bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AArch64] Fix a silent codegen fault in BUILD_VECTOR lowering.	James Molloy	2014-10-17	2	-9/+31
\| \| \| \| \| \| \| \| \| \|	We should be talking about the number of source elements, not the number of destination elements, given we know at this point that the source and dest element numbers are not the same. While we're at it, avoid writing to std::vector::end()... Bug found with random testing and a lot of coffee. llvm-svn: 220051
*	Don't crash if find_executable return None.	Rafael Espindola	2014-10-17	1	-4/+4
\| \| \| \| \| \|	This was crashing when trying to run the tests on Windows. llvm-svn: 220048
*	[PowerPC] Enable use of lxvw4x/stxvw4x in VSX code generation	Bill Schmidt	2014-10-17	9	-12/+157
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the VSX support enables use of lxvd2x and stxvd2x for 2x64 types, but does not yet use lxvw4x and stxvw4x for 4x32 types. This patch adds that support. As with lxvd2x/stxvd2x, this involves straightforward overriding of the patterns normally recognized for lvx/stvx, with preference given to the VSX patterns when VSX is enabled. In addition, the logic for permitting misaligned memory accesses is modified so that v4r32 and v4i32 are treated the same as v2f64 and v2i64 when VSX is enabled. Finally, the DAG generation for unaligned loads is changed to just use a normal LOAD (which will become lxvw4x) on P8 and later hardware, where unaligned loads are preferred over lvsl/lvx/lvx/vperm. A number of tests now generate the VSX loads/stores instead of lvx/stvx, so this patch adds VSX variants to those tests. I've also added <4 x float> tests to the vsx.ll test case, and created a vsx-p8.ll test case to be used for testing code generation for the P8Vector feature. For now, that simply tests the unaligned load/store behavior. This has been tested along with a temporary patch to enable the VSX and P8Vector features, with no new regressions encountered with or without the temporary patch applied. llvm-svn: 220047
*	Mips: Only set divrem i64 to custom on 64bit	Jan Vesely	2014-10-17	1	-2/+2
\| \| \| \| \| \|	Reviewed-by: Daniel Sanders <daniel.sanders@imgtec.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 220046
*	R600: Add EG to FMA test	Jan Vesely	2014-10-17	1	-1/+14
\| \| \| \| \| \|	Reviewed-by: Tom Stellard <tom@stellard.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 220045
*	SelectionDAG: Add sext_inreg optimizations	Jan Vesely	2014-10-17	2	-0/+48
\| \| \| \| \| \| \| \| \| \|	v2: use dyn_cast fixup comments v3: use cast Reviewed-by: Matt Arsenault <arsenm2@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 220044
*	[mips] Add support for COP1's Branch-On-Cond-Likely instructions	Vasileios Kalintiris	2014-10-17	24	-16/+104
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Depends on D5782 Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5802 llvm-svn: 220042
*	[mips] Add support for COP0's Branch-On-Cond-Likely instructions	Vasileios Kalintiris	2014-10-17	16	-30/+217
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5782 llvm-svn: 220036
*	[DSE] Remove no-data-layout-only type-based overlap checking	Hal Finkel	2014-10-17	5	-20/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DSE's overlap checking contained special logic, used only when no DataLayout was available, which inferred a complete overwrite when the pointee types were equal. This logic seems fine for regular loads/stores, but does not work for memcpy and friends. Instead of fixing this, I'm just removing it. Philosophically, transformations should not contain enhanced behavior used only when data layout is lacking (data layout should be strictly additive), and maintaining these rarely-tested code paths seems not worthwhile at this stage. Credit to Aliaksei Zasenka for the bug report and the diagnosis. The test case (slightly reduced from that provided by Aliaksei) replaces the original contents of test/Transforms/DeadStoreElimination/no-targetdata.ll -- a few other tests have been updated to have a data layout. llvm-svn: 220035
*	Fix bashism in build.sh.	Peter Collingbourne	2014-10-17	1	-1/+1
\| \| \| \|	llvm-svn: 220027
*	Add back commits r219835 and a fixed version of r219829.	Rafael Espindola	2014-10-17	8	-51/+193
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The only difference from r219829 is using getOrCreateSectionSymbol(*ELFSec) instead of GetOrCreateSymbol(ELFSec->getSectionName()) in ELFObjectWriter which causes us to use the correct section symbol even if we have multiple sections with the same name. Original messages: r219829: Correctly handle references to section symbols. When processing assembly like .long .text we were creating a new undefined symbol .text. GAS on the other hand would handle that as a reference to the .text section. This patch implements that by creating the section symbols earlier so that they are visible during asm parsing. The patch also updates llvm-readobj to print the symbol number in the relocation dump so that the test can differentiate between two sections with the same name. r219835: Allow forward references to section symbols. llvm-svn: 220021
*	[PPC] Adjust some PowerPC tests to account for presence/absence of VSX	Bill Schmidt	2014-10-17	19	-37/+243
\| \| \| \| \| \| \| \| \| \| \|	Patch by Bill Seurer; committed on his behalf. These test cases generate slightly different code sequences when VSX is activated and thus fail. The update turns off VSX explicitly for the existing checks and then adds a second set of checks for most of them that test the VSX instruction output. llvm-svn: 220019
*	Add a test that would have found the bug in r219829.	Rafael Espindola	2014-10-17	2	-1/+33
\| \| \| \|	llvm-svn: 220016
*	ARM: Fix a bug which was causing convergence failure in constant-island pass.	Akira Hatanaka	2014-10-17	2	-1/+401
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The bug is in ARMConstantIslands::createNewWater where the upper bound of the new water split point is computed: // This could point off the end of the block if we've already got constant // pool entries following this block; only the last one is in the water list. // Back past any possible branches (allow for a conditional and a maximally // long unconditional). if (BaseInsertOffset + 8 >= UserBBI.postOffset()) { BaseInsertOffset = UserBBI.postOffset() - UPad - 8; DEBUG(dbgs() << format("Move inside block: %#x\n", BaseInsertOffset)); } The split point is supposed to be somewhere between the machine instruction that loads from the constant pool entry and the end of the basic block, before branch instructions. The code above is fine if the basic block is large enough and there are a sufficient number of instructions following the machine instruction. However, if the machine instruction is near the end of the basic block, BaseInsertOffset can point to the machine instruction or another instruction that precedes it, and this can lead to convergence failure. This commit fixes this bug by ensuring BaseInsertOffset is larger than the offset of the instruction following the constant-loading instruction. rdar://problem/18581150 llvm-svn: 220015
*	Revert commit r219835 and r219829.	Rafael Espindola	2014-10-17	9	-195/+52
\| \| \| \| \| \| \| \| \|	Revert "Correctly handle references to section symbols." Revert "Allow forward references to section symbols." Rui found a regression I am debugging. llvm-svn: 220010
*	[OCaml] Add Llvm.instr_clone.	Peter Zotov	2014-10-17	4	-0/+31
\| \| \| \|	llvm-svn: 220008
*	[LLVM-C] Add LLVMInstructionClone.	Peter Zotov	2014-10-17	2	-0/+16
\| \| \| \|	llvm-svn: 220007
*	[llvm-symbolizer] Introduce the -dsym-hint option.	Alexander Potapenko	2014-10-17	12	-50/+177
\| \| \| \| \| \| \|	llvm-symbolizer will consult one of the .dSYM paths passed via -dsym-hint if it fails to find the .dSYM bundle at the default location. llvm-svn: 220004
*	R600/SI: Simplify debug printing	Matt Arsenault	2014-10-17	1	-5/+3
\| \| \| \|	llvm-svn: 219999
*	Add our own copy of the find_executable function to cope with installations	Peter Collingbourne	2014-10-16	1	-2/+20
\| \| \| \| \| \| \|	that do not have the distutils.spawn package. Should hopefully fix the aarch64 buildbot. llvm-svn: 219991
*	R600/SI: Remove another VALU pattern	Matt Arsenault	2014-10-16	1	-5/+0
\| \| \| \|	llvm-svn: 219988
*	Initial version of Go bindings.	Peter Collingbourne	2014-10-16	45	-6/+4426
\| \| \| \| \| \| \| \| \| \| \| \|	This code is based on the existing LLVM Go bindings project hosted at: https://github.com/go-llvm/llvm Note that all contributors to the gollvm project have agreed to relicense their changes under the LLVM license and submit them to the LLVM project. Differential Revision: http://reviews.llvm.org/D5684 llvm-svn: 219976
*	Introduce LLVMParseCommandLineOptions C API function.	Peter Collingbourne	2014-10-16	2	-0/+17
\| \| \| \|	llvm-svn: 219975
*	Reduce code duplication between patchpoint and non-patchpoint lowering. NFC.	Juergen Ributzka	2014-10-16	2	-44/+58
\| \| \| \| \| \| \| \| \| \| \| \|	This is in preparation for another patch that makes patchpoints invokable. Reviewers: atrick, ributzka Reviewed By: ributzka Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5657 llvm-svn: 219967
*	[SROA] Switch the common variable name for the 'AllocaSlices' class to	Chandler Carruth	2014-10-16	1	-40/+42
\| \| \| \| \| \| \| \| \| \| \|	'AS'. Using 'S' as this was a terrible idea. Arguably, 'AS' is not much better, but it at least follows the idea of using initialisms and removes active confusion about the AllocaSlices variable and a Slice variable. llvm-svn: 219963
*	[SROA] More range-based cleanups to SROA, these brought to you by	Chandler Carruth	2014-10-16	1	-25/+12
\| \| \| \| \| \| \| \| \|	clang-modernize. I did have to clean up the variable types and whitespace a bit because the use of auto made the code much less readable here. llvm-svn: 219962
*	[SROA] Switch a couple of overly complex iterator accessors to just be	Chandler Carruth	2014-10-16	1	-26/+10
\| \| \| \| \| \| \| \| \|	ArrayRef accessors. I think this even came up in review that this was over-engineered, and indeed it was. Time to un-build it. llvm-svn: 219958
*	Erase fence insertion from SelectionDAGBuilder.cpp (NFC)	Robin Morisset	2014-10-16	6	-83/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Backends can use setInsertFencesForAtomic to signal to the middle-end that montonic is the only memory ordering they can accept for stores/loads/rmws/cmpxchg. The code lowering those accesses with a stronger ordering to fences + monotonic accesses is currently living in SelectionDAGBuilder.cpp. In this patch I propose moving this logic out of it for several reasons: - There is lots of redundancy to avoid: extremely similar logic already exists in AtomicExpand. - The current code in SelectionDAGBuilder does not use any target-hooks, it does the same transformation for every backend that requires it - As a result it is plain unsound, as it was apparently designed for ARM. It happens to mostly work for the other targets because they are extremely conservative, but Power for example had to switch to AtomicExpand to be able to use lwsync safely (see r218331). - Because it produces IR-level fences, it cannot be made sound ! This is noted in the C++11 standard (section 29.3, page 1140): ``` Fences cannot, in general, be used to restore sequential consistency for atomic operations with weaker ordering semantics. ``` It can also be seen by the following example (called IRIW in the litterature): ``` atomic<int> x = y = 0; int r1, r2, r3, r4; Thread 0: x.store(1); Thread 1: y.store(1); Thread 2: r1 = x.load(); r2 = y.load(); Thread 3: r3 = y.load(); r4 = x.load(); ``` r1 = r3 = 1 and r2 = r4 = 0 is impossible as long as the accesses are all seq_cst. But if they are lowered to monotonic accesses, no amount of fences can prevent it.. This patch does three things (I could cut it into parts, but then some of them would not be tested/testable, please tell me if you would prefer that): - it provides a default implementation for emitLeadingFence/emitTrailingFence in terms of IR-level fences, that mimic the original logic of SelectionDAGBuilder. As we saw above, this is unsound, but the best that can be done without knowing the targets well (and there is a comment warning about this risk). - it then switches Mips/Sparc/XCore to use AtomicExpand, relying on this default implementation (that exactly replicates the logic of SelectionDAGBuilder, so no functional change) - it finally erase this logic from SelectionDAGBuilder as it is dead-code. Ideally, each target would define its own override for emitLeading/TrailingFence using target-specific fences, but I do not know the Sparc/Mips/XCore memory model well enough to do this, and they appear to be dealing fine with the ARM-inspired default expansion for now (probably because they are overly conservative, as Power was). If anyone wants to compile fences more agressively on these platforms, the long comment should make it clear why he should first override emitLeading/TrailingFence. Test Plan: make check-all, no functional change Reviewers: jfb, t.p.northover Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D5474 llvm-svn: 219957
*	R600/SI: Remove unnecessary VALU patterns	Matt Arsenault	2014-10-16	1	-41/+0
\| \| \| \| \| \| \| \|	These haven't been necessary since allowing selecting SALU instructions in non-entry blocks was enabled. llvm-svn: 219956
*	[SROA] Start more deeply moving SROA to use ranges rather than just	Chandler Carruth	2014-10-16	1	-45/+42
\| \| \| \| \| \| \| \| \| \| \| \|	iterators. There are a ton of places where it essentially wants ranges rather than just iterators. This is just the first step that adds the core slice range typedefs and uses them in a couple of places. I still have to explicitly construct them because they've not been punched throughout the entire set of code. More range-based cleanups incoming. llvm-svn: 219955
*	R600: Fix nonsensical implementation of computeKnownBits for BFE	Matt Arsenault	2014-10-16	2	-5/+16
\| \| \| \| \| \|	This was resulting in invalid simplifications of sdiv llvm-svn: 219953
*	Delete -std-compile-opts.	Rafael Espindola	2014-10-16	18	-107/+43
\| \| \| \| \| \|	These days -std-compile-opts was just a silly alias for -O3. llvm-svn: 219951
*	Allow call-slop optzn for destinations with a suitable dereferenceable attribute	Bjorn Steinbrink	2014-10-16	2	-14/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently, call slot optimization requires that if the destination is an argument, the argument has the sret attribute. This is to ensure that the memory access won't trap. In addition to sret, we can also allow the optimization to happen for arguments that have the new dereferenceable attribute, which gives the same guarantee. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5832 llvm-svn: 219950
*	Fix lang-ref doc bug: s/icmp lt/icmp slt/	Jonathan Roelofs	2014-10-16	1	-1/+1
\| \| \| \|	llvm-svn: 219947
*	[llvm-objdump] Fix -private-headers for mach-o to print all LC_*_DYLIB variants	Nick Kledzik	2014-10-16	1	-1/+6
\| \| \| \|	llvm-svn: 219945
*	fold: sqrt(x * x * y) -> fabs(x) * sqrt(y)	Sanjay Patel	2014-10-16	3	-1/+258
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a square root call has an FP multiplication argument that can be reassociated, then we can hoist a repeated factor out of the square root call and into a fabs(). In the simplest case, this: y = sqrt(x * x); becomes this: y = fabs(x); This patch relies on an earlier optimization in instcombine or reassociate to put the multiplication tree into a canonical form, so we don't have to search over every permutation of the multiplication tree. Because there are no IR-level FastMathFlags for intrinsics (PR21290), we have to use function-level attributes to do this optimization. This needs to be fixed for both the intrinsics and in the backend. Differential Revision: http://reviews.llvm.org/D5787 llvm-svn: 219944
*	[AArch64] Fix miscompile of sdiv-by-power-of-2.	Juergen Ributzka	2014-10-16	3	-5/+17
\| \| \| \| \| \| \| \| \| \| \|	When the constant divisor was larger than 32bits, then the optimized code generated for the AArch64 backend would emit the wrong code, because the shift was defined as a shift of a 32bit constant '(1<<Lg2(divisor))' and we would loose the upper 32bits. This fixes rdar://problem/18678801. llvm-svn: 219934
*	[mips] Account for endianess when expanding BuildPairF64/ExtractElementF64 ↵	Vasileios Kalintiris	2014-10-16	2	-68/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nodes. Summary: In order to support big endian targets for the BuildPairF64 nodes we just need to swap the low/high pair registers. Additionally, for the ExtractElementF64 nodes we have to calculate the correct stack offset with respect to the node's register/operand that we want to extract. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5753 llvm-svn: 219931
*	[mips] Marked the DI/EI instruction aliases as MIPS32r2	Vasileios Kalintiris	2014-10-16	12	-6/+77
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5751 llvm-svn: 219927
*	Test commit access: remove extra new line at the end of file	Vasileios Kalintiris	2014-10-16	1	-1/+0
\| \| \| \|	llvm-svn: 219925
*	Add missing header guard.	Benjamin Kramer	2014-10-16	1	-0/+5
\| \| \| \|	llvm-svn: 219922
*	Reapply r219832 - InstCombine: Narrow switch instructions using known bits.	Akira Hatanaka	2014-10-16	2	-0/+124
\| \| \| \| \| \| \|	The code committed in r219832 asserted when it attempted to shrink a switch statement whose type was larger than 64-bit. llvm-svn: 219902
*	TRE: make TRE a bit more aggressive	Saleem Abdulrasool	2014-10-16	4	-10/+39
\| \| \| \| \| \| \| \| \|	Make tail recursion elimination a bit more aggressive. This allows us to get tail recursion on functions that are just branches to a different function. The fact that the function takes a byval argument does not restrict it from being optimised into just a tail call. llvm-svn: 219899
*	Revert r219832.	Akira Hatanaka	2014-10-16	2	-92/+0
\| \| \| \|	llvm-svn: 219884
*	[LVI] Add some additional comments about caching and context instructions	Hal Finkel	2014-10-16	1	-0/+13
\| \| \| \| \| \| \| \|	Philip Reames and I had a long conversation about this, mostly because it is not obvious why the current logic is correct. Hopefully, these comments will prevent such confusion in the future. llvm-svn: 219882
*	llvm/Support/Options.h: Use \tparam. [-Wdocumentation]	NAKAMURA Takumi	2014-10-16	1	-6/+6
\| \| \| \|	llvm-svn: 219881
*	R600: Remove dead function	Matt Arsenault	2014-10-16	2	-15/+0
\| \| \| \|	llvm-svn: 219879
*	Revert "r219834 - Teach ScalarEvolution to sharpen range information"	Sanjoy Das	2014-10-15	3	-78/+2
\| \| \| \| \| \| \|	This change breaks the asan buildbots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/13468 llvm-svn: 219878
*	Preserve non-byval pointer alignment attributes using @llvm.assume when inlining	Hal Finkel	2014-10-15	2	-0/+143
\| \| \| \| \| \| \| \| \|	For pointer-typed function arguments, enhanced alignment can be asserted using the 'align' attribute. When inlining, if this enhanced alignment information is not otherwise available, preserve it using @llvm.assume-based alignment assumptions. llvm-svn: 219876
*	Add CreateAlignmentAssumption to IRBuilder	Hal Finkel	2014-10-15	2	-0/+53
\| \| \| \| \| \| \| \| \| \|	Clang CodeGen had a utility function for creating pointer alignment assumptions using the @llvm.assume intrinsic. This functionality will also be needed by the inliner (to preserve function-argument alignment attributes when inlining), so this moves the utility function into IRBuilder where it can be used both by Clang CodeGen and also other LLVM-level code. llvm-svn: 219875