bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[DAGCombiner] SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT(V,C)) -> VECTOR_SHUFFLE	Simon Pilgrim	2015-03-07	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \|	This patch attempts to convert a SCALAR_TO_VECTOR using an operand from an EXTRACT_VECTOR_ELT into a VECTOR_SHUFFLE. This prevents many cases of spilling scalar data between the gpr + simd registers. At present the optimization only accepts cases where there is no TRUNC of the scalar type (i.e. all types must match). Differential Revision: http://reviews.llvm.org/D8132 llvm-svn: 231554
*	DAGCombiner: Canonicalize select(and/or,x,y) depending on target.	Matthias Braun	2015-03-06	1	-0/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is based on the following equivalences: select(C0 & C1, X, Y) <=> select(C0, select(C1, X, Y), Y) select(C0 \| C1, X, Y) <=> select(C0, X, select(C1, X, Y)) Many target cannot perform and/or on the CPU flags and therefore the right side should be choosen to avoid materializign the i1 flags in an integer register. If the target can perform this operation efficiently we normalize to the left form. Differential Revision: http://reviews.llvm.org/D7622 llvm-svn: 231507
*	DAGCombiner: Factor out some and/or combines.	Matthias Braun	2015-03-06	1	-225/+252
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is in preparation for changing visitSELECT to normalize towards select(Cond0, select(Cond1, X, Y), Y); select(Cond0, X, select(Cond1, X, Y)) which perfom an implicit and/or of the conditions. The factored function contains all DAGCombine rules which reduce two values combined by an And/Or operation to a single value. This does not include rules involving constants as visitSELECT already handles that case. Differential Revision: http://reviews.llvm.org/D8026 llvm-svn: 231506
*	ExecutionDepsFix: Indizes -> Indices.	Matthias Braun	2015-03-06	1	-10/+10
\| \| \| \| \| \|	Translate german to english. llvm-svn: 231500
*	Fix typo.	Eric Christopher	2015-03-06	1	-1/+1
\| \| \| \|	llvm-svn: 231495
*	[AsmPrinter][TLOF] 32-bit MachO support for replacing GOT equivalents	Bruno Cardoso Lopes	2015-03-06	2	-4/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add MachO 32-bit (i.e. arm and x86) support for replacing global GOT equivalent symbol accesses. Unlike 64-bit targets, there's no GOTPCREL relocation, and access through a non_lazy_symbol_pointers section is used instead. -- before _extgotequiv: .long _extfoo _delta: .long _extgotequiv-_delta -- after _delta: .long L_extfoo$non_lazy_ptr-_delta .section __IMPORT,__pointers,non_lazy_symbol_pointers L_extfoo$non_lazy_ptr: .indirect_symbol _extfoo .long 0 llvm-svn: 231475
*	[AsmPrinter][TLOF] ARM64 MachO support for replacing GOT equivalents	Bruno Cardoso Lopes	2015-03-06	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Follow up r230264 and add ARM64 support for replacing global GOT equivalent symbol accesses by references to the GOT entry for the final symbol instead, example: -- before .globl _foo _foo: .long 42 .globl _gotequivalent _gotequivalent: .quad _foo .globl _delta _delta: .long _gotequivalent-_delta -- after .globl _foo _foo: .long 42 .globl _delta Ltmp3: .long _foo@GOT-Ltmp3 llvm-svn: 231474
*	LegalizeTypes: Handle shift by 0 in ExpandShiftByConstant.	Michael Zolotukhin	2015-03-06	1	-1/+8
\| \| \| \| \| \| \|	Though such shifts are usually optimized away by combiner, we still can encounter them after a vector shift is legalized. llvm-svn: 231443
*	SelectionDAGBuilder: Merge 3 copies of the limited precision exp2 emission code.	Benjamin Kramer	2015-03-05	1	-251/+93
\| \| \| \| \| \|	NFC intended. llvm-svn: 231406
*	Fix uninitialized memory references in WinEHPrepare	Andrew Kaylor	2015-03-05	1	-1/+3
\| \| \| \|	llvm-svn: 231405
*	SDAG: Merge the meat of two ExpandAtomic implementations.	Benjamin Kramer	2015-03-05	3	-212/+42
\| \| \| \| \| \| \|	The copies already diverged, don't let them become any worse. Reduce redundancy in code with a little macro metaprogramming. llvm-svn: 231401
*	Use the correct func begin symbol in all places in ppc.	Rafael Espindola	2015-03-05	1	-7/+9
\| \| \| \| \| \|	I missed an occurrence of the old symbol in my previous patch. llvm-svn: 231398
*	Use the generic Lfunc_begin label on ppc.	Rafael Espindola	2015-03-05	1	-1/+5
\| \| \| \| \| \|	This removes yet another custom label to mark the start of a function. llvm-svn: 231390
*	X86: Optimize address mode matching for FRAME_ALLOC_RECOVER nodes	David Majnemer	2015-03-05	1	-0/+1
\| \| \| \| \| \| \|	We know that the absolute symbol will be less than 2GB and thus will always fit. llvm-svn: 231389
*	Replace llvm.frameallocate with llvm.frameescape	Reid Kleckner	2015-03-05	3	-192/+78
\| \| \| \| \| \| \| \| \| \|	Turns out it's pretty straightforward and simplifies the implementation. Reviewers: andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D8051 llvm-svn: 231386
*	[DagCombiner] Allow shuffles to merge through bitcasts	Simon Pilgrim	2015-03-05	1	-0/+83
\| \| \| \| \| \| \| \| \| \| \| \|	Currently shuffles may only be combined if they are of the same type, despite the fact that bitcasts are often introduced in between shuffle nodes (e.g. x86 shuffle type widening). This patch allows a single input shuffle to peek through bitcasts and if the input is another shuffle will merge them, shuffling using the smallest sized type, and re-applying the bitcasts at the inputs and output instead. Dropped old ShuffleToZext test - this patch removes the use of the zext and vector-zext.ll covers these anyhow. Differential Revision: http://reviews.llvm.org/D7939 llvm-svn: 231380
*	Revert change r231366 as it broke clang-native-arm-cortex-a9 ↵	Igor Laevsky	2015-03-05	3	-92/+22
\| \| \| \| \| \|	Analysis/properties.m test. llvm-svn: 231374
*	AVX-512, SKX: Enabled masked_load/store operations for this target.	Elena Demikhovsky	2015-03-05	1	-2/+2
\| \| \| \| \| \| \|	Added lowering for ISD::CONCAT_VECTORS and ISD::INSERT_SUBVECTOR for i1 vectors, it is needed to pass all masked_memop.ll tests for SKX. llvm-svn: 231371
*	Teach lowering to correctly handle invoke statepoint and gc results tied to ↵	Igor Laevsky	2015-03-05	3	-22/+92
\| \| \| \| \| \| \| \|	them. Note that we still can not lower gc.relocates for invoke statepoints. Also it extracts getCopyFromRegs helper function in SelectionDAGBuilder as we need to be able to customize type of the register exported from basic block during lowering of the gc.result. llvm-svn: 231366
*	[PBQP] Use a local bit-matrix to speedup searching an edge in the graph.	Arnaud A. de Grandmaison	2015-03-05	1	-3/+10
\| \| \| \| \| \| \| \| \|	Build time (user time) for building llvm+clang+lldb in release mode: - default allocator: 9086 seconds - with PBQP: 9126 seconds - with PBQP + local bit matrix cache: 9097 seconds llvm-svn: 231360
*	Remove useless break after return.	Frederic Riss	2015-03-05	1	-1/+0
\| \| \| \| \| \|	Pointed out by Paul Robinson. llvm-svn: 231353
*	[MBP] Use range based for-loops throughout this code. Several had	Chandler Carruth	2015-03-05	1	-140/+108
\| \| \| \| \| \| \|	already been added and the inconsistency made choosing names and changing code more annoying. Plus, wow are they better for this code! llvm-svn: 231347
*	[MBP] NFC, run clang-format over this code and tweak things to make the	Chandler Carruth	2015-03-05	1	-71/+62
\| \| \| \| \| \| \| \| \| \| \|	result reasonable. This code predated clang-format and so there was a reasonable amount of crufty formatting that had accumulated. This should ensure that neither myself nor others end up with formatting-only changes sneaking into other fixes. llvm-svn: 231341
*	[MBP] This is no longer 'block-placement2'. ;] The old variants are long	Chandler Carruth	2015-03-05	1	-3/+3
\| \| \| \| \| \|	gone, update this code to reflect that. llvm-svn: 231340
*	Use the existing begin and end symbol for debug info.	Rafael Espindola	2015-03-05	5	-27/+11
\| \| \| \|	llvm-svn: 231338
*	[MBP] Revert r231238 which attempted to fix a nasty bug where MBP is	Chandler Carruth	2015-03-05	1	-26/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	just arbitrarily interleaving unrelated control flows once they get moved "out-of-line" (both outside of natural CFG ordering and with diamonds that cannot be fully laid out by chaining fallthrough edges). This easy solution doesn't work in practice, and it isn't just a small bug. It looks like a very different strategy will be required. I'm working on that now, and it'll again go behind some flag so that everyone can experiment and make sure it is working well for them. llvm-svn: 231332
*	Turn off .debug_pubnames/pubtypes for PS4.	Paul Robinson	2015-03-05	1	-2/+2
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D8067 llvm-svn: 231322
*	Teach DIEInteger to emit FORM_strp and FORM_ref_addr attributes.	Frederic Riss	2015-03-04	1	-0/+10
\| \| \| \| \| \| \| \|	To be used/tested by llvm-dsymutil. (llvm-dsymutil does a 'static' link, no need for relocations for most things, so it'll just emit raw integers for most attributes) llvm-svn: 231298
*	Support standard DWARF TLS opcode; Darwin and PS4 use it.	Paul Robinson	2015-03-04	3	-2/+17
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D8018 llvm-svn: 231286
*	Make DataLayout Non-Optional in the Module	Mehdi Amini	2015-03-04	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: DataLayout keeps the string used for its creation. As a side effect it is no longer needed in the Module. This is "almost" NFC, the string is no longer canonicalized, you can't rely on two "equals" DataLayout having the same string returned by getStringRepresentation(). Get rid of DataLayoutPass: the DataLayout is in the Module The DataLayout is "per-module", let's enforce this by not duplicating it more than necessary. One more step toward non-optionality of the DataLayout in the module. Make DataLayout Non-Optional in the Module Module->getDataLayout() will never returns nullptr anymore. Reviewers: echristo Subscribers: resistor, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D7992 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231270
*	Revert the test commit.	Wei Mi	2015-03-04	1	-1/+0
\| \| \| \|	llvm-svn: 231264
*	Test commit. It will be reverted in the next commit.	Wei Mi	2015-03-04	1	-0/+1
\| \| \| \|	llvm-svn: 231262
*	Fix DwarfExpression::AddMachineRegExpression so it doesn't read past the	Adrian Prantl	2015-03-04	1	-11/+15
\| \| \| \| \| \| \|	end of an expression that ends with DW_OP_plus. Caught by the ASAN build bots. llvm-svn: 231260
*	Mutate TargetLowering::shouldExpandAtomicRMWInIR to specifically dictate how ↵	JF Bastien	2015-03-04	1	-9/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AtomicRMWInsts are expanded. Summary: In PNaCl, most atomic instructions have their own @llvm.nacl.atomic.* function, each one, with a few exceptions, represents a consistent behaviour across all NaCl-supported targets. Unfortunately, the atomic RMW operations nand, [u]min, and [u]max aren't directly represented by any such @llvm.nacl.atomic.* function. This patch refines shouldExpandAtomicRMWInIR in TargetLowering so that a future `Le32TargetLowering` class can selectively inform the caller how the target desires the atomic RMW instruction to be expanded (ie via load-linked/store-conditional for ARM/AArch64, via cmpxchg for X86/others?, or not at all for Mips) if at all. This does not represent a behavioural change and as such no tests were added. Patch by: Richard Diamond. Reviewers: jfb Reviewed By: jfb Subscribers: jfb, aemerson, t.p.northover, llvm-commits Differential Revision: http://reviews.llvm.org/D7713 llvm-svn: 231250
*	[MBP] Fix a really horrible bug in MachineBlockPlacement, but behind	Chandler Carruth	2015-03-04	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a flag for now. First off, thanks to Daniel Jasper for really pointing out the issue here. It's been here forever (at least, I think it was there when I first wrote this code) without getting really noticed or fixed. The key problem is what happens when two reasonably common patterns happen at the same time: we outline multiple cold regions of code, and those regions in turn have diamonds or other CFGs for which we can't just topologically lay them out. Consider some C code that looks like: if (a1()) { if (b1()) c1(); else d1(); f1(); } if (a2()) { if (b2()) c2(); else d2(); f2(); } done(); Now consider the case where a1() and a2() are unlikely to be true. In that case, we might lay out the first part of the function like: a1, a2, done; And then we will be out of successors in which to build the chain. We go to find the best block to continue the chain with, which is perfectly reasonable here, and find "b1" let's say. Laying out successors gets us to: a1, a2, done; b1, c1; At this point, we will refuse to lay out the successor to c1 (f1) because there are still un-placed predecessors of f1 and we want to try to preserve the CFG structure. So we go get the next best block, d1. ... wait for it ... Except that the next best block isn't d1. It is b2! d1 is waaay down inside these conditionals. It is much less important than b2. Except that this is exactly what we didn't want. If we keep going we get the entire set of the rest of the CFG interleaved!!! a1, a2, done; b1, c1; b2, c2; d1, f1; d2, f2; So we clearly need a better strategy here. =] My current favorite strategy is to actually try to place the block whose predecessor is closest. This very simply ensures that we unwind these kinds of CFGs the way that is natural and fitting, and should minimize the number of cache lines instructions are spread across. It also happens to be dead simple. It's like the datastructure was specifically set up for this use case or something. We only push blocks onto the work list when the last predecessor for them is placed into the chain. So the back of the worklist is the nearest next block. Unfortunately, a change like this is going to cause soooo many benchmarks to swing wildly. So for now I'm adding this under a flag so that we and others can validate that this is fixing the problems described, that it seems possible to enable, and hopefully that it fixes more of our problems long term. llvm-svn: 231238
*	Add a flag to experiment with outlining optional branches.	Daniel Jasper	2015-03-04	1	-2/+46
\| \| \| \| \| \| \| \| \| \| \| \| \|	In a CFG with the edges A->B->C and A->C, B is an optional branch. LLVM's default behavior is to lay the blocks out naturally, i.e. A, B, C, in order to improve code locality and fallthroughs. However, if a function contains many of those optional branches only a few of which are taken, this leads to a lot of unnecessary icache misses. Moving B out of line can work around this. Review: http://reviews.llvm.org/D7719 llvm-svn: 231230
*	[DAGCombine] Fix a bug in a BUILD_VECTOR combine	Michael Kuperstein	2015-03-04	1	-2/+3
\| \| \| \| \| \| \| \| \| \|	When trying to convert a BUILD_VECTOR into a shuffle, we try to split a single source vector that is twice as wide as the destination vector. We can not do this when we also need the zero vector to create a blend. This fixes PR22774. Differential Revision: http://reviews.llvm.org/D8040 llvm-svn: 231219
*	Move emitDIE and emitAbbrevs to AsmPrinter. NFC.	Frederic Riss	2015-03-04	5	-68/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	(They are called emitDwarfDIE and emitDwarfAbbrevs in their new home) llvm-dsymutil wants to reuse that code, but it doesn't have a DwarfUnit or a DwarfDebug object to call those. It has access to an AsmPrinter though. Having emitDIE in the AsmPrinter also removes the DwarfFile dependency on DwarfDebug, and thus the patch drops that field. Differential Revision: http://reviews.llvm.org/D8024 llvm-svn: 231210
*	Constify AsmPrinter passed to DIE methods.	Frederic Riss	2015-03-04	1	-22/+22
\| \| \| \|	llvm-svn: 231209
*	Use report_fatal_error instead of unreachable for -fast-isel-abort	Mehdi Amini	2015-03-04	1	-3/+3
\| \| \| \| \| \| \|	Suggestion by Andrea Di Biagio From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231201
*	Use the vanilla func_end symbol for .size.	Rafael Espindola	2015-03-04	1	-7/+4
\| \| \| \| \| \|	No need to create yet another temp symbol. llvm-svn: 231198
*	Recommit r231168: unique_ptrify LiveRange::segmentSet	David Blaikie	2015-03-04	2	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GCC 4.7's libstdc++ doesn't have std::map::emplace, but it does have std::unordered_map::emplace, and the use case here doesn't appear to need ordering. The container has been changed in a separate/precursor patch, and now this patch should hopefully build cleanly even with GCC 4.7. & then I realized the order of the container did matter, so extra handling of ordering was added in r231189. Original commit message: This makes LiveRange non-copyable, and LiveInterval is already non-movable (due to the explicit dtor), so now it's non-copyable and non-movable. Fix the one case where we were relying on the (deprecated in C++11) implicit copy ctor of LiveInterval (which happened to work because the ctor created an object with a null segmentSet, so double-deleting the null pointer was fine). llvm-svn: 231192
*	Recommit r231175: Change LiveStackAnalysis::SS2IntervalMap from std::map to ↵	David Blaikie	2015-03-04	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	std::unordered_map The order of this container was needed at one point - so, at that point create a temporary array of pointers, sort those, then iterate them. This keeps lookup efficient (& the lesser issue, of allowing the use of emplace... ), object identity preserved, and ordered iteration in the one place that requires it. While this has no functional change, I realize it does mean allocating an extra data structure and performing a sort - so if this looks suspect to anyone regarding perf characteristics, I'm all ears. llvm-svn: 231189
*	RegisterCoalescer: Gracefully continue if subrange merging fails.	Matthias Braun	2015-03-04	1	-18/+48
\| \| \| \| \| \| \| \| \| \| \|	There is a known bug where the register coalescer fails to merge subranges when multiple ranges end up in the "overflow" bit 32 of the lanemasks. A proper fix for this is complicated so for now this is a workaround which lets the register coalescer drop the subregister liveness information (we just loose some precision by that) and continue. llvm-svn: 231186
*	Drop the "eh_" from eh_func_begin and eh_func_end.	Rafael Espindola	2015-03-04	1	-2/+2
\| \| \| \| \| \|	They will be used for more than eh tables. llvm-svn: 231185
*	Revert "unique_ptrify LiveRange::segmentSet"	David Blaikie	2015-03-04	2	-4/+3
\| \| \| \| \| \| \| \|	Apparently something does care about ordering of LiveIntervals... so revert all that stuff (r231175, r231176, r231177) & take some time to re-evaluate. llvm-svn: 231184
*	Recommit r231168: unique_ptrify LiveRange::segmentSet	David Blaikie	2015-03-03	2	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GCC 4.7's libstdc++ doesn't have std::map::emplace, but it does have std::unordered_map::emplace, and the use case here doesn't appear to need ordering. The container has been changed in a separate/precursor patch, and now this patch should hopefully build cleanly even with GCC 4.7. Original commit message: This makes LiveRange non-copyable, and LiveInterval is already non-movable (due to the explicit dtor), so now it's non-copyable and non-movable. Fix the one case where we were relying on the (deprecated in C++11) implicit copy ctor of LiveInterval (which happened to work because the ctor created an object with a null segmentSet, so double-deleting the null pointer was fine). llvm-svn: 231176
*	Revert "unique_ptrify LiveRange::segmentSet"	David Blaikie	2015-03-03	2	-4/+3
\| \| \| \| \| \| \| \|	GCC 4.7 shakes fist (doesn't have std::map::emplace... ) This reverts commit r231168. llvm-svn: 231173
*	unique_ptrify LiveRange::segmentSet	David Blaikie	2015-03-03	2	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	This makes LiveRange non-copyable, and LiveInterval is already non-movable (due to the explicit dtor), so now it's non-copyable and non-movable. Fix the one case where we were relying on the (deprecated in C++11) implicit copy ctor of LiveInterval (which happened to work because the ctor created an object with a null segmentSet, so double-deleting the null pointer was fine). llvm-svn: 231168
*	WinEH: Remove vestigial EH object	Reid Kleckner	2015-03-03	1	-43/+13
\| \| \| \| \| \| \| \|	Ultimately, we'll need to leave something behind to indicate which alloca will hold the exception, but we can figure that out when it comes time to emit the __CxxFrameHandler3 catch handler table. llvm-svn: 231164