bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	ARM: add support for R_ARM_ABS16	Saleem Abdulrasool	2015-01-09	1	-0/+8
\| \| \| \| \| \|	Add support for R_ARM_ABS16 relocation mapping. Addresses PR22156. llvm-svn: 225510
*	ARM: add support for R_ARM_ABS8 relocations	Saleem Abdulrasool	2015-01-09	1	-0/+8
\| \| \| \| \| \|	Add support for R_ARM_ABS8 relocation. Addresses PR22126. llvm-svn: 225507
*	RegisterCoalescer: Fix removeCopyByCommutingDef with subreg liveness	Matthias Braun	2015-01-09	1	-1/+3
\| \| \| \| \| \| \| \| \|	The code that eliminated additional coalescable copies in removeCopyByCommutingDef() used MergeValueNumberInto() which internally may merge A into B or B into A. In this case A and B had different Def points, so we have to reset ValNo.Def to the intended one after merging. llvm-svn: 225503
*	RegisterCoalescer: Some cleanup in removeCopyByCommutingDef(), NFC	Matthias Braun	2015-01-09	1	-15/+19
\| \| \| \|	llvm-svn: 225502
*	RegisterCoalescer: No need to set kill flags, they are recompute later anyway	Matthias Braun	2015-01-09	1	-2/+0
\| \| \| \|	llvm-svn: 225501
*	RegisterCoalescer: Turn some impossible conditions into asserts	Matthias Braun	2015-01-09	1	-15/+9
\| \| \| \|	llvm-svn: 225500
*	Bitcode: Share logic for last instruction, NFC	Duncan P. N. Exon Smith	2015-01-09	1	-14/+10
\| \| \| \| \| \|	Share logic for getting the last instruction emitted. llvm-svn: 225499
*	Bitcode: Move the DEBUG_LOC record to DEBUG_LOC_OLD	Duncan P. N. Exon Smith	2015-01-09	2	-2/+2
\| \| \| \| \| \|	Prepare to simplify the `DebugLoc` record. llvm-svn: 225498
*	[PowerPC] Add a flag for experimenting with subreg liveness tracking	Hal Finkel	2015-01-09	2	-0/+10
\| \| \| \| \| \| \|	This cannot yet be enabled by default, it causes ~50 miscompiles in the test suite. llvm-svn: 225497
*	[PowerPC] Fold [sz]ext with fp_to_int lowering where possible	Hal Finkel	2015-01-09	2	-4/+61
\| \| \| \| \| \| \|	On modern cores with lfiw[az]x, we can fold a sign or zero extension from i32 to i64 into the load necessary for an i64 -> fp conversion. llvm-svn: 225493
*	[DAGCombine] Remainder of fix to r225380 (More FMA folding opportunities)	Hal Finkel	2015-01-09	1	-10/+24
\| \| \| \| \| \| \| \| \| \|	As pointed out by Aditya (and Owen), when we elide an FP extend to form an FMA, we need to extend the incoming operands so that the resulting node will really be legal. This is currently enabled only for PowerPC, and it happens to work there regardless, but this should fix the functionality for everyone else should anyone else wish to use it. llvm-svn: 225492
*	[x86] Add a flag to control the vector shuffle legality predicates that	Chandler Carruth	2015-01-09	1	-0/+23
\| \| \| \| \| \| \| \| \| \|	complements the new vector shuffle lowering code path. This flag, naturally, is off because we've not tested or evaluated the results of this at all. However, the flag will make it much easier to evaluate whether we can be this aggressive and whether there are missing vector shuffle lowering optimizations. llvm-svn: 225491
*	Cleaup ValueHandle to no longer keep a PointerIntPair for the Value*.	Chandler Carruth	2015-01-09	1	-12/+12
\| \| \| \| \| \| \| \|	This was used previously for metadata but is no longer needed there. Not doing this simplifies ValueHandle and will make it easier to fix things like AssertingVH's DenseMapInfo. llvm-svn: 225487
*	Partial fix to r225380 (More FMA folding opportunities)	Hal Finkel	2015-01-09	1	-96/+95
\| \| \| \| \| \| \| \| \| \| \| \|	As pointed out by Aditya (and Owen), there are two things wrong with this code. First, it adds patterns which elide FP extends when forming FMAs, and that might not be profitable on all targets (it belongs behind the pre-existing aggressive-FMA-formation flag). This is fixed by this change. Second, the resulting nodes might have operands of different types (the extensions need to be re-added). That will be fixed in the follow-up commit. llvm-svn: 225485
*	[REFACTOR] Push logic from MemDepPrinter into getNonLocalPointerDependency	Philip Reames	2015-01-09	2	-35/+24
\| \| \| \| \| \| \| \|	Previously, MemDepPrinter handled volatile and unordered accesses without involving MemoryDependencyAnalysis. By making a slight tweak to the documented interface - which is respected by both callers - we can move this responsibility to MDA for the benefit of any future callers. This is basically just cleanup. In the future, we may decide to extend MDA's non local dependency analysis to return useful results for ordered or volatile loads. I believe (but have not really checked in detail) that local dependency analyis does get useful results for ordered, but not volatile, loads. llvm-svn: 225483
*	[Refactor] Have getNonLocalPointerDependency take the query instruction	Philip Reames	2015-01-09	3	-11/+50
\| \| \| \| \| \| \| \| \| \|	Previously, MemoryDependenceAnalysis::getNonLocalPointerDependency was taking a list of properties about the instruction being queried. Since I'm about to need one more property to be passed down through the infrastructure - I need to know a query instruction is non-volatile in an inner helper - fix the interface once and for all. I also added some assertions and behaviour clarifications around volatile and ordered field accesses. At the moment, this is mostly to document expected behaviour. The only non-standard instructions which can currently reach this are atomic, but unordered, loads and stores. Neither ordered or volatile accesses can reach here. The call in GVN is protected by an isSimple check when it first considers the load. The calls in MemDepPrinter are protected by isUnordered checks. Both utilities also check isVolatile for loads and stores. llvm-svn: 225481
*	Utils: Keep distinct MDNodes distinct in MapMetadata()	Duncan P. N. Exon Smith	2015-01-08	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	Create new copies of distinct `MDNode`s instead of following the uniquing `MDNode` logic. Just like self-references (or other cycles), `MapMetadata()` creates a new node. In practice most calls use `RF_NoModuleLevelChanges`, in which case nothing is duplicated anyway. Part of PR22111. llvm-svn: 225476
*	IR: Add 'distinct' MDNodes to bitcode and assembly	Duncan P. N. Exon Smith	2015-01-08	7	-6/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Propagate whether `MDNode`s are 'distinct' through the other types of IR (assembly and bitcode). This adds the `distinct` keyword to assembly. Currently, no one actually calls `MDNode::getDistinct()`, so these nodes only get created for: - self-references, which are never uniqued, and - nodes whose operands are replaced that hit a uniquing collision. The concept of distinct nodes is still not quite first-class, since distinct-ness doesn't yet survive across `MapMetadata()`. Part of PR22111. llvm-svn: 225474
*	[PowerPC] Mark all instructions as non-cheap for MachineLICM	Hal Finkel	2015-01-08	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \|	MachineLICM uses a callback named hasLowDefLatency to determine if an instruction def operand has a 'low' latency. If all relevant operands have a 'low' latency, the instruction is considered too cheap to hoist out of loops even in low-register-pressure situations. On PowerPC cores, both the embedded cores and the others, there is no reason to believe that this is a good choice: all instructions have a cost inside a loop, and hoisting them when not limited by register pressure is a reasonable default. llvm-svn: 225471
*	[MachineLICM] A command-line option to hoist even cheap instructions	Hal Finkel	2015-01-08	1	-1/+6
\| \| \| \| \| \| \| \|	Add a command-line option to enable hoisting even cheap instructions (in low-register-pressure situations). This is turned off by default, but has proved useful for testing purposes. llvm-svn: 225470
*	CodeGen: Use handy new-fangled post-increment, NFC	Duncan P. N. Exon Smith	2015-01-08	1	-1/+1
\| \| \| \| \| \| \|	Drive-by cleanup; I noticed this when reviewing the patch that became r225466. llvm-svn: 225468
*	[ARM] Fix a bug in constant island pass that was triggering an assertion.	Akira Hatanaka	2015-01-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The assert was being triggered when the distance between a constant pool entry and its user exceeded the maximally allowed distance after thumb2 branch shortening. A padding was inserted after a thumb2 branch instruction was shrunk, which caused the user to be out of range. This is wrong as the padding should have been inserted by the layout algorithm so that the distance between two instructions doesn't grow later during thumb2 instruction optimization. This commit fixes the code in ARMConstantIslands::createNewWater to call computeBlockSize and set BasicBlock::Unalign when a branch instruction is inserted to create new water after a basic block. A non-zero Unalign causes the worst-case padding to be inserted when adjustBBOffsetsAfter is called to recompute the basic block offsets. rdar://problem/19130476 llvm-svn: 225467
*	CodeGen: Use range-based for loops, NFC	Duncan P. N. Exon Smith	2015-01-08	1	-5/+5
\| \| \| \| \| \|	Patch by Ramkumar Ramachandra! llvm-svn: 225466
*	Fix fcmp + fabs instcombines when using the intrinsic	Matt Arsenault	2015-01-08	1	-26/+28
\| \| \| \| \| \| \|	This was only handling the libcall. This is another example of why only the intrinsic should ever be used when it exists. llvm-svn: 225465
*	Make the TargetMachine in MipsSubtarget a reference rather	Eric Christopher	2015-01-08	3	-15/+15
\| \| \| \| \| \|	than a pointer to make unifying code a bit easier. llvm-svn: 225459
*	Update include - this class doesn't use the target machine, but	Eric Christopher	2015-01-08	1	-1/+1
\| \| \| \| \| \|	only the subtarget. llvm-svn: 225458
*	Fix a couple of odd formatting issues.	Eric Christopher	2015-01-08	1	-6/+4
\| \| \| \|	llvm-svn: 225457
*	This routine is in InstrInfo, there's no need to access it again.	Eric Christopher	2015-01-08	1	-8/+3
\| \| \| \|	llvm-svn: 225456
*	[X86] Reflow comment. NFC.	Ahmed Bougacha	2015-01-08	1	-3/+4
\| \| \| \|	llvm-svn: 225455
*	clang-format. NFC.	Rafael Espindola	2015-01-08	1	-11/+22
\| \| \| \|	llvm-svn: 225454
*	Add saving and restoring of r30 to the prologue and epilogue, respectively	Justin Hibbits	2015-01-08	2	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The PIC additions didn't update the prologue and epilogue code to save and restore r30 (PIC base register). This does that. Test Plan: Tests updated. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6876 llvm-svn: 225450
*	Explicitly handle LinkOnceODRAutoHideLinkage. NFC. We already have a test.	Rafael Espindola	2015-01-08	1	-0/+2
\| \| \| \|	llvm-svn: 225449
*	Update naming style and clang-format. NFC.	Rafael Espindola	2015-01-08	1	-17/+30
\| \| \| \|	llvm-svn: 225448
*	Fix large stack alignment codegen for ARM and Thumb2 targets	Kristof Beyls	2015-01-08	2	-22/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This partially fixes PR13007 (ARM CodeGen fails with large stack alignment): for ARM and Thumb2 targets, but not for Thumb1, as it seems stack alignment for Thumb1 targets hasn't been supported at all. Producing an aligned stack pointer is done by zero-ing out the lower bits of the stack pointer. The BIC instruction was used for this. However, the immediate field of the BIC instruction only allows to encode an immediate that can zero out up to a maximum of the 8 lower bits. When a larger alignment is requested, a BIC instruction cannot be used; llvm was silently producing incorrect code in this case. This commit fixes code generation for large stack aligments by using the BFC instruction instead, when the BFC instruction is available. When not, it uses 2 instructions: a right shift, followed by a left shift to zero out the lower bits. The lowering of ARM::Int_eh_sjlj_dispatchsetup still has code that unconditionally uses BIC to realign the stack pointer, so it very likely has the same problem. However, I wasn't able to produce a test case for that. This commit adds an assert so that the compiler will fail the assert instead of silently generating wrong code if this is ever reached. llvm-svn: 225446
*	R600/SI: Remove SIISelLowering::legalizeOperands()	Tom Stellard	2015-01-08	2	-176/+1
\| \| \| \| \| \| \| \| \|	Its functionality has been replaced by calling SIInstrInfo::legalizeOperands() from SIISelLowering::AdjstInstrPostInstrSelection() and running the SIFoldOperands and SIShrinkInstructions passes. llvm-svn: 225445
*	Masked Load/Store - fixed a bug in type legalization.	Elena Demikhovsky	2015-01-08	3	-3/+107
\| \| \| \|	llvm-svn: 225441
*	Fix include ordering, NFC.	Michael Kuperstein	2015-01-08	1	-1/+1
\| \| \| \|	llvm-svn: 225439
*	[X86] Don't try to generate direct calls to TLS globals	Michael Kuperstein	2015-01-08	1	-1/+2
\| \| \| \| \| \| \| \| \|	The call lowering assumes that if the callee is a global, we want to emit a direct call. This is correct for regular globals, but not for TLS ones. Differential Revision: http://reviews.llvm.org/D6862 llvm-svn: 225438
*	Move SPAdj logic from PEI into the targets (NFC)	Michael Kuperstein	2015-01-08	2	-11/+34
\| \| \| \| \| \| \| \| \| \| \| \|	PEI tries to keep track of how much starting or ending a call sequence adjusts the stack pointer by, so that it can resolve frame-index references. Currently, it takes a very simplistic view of how SP adjustments are done - both FrameStartOpcode and FrameDestroyOpcode adjust it exactly by the amount written in its first argument. This view is in fact incorrect for some targets (e.g. due to stack re-alignment, or because it may want to adjust the stack pointer in multiple steps). However, that doesn't cause breakage, because most targets (the only in-tree exception appears to be 32-bit ARM) rely on being able to simplify the call frame pseudo-instructions earlier, so this code is never hit. Moving the computation into TargetInstrInfo allows targets to override the way the adjustment is computed if they need to have a non-zero SPAdj. Differential Revision: http://reviews.llvm.org/D6863 llvm-svn: 225437
*	[X86] Don't print 'dword ptr' or 'qword ptr' on the operand to some of the ↵	Craig Topper	2015-01-08	4	-4/+14
\| \| \| \| \| \|	LEA variants in Intel syntax. The memory operand is inherently unsized. llvm-svn: 225432
*	Revert "Reapply: Teach SROA how to update debug info for fragmented variables."	Adrian Prantl	2015-01-08	1	-60/+8
\| \| \| \| \| \| \|	This reverts commit r225379 while investigating an assertion failure reported by Alexey. llvm-svn: 225424
*	[RegAllocGreedy] Introduce a late pass to repair broken hints.	Quentin Colombet	2015-01-08	3	-2/+212
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A broken hint is a copy where both ends are assigned different colors. When a variable gets evicted in the neighborhood of such copies, it is likely we can reconcile some of them. Context Copies are inserted during the register allocation via splitting. These split points are required to relax the constraints on the allocation problem. When such a point is inserted, both ends of the copy would not share the same color with respect to the current allocation problem. When variables get evicted, the allocation problem becomes different and some split point may not be required anymore. However, the related variables may already have been colored. This usually shows up in the assembly with pattern like this: def A ... save A to B def A use A restore A from B ... use B Whereas we could simply have done: def B ... def A use A ... use B Proposed Solution A variable having a broken hint is marked for late recoloring if and only if selecting a register for it evict another variable. Indeed, if no eviction happens this is pointless to look for recoloring opportunities as it means the situation was the same as the initial allocation problem where we had to break the hint. Finally, when everything has been allocated, we look for recoloring opportunities for all the identified candidates. The recoloring is performed very late to rely on accurate copy cost (all involved variables are allocated). The recoloring is simple unlike the last change recoloring. It propagates the color of the broken hint to all its copy-related variables. If the color is available for them, the recoloring uses it, otherwise it gives up on that hint even if a more complex coloring would have worked. The recoloring happens only if it is profitable. The profitability is evaluated using the expected frequency of the copies of the currently recolored variable with a) its current color and b) with the target color. If a) is greater or equal than b), then it is profitable and the recoloring happen. Example Consider the following example: BB1: a = b = BB2: ... = b = a Let us assume b gets split: BB1: a = b = BB2: c = b ... d = c = d = a Because of how the allocation work, b, c, and d may be assigned different colors. Now, if a gets evicted to make room for c, assuming b and d were assigned to something different than a. We end up with: BB1: a = st a, SpillSlot b = BB2: c = b ... d = c = d e = ld SpillSlot = e This is likely that we can assign the same register for b, c, and d, getting rid of 2 copies. Performances Both ARM64 and x86_64 show performance improvements of up to 3% for the llvm-testsuite + externals with Os and O3. There are a few regressions too that comes from the (in)accuracy of the block frequency estimate. <rdar://problem/18312047> llvm-svn: 225422
*	[SelectionDAG] Allow targets to specify legality of extloads' result	Ahmed Bougacha	2015-01-08	20	-169/+230
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	type (in addition to the memory type). The LoadExt legalization handling used to only have one type, the memory type. This forced users to assume that as long as the extload for the memory type was declared legal, and the result type was legal, the whole extload was legal. However, this isn't always the case. For instance, on X86, with AVX, this is legal: v4i32 load, zext from v4i8 but this isn't: v4i64 load, zext from v4i8 Whereas v4i64 is (arguably) legal, even without AVX2. Note that the same thing was done a while ago for truncstores (r46140), but I assume no one needed it yet for extloads, so here we go. Calls to getLoadExtAction were changed to add the value type, found manually in the surrounding code. Calls to setLoadExtAction were mechanically changed, by wrapping the call in a loop, to match previous behavior. The loop iterates over the MVT subrange corresponding to the memory type (FP vectors, etc...). I also pulled neighboring setTruncStoreActions into some of the loops; those shouldn't make a difference, as the additional types are illegal. (e.g., i128->i1 truncstores on PPC.) No functional change intended. Differential Revision: http://reviews.llvm.org/D6532 llvm-svn: 225421
*	Remove empty statement. No functionality change.	Nick Lewycky	2015-01-08	1	-1/+0
\| \| \| \|	llvm-svn: 225420
*	X86: VZeroUpperInserter: shortcut should not trigger if we have any function ↵	Matthias Braun	2015-01-08	1	-8/+12
\| \| \| \| \| \|	live-ins. llvm-svn: 225419
*	RegisterCoalescer: Do not remove IMPLICIT_DEFS if they are required for ↵	Matthias Braun	2015-01-08	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \|	subranges. The register coalescer used to remove implicit_defs when they are covered by the main range anyway. With subreg liveness tracking we can't do that anymore in places where the IMPLICIT_DEF is required as begin of a subregister liverange. llvm-svn: 225416
*	RegisterCoalescer: Fix valuesIdentical() in some subrange merge cases.	Matthias Braun	2015-01-07	1	-90/+81
\| \| \| \| \| \| \| \| \| \| \| \| \|	I got confused and assumed SrcIdx/DstIdx of the CoalescerPair is a subregister index in SrcReg/DstReg, but they are actually subregister indices of the coalesced register that get you back to SrcReg/DstReg when applied. Fixed the bug, improved comments and simplified code accordingly. Testcase by Tom Stellard! llvm-svn: 225415
*	LiveInterval: Implement feedback by Quentin Colombet.	Matthias Braun	2015-01-07	1	-25/+32
\| \| \| \|	llvm-svn: 225413
*	[GC] improve testing around gc.relocate and fix a test	Philip Reames	2015-01-07	1	-9/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch by: Ramkumar Ramachandra <artagnon@gmail.com> "This patch started out as an exploration of gc.relocate, and an attempt to write a simple test in call-lowering. I then noticed that the arguments of gc.relocate were not checked fully, so I went in and fixed a few things. Finally, the most important outcome of this patch is that my new error handling code caught a bug in a callsite in stackmap-format." Differential Revision: http://reviews.llvm.org/D6824 llvm-svn: 225412
*	R600/SI: Commute instructions to enable more folding opportunities	Tom Stellard	2015-01-07	2	-19/+51
\| \| \| \|	llvm-svn: 225410