bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SystemZ] Implement isLegalAddressingMode()	Richard Sandiford	2013-07-31	1	-0/+25
\| \| \| \| \| \| \| \| \| \|	The loop optimizers were assuming that scales > 1 were OK. I think this is actually a bug in TargetLoweringBase::isLegalAddressingMode(), since it seems to be trying to reject anything that isn't r+i or r+r, but it has no default case for scales other than 0, 1 or 2. Implementing the hook for z means that z can no longer test any change there though. llvm-svn: 187497
*	[SystemZ] Be more careful about inverting CC masks (conditional loads)	Richard Sandiford	2013-07-31	2	-14/+14
\| \| \| \| \| \| \| \|	Extend r187495 to conditional loads. I split this out because the easiest way seemed to be to force a particular operand order in SystemZISelDAGToDAG.cpp. llvm-svn: 187496
*	[SystemZ] Be more careful about inverting CC masks	Richard Sandiford	2013-07-31	47	-124/+149
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	System z branches have a mask to select which of the 4 CC values should cause the branch to be taken. We can invert a branch by inverting the mask. However, not all instructions can produce all 4 CC values, so inverting the branch like this can lead to some oddities. For example, integer comparisons only produce a CC of 0 (equal), 1 (less) or 2 (greater). If an integer EQ is reversed to NE before instruction selection, the branch will test for 1 or 2. If instead the branch is reversed after instruction selection (by inverting the mask), it will test for 1, 2 or 3. Both are correct, but the second isn't really canonical. This patch therefore keeps track of which CC values are possible and uses this when inverting a mask. Although this is mostly cosmestic, it fixes undefined behavior for the CIJNLH in branch-08.ll. Another fix would have been to mask out bit 0 when generating the fused compare and branch, but the point of this patch is that we shouldn't need to do that in the first place. The patch also makes it easier to reuse CC results from other instructions. llvm-svn: 187495
*	[SystemZ] Move compare-and-branch generation even later	Richard Sandiford	2013-07-31	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	r187116 moved compare-and-branch generation from the instruction-selection pass to the peephole optimizer (via optimizeCompare). It turns out that even this is a bit too early. Fused compare-and-branch instructions don't interact well with predication, where a CC result is needed. They also make it harder to reuse the CC side-effects of earlier instructions (not yet implemented, but the subject of a later patch). Another problem was that the AnalyzeBranch family of routines weren't handling compares and branches, so we weren't able to reverse the fused form in cases where we would reverse a separate branch. This could have been fixed by extending AnalyzeBranch, but given the other problems, I've instead moved the fusing to the long-branch pass, which is also responsible for the opposite transformation: splitting out-of-range compares and branches into separate compares and long branches. I've added a test for the AnalyzeBranch problem. A test for the predication problem is included in the next patch, which fixes a bug in the choice of CC mask. llvm-svn: 187494
*	[SystemZ] Postpone NI->RISBG conversion to convertToThreeAddress()	Richard Sandiford	2013-07-31	29	-431/+446
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	r186399 aggressively used the RISBG instruction for immediate ANDs, both because it can handle some values that AND IMMEDIATE can't, and because it allows the destination register to be different from the source. I realized later while implementing the distinct-ops support that it would be better to leave the choice up to convertToThreeAddress() instead. The AND IMMEDIATE form is shorter and is less likely to be cracked. This is a problem for 32-bit ANDs because we assume that all 32-bit operations will leave the high word untouched, whereas RISBG used in this way will either clear the high word or copy it from the source register. The patch uses the z196 instruction RISBLG for this instead. This means that z10 will be restricted to NILL, NILH and NILF for 32-bit ANDs, but I think that should be OK for now. Although we're using z10 as the base architecture, the optimization work is going to be focused more on z196 and zEC12. llvm-svn: 187492
*	Added INSERT and EXTRACT intructions from AVX-512 ISA.	Elena Demikhovsky	2013-07-31	1	-0/+44
\| \| \| \| \| \| \| \| \|	All insertf/extractf functions replaced with insert/extract since we have insertf and inserti forms. Added lowering for INSERT_VECTOR_ELT / EXTRACT_VECTOR_ELT for 512-bit vectors. Added lowering for EXTRACT/INSERT subvector for 512-bit vectors. Added a test. llvm-svn: 187491
*	Changed register names (and pointer keywords) to be lower case when using ↵	Craig Topper	2013-07-31	6	-18/+18
\| \| \| \| \| \| \| \|	Intel X86 assembler syntax. Patch by Richard Mitton. llvm-svn: 187476
*	This test may have been sensitive to the ARM ABI...	Andrew Trick	2013-07-30	1	-1/+1
\| \| \| \|	llvm-svn: 187442
*	MI Sched fix: assert "Disconnected LRG within the scheduling region."	Andrew Trick	2013-07-30	1	-1/+54
\| \| \| \|	llvm-svn: 187435
*	R600/SI: Expand vector fp <-> int conversions	Tom Stellard	2013-07-30	4	-36/+36
\| \| \| \|	llvm-svn: 187421
*	[ARM] check bitwidth in PerformORCombine	Saleem Abdulrasool	2013-07-30	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \|	When simplifying a (or (and B A) (and C ~A)) to a (VBSL A B C) ensure that the bitwidth of the second operands to both ands match before comparing the negation of the values. Split the check of the value of the second operands to the ands. Move the cast and variable declaration slightly higher to make it slightly easier to follow. Bug-Id: 16700 Signed-off-by: Saleem Abdulrasool <compnerd@compnerd.org> llvm-svn: 187404
*	[R600] Replicate old DAGCombiner behavior in target specific DAG combine.	Quentin Colombet	2013-07-30	1	-1/+0
\| \| \| \| \| \| \|	build_vector is lowered to REG_SEQUENCE, which is something the register allocator does a good job at optimizing. llvm-svn: 187397
*	[DAGCombiner] insert_vector_elt: Avoid building a vector twice.	Quentin Colombet	2013-07-30	7	-26/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch prevents the following combine when the input vector is used more than once. insert_vector_elt (build_vector elt0, ..., eltN), NewEltIdx, idx => build_vector elt0, ..., NewEltIdx, ..., eltN The reasons are: - Building a vector may be expensive, so try to reuse the existing part of a vector instead of creating a new one (think big vectors). - elt0 to eltN now have two users instead of one. This may prevent some other optimizations. llvm-svn: 187396
*	Debug Info: enable verifier for testing cases.	Manman Ren	2013-07-29	3	-3/+3
\| \| \| \|	llvm-svn: 187375
*	Debug Info: update testing cases to pass verifier.	Manman Ren	2013-07-29	15	-45/+61
\| \| \| \|	llvm-svn: 187362
*	Proper va_arg/va_copy lowering on win64	Nico Rieck	2013-07-29	1	-0/+60
\| \| \| \| \| \| \|	Win64 uses CharPtrBuiltinVaList instead of X86_64ABIBuiltinVaList like other 64-bit targets. llvm-svn: 187355
*	Allow generation of vmla.f32 instructions when targeting Cortex-A15. The ↵	Silviu Baranga	2013-07-29	1	-1/+25
\| \| \| \| \| \|	patch also adds the VFP4 feature to Cortex-A15 and fixes the DontUseFusedMAC predicate so that we can still generate vmla.f32 instructions on non-darwin targets with VFP4. llvm-svn: 187349
*	Debug Info Verifier: verify SPs in llvm.dbg.sp.	Manman Ren	2013-07-27	15	-61/+64
\| \| \| \| \| \| \| \|	Also always add DIType, DISubprogram and DIGlobalVariable to the list in DebugInfoFinder without checking them, so we can verify them later on. llvm-svn: 187285
*	next batch of -disable-debug-info-verifier	Rafael Espindola	2013-07-26	2	-2/+2
\| \| \| \|	llvm-svn: 187260
*	[mips] Implement llvm.trap intrinsic.	Akira Hatanaka	2013-07-26	1	-0/+11
\| \| \| \| \| \|	Patch by Sasa Stankovic. llvm-svn: 187244
*	Debug Info Verifier: enable verification of DICompileUnit.	Manman Ren	2013-07-26	25	-59/+85
\| \| \| \| \| \| \| \|	We used to call Verify before adding DICompileUnit to the list, and now we remove the check and always add DICompileUnit to the list in DebugInfoFinder, so we can verify them later on. llvm-svn: 187237
*	[mips] Print instructions "beq", "bne" and "or" using assembler pseudo	Akira Hatanaka	2013-07-26	3	-24/+24
\| \| \| \| \| \| \| \| \| \|	instructions "beqz", "bnez" and "move", when possible. beq $2, $zero, $L1 => beqz $2, $L1 bne $2, $zero, $L1 => bnez $2, $L1 or $2, $3, $zero => move $2, $3 llvm-svn: 187229
*	Add a target legalize hook for SplitVectorOperand (again)	Justin Holewinski	2013-07-26	2	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	CustomLowerNode was not being called during SplitVectorOperand, meaning custom legalization could not be used by targets. This also adds a test case for NVPTX that depends on this custom legalization. Differential Revision: http://llvm-reviews.chandlerc.com/D1195 Attempt to fix the buildbots by making the X86 test I just added platform independent llvm-svn: 187202
*	Revert "Add a target legalize hook for SplitVectorOperand"	Rafael Espindola	2013-07-26	2	-41/+0
\| \| \| \| \| \| \| \| \| \|	This reverts commit 187198. It broke the bots. The soft float test probably needs a -triple because of name differences. On the hard float test I am getting a "roundss $1, %xmm0, %xmm0", instead of "vroundss $1, %xmm0, %xmm0, %xmm0". llvm-svn: 187201
*	Add a target legalize hook for SplitVectorOperand	Justin Holewinski	2013-07-26	2	-0/+41
\| \| \| \| \| \| \| \| \| \| \| \|	CustomLowerNode was not being called during SplitVectorOperand, meaning custom legalization could not be used by targets. This also adds a test case for NVPTX that depends on this custom legalization. Differential Revision: http://llvm-reviews.chandlerc.com/D1195 llvm-svn: 187198
*	PPC32 va_list is an actual structure so va_copy needs to copy the whole	Roman Divacky	2013-07-25	1	-0/+24
\| \| \| \| \| \| \|	structure not just a pointer. This implements that and thus fixes va_copy on PPC32. Fixes #15286. Both bug and patch by Florian Zeitz! llvm-svn: 187158
*	Debug Info: improve the verifier to check field types.	Manman Ren	2013-07-25	14	-97/+100
\| \| \| \| \| \| \|	Make sure the context field of DIType is MDNode. Fix testing cases to make them pass the verifier. llvm-svn: 187150
*	Remove the mblaze backend from llvm.	Rafael Espindola	2013-07-25	17	-1410/+0
\| \| \| \| \| \|	Approval in here http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-July/064169.html llvm-svn: 187145
*	Evict local live ranges if they can be reassigned.	Andrew Trick	2013-07-25	4	-12/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous change to local live range allocation also suppressed eviction of local ranges. In rare cases, this could result in more expensive register choices. This commit actually revives a feature that I added long ago: check if live ranges can be reassigned before eviction. But now it only happens in rare cases of evicting a local live range because another local live range wants a cheaper register. The benefit is improved code size for some benchmarks on x86 and armv7. I measured no significant compile time increase and performance changes are noise. llvm-svn: 187140
*	Allocate local registers in order for optimal coloring.	Andrew Trick	2013-07-25	17	-73/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also avoid locals evicting locals just because they want a cheaper register. Problem: MI Sched knows exactly how many registers we have and assumes they can be colored. In cases where we have large blocks, usually from unrolled loops, greedy coloring fails. This is a source of "regressions" from the MI Scheduler on x86. I noticed this issue on x86 where we have long chains of two-address defs in the same live range. It's easy to see this in matrix multiplication benchmarks like IRSmk and even the unit test misched-matmul.ll. A fundamental difference between the LLVM register allocator and conventional graph coloring is that in our model a live range can't discover its neighbors, it can only verify its neighbors. That's why we initially went for greedy coloring and added eviction to deal with the hard cases. However, for singly defined and two-address live ranges, we can optimally color without visiting neighbors simply by processing the live ranges in instruction order. Other beneficial side effects: It is much easier to understand and debug regalloc for large blocks when the live ranges are allocated in order. Yes, global allocation is still very confusing, but it's nice to be able to comprehend what happened locally. Heuristics could be added to bias register assignment based on instruction locality (think late register pairing, banks...). Intuituvely this will make some test cases that are on the threshold of register pressure more stable. llvm-svn: 187139
*	Current batch of -disable-debug-info-verifier.	Rafael Espindola	2013-07-25	7	-9/+9
\| \| \| \|	llvm-svn: 187130
*	AArch64: add llc-based tests for previous commit.	Tim Northover	2013-07-25	2	-2/+15
\| \| \| \| \| \|	Better to have tests run even on non-AArch64 platforms. llvm-svn: 187128
*	[SystemZ] Rework compare and branch support	Richard Sandiford	2013-07-25	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before the patch we took advantage of the fact that the compare and branch are glued together in the selection DAG and fused them together (where possible) while emitting them. This seemed to work well in practice. However, fusing the compare so early makes it harder to remove redundant compares in cases where CC already has a suitable value. This patch therefore uses the peephole analyzeCompare/optimizeCompareInstr pair of functions instead. No behavioral change intended, but it paves the way for a later patch. llvm-svn: 187116
*	[SystemZ] Add LOCR and LOCGR	Richard Sandiford	2013-07-25	1	-0/+25
\| \| \| \|	llvm-svn: 187113
*	[SystemZ] Add LOC and LOCG	Richard Sandiford	2013-07-25	2	-0/+260
\| \| \| \| \| \| \|	As with the stores, these instructions can trap when the condition is false, so they are only used for things like (cond ? x : *ptr). llvm-svn: 187112
*	[SystemZ] Add STOC and STOCG	Richard Sandiford	2013-07-25	4	-2/+312
\| \| \| \| \| \| \| \|	These instructions are allowed to trap even if the condition is false, so for now they are only used for "ptr = (cond ? x : ptr)"-style constructs. llvm-svn: 187111
*	Debug Info: improve the verifier to check field types.	Manman Ren	2013-07-25	14	-82/+83
\| \| \| \| \| \| \| \|	Make sure the context and type fields are MDNodes. We will generate verification errors if those fields are non-empty strings. Fix testing cases to make them pass the verifier. llvm-svn: 187106
*	Replace the "NoFramePointerElimNonLeaf" target option with a function attribute.	Bill Wendling	2013-07-25	2	-30/+48
\| \| \| \| \| \| \| \|	There's no need to specify a flag to omit frame pointer elimination on non-leaf nodes...(Honestly, I can't parse that option out.) Use the function attribute stuff instead. llvm-svn: 187093
*	Update testing cases to pass debug info verifier.	Manman Ren	2013-07-24	9	-82/+83
\| \| \| \|	llvm-svn: 187083
*	Fix a bug in IfConverter with nested predicates.	Quentin Colombet	2013-07-24	1	-0/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prior to this patch, IfConverter may widen the cases where a sequence of instructions were executed because of the way it uses nested predicates. This result in incorrect execution. For instance, Let A be a basic block that flows conditionally into B and B be a predicated block. B can be predicated with A.BrToBPredicate into A iff B.Predicate is less "permissive" than A.BrToBPredicate, i.e., iff A.BrToBPredicate subsumes B.Predicate. The IfConverter was checking the opposite: B.Predicate subsumes A.BrToBPredicate. <rdar://problem/14379453> llvm-svn: 187071
*	Debug Info: improve the Finder.	Manman Ren	2013-07-24	8	-56/+58
\| \| \| \| \| \| \|	Improve the Finder to handle context of a DIVariable used by DbgValueInst. Fix testing cases to make them pass the verifier. llvm-svn: 187052
*	Update testing cases to pass debug info verifier.	Manman Ren	2013-07-24	3	-39/+40
\| \| \| \|	llvm-svn: 187049
*	Update testing cases to make them pass debug info verification.	Manman Ren	2013-07-24	2	-21/+22
\| \| \| \|	llvm-svn: 187016
*	DAGCombiner: Pass the correct type to TargetLowering::isF(Abs\|Neg)Free	Tom Stellard	2013-07-23	3	-186/+38
\| \| \| \| \| \| \|	This commit also implements these functions for R600 and removes a test case that was relying on the buggy behavior. llvm-svn: 187007
*	R600: Treat CONSTANT_ADDRESS loads like GLOBAL_ADDRESS loads when necessary	Tom Stellard	2013-07-23	1	-25/+122
\| \| \| \| \| \| \| \| \| \|	These are really the same address space in hardware. The only difference is that CONSTANT_ADDRESS uses a special cache for faster access. When we are unable to use the constant kcache for some reason (e.g. smaller types or lack of indirect addressing) then the instruction selector must use GLOBAL_ADDRESS loads instead. llvm-svn: 187006
*	Debug Info: improve the Finder.	Manman Ren	2013-07-23	6	-18/+22
\| \| \| \| \| \| \|	Improve the Finder to handle context of a DIVariable. If Scope is a DICompileUnit, add it to the list of CUs. llvm-svn: 187003
*	[ARM][ISel] Improve the lowering of vector loads.	Quentin Colombet	2013-07-23	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When vectors are built from a single value, the ARM lowering issues a scalar_to_vector node. This node is then always morphed into a move from the general purpose unit to the vector unit. When the value comes from a load, this can be simplified into a vector load to the right lane. This patch changes the lowering of insert_vector_elt to expose a vector friendly pattern in this situation. This is a step toward fixing <rdar://problem/14170854>. llvm-svn: 186999
*	R600: Add support for 24-bit MAD instructions	Tom Stellard	2013-07-23	2	-0/+90
\| \| \| \| \|	Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186923
*	R600: Add support for 24-bit MUL instructions	Tom Stellard	2013-07-23	2	-0/+84
\| \| \| \| \|	Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186922
*	R600: Improve support for < 32-bit loads	Tom Stellard	2013-07-23	2	-14/+67
\| \| \| \| \|	Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186921