bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a	Lang Hames	2012-06-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	boolean flag to an enum: { Fast, Standard, Strict } (default = Standard). This option controls the creation by optimizations of fused FP ops that store intermediate results in higher precision than IEEE allows (E.g. FMAs). The behavior of this option is intended to match the behaviour specified by a soon-to-be-introduced frontend flag: '-ffuse-fp-ops'. Fast mode - allows formation of fused FP ops whenever they're profitable. Standard mode - allow fusion only for 'blessed' FP ops. At present the only blessed op is the fmuladd intrinsic. In the future more blessed ops may be added. Strict mode - allow fusion only if/when it can be proven that the excess precision won't effect the result. Note: This option only controls formation of fused ops by the optimizers. Fused operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic) will always be honored, regardless of the value of this option. Internally TargetOptions::AllowExcessFPPrecision has been replaced by TargetOptions::AllowFPOpFusion. llvm-svn: 158956
*	Fix potential crash if DAGCombine on stores sees a half type	Pete Cooper	2012-06-21	1	-1/+2
\| \| \| \|	llvm-svn: 158927
*	Add users of a MERGE_VALUE node to the worklist to process again when the ↵	Pete Cooper	2012-06-20	1	-0/+3
\| \| \| \| \| \|	node is removed. Sorry, no test case. Foudn it by inspection of the code llvm-svn: 158839
*	Fix DAGCombine to deal with ext-conversion of pre/post_inc loads.	Hal Finkel	2012-06-20	1	-1/+8
\| \| \| \| \| \|	The test case for this will come with the PPC indexed preinc loads commit. llvm-svn: 158822
*	Add DAG-combines for aggressive FMA formation.	Lang Hames	2012-06-19	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or FSUB + FMUL. The combines are performed when: (a) Either AllowExcessFPPrecision option (-enable-excess-fp-precision for llc) OR UnsafeFPMath option (-enable-unsafe-fp-math) are set, and (b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of the FADD/FSUB, and (c) The FMUL only has one user (the FADD/FSUB). If your target has fast FMA instructions you can make use of these combines by overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for types supported by your FMA instruction, and adding patterns to match ISD::FMA to your FMA instructions. llvm-svn: 158757
*	Make comment slightly more helpful.	Lang Hames	2012-06-14	1	-1/+1
\| \| \| \|	llvm-svn: 158467
*	Switch the canonical FMA term operand order to match both the comment I ↵	Owen Anderson	2012-05-30	1	-1/+1
\| \| \| \| \| \|	wrote and the usual LLVM convention. llvm-svn: 157708
*	Teach DAGCombine to canonicalize the position of a constant in the term ↵	Owen Anderson	2012-05-30	1	-0/+4
\| \| \| \| \| \|	operands of an FMA node. llvm-svn: 157707
*	DAGCombiner should not change the type of an extract_vector index.	Jim Grosbach	2012-05-08	1	-3/+4
\| \| \| \| \| \| \| \| \| \|	When a combine twiddles an extract_vector, care should be take to preserve the type of the index operand. No luck extracting a reasonable testcase, unfortunately. rdar://11391009 llvm-svn: 156419
*	Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled.	Owen Anderson	2012-05-07	1	-0/+4
\| \| \| \|	llvm-svn: 156324
*	Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, ↵	Owen Anderson	2012-05-02	1	-0/+18
\| \| \| \| \| \|	just like it now knows for FMULs. llvm-svn: 156029
*	Teach DAG combine that multiplication by 1.0 can always be constant folded.	Owen Anderson	2012-05-02	1	-0/+3
\| \| \| \|	llvm-svn: 156023
*	ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2	Elena Demikhovsky	2012-04-22	1	-0/+2
\| \| \| \|	llvm-svn: 155309
*	Register DAGUpdateListeners with SelectionDAG.	Jakob Stoklund Olesen	2012-04-20	1	-48/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of passing listener pointers to RAUW, let SelectionDAG itself keep a linked list of interested listeners. This makes it possible to have multiple listeners active at once, like RAUWUpdateListener was already doing. It also makes it possible to register listeners up the call stack without controlling all RAUW calls below. DAGUpdateListener uses an RAII pattern to add itself to the SelectionDAG list of active listeners. llvm-svn: 155248
*	Remove dead SD nodes after the combining pass. Fixes PR12201.	Hal Finkel	2012-04-16	1	-0/+1
\| \| \| \|	llvm-svn: 154786
*	Reapply 154397. Original message:	Nadav Rotem	2012-04-11	1	-11/+18
\| \| \| \| \| \| \| \|	Fix a dagcombine optimization which assumes that the vsetcc result type is always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154490
*	Add a comment noting that the fdiv -> fmul conversion won't generate	Duncan Sands	2012-04-10	1	-3/+3
\| \| \| \| \| \|	multiplication by a denormal, and some tests checking that. llvm-svn: 154431
*	Revert r154397, which was causing make check failures on the buildbots.	Owen Anderson	2012-04-10	1	-13/+6
\| \| \| \|	llvm-svn: 154414
*	Fix a dagcombine optimization which assumes that the vsetcc result type is ↵	Nadav Rotem	2012-04-10	1	-6/+13
\| \| \| \| \| \| \| \| \|	always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154397
*	Transform div to mul with reciprocal only when fp imm is legal.	Anton Korobeynikov	2012-04-10	1	-2/+9
\| \| \| \| \| \|	This fixes PR12516 and uncovers one weird problem in legalize (workarounded) llvm-svn: 154394
*	Don't try to zExt just to check if an integer constant is zero, it might	Rafael Espindola	2012-04-10	1	-2/+2
\| \| \| \| \| \|	not fit in a i64. llvm-svn: 154364
*	Pattern match a setcc of boolean value with 0 as a truncate.	Rafael Espindola	2012-04-09	1	-9/+48
\| \| \| \|	llvm-svn: 154322
*	Remove unnecessary type check when combining and/or/xor of swizzles. Move ↵	Craig Topper	2012-04-09	1	-13/+12
\| \| \| \| \| \|	some checks to allow better early out. llvm-svn: 154309
*	Remove unnecessary 'else' on an 'if' that always returns	Craig Topper	2012-04-09	1	-1/+2
\| \| \| \|	llvm-svn: 154308
*	Optimize code slightly. No functionality change.	Craig Topper	2012-04-09	1	-6/+7
\| \| \| \|	llvm-svn: 154307
*	Replace some explicit checks with asserts for conditions that should never ↵	Craig Topper	2012-04-09	1	-14/+7
\| \| \| \| \| \|	happen. llvm-svn: 154305
*	Silence sign-compare warning.	Benjamin Kramer	2012-04-08	1	-1/+1
\| \| \| \|	llvm-svn: 154297
*	Only have codegen turn fdiv by a constant into fmul by the reciprocal	Duncan Sands	2012-04-08	1	-5/+3
\| \| \| \| \| \| \| \|	when -ffast-math, i.e. don't just always do it if the reciprocal can be formed exactly. There is already an IR level transform that does that, and it does it more carefully. llvm-svn: 154296
*	1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new	Nadav Rotem	2012-04-07	1	-12/+14
\| \| \| \| \| \| \| \| \| \|	shuffle node because it could introduce new shuffle nodes that were not supported efficiently by the target. 2. Add a more restrictive shuffle-of-shuffle optimization for cases where the second shuffle reverses the transformation of the first shuffle. llvm-svn: 154266
*	Convert floating point division by a constant into multiplication by the	Duncan Sands	2012-04-07	1	-0/+13
\| \| \| \| \| \| \| \|	reciprocal if converting to the reciprocal is exact. Do it even if inexact if -ffast-math. This substantially speeds up ac.f90 from the polyhedron benchmarks. llvm-svn: 154265
*	Always compute all the bits in ComputeMaskedBits.	Rafael Espindola	2012-04-04	1	-15/+10
\| \| \| \| \| \| \| \|	This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011
*	Add predicates for checking whether targets have free FNEG and FABS ↵	Owen Anderson	2012-04-02	1	-3/+5
\| \| \| \| \| \|	operations, and prevent the DAGCombiner from turning them into bitwise operations if they do. llvm-svn: 153901
*	Optimizing swizzles of complex shuffles may generate additional complex ↵	Nadav Rotem	2012-04-02	1	-1/+9
\| \| \| \| \| \| \| \| \|	shuffles. Do not try to optimize swizzles of shuffles if the source shuffle has more than a single user, except when the source shuffle is also a swizzle. llvm-svn: 153864
*	This commit contains a few changes that had to go in together.	Nadav Rotem	2012-04-01	1	-0/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) (and also scalar_to_vector). 2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src). Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B)) 3. Optimize swizzles of shuffles: shuff(shuff(x, y), undef) -> shuff(x, y). 4. Fix an X86ISelLowering optimization which was very bitcast-sensitive. Code which was previously compiled to this: movd (%rsi), %xmm0 movdqa .LCPI0_0(%rip), %xmm2 pshufb %xmm2, %xmm0 movd (%rdi), %xmm1 pshufb %xmm2, %xmm1 pxor %xmm0, %xmm1 pshufb .LCPI0_1(%rip), %xmm1 movd %xmm1, (%rdi) ret Now compiles to this: movl (%rsi), %eax xorl %eax, (%rdi) ret llvm-svn: 153848
*	fix what looks like a real logic bug, found by PVS-Studio (part of PR12357)	Chris Lattner	2012-03-27	1	-2/+2
\| \| \| \|	llvm-svn: 153513
*	When combining (vextract shuffle (load ), <1,u,u,u>), 0) -> (load ), add ↵	Craig Topper	2012-03-20	1	-0/+1
\| \| \| \| \| \|	users of the final load to the worklist too. Needed by changes I'm preparing to make to X86 backend. llvm-svn: 153078
*	Fix DAG combine which creates illegal vector shuffles. Patch by Heikki Kultala.	Duncan Sands	2012-03-19	1	-0/+6
\| \| \| \|	llvm-svn: 153035
*	When optimizing certain BUILD_VECTOR nodes into other BUILD_VECTOR nodes, ↵	Nadav Rotem	2012-03-15	1	-0/+4
\| \| \| \| \| \|	add the new node into the work list because there is a potential for further optimizations. llvm-svn: 152784
*	Add a xform to the DAG combiner.	Bill Wendling	2012-03-15	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \|	Transform: (fsub x, (fadd x, y)) -> (fneg y) and (fsub x, (fadd y, x)) -> (fneg y) if 'unsafe math' is specified. <rdar://problem/7540295> llvm-svn: 152777
*	Fortify r152675 a bit. Although I'm not able to come up with a test case ↵	Evan Cheng	2012-03-13	1	-3/+11
\| \| \| \| \| \|	that would trigger the truncation case. llvm-svn: 152678
*	DAG combine incorrectly optimize (i32 vextract (v4i16 load $addr), c) to	Evan Cheng	2012-03-13	1	-4/+19
\| \| \| \| \| \| \| \| \|	(i16 load $addr+csizeof(i16)) and replace uses of (i32 vextract) with the i16 load. It should issue an extload instead: (i32 extload $addr+csizeof(i16)). rdar://11035895 llvm-svn: 152675
*	Give dagcombiner's worklist some inline capacity.	Benjamin Kramer	2012-03-10	1	-3/+2
\| \| \| \|	llvm-svn: 152454
*	Extend r148086 to check for [r +/- reg] address mode. This fixes queens ↵	Evan Cheng	2012-03-06	1	-4/+7
\| \| \| \| \| \|	performance regression (due to increased register pressure from overly aggressive pre-inc formation). llvm-svn: 152162
*	Make it possible for a target to mark FSUB as Expand. This requires ↵	Owen Anderson	2012-03-06	1	-16/+29
\| \| \| \| \| \|	providing a default expansion (FADD+FNEG), and teaching DAGCombine not to form FSUBs post-legalize if they are not legal. llvm-svn: 152079
*	Teach the DAGCombiner that certain loadext nodes followed by ANDs can be ↵	James Molloy	2012-02-20	1	-0/+82
\| \| \| \| \| \|	converted to zeroexts. llvm-svn: 150957
*	Remove extraneous #include and spelling mistake introduced in r150669.	James Molloy	2012-02-16	1	-2/+1
\| \| \| \|	llvm-svn: 150670
*	Modify the algorithm when traversing the DAGCombiner's worklist to be O(log ↵	James Molloy	2012-02-16	1	-13/+36
\| \| \| \| \| \|	N) for all operations. This fixes a horrible worst case with lots of nodes where 99% of the time was being spent in std::remove. llvm-svn: 150669
*	Fix a bug in DAGCombine for the optimization of BUILD_VECTOR. We cant ↵	Nadav Rotem	2012-02-13	1	-2/+6
\| \| \| \| \| \|	generate a shuffle node from two vectors of different types. llvm-svn: 150383
*	This patch addresses the problem of poor code generation for the zext	Nadav Rotem	2012-02-12	1	-14/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v8i8 -> v8i32 on AVX machines. The codegen often scalarizes ANY_EXTEND nodes. The DAGCombiner has two optimizations that can mitigate the problem. First, if all of the operands of a BUILD_VECTOR node are extracted from an ZEXT/ANYEXT nodes, then it is possible to create a new simplified BUILD_VECTOR which uses UNDEFS/ZERO values to eliminate the scalar ZEXT/ANYEXT nodes. Second, another dag combine optimization lowers BUILD_VECTOR into a shuffle vector instruction. In the case of zext v8i8->v8i32 on AVX, a value in an XMM register is to be shuffled into a wide YMM register. This patch modifes the second optimization and allows the creation of shuffle vectors even when the newly generated vector and the original vector from which we extract the values are of different types. llvm-svn: 150340
*	Add additional documentation to the extract-and-trunc dagcombine optimization.	Nadav Rotem	2012-02-05	1	-3/+8
\| \| \| \|	llvm-svn: 149823