summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Constify this properly. Found by gcc48 -Wcast-qual.Roman Divacky2012-09-051-4/+4
| | | | llvm-svn: 163256
* Fixed the DAG combiner to better handle the folding of AND nodes for vector ↵Silviu Baranga2012-09-051-1/+11
| | | | | | types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value. llvm-svn: 163203
* Teach DAG combine a number of tricks to simplify FMA expressions in ↵Owen Anderson2012-09-011-0/+64
| | | | | | fast-math mode. llvm-svn: 163051
* Fix typoMichael Liao2012-09-011-1/+1
| | | | llvm-svn: 163049
* Teach the DAG combiner to turn chains of FADDs (x+x+x+x+...) into FMULs by ↵Owen Anderson2012-08-301-1/+122
| | | | | | constants. This is only enabled in unsafe FP math mode, since it does not preserve rounding effects for all such constants. llvm-svn: 162956
* Rejected 169195. As Duncan commented, bitcasting to proper type is wrong ↵Stepan Dyatkovskiy2012-08-221-23/+3
| | | | | | approach. We need to insert some valid TRANCATE node here. llvm-svn: 162354
* Fixed DAGCombiner bug (found and localized by James Malloy):Stepan Dyatkovskiy2012-08-201-3/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The DAGCombiner tries to optimise a BUILD_VECTOR by checking if it consists purely of get_vector_elts from one or two source vectors. If so, it either makes a concat_vectors node or a shufflevector node. However, it doesn't check the element type width of the underlying vector, so if you have this sequence: Node0: v4i16 = ... Node1: i32 = extract_vector_elt Node0 Node2: i32 = extract_vector_elt Node0 Node3: v16i8 = BUILD_VECTOR Node1, Node2, ... It will attempt to: Node0: v4i16 = ... NewNode1: v16i8 = concat_vectors Node0, ... Where this is actually invalid because the element width is completely different. This causes an assertion failure on DAG legalization stage. Fix: If output item type of BUILD_VECTOR differs from input item type. Make concat_vectors based on input element type and then bitcast it to the output vector type. So the case described above will transformed to: Node0: v4i16 = ... NewNode1: v8i16 = concat_vectors Node0, ... NewNode2: v16i8 = bitcast NewNode1 llvm-svn: 162195
* Add a roundToIntegral method to APFloat, which can be parameterized over ↵Owen Anderson2012-08-131-0/+42
| | | | | | various rounding modes. Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC. llvm-svn: 161807
* Added FMA functionality to X86 target.Elena Demikhovsky2012-08-011-8/+20
| | | | llvm-svn: 161110
* Fixed DAGCombine optimizations which generate select_cc for targetsNadav Rotem2012-07-231-33/+47
| | | | | | | | | | that do not support it (X86 does not lower select_cc). PR: 13428 Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160619
* Remove tabs.Bill Wendling2012-07-191-1/+1
| | | | llvm-svn: 160475
* Back out r160101 and instead implement a dag combine to recover from ↵Evan Cheng2012-07-171-0/+28
| | | | | | instcombine transformation. llvm-svn: 160387
* Refactor the code that checks that all operands of a node are UNDEFs.Nadav Rotem2012-07-151-13/+7
| | | | | | | | | Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs. Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160229
* Add a dagcombine optimization to convert concat_vectors of undefs into a ↵Nadav Rotem2012-07-141-0/+11
| | | | | | | | single undef. The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node. llvm-svn: 160221
* Only apply the SETCC+SITOFP -> SELECTCC optimization when the SETCC returns ↵Owen Anderson2012-07-111-1/+2
| | | | | | | | an MVT::i1, i.e. before type legalization. This is a speculative fix for a problem on Mips reported by Akira Hatanaka. llvm-svn: 160036
* Improve the loading of load-anyext vectors by allowing the codegen to loadNadav Rotem2012-07-101-1/+1
| | | | | | | | | multiple scalars and insert them into a vector. Next, we shuffle the elements into the correct places, as before. Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the migration of bitcasts happened too late in the SelectionDAG process. llvm-svn: 159991
* Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional ↵Owen Anderson2012-07-091-0/+36
| | | | | | | | move, since there are only two possible values. Previously, this would become an integer extension operation, followed by a real integer->float conversion. llvm-svn: 159957
* Make sure type is not extended or untyped before create a constant of the ↵Evan Cheng2012-06-261-0/+4
| | | | | | type. No test case. Found by inspection. llvm-svn: 159179
* Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from aLang Hames2012-06-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | boolean flag to an enum: { Fast, Standard, Strict } (default = Standard). This option controls the creation by optimizations of fused FP ops that store intermediate results in higher precision than IEEE allows (E.g. FMAs). The behavior of this option is intended to match the behaviour specified by a soon-to-be-introduced frontend flag: '-ffuse-fp-ops'. Fast mode - allows formation of fused FP ops whenever they're profitable. Standard mode - allow fusion only for 'blessed' FP ops. At present the only blessed op is the fmuladd intrinsic. In the future more blessed ops may be added. Strict mode - allow fusion only if/when it can be proven that the excess precision won't effect the result. Note: This option only controls formation of fused ops by the optimizers. Fused operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic) will always be honored, regardless of the value of this option. Internally TargetOptions::AllowExcessFPPrecision has been replaced by TargetOptions::AllowFPOpFusion. llvm-svn: 158956
* Fix potential crash if DAGCombine on stores sees a half typePete Cooper2012-06-211-1/+2
| | | | llvm-svn: 158927
* Add users of a MERGE_VALUE node to the worklist to process again when the ↵Pete Cooper2012-06-201-0/+3
| | | | | | node is removed. Sorry, no test case. Foudn it by inspection of the code llvm-svn: 158839
* Fix DAGCombine to deal with ext-conversion of pre/post_inc loads.Hal Finkel2012-06-201-1/+8
| | | | | | The test case for this will come with the PPC indexed preinc loads commit. llvm-svn: 158822
* Add DAG-combines for aggressive FMA formation.Lang Hames2012-06-191-0/+43
| | | | | | | | | | | | | | | | | | | | This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or FSUB + FMUL. The combines are performed when: (a) Either AllowExcessFPPrecision option (-enable-excess-fp-precision for llc) OR UnsafeFPMath option (-enable-unsafe-fp-math) are set, and (b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of the FADD/FSUB, and (c) The FMUL only has one user (the FADD/FSUB). If your target has fast FMA instructions you can make use of these combines by overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for types supported by your FMA instruction, and adding patterns to match ISD::FMA to your FMA instructions. llvm-svn: 158757
* Make comment slightly more helpful.Lang Hames2012-06-141-1/+1
| | | | llvm-svn: 158467
* Switch the canonical FMA term operand order to match both the comment I ↵Owen Anderson2012-05-301-1/+1
| | | | | | wrote and the usual LLVM convention. llvm-svn: 157708
* Teach DAGCombine to canonicalize the position of a constant in the term ↵Owen Anderson2012-05-301-0/+4
| | | | | | operands of an FMA node. llvm-svn: 157707
* DAGCombiner should not change the type of an extract_vector index.Jim Grosbach2012-05-081-3/+4
| | | | | | | | | | When a combine twiddles an extract_vector, care should be take to preserve the type of the index operand. No luck extracting a reasonable testcase, unfortunately. rdar://11391009 llvm-svn: 156419
* Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled.Owen Anderson2012-05-071-0/+4
| | | | llvm-svn: 156324
* Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, ↵Owen Anderson2012-05-021-0/+18
| | | | | | just like it now knows for FMULs. llvm-svn: 156029
* Teach DAG combine that multiplication by 1.0 can always be constant folded.Owen Anderson2012-05-021-0/+3
| | | | llvm-svn: 156023
* ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2Elena Demikhovsky2012-04-221-0/+2
| | | | llvm-svn: 155309
* Register DAGUpdateListeners with SelectionDAG.Jakob Stoklund Olesen2012-04-201-48/+30
| | | | | | | | | | | | | | | Instead of passing listener pointers to RAUW, let SelectionDAG itself keep a linked list of interested listeners. This makes it possible to have multiple listeners active at once, like RAUWUpdateListener was already doing. It also makes it possible to register listeners up the call stack without controlling all RAUW calls below. DAGUpdateListener uses an RAII pattern to add itself to the SelectionDAG list of active listeners. llvm-svn: 155248
* Remove dead SD nodes after the combining pass. Fixes PR12201.Hal Finkel2012-04-161-0/+1
| | | | llvm-svn: 154786
* Reapply 154397. Original message:Nadav Rotem2012-04-111-11/+18
| | | | | | | | Fix a dagcombine optimization which assumes that the vsetcc result type is always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154490
* Add a comment noting that the fdiv -> fmul conversion won't generateDuncan Sands2012-04-101-3/+3
| | | | | | multiplication by a denormal, and some tests checking that. llvm-svn: 154431
* Revert r154397, which was causing make check failures on the buildbots.Owen Anderson2012-04-101-13/+6
| | | | llvm-svn: 154414
* Fix a dagcombine optimization which assumes that the vsetcc result type is ↵Nadav Rotem2012-04-101-6/+13
| | | | | | | | | always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154397
* Transform div to mul with reciprocal only when fp imm is legal.Anton Korobeynikov2012-04-101-2/+9
| | | | | | This fixes PR12516 and uncovers one weird problem in legalize (workarounded) llvm-svn: 154394
* Don't try to zExt just to check if an integer constant is zero, it mightRafael Espindola2012-04-101-2/+2
| | | | | | not fit in a i64. llvm-svn: 154364
* Pattern match a setcc of boolean value with 0 as a truncate.Rafael Espindola2012-04-091-9/+48
| | | | llvm-svn: 154322
* Remove unnecessary type check when combining and/or/xor of swizzles. Move ↵Craig Topper2012-04-091-13/+12
| | | | | | some checks to allow better early out. llvm-svn: 154309
* Remove unnecessary 'else' on an 'if' that always returnsCraig Topper2012-04-091-1/+2
| | | | llvm-svn: 154308
* Optimize code slightly. No functionality change.Craig Topper2012-04-091-6/+7
| | | | llvm-svn: 154307
* Replace some explicit checks with asserts for conditions that should never ↵Craig Topper2012-04-091-14/+7
| | | | | | happen. llvm-svn: 154305
* Silence sign-compare warning.Benjamin Kramer2012-04-081-1/+1
| | | | llvm-svn: 154297
* Only have codegen turn fdiv by a constant into fmul by the reciprocalDuncan Sands2012-04-081-5/+3
| | | | | | | | when -ffast-math, i.e. don't just always do it if the reciprocal can be formed exactly. There is already an IR level transform that does that, and it does it more carefully. llvm-svn: 154296
* 1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a newNadav Rotem2012-04-071-12/+14
| | | | | | | | | | shuffle node because it could introduce new shuffle nodes that were not supported efficiently by the target. 2. Add a more restrictive shuffle-of-shuffle optimization for cases where the second shuffle reverses the transformation of the first shuffle. llvm-svn: 154266
* Convert floating point division by a constant into multiplication by theDuncan Sands2012-04-071-0/+13
| | | | | | | | reciprocal if converting to the reciprocal is exact. Do it even if inexact if -ffast-math. This substantially speeds up ac.f90 from the polyhedron benchmarks. llvm-svn: 154265
* Always compute all the bits in ComputeMaskedBits.Rafael Espindola2012-04-041-15/+10
| | | | | | | | This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011
* Add predicates for checking whether targets have free FNEG and FABS ↵Owen Anderson2012-04-021-3/+5
| | | | | | operations, and prevent the DAGCombiner from turning them into bitwise operations if they do. llvm-svn: 153901
OpenPOWER on IntegriCloud