|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| | llvm-svn: 163256 | 
| | 
| 
| 
| 
| 
| | types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value.
llvm-svn: 163203 | 
| | 
| 
| 
| 
| 
| | fast-math mode.
llvm-svn: 163051 | 
| | 
| 
| 
| | llvm-svn: 163049 | 
| | 
| 
| 
| 
| 
| | constants.  This is only enabled in unsafe FP math mode, since it does not preserve rounding effects for all such constants.
llvm-svn: 162956 | 
| | 
| 
| 
| 
| 
| | approach. We need to insert some valid TRANCATE node here.
llvm-svn: 162354 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The DAGCombiner tries to optimise a BUILD_VECTOR by checking if it
consists purely of get_vector_elts from one or two source vectors. If
so, it either makes a concat_vectors node or a shufflevector node.
However, it doesn't check the element type width of the underlying
vector, so if you have this sequence:
Node0: v4i16 = ...
Node1: i32 = extract_vector_elt Node0
Node2: i32 = extract_vector_elt Node0
Node3: v16i8 = BUILD_VECTOR Node1, Node2, ...
It will attempt to:
Node0:    v4i16 = ...
NewNode1: v16i8 = concat_vectors Node0, ...
Where this is actually invalid because the element width is completely
different. This causes an assertion failure on DAG legalization stage.
Fix:
If output item type of BUILD_VECTOR differs from input item type.
Make concat_vectors based on input element type and then bitcast it to the output vector type. So the case described above will transformed to:
Node0:    v4i16 = ...
NewNode1: v8i16 = concat_vectors Node0, ...
NewNode2: v16i8 = bitcast NewNode1
llvm-svn: 162195 | 
| | 
| 
| 
| 
| 
| | various rounding modes.  Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC.
llvm-svn: 161807 | 
| | 
| 
| 
| | llvm-svn: 161110 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | that do not support it (X86 does not lower select_cc).
PR: 13428
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
llvm-svn: 160619 | 
| | 
| 
| 
| | llvm-svn: 160475 | 
| | 
| 
| 
| 
| 
| | instcombine transformation.
llvm-svn: 160387 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs.
Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>
llvm-svn: 160229 | 
| | 
| 
| 
| 
| 
| 
| 
| | single undef.
The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node.
llvm-svn: 160221 | 
| | 
| 
| 
| 
| 
| 
| 
| | an MVT::i1, i.e. before type legalization.
This is a speculative fix for a problem on Mips reported by Akira Hatanaka.
llvm-svn: 160036 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | multiple scalars and insert them into a vector. Next, we shuffle the elements
into the correct places, as before.
Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the
migration of bitcasts happened too late in the SelectionDAG process.
llvm-svn: 159991 | 
| | 
| 
| 
| 
| 
| 
| 
| | move, since there are only two possible values.
Previously, this would become an integer extension operation, followed by a real integer->float conversion.
llvm-svn: 159957 | 
| | 
| 
| 
| 
| 
| | type. No test case. Found by inspection.
llvm-svn: 159179 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | boolean flag to an enum: { Fast, Standard, Strict } (default = Standard).
This option controls the creation by optimizations of fused FP ops that store
intermediate results in higher precision than IEEE allows (E.g. FMAs). The
behavior of this option is intended to match the behaviour specified by a
soon-to-be-introduced frontend flag: '-ffuse-fp-ops'.
Fast mode - allows formation of fused FP ops whenever they're profitable.
Standard mode - allow fusion only for 'blessed' FP ops. At present the only
blessed op is the fmuladd intrinsic. In the future more blessed ops may be
added.
Strict mode - allow fusion only if/when it can be proven that the excess
precision won't effect the result.
Note: This option only controls formation of fused ops by the optimizers.  Fused
operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic)
will always be honored, regardless of the value of this option.
Internally TargetOptions::AllowExcessFPPrecision has been replaced by
TargetOptions::AllowFPOpFusion.
llvm-svn: 158956 | 
| | 
| 
| 
| | llvm-svn: 158927 | 
| | 
| 
| 
| 
| 
| | node is removed.  Sorry, no test case.  Foudn it by inspection of the code
llvm-svn: 158839 | 
| | 
| 
| 
| 
| 
| | The test case for this will come with the PPC indexed preinc loads commit.
llvm-svn: 158822 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
      AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
        OR
      UnsafeFPMath option (-enable-unsafe-fp-math)
    are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
    the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).
If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.
llvm-svn: 158757 | 
| | 
| 
| 
| | llvm-svn: 158467 | 
| | 
| 
| 
| 
| 
| | wrote and the usual LLVM convention.
llvm-svn: 157708 | 
| | 
| 
| 
| 
| 
| | operands of an FMA node.
llvm-svn: 157707 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | When a combine twiddles an extract_vector, care should be take to preserve
the type of the index operand. No luck extracting a reasonable testcase,
unfortunately.
rdar://11391009
llvm-svn: 156419 | 
| | 
| 
| 
| | llvm-svn: 156324 | 
| | 
| 
| 
| 
| 
| | just like it now knows for FMULs.
llvm-svn: 156029 | 
| | 
| 
| 
| | llvm-svn: 156023 | 
| | 
| 
| 
| | llvm-svn: 155309 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Instead of passing listener pointers to RAUW, let SelectionDAG itself
keep a linked list of interested listeners.
This makes it possible to have multiple listeners active at once, like
RAUWUpdateListener was already doing. It also makes it possible to
register listeners up the call stack without controlling all RAUW calls
below.
DAGUpdateListener uses an RAII pattern to add itself to the SelectionDAG
list of active listeners.
llvm-svn: 155248 | 
| | 
| 
| 
| | llvm-svn: 154786 | 
| | 
| 
| 
| 
| 
| 
| 
| | Fix a dagcombine optimization which assumes that the vsetcc result type is always
of the same size as the compared values. This is ture for SSE/AVX/NEON but not
for all targets.
llvm-svn: 154490 | 
| | 
| 
| 
| 
| 
| | multiplication by a denormal, and some tests checking that.
llvm-svn: 154431 | 
| | 
| 
| 
| | llvm-svn: 154414 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | always
of the same size as the compared values. This is ture for SSE/AVX/NEON but not
for all targets.
llvm-svn: 154397 | 
| | 
| 
| 
| 
| 
| | This fixes PR12516 and uncovers one weird problem in legalize (workarounded)
llvm-svn: 154394 | 
| | 
| 
| 
| 
| 
| | not fit in a i64.
llvm-svn: 154364 | 
| | 
| 
| 
| | llvm-svn: 154322 | 
| | 
| 
| 
| 
| 
| | some checks to allow better early out.
llvm-svn: 154309 | 
| | 
| 
| 
| | llvm-svn: 154308 | 
| | 
| 
| 
| | llvm-svn: 154307 | 
| | 
| 
| 
| 
| 
| | happen.
llvm-svn: 154305 | 
| | 
| 
| 
| | llvm-svn: 154297 | 
| | 
| 
| 
| 
| 
| 
| 
| | when -ffast-math, i.e. don't just always do it if the reciprocal can
be formed exactly.  There is already an IR level transform that does
that, and it does it more carefully.
llvm-svn: 154296 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | shuffle node because it could introduce new shuffle nodes that were not
   supported efficiently by the target.
2. Add a more restrictive shuffle-of-shuffle optimization for cases where the
   second shuffle reverses the transformation of the first shuffle.
llvm-svn: 154266 | 
| | 
| 
| 
| 
| 
| 
| 
| | reciprocal if converting to the reciprocal is exact.  Do it even if inexact
if -ffast-math.  This substantially speeds up ac.f90 from the polyhedron
benchmarks.
llvm-svn: 154265 | 
| | 
| 
| 
| 
| 
| 
| 
| | This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.
llvm-svn: 154011 | 
| | 
| 
| 
| 
| 
| | operations, and prevent the DAGCombiner from turning them into bitwise operations if they do.
llvm-svn: 153901 |