| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
RAII helper.
Defines away the issue where cast<Instruction> would fail because constant
folding happened. Also slightly cleaner.
llvm-svn: 191674
|
|
|
|
|
|
|
|
|
| |
when it was actually a Constant*.
There are quite a few other casts to Instruction that might have the same problem,
but this is the only one I have a test case for.
llvm-svn: 191668
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If "C1/X" were having multiple uses, the only benefit of this
transformation is to potentially shorten critical path. But it is at the
cost of instroducing additional div.
The additional div may or may not incur cost depending on how div is
implemented. If it is implemented using Newton–Raphson iteration, it dosen't
seem to incur any cost (FIXME). However, if the div blocks the entire
pipeline, that sounds to be pretty expensive. Let CodeGen to take care
this transformation.
This patch sees 6% on a benchmark.
rdar://15032743
llvm-svn: 191037
|
|
|
|
|
|
| |
for consistency.
llvm-svn: 187225
|
|
|
|
| |
llvm-svn: 186759
|
|
|
|
| |
llvm-svn: 186533
|
|
|
|
| |
llvm-svn: 186235
|
|
|
|
|
|
|
|
|
|
|
| |
This transform was originally added in r185257 but later removed in
r185415. The original transform would create instructions speculatively
and then discard them if the speculation was proved incorrect. This has
been replaced with a scheme that splits the transform into two parts:
preflight and fold. While we preflight, we build up fold actions that
inform the folding stage on how to act.
llvm-svn: 185667
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'select' denoms)
I'm reverting this commit because:
1. As discussed during review, it needs to be rewritten (to avoid creating and
then deleting instructions).
2. This is causing optimizer crashes. Specifically, I'm seeing things like
this:
While deleting: i1 %
Use still stuck around after Def is destroyed: <badref> = select i1 <badref>, i32 0, i32 1
opt: /src/llvm-trunk/lib/IR/Value.cpp:79: virtual llvm::Value::~Value(): Assertion `use_empty() && "Uses remain when a value is destroyed!"' failed.
I'd guess that these will go away once we're no longer creating/deleting
instructions here, but just in case, I'm adding a regression test.
Because the code is bring rewritten, I've just XFAIL'd the original regression test. Original commit message:
InstCombine: Be more agressive optimizing 'udiv' instrs with 'select' denoms
Real world code sometimes has the denominator of a 'udiv' be a
'select'. LLVM can handle such cases but only when the 'select'
operands are symmetric in structure (both select operands are a constant
power of two or a left shift, etc.). This falls apart if we are dealt a
'udiv' where the code is not symetric or if the select operands lead us
to more select instructions.
Instead, we should treat the LHS and each select operand as a distinct
divide operation and try to optimize them independently. If we can
to simplify each operation, then we can replace the 'udiv' with, say, a
'lshr' that has a new select with a bunch of new operands for the
select.
llvm-svn: 185415
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Real world code sometimes has the denominator of a 'udiv' be a
'select'. LLVM can handle such cases but only when the 'select'
operands are symmetric in structure (both select operands are a constant
power of two or a left shift, etc.). This falls apart if we are dealt a
'udiv' where the code is not symetric or if the select operands lead us
to more select instructions.
Instead, we should treat the LHS and each select operand as a distinct
divide operation and try to optimize them independently. If we can
to simplify each operation, then we can replace the 'udiv' with, say, a
'lshr' that has a new select with a bunch of new operands for the
select.
llvm-svn: 185257
|
|
|
|
|
|
| |
!APFloat.isDenormal => APFloat.isNormal.
llvm-svn: 185037
|
|
|
|
|
|
|
|
| |
APFloat::isFiniteNonZero.
Turns out all the references were in llvm and not in clang.
llvm-svn: 184356
|
|
|
|
| |
llvm-svn: 183461
|
|
|
|
|
|
| |
Patch by Andrea Di Biagio.
llvm-svn: 183005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The earlier change list introduced the following inst combines:
B * (uitofp i1 C) —> select C, B, 0
A * (1 - uitofp i1 C) —> select C, 0, A
select C, 0, B + select C, A, 0 —> select C, A, B
Together these 3 changes would simplify :
A * (1 - uitofp i1 C) + B * uitofp i1 C
down to :
select C, B, A
In practice we found that the first two substitutions can have a
negative effect on performance, because they reduce opportunities to
use FMA contractions; between the two options FMAs are often the
better choice. This change list amends the previous one to enable
just these inst combines:
select C, B, 0 + select C, 0, A —> select C, B, A
A * (1 - uitofp i1 C) + B * uitofp i1 C —> select C, B, A
llvm-svn: 182499
|
|
|
|
| |
llvm-svn: 181848
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are two transforms in visitUrem that conflict with each other.
*) One, if a divisor is a power of two, subtracts one from the divisor
and turns it into a bitwise-and.
*) The other unwraps both operands if they are surrounded by zext
instructions.
Flipping the order allows the subtraction to go beneath the sign
extension.
llvm-svn: 181668
|
|
|
|
|
|
|
| |
Use isKnownToBeAPowerOfTwo in visitUrem so that we may more aggressively
fold away urem instructions.
llvm-svn: 181661
|
|
|
|
|
|
| |
PR15952.
llvm-svn: 181586
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A * (1 - (uitofp i1 C)) -> select C, 0, A
B * (uitofp i1 C) -> select C, B, 0
select C, 0, A + select C, B, 0 -> select C, B, A
These come up in code that has been hand-optimized from a select to a linear blend,
on platforms where that may have mattered. We want to undo such changes
with the following transform:
A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B
llvm-svn: 181216
|
|
|
|
| |
llvm-svn: 178915
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The instcombine recognized pattern looks like:
a = b * c
d = a +/- Cst
or
a = b * c
d = Cst +/- a
When creating the new operands for fadd or fsub instruction following the related fmul, the first operand was created with the second original operand (M0 was created with C1) and the second with the first (M1 with Opnd0).
The fix consists in creating the new operands with the appropriate original operand, i.e., M0 with Opnd0 and M1 with C1.
llvm-svn: 176300
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
some optimization opportunities (in the enclosing supper-expressions).
rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y)
if expression "-0.0 - X" has only one reference.
rule 2. (0.0 - X ) * Y => -0.0 - (X * Y)
if expression "0.0 - X" has only one reference, and
the instruction is marked "noSignedZero".
2. Eliminate negation (The compiler was already able to handle these
opt if the 0.0s are replaced with -0.0.)
rule 3: (0.0 - X) * (0.0 - Y) => X * Y
rule 4: (0.0 - X) * C => X * -C
if the expr is flagged "noSignedZero".
3.
Rule 5: (X*Y) * X => (X*X) * Y
if X!=Y and the expression is flagged with "UnsafeAlgebra".
The purpose of this transformation is two-fold:
a) to form a power expression (of X).
b) potentially shorten the critical path: After transformation, the
latency of the instruction Y is amortized by the expression of X*X,
and therefore Y is in a "less critical" position compared to what it
was before the transformation.
4. Remove the InstCombine code about simplifiying "X * select".
The reasons are following:
a) The "select" is somewhat architecture-dependent, therefore the
higher level optimizers are not able to precisely predict if
the simplification really yields any performance improvement
or not.
b) The "select" operator is bit complicate, and tends to obscure
optimization opportunities. It is btter to keep it as low as
possible in expr tree, and let CodeGen to tackle the optimization.
llvm-svn: 172551
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
---------------------------------------------------------------------------
C_A: reassociation is allowed
C_R: reciprocal of a constant C is appropriate, which means
- 1/C is exact, or
- reciprocal is allowed and 1/C is neither a special value nor a denormal.
-----------------------------------------------------------------------------
rule1: (X/C1) / C2 => X / (C2*C1) (if C_A)
=> X * (1/(C2*C1)) (if C_A && C_R)
rule 2: X*C1 / C2 => X * (C1/C2) if C_A
rule 3: (X/Y)/Z = > X/(Y*Z) (if C_A && at least one of Y and Z is symbolic value)
rule 4: Z/(X/Y) = > (Z*Y)/X (similar to rule3)
rule 5: C1/(X*C2) => (C1/C2) / X (if C_A)
rule 6: C1/(X/C2) => (C1*C2) / X (if C_A)
rule 7: C1/(C2/X) => (C1/C2) * X (if C_A)
llvm-svn: 172488
|
|
|
|
|
|
| |
Thank Eric Christopher for figuring out these problems!
llvm-svn: 171805
|
|
|
|
|
|
|
|
|
|
|
| |
o. X/C1 * C2 => X * (C2/C1) (if C2/C1 is neither special FP nor denormal)
o. X/C1 * C2 -> X/(C1/C2) (if C2/C1 is either specical FP or denormal, but C1/C2 is a normal Fp)
Let MDC denote multiplication or dividion with one & only one operand being a constant
o. (MDC ± C1) * C2 => (MDC * C2) ± (C1 * C2)
(so long as the constant-folding doesn't yield any denormal or special value)
llvm-svn: 171793
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.
There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.
The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.
I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).
I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.
llvm-svn: 171366
|
|
|
|
|
|
| |
Implement rule : "x * (select cond 1.0, 0.0) -> select cond x, 0.0"
llvm-svn: 170226
|
|
|
|
|
|
|
|
| |
In a previous thread it was pointed out that isPowerOfTwo is not a very precise
name since it can return false for powers of two if it is unable to show that
they are powers of two.
llvm-svn: 170093
|
|
|
|
|
|
|
|
|
|
| |
been used in the first place. It simply was passed to the function and to the
recursive invocations. Simply drop the parameter and update the callers for the
new signature.
Patch by Saleem Abdulrasool!
llvm-svn: 169988
|
|
|
|
|
|
| |
functions from SimplifyInstruction
llvm-svn: 169941
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
llvm-svn: 169131
|
|
|
|
|
|
| |
nested ifs
llvm-svn: 169049
|
|
|
|
| |
llvm-svn: 169043
|
|
|
|
|
|
| |
reviewed by Michael Ilseman <milseman@apple.com>
llvm-svn: 169025
|
|
|
|
| |
llvm-svn: 165402
|
|
|
|
|
|
| |
See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767
llvm-svn: 164768
|
|
|
|
| |
llvm-svn: 164767
|
|
|
|
|
|
|
|
| |
a value that is zext'd.
Fixes PR13250.
llvm-svn: 164377
|
|
|
|
| |
llvm-svn: 162911
|
|
|
|
|
|
|
|
| |
because C always rounds towards zero.
Thanks Dirk and Ben.
llvm-svn: 162899
|
|
|
|
|
|
|
|
|
| |
the bit width.
No test case, undefined shifts get folded early, but can occur when other
transforms generate a constant. Thanks to Duncan for bringing this up.
llvm-svn: 162755
|
|
|
|
|
|
| |
and non-const shifts.
llvm-svn: 162751
|
|
|
|
|
|
| |
Thanks Benjamin for noticing this.
llvm-svn: 162749
|
|
|
|
|
|
|
|
|
|
| |
For example:
%1 = lshr i32 %x, 2
%2 = udiv i32 %1, 100
rdar://12182093
llvm-svn: 162743
|
|
|
|
|
|
| |
rdar://11721329
llvm-svn: 158946
|
|
|
|
|
|
| |
instead of always using ConstantVector.
llvm-svn: 149912
|
|
|
|
|
|
|
| |
we should (theoretically optimize and codegen ConstantDataVector as well
as ConstantVector.
llvm-svn: 149116
|
|
|
|
| |
llvm-svn: 148929
|
|
|
|
|
|
| |
Fixes r8429
llvm-svn: 144036
|