| Commit message (Collapse) | Author | Age | Files | Lines | 
| ... |  | 
| | 
| 
| 
|  | 
llvm-svn: 177863
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
This simplification happens at 2 places :
 - using the nsw attribute when the shl / mul is used by a sign test
 - when the shl / mul is compared for (in)equality to zero
llvm-svn: 177856
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
(gep GV, i)) C) to a bit test.
The original code used i32, and i64 if legal. This introduced unneeded
casts when they aren't legal, or when the index variable i has another
type. In order of preference: try to use i's type; use the smallest
fitting legal type (using an added DataLayout method); default to i32.
A testcase checks that this works when the index gep operand is i16.
Patch by : Ahmed Bougacha <ahmed.bougacha@gmail.com>
Reviewed by : Duncan
llvm-svn: 177712
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Rules include:
  1)1 x*y +/- x*z => x*(y +/- z) 
    (the order of operands dosen't matter)
  2) y/x +/- z/x => (y +/- z)/x 
 The transformation is disabled if the new add/sub expr "y +/- z" is a 
denormal/naz/inifinity.
rdar://12911472
llvm-svn: 177088
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
%v, C1), C2 :
Only combine when the shl is only used by the icmp
llvm-svn: 176950
 | 
| | 
| 
| 
|  | 
llvm-svn: 176765
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
When considering folding a bitcast of an alloca into the alloca itself,
make sure we don't shrink the amount of memory being allocated, or
things rapidly go sideways.
rdar://13324424
llvm-svn: 176547
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The instcombine recognized pattern looks like:
a = b * c
d = a +/- Cst
or
a = b * c
d = Cst +/- a
When creating the new operands for fadd or fsub instruction following the related fmul, the first operand was created with the second original operand (M0 was created with C1) and the second with the first (M1 with Opnd0).
The fix consists in creating the new operands with the appropriate original operand, i.e., M0 with Opnd0 and M1 with C1.
llvm-svn: 176300
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
(or (bool?A:B),(bool?C:D)) --> (bool?(or A,C):(or B,D))
By the time the OR is visited, both the SELECTs have been visited and not
optimized and the OR itself hasn't been transformed so we do this transform in
the hopes that the new ORs will be optimized.
The transform is explicitly disabled for vector-selects until "codegen matures
to handle them better".
Patch by Muhammad Tauqir!
llvm-svn: 175380
 | 
| | 
| 
| 
| 
| 
|  | 
types..."
llvm-svn: 175273
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
It enables to work with a smaller constant, which is target friendly for those which can compare to immediates.
It also avoids inserting a shift in favor of a trunc, which can be free on some targets.
This used to work until LLVM-3.1, but regressed with the 3.2 release.
llvm-svn: 175270
 | 
| | 
| 
| 
| 
| 
|  | 
visitSExt is an adapted copy of the related visitZExt method, so adapt the comment accordingly.
llvm-svn: 175019
 | 
| | 
| 
| 
| 
| 
|  | 
bitcast X to ...
llvm-svn: 174905
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
included."
This reverts commit 3854a5d90fee52af1065edbed34521fff6cdc18d.
This causes a clang unit test to hang: vtable-available-externally.cpp.
llvm-svn: 174692
 | 
| | 
| 
| 
|  | 
llvm-svn: 174675
 | 
| | 
| 
| 
|  | 
llvm-svn: 174571
 | 
| | 
| 
| 
|  | 
llvm-svn: 174438
 | 
| | 
| 
| 
| 
| 
|  | 
Found by running instcombine on a fabricated test case for the constant folder.
llvm-svn: 174430
 | 
| | 
| 
| 
| 
| 
|  | 
transformation is illegal.
llvm-svn: 174156
 | 
| | 
| 
| 
|  | 
llvm-svn: 174152
 | 
| | 
| 
| 
| 
| 
| 
|  | 
There are still places which treat the Attribute object as a collection of
attributes. I'm systematically removing them.
llvm-svn: 173990
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
sext-not-and --> select.
Patch by Muhammad Tauqir Ahmad.
llvm-svn: 173901
 | 
| | 
| 
| 
| 
| 
| 
|  | 
In the future, AttributeWithIndex won't be used anymore. Besides, it exposes the
internals of the AttributeSet to outside users, which isn't goodness.
llvm-svn: 173602
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
The 'getSlot' function and its ilk allow introspection into the AttributeSet
class. However, that class should be opaque. Allow access through accessor
methods instead.
llvm-svn: 173522
 | 
| | 
| 
| 
|  | 
llvm-svn: 173499
 | 
| | 
| 
| 
|  | 
llvm-svn: 173322
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
long gone."
This causes crashes during the build of compiler-rt during selfhost. Add a
testcase for coverage.
llvm-svn: 173279
 | 
| | 
| 
| 
| 
| 
| 
|  | 
This does the right thing unless the multiplication overflows, but the old code
didn't handle that case either.
llvm-svn: 173276
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
attributes.
Collections of attributes are handled via the AttributeSet class now. This
finally frees us up to make significant changes to how attributes are structured.
llvm-svn: 173228
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
Attribute.
This further restricts the use of the Attribute class to the Attribute family of
classes.
llvm-svn: 173098
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
Attribute.
This is more code to isolate the use of the Attribute class to that of just
holding one attribute instead of a collection of attributes.
llvm-svn: 173094
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
(sub 0, (sext bool to A)) to (zext bool to A).
Patch by Muhammad Ahmad
Reviewed by Duncan Sands
llvm-svn: 173093
 | 
| | 
| 
| 
| 
| 
| 
|  | 
Further encapsulation of the Attribute object. Don't allow direct access to the
Attribute object as an aggregate.
llvm-svn: 172853
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
Because the Attribute class is going to stop representing a collection of
attributes, limit the use of it as an aggregate in favor of using AttributeSet.
This replaces some of the uses for querying the function attributes.
llvm-svn: 172844
 | 
| | 
| 
| 
| 
| 
|  | 
with other code related to shuffles and easier to implement in compiled code.
llvm-svn: 172788
 | 
| | 
| 
| 
|  | 
llvm-svn: 172784
 | 
| | 
| 
| 
| 
| 
|  | 
with a constant zero.
llvm-svn: 172576
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
some optimization opportunities (in the enclosing supper-expressions).
   rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y)
     if expression "-0.0 - X" has only one reference.
   rule 2. (0.0 - X ) * Y => -0.0 - (X * Y)
     if expression "0.0 - X" has only one reference, and
        the instruction is marked "noSignedZero".
2. Eliminate negation (The compiler was already able to handle these
    opt if the 0.0s are replaced with -0.0.)
   rule 3: (0.0 - X) * (0.0 - Y) => X * Y
   rule 4: (0.0 - X) * C => X * -C
   if the expr is flagged "noSignedZero".
3. 
  Rule 5: (X*Y) * X => (X*X) * Y
   if X!=Y and the expression is flagged with "UnsafeAlgebra".
   The purpose of this transformation is two-fold:
    a) to form a power expression (of X).
    b) potentially shorten the critical path: After transformation, the
       latency of the instruction Y is amortized by the expression of X*X,
       and therefore Y is in a "less critical" position compared to what it
      was before the transformation. 
4. Remove the InstCombine code about simplifiying "X * select".
   
   The reasons are following:
    a) The "select" is somewhat architecture-dependent, therefore the
       higher level optimizers are not able to precisely predict if
       the simplification really yields any performance improvement
       or not.
    b) The "select" operator is bit complicate, and tends to obscure
       optimization opportunities. It is btter to keep it as low as
       possible in expr tree, and let CodeGen to tackle the optimization.
llvm-svn: 172551
 | 
| | 
| 
| 
|  | 
llvm-svn: 172489
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
---------------------------------------------------------------------------
 C_A: reassociation is allowed
 C_R: reciprocal of a constant C is appropriate, which means 
    - 1/C is exact, or 
    - reciprocal is allowed and 1/C is neither a special value nor a denormal.
 -----------------------------------------------------------------------------
 rule1:  (X/C1) / C2 => X / (C2*C1)  (if C_A)
                     => X * (1/(C2*C1))  (if C_A && C_R)
 rule 2:  X*C1 / C2 => X * (C1/C2)  if C_A
 rule 3: (X/Y)/Z = > X/(Y*Z)  (if C_A && at least one of Y and Z is symbolic value)
 rule 4: Z/(X/Y) = > (Z*Y)/X  (similar to rule3)
 rule 5: C1/(X*C2) => (C1/C2) / X (if C_A)
 rule 6: C1/(X/C2) => (C1*C2) / X (if C_A)
 rule 7: C1/(C2/X) => (C1/C2) * X (if C_A)
llvm-svn: 172488
 | 
| | 
| 
| 
| 
| 
|  | 
Add a const version of getFpValPtr to avoid a cast-away-const warning.
llvm-svn: 172467
 | 
| | 
| 
| 
|  | 
llvm-svn: 172460
 | 
| | 
| 
| 
| 
| 
|  | 
application of these operations commutes with the truncation, so we should prefer to do them in the smallest size we can, to save register space, use smaller constant pool entries, etc.
llvm-svn: 172117
 | 
| | 
| 
| 
| 
| 
| 
|  | 
- this expression is explicitly marked no-signed-zero, or
  - no-signed-zero of this expression can be derived from some context.
llvm-svn: 171922
 | 
| | 
| 
| 
| 
| 
|  | 
Thank Eric Christopher for figuring out these problems!
llvm-svn: 171805
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
o. X/C1 * C2 => X * (C2/C1) (if C2/C1 is neither special FP nor denormal)
  o. X/C1 * C2 -> X/(C1/C2)   (if C2/C1 is either specical FP or denormal, but C1/C2 is a normal Fp)
     Let MDC denote multiplication or dividion with one & only one operand being a constant
  o. (MDC ± C1) * C2 => (MDC * C2) ± (C1 * C2)
     (so long as the constant-folding doesn't yield any denormal or special value)
llvm-svn: 171793
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
turning a code like this:
if (foo)
   free(foo)
into that:
free(foo)
Move a call to free from basic block FB into FB's predecessor, P,
when the path from P to FB is taken only if the argument of free is
not equal to NULL.
Some restrictions apply on P and FB to be sure that this code motion
is profitable. Namely:
1. FB must have only one predecessor P.
2. FB must contain only the call to free plus an unconditional
   branch to S.
3. P's successors are FB and S.
Because of 1., we will not increase the code size when moving the call
to free from FB to P.
Because of 2., FB will be empty after the move.
Because of 2. and 3., P's branch instruction becomes useless, so as FB
(simplifycfg will do the job).
llvm-svn: 171762
 | 
| | 
| 
| 
| 
| 
|  | 
when merging two TBAA tags, pointed out by Nuno.
llvm-svn: 171627
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.
There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.
The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.
I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).
I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.
llvm-svn: 171366
 | 
| | 
| 
| 
| 
| 
|  | 
Also add an assert to avoid confusion in the code where is known that C1 <= C2.
llvm-svn: 171310
 |