| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
| |
with a constant zero.
llvm-svn: 172576
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
some optimization opportunities (in the enclosing supper-expressions).
rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y)
if expression "-0.0 - X" has only one reference.
rule 2. (0.0 - X ) * Y => -0.0 - (X * Y)
if expression "0.0 - X" has only one reference, and
the instruction is marked "noSignedZero".
2. Eliminate negation (The compiler was already able to handle these
opt if the 0.0s are replaced with -0.0.)
rule 3: (0.0 - X) * (0.0 - Y) => X * Y
rule 4: (0.0 - X) * C => X * -C
if the expr is flagged "noSignedZero".
3.
Rule 5: (X*Y) * X => (X*X) * Y
if X!=Y and the expression is flagged with "UnsafeAlgebra".
The purpose of this transformation is two-fold:
a) to form a power expression (of X).
b) potentially shorten the critical path: After transformation, the
latency of the instruction Y is amortized by the expression of X*X,
and therefore Y is in a "less critical" position compared to what it
was before the transformation.
4. Remove the InstCombine code about simplifiying "X * select".
The reasons are following:
a) The "select" is somewhat architecture-dependent, therefore the
higher level optimizers are not able to precisely predict if
the simplification really yields any performance improvement
or not.
b) The "select" operator is bit complicate, and tends to obscure
optimization opportunities. It is btter to keep it as low as
possible in expr tree, and let CodeGen to tackle the optimization.
llvm-svn: 172551
|
| |
|
|
| |
llvm-svn: 172549
|
| |
|
|
|
|
| |
They are failing on the bots.
llvm-svn: 172540
|
| |
|
|
|
|
| |
Also improve test coveration of the handling of relational comparisons.
llvm-svn: 172539
|
| |
|
|
|
|
|
|
|
|
|
| |
Test was failing for clang-native-arm-cortex-a9 build-bot configuration.
The reason for the failure was the test was using hardcoded names.
The attached patch fixes this failure by replacing the hard-coded variables
names with pattern-matched variable names.
Patch by Manish Verma, ARM
llvm-svn: 172534
|
| |
|
|
|
|
|
| |
- Also, update the LangRef documentation on module flags to match the
implementation.
llvm-svn: 172498
|
| |
|
|
|
|
|
|
|
|
|
| |
we need to generate a N64 compound relocation
R_MIPS_GPREL_32/R_MIPS_64/R_MIPS_NONE.
The bug was exposed by the SingleSourcetest case
DuffsDevice.c.
Contributer: Jack Carter
llvm-svn: 172496
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
---------------------------------------------------------------------------
C_A: reassociation is allowed
C_R: reciprocal of a constant C is appropriate, which means
- 1/C is exact, or
- reciprocal is allowed and 1/C is neither a special value nor a denormal.
-----------------------------------------------------------------------------
rule1: (X/C1) / C2 => X / (C2*C1) (if C_A)
=> X * (1/(C2*C1)) (if C_A && C_R)
rule 2: X*C1 / C2 => X * (C1/C2) if C_A
rule 3: (X/Y)/Z = > X/(Y*Z) (if C_A && at least one of Y and Z is symbolic value)
rule 4: Z/(X/Y) = > (Z*Y)/X (similar to rule3)
rule 5: C1/(X*C2) => (C1/C2) / X (if C_A)
rule 6: C1/(X/C2) => (C1*C2) / X (if C_A)
rule 7: C1/(C2/X) => (C1/C2) * X (if C_A)
llvm-svn: 172488
|
| |
|
|
|
|
|
| |
have an arbitrary ordering of the base register, index register and displacement.
rdar://12527141
llvm-svn: 172484
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The included test case is derived from one of the GCC compatibility tests.
The problem arises after the selection DAG has been converted to type-legalized
form. The combiner first sees a 64-bit load that can be converted into a
pre-increment form. The original load feeds into a SRL that isolates the
upper 32 bits of the loaded doubleword. This looks like an opportunity for
DAGCombiner::ReduceLoadWidth() to replace the 64-bit load with a 32-bit load.
However, this transformation is not valid, as the replacement load is not
a pre-increment load. The pre-increment load produces an extra result,
which feeds a subsequent add instruction. The replacement load only has
one result value, and this value is propagated to all uses of the pre-
increment load, including the add. Because the add is looking for the
second result value as its operand, it ends up attempting to add a constant
to a token chain, resulting in a crash.
So the patch simply disables this transformation for any load with more than
two result values.
llvm-svn: 172480
|
| |
|
|
|
|
|
|
| |
Note that this bug is only exposed because LTO fails to use TTI.
Fixes self-LTO of clang. rdar://13007381.
llvm-svn: 172462
|
| |
|
|
| |
llvm-svn: 172369
|
| |
|
|
|
|
|
|
| |
2x blocks each assigned a value via a phi-node causing each to depend on the other.
A test case is provided as well.
llvm-svn: 172368
|
| |
|
|
|
|
|
| |
Those can occur when something between the sextload and the store is on the same
chain and blocks isel. Fixes PR14887.
llvm-svn: 172353
|
| |
|
|
|
|
| |
and i16).
llvm-svn: 172348
|
| |
|
|
|
|
|
| |
Shifting right two times will only yield zero. Should fix
SingleSource/UnitTests/SignlessTypes/factor.
llvm-svn: 172322
|
| |
|
|
|
|
| |
=> objc_autorelease but were not updating the InstructionClass to IC_Autorelease.
llvm-svn: 172288
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
not be placed into an autorelease pool.
The reason that this occurs is that tail calling objc_autorelease eventually
tail calls -[NSObject autorelease] which supports fast autorelease. This can
cause us to violate the semantic gaurantees of __autoreleasing variables that
assignment to an __autoreleasing variables always yields an object that is
placed into the innermost autorelease pool.
The fix included in this patch works by:
1. In the peephole optimization function OptimizeIndividualFunctions, always
remove tail call from objc_autorelease.
2. Whenever we convert to/from an objc_autorelease, set/unset the tail call
keyword as appropriate.
*NOTE* I also handled the case where objc_autorelease is converted in
OptimizeReturns to an autoreleaseRV which still violates the ARC semantics. I
will be removing that in a later patch and I wanted to make sure that the tree
is in a consistent state vis-a-vis ARC always.
Additionally some test cases are provided and all tests that have tail call marked
objc_autorelease keywords have been modified so that tail call has been removed.
*NOTE* One test fails due to a separate bug that I am going to commit soon. Thus
I marked the check line TMP: instead of CHECK: so make check does not fail.
llvm-svn: 172287
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
register names in the standalone assembler llvm-mc.
Registers such as $A1 can represent either a 32 or
64 bit register based on the instruction using it.
In addition, based on the abi, $T0 can represent different
32 bit registers.
The problem is resolved by the Mips specific AsmParser
td definitions changing to work together. Many cases of
RegisterClass parameters are now RegisterOperand.
Contributer: Vladimir Medic
llvm-svn: 172284
|
| |
|
|
| |
llvm-svn: 172269
|
| |
|
|
|
|
|
|
|
| |
Adds a check for -Oz, changes the code to not re-visit BBs,
and skips over DBG_VALUE instrs.
Patch by Andy Zhang.
llvm-svn: 172258
|
| |
|
|
|
|
|
|
| |
the target if it supports the different CAST types. We didn't do this
on X86 because of the different register sizes and types, but on ARM
this makes sense.
llvm-svn: 172245
|
| |
|
|
|
|
|
|
|
|
|
| |
- recognize string "{memory}" in the MI generation
- mark as mayload/maystore when there's a memory clobber constraint.
PR14859.
Patch by Krzysztof Parzyszek
llvm-svn: 172228
|
| |
|
|
|
|
|
| |
This removes previous special cases for each floating-point type in favour of a
shared codepath.
llvm-svn: 172189
|
| |
|
|
|
|
|
|
|
|
|
| |
order to select the max vectorization factor.
We don't have a detailed analysis on which values are vectorized and which stay scalars in the vectorized loop so we use
another method. We look at reduction variables, loads and stores, which are the only ways to get information in and out
of loop iterations. If the data types are extended and truncated then the cost model will catch the cost of the vector
zext/sext/trunc operations.
llvm-svn: 172178
|
| |
|
|
| |
llvm-svn: 172172
|
| |
|
|
|
|
|
|
|
|
|
|
| |
use FileCheck instead of grep.
Messages:
Converted test case trivial_codegen_tailcall.ll to use FileCheck.
Converted test return_constant.ll to use FileCheck instead of grep.
Converted test reorder_load.ll to use FileCheck instead of grep.
Converted test intervening-inst.ll to use FileCheck instead of grep.
llvm-svn: 172171
|
| |
|
|
|
|
|
|
| |
The root cause is mistakenly taking for granted that
"dyn_cast<Instruction>(a-Value)"
return a non-NULL instruction.
llvm-svn: 172145
|
| |
|
|
|
|
| |
Value's current type. The casting is trivial even for aggregate type.
llvm-svn: 172143
|
| |
|
|
|
|
| |
underscore in symbol on linux.
llvm-svn: 172139
|
| |
|
|
| |
llvm-svn: 172130
|
| |
|
|
|
|
|
|
|
| |
This fixes va_start/va_copy of a va_list field which happens to not
be laid out at a 16-byte boundary.
Differential Revision: http://llvm-reviews.chandlerc.com/D276
llvm-svn: 172128
|
| |
|
|
|
|
| |
than the string size.
llvm-svn: 172124
|
| |
|
|
|
|
| |
Part of rdar://12991541
llvm-svn: 172121
|
| |
|
|
|
|
| |
application of these operations commutes with the truncation, so we should prefer to do them in the smallest size we can, to save register space, use smaller constant pool entries, etc.
llvm-svn: 172117
|
| |
|
|
|
|
|
|
| |
BinaryOperator can be folded to an Undef, and we don't want to set NSW flags to undef vals.
PR14878
llvm-svn: 172079
|
| |
|
|
|
|
| |
VectorType.
llvm-svn: 172054
|
| |
|
|
|
|
| |
address spaces.
llvm-svn: 172051
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
requirement when creating stack objects in MachineFrameInfo.
Add CreateStackObjectWithMinAlign to throw error when the minimal alignment
can't be achieved and to clamp the alignment when the preferred alignment
can't be achieved. Same is true for CreateVariableSizedObject.
Will not emit error in CreateSpillStackObject or CreateStackObject.
As long as callers of CreateStackObject do not assume the object will be
aligned at the requested alignment, we should not have miscompile since
later optimizations which look at the object's alignment will have the correct
information.
rdar://12713765
llvm-svn: 172027
|
| |
|
|
|
|
| |
instruction to determine the max vectorization factor.
llvm-svn: 172010
|
| |
|
|
|
|
|
|
|
| |
It cahced XOR's operands before calling visitXOR() but failed to update the
operands when visitXOR changed the XOR node.
rdar://12968664
llvm-svn: 171999
|
| |
|
|
|
|
| |
Fixes PR14854.
llvm-svn: 171984
|
| |
|
|
|
|
|
|
|
| |
This patch adjust the r171506 to make all DWARF enconding pc-relative
for PPC64. It also adds the R_PPC64_REL32 relocation handling in MCJIT
(since the eh_frame will not generate PIC-relative relocation) and also
adds the emission of stubs created by the TTypeEncoding.
llvm-svn: 171979
|
| |
|
|
| |
llvm-svn: 171956
|
| |
|
|
|
|
|
|
|
|
|
| |
two constant.
PR 14848. The lowered sequence is based on the existing sequence the target-independent
DAG Combiner creates for the scalar case.
Patch by Zvi Rackover.
llvm-svn: 171953
|
| |
|
|
|
|
|
|
|
|
| |
This was an experimental option, but needs to be defined
per-target. e.g. PPC A2 needs to aggressively hide latency.
I converted some in-order scheduling tests to A2. Hal is working on
more test cases.
llvm-svn: 171946
|
| |
|
|
| |
llvm-svn: 171931
|
| |
|
|
|
|
| |
vectorizer does it now.
llvm-svn: 171930
|
| |
|
|
|
|
| |
Cost Model support on ARM.
llvm-svn: 171928
|