| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
llvm-svn: 255511
|
|
|
|
|
|
|
|
|
|
|
|
| |
In conditional store merging, we were creating PHIs when we didn't
need to. If the value to be predicated isn't defined in the block
we're predicating, then it doesn't need a PHI at all (because we only
deal with triangles and diamonds, any value not in the predicated BB
must dominate the predicated BB).
This fixes a large code size increase in some benchmarks in a popular embedded benchmark suite.
llvm-svn: 255489
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While we have successfully implemented a funclet-oriented EH scheme on
top of LLVM IR, our scheme has some notable deficiencies:
- catchendpad and cleanupendpad are necessary in the current design
but they are difficult to explain to others, even to seasoned LLVM
experts.
- catchendpad and cleanupendpad are optimization barriers. They cannot
be split and force all potentially throwing call-sites to be invokes.
This has a noticable effect on the quality of our code generation.
- catchpad, while similar in some aspects to invoke, is fairly awkward.
It is unsplittable, starts a funclet, and has control flow to other
funclets.
- The nesting relationship between funclets is currently a property of
control flow edges. Because of this, we are forced to carefully
analyze the flow graph to see if there might potentially exist illegal
nesting among funclets. While we have logic to clone funclets when
they are illegally nested, it would be nicer if we had a
representation which forbade them upfront.
Let's clean this up a bit by doing the following:
- Instead, make catchpad more like cleanuppad and landingpad: no control
flow, just a bunch of simple operands; catchpad would be splittable.
- Introduce catchswitch, a control flow instruction designed to model
the constraints of funclet oriented EH.
- Make funclet scoping explicit by having funclet instructions consume
the token produced by the funclet which contains them.
- Remove catchendpad and cleanupendpad. Their presence can be inferred
implicitly using coloring information.
N.B. The state numbering code for the CLR has been updated but the
veracity of it's output cannot be spoken for. An expert should take a
look to make sure the results are reasonable.
Reviewers: rnk, JosephTremoulet, andrew.w.kaylor
Differential Revision: http://reviews.llvm.org/D15139
llvm-svn: 255422
|
|
|
|
|
|
|
|
| |
Mem2Reg shouldn't be optimizing a function that is marked
optnone. There is a test checking this that fails when mem2reg is
explicitly added to the standard pass pipeline.
llvm-svn: 255336
|
|
|
|
|
|
|
|
|
|
| |
- This simplifies the CallSite class, arg_begin / arg_end are now
simple wrapper getters.
- In several places, we were creating CallSite instances solely to call
arg_begin and arg_end. With this change, that's no longer required.
llvm-svn: 255226
|
|
|
|
|
|
|
|
| |
`CloneAndPruneIntoFromInst` can DCE instructions after cloning them into
the new function, and so an AssertingVH is too strong. This change
switches CloneCodeInfo to use a std::vector<WeakVH>.
llvm-svn: 255148
|
|
|
|
| |
llvm-svn: 255147
|
|
|
|
|
|
|
|
|
|
|
| |
loop-unrolling when possible.""
The bug in IndVarSimplify was fixed in r254976, r254977, so I'm
reapplying the original patch for avoiding redundant LCSSA recomputation.
This reverts commit ffe3b434e505e403146aff00be0c177bb6d13466.
llvm-svn: 255133
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ScalarEvolution.h, in order to avoid cyclic dependencies between the Transform
and Analysis modules:
[LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions
Summary:
This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the
usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge
by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is
that both LAA and LV should use this interface everywhere.
This also solves a problem involving the result of SCEV expression rewritting when
the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates
P1: {a,+,b} has nsw
P2: b = 1.
Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us
sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies).
The SCEVPredicatedLayer maintains the order of transformations by feeding back
the results of previous transformations into new transformations, and therefore
avoiding this issue.
The SCEVPredicatedLayer maintains a cache to remember the results of previous
SCEV rewritting results. This also has the benefit of reducing the overall number
of expression rewrites.
Reviewers: mzolotukhin, anemet
Subscribers: jmolloy, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D14296
llvm-svn: 255122
|
|
|
|
| |
llvm-svn: 255117
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
expressions
Summary:
This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the
usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge
by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is
that both LAA and LV should use this interface everywhere.
This also solves a problem involving the result of SCEV expression rewritting when
the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates
P1: {a,+,b} has nsw
P2: b = 1.
Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us
sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies).
The SCEVPredicatedLayer maintains the order of transformations by feeding back
the results of previous transformations into new transformations, and therefore
avoiding this issue.
The SCEVPredicatedLayer maintains a cache to remember the results of previous
SCEV rewritting results. This also has the benefit of reducing the overall number
of expression rewrites.
Reviewers: mzolotukhin, anemet
Subscribers: jmolloy, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D14296
llvm-svn: 255115
|
|
|
|
| |
llvm-svn: 255078
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: pcc, majnemer, reames
Subscribers: reames, llvm-commits
Differential Revision: http://reviews.llvm.org/D15345
llvm-svn: 255062
|
|
|
|
|
|
|
|
| |
The StringRef constructor is unnecessary (since we're converting to
std::string anyway), and having it requires an explicit call to
StringRef's or std::string's constructor.
llvm-svn: 255000
|
|
|
|
| |
llvm-svn: 254878
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In order to avoid calling pow function we generate repeated fmul when n is a
positive or negative whole number.
For each exponent we pre-compute Addition Chains in order to minimize the no.
of fmuls.
Refer: http://wwwhomes.uni-bielefeld.de/achim/addition_chain.html
We pre-compute addition chains for exponents upto 32 (which results in a max of
7 fmuls).
For eg:
4 = 2+2
5 = 2+3
6 = 3+3 and so on
Hence,
pow(x, 4.0) ==> y = fmul x, x
x = fmul y, y
ret x
For negative exponents, we simply compute the reciprocal of the final result.
Note: This transformation is only enabled under fast-math.
Patch by Mandeep Singh Grang <mgrang@codeaurora.org>
Reviewers: weimingz, majnemer, escha, davide, scanon, joerg
Subscribers: probinson, escha, llvm-commits
Differential Revision: http://reviews.llvm.org/D13994
llvm-svn: 254776
|
|
|
|
|
|
| |
No functionality change is intended.
llvm-svn: 254562
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The difference is that now we don't error on out-of-comdat access to
internal global values. We copy them instead. This seems to match the
expectation of COFF linkers (see pr25686).
Original message:
Start deciding earlier what to link.
A traditional linker is roughly split in symbol resolution and
"copying
stuff".
The two tasks are badly mixed in lib/Linker.
This starts splitting them apart.
With this patch there are no direct call to linkGlobalValueBody or
linkGlobalValueProto. Everything is linked via WapValue.
This also includes a few fixes:
* A GV goes undefined if the comdat is dropped (comdat11.ll).
* We error if an internal GV goes undefined (comdat13.ll).
* We don't link an unused comdat.
The first two match the behavior of an ELF linker. The second one is
equivalent to running globaldce on the input.
llvm-svn: 254418
|
|
|
|
|
|
|
| |
Detect unsafe byval function arguments and move them to the unsafe
stack.
llvm-svn: 254353
|
|
|
|
|
|
| |
They broke a bot and I am debugging why.
llvm-svn: 254347
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A traditional linker is roughly split in symbol resolution and "copying
stuff".
The two tasks are badly mixed in lib/Linker.
This starts splitting them apart.
With this patch there are no direct call to linkGlobalValueBody or
linkGlobalValueProto. Everything is linked via WapValue.
This also includes a few fixes:
* A GV goes undefined if the comdat is dropped (comdat11.ll).
* We error if an internal GV goes undefined (comdat13.ll).
* We don't link an unused comdat.
The first two match the behavior of an ELF linker. The second one is
equivalent to running globaldce on the input.
llvm-svn: 254336
|
|
|
|
| |
llvm-svn: 254317
|
|
|
|
| |
llvm-svn: 254265
|
|
|
|
| |
llvm-svn: 254264
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This one is enabled only under -ffast-math. There are cases where the
difference between the value computed and the correct value is huge
even for ffast-math, e.g. as Steven pointed out:
x = -1, y = -4
log(pow(-1), 4) = 0
4*log(-1) = NaN
I checked what GCC does and apparently they do the same optimization
(which result in the dramatic difference). Future work might try to
make this (slightly) less worse.
Differential Revision: http://reviews.llvm.org/D14400
llvm-svn: 254263
|
|
|
|
| |
llvm-svn: 254239
|
|
|
|
|
|
|
|
| |
memory read below.
Found by msan!
llvm-svn: 254238
|
|
|
|
|
|
|
|
| |
Now the ValueMapper has two callbacks. The first one maps the
declaration. The ValueMapper records the mapping and then materializes
the body/initializer.
llvm-svn: 254209
|
|
|
|
| |
llvm-svn: 254193
|
|
|
|
|
|
|
|
| |
be an indirect call.
Fixes the crasher in PR25651 and related crashers using the same pattern.
llvm-svn: 254145
|
|
|
|
| |
llvm-svn: 254047
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Followed the guidelines in:
http://llvm.org/docs/CodingStandards.html#include-style
However, I noticed that uppercase named headers come before lowercase ones
throughout the codebase. So kept them as is.
Patch by Mandeep Singh Grang <mgrang@codeaurora.org>
Reviewers: majnemer, davide, jmolloy, atrick
Subscribers: sanjoy
Differential Revision: http://reviews.llvm.org/D14939
llvm-svn: 254005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
D14302 implements tan(atan(x)) -> x
D14045 implements pow(exp(x), y) -> exp(x*y)
Patch by Mandeep Singh Grang <mgrang@codeaurora.org>
Reviewers: majnemer, davide
Differential Revision: http://reviews.llvm.org/D14882
llvm-svn: 253768
|
|
|
|
| |
llvm-svn: 253597
|
|
|
|
|
|
|
|
|
|
|
|
| |
possible."
The change exposed a bug in IndVarSimplify (PR25578), which led to a
failure (PR25538). When the bug is fixed, this patch can be reapplied.
The tests are kept in tree, as they're useful anyway, and will not break
with this revert.
llvm-svn: 253596
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The new algorithm is more efficient (O(n), n is number of basic blocks). And it is guaranteed to cover all cases of multiple BB mapped to same line.
Reviewers: dblaikie, davidxl, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14738
llvm-svn: 253594
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r253511.
This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787
llvm-svn: 253543
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D14466
llvm-svn: 253521
|
|
|
|
| |
llvm-svn: 253514
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
These intrinsics currently have an explicit alignment argument which is
required to be a constant integer. It represents the alignment of the
source and dest, and so must be the minimum of those.
This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments. The alignment
argument itself is removed.
There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe. For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.
For example, code which used to read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)
For out of tree owners, I was able to strip alignment from calls using sed by replacing:
(call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
$1i1 false)
and similarly for memmove and memcpy.
I then added back in alignment to test cases which needed it.
A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.
In IRBuilder itself, a new argument was added. Instead of calling:
CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)
There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool. This is to prevent isVolatile here from passing its default
parameter to the source alignment.
Note, changes in future can now be made to codegen. I didn't change anything here, but this
change should enable better memcpy code sequences.
Reviewed by Hal Finkel.
llvm-svn: 253511
|
|
|
|
|
|
|
|
| |
(r252604)"
Failing clang test is now fixed by the r253458.
llvm-svn: 253459
|
|
|
|
| |
llvm-svn: 253446
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change teaches LLVM's inliner to track and suitably adjust
deoptimization state (tracked via deoptimization operand bundles) as it
inlines through call sites. The operation is described in more detail
in the LangRef changes.
Reviewers: reames, majnemer, chandlerc, dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14552
llvm-svn: 253438
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In r253126 we stopped to recompute LCSSA after loop unrolling in all
cases, except the unrolling is full and at least one of the loop exits
is outside the parent loop. In other cases the transformation should not
break LCSSA, but it turned out, that we also call SimplifyLoop on the
parent loop, which might break LCSSA by itself. This fix just triggers
LCSSA recomputation in this case as well.
I'm committing it without a test case for now, but I'll try to invent
one. It's a bit tricky because in an isolated test LoopSimplify would
be scheduled before LoopUnroll, and thus will change the test and hide
the problem.
llvm-svn: 253253
|
|
|
|
| |
llvm-svn: 253224
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fails a check in Verifier.cpp, which checks for location matches between the declared
variable and the !dbg attachments.
Reviewers: dnovillo, dblaikie, danielcdh
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14657
llvm-svn: 253194
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The Old personality function gets copied over, but the
Materializer didn't have a chance to inspect it (e.g. to fix up
references to the correct module for the target function).
Also add a verifier check that makes sure the personality routine
is in the same module as the function whose personality it is.
Reviewers: majnemer
Subscribers: jevinskie, llvm-commits
Differential Revision: http://reviews.llvm.org/D14474
llvm-svn: 253183
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The patch to move metadata linking after global value linking didn't
correctly map unmaterialized global values to null as desired. They
were in fact mapped to the source copy. It largely worked by accident
since most module linker clients destroyed the source module which
caused the source GVs to be replaced by null, but caused a failure with
LTO linking on Windows:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312869.html
The problem is that a null return value from materializeValueFor is
handled by mapping the value to self. This is the desired behavior when
materializeValueFor is passed a non-GlobalValue. The problem is how to
distinguish that case from the case where we really do want to map to
null.
This patch addresses this by passing in a new flag to the value mapper
indicating that unmapped global values should be mapped to null. Other
Value types are handled as before.
Note that the documented behavior of asserting on unmapped values when
the flag RF_IgnoreMissingValues isn't set is currently disabled with
FIXME notes due to bootstrap failures. I modified these disabled asserts
so when they are eventually enabled again it won't assert for the
unmapped values when the new RF_NullMapMissingGlobalValues flag is set.
I also considered using a callback into the value materializer, but a
flag seemed cleaner given that there are already existing flags.
I also considered modifying materializeValueFor to return the input
value when we want to map to source and then treat a null return
to mean map to null. However, there are other value materializer
subclasses that implement materializeValueFor, and they would all need
to be audited and the return values possibly changed, which seemed
error-prone.
Reviewers: dexonsmith, joker.eph
Subscribers: pcc, llvm-commits
Differential Revision: http://reviews.llvm.org/D14682
llvm-svn: 253170
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently we always recompute LCSSA for outer loops after unrolling an
inner loop. That leads to compile time problem when we have big loop
nests, and we can solve it by avoiding unnecessary work. For instance,
if w eonly do partial unrolling, we don't break LCSSA, so we don't need
to rebuild it. Also, if all exits from the inner loop are inside the
enclosing loop, then complete unrolling won't break LCSSA either.
I replaced unconditional LCSSA recomputation with conditional recomputation +
unconditional assert and added several tests, which were failing when I
experimented with it.
Soon I plan to follow up with a similar patch for recalculation of dominators
tree.
Reviewers: hfinkel, dexonsmith, bogner, joker.eph, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14526
llvm-svn: 253126
|
|
|
|
| |
llvm-svn: 252970
|