| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
| |
A seg-fault occurs due to a reference of a null pointer, which is
the value returned by getConstantPart. This function returns
null if the constant part is not found. The code that calls this
function needs to check for the null return value.
Differential Revision: http://reviews.llvm.org/D18718
llvm-svn: 265319
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Floating point intrinsics in LLVM are generally not speculatively
executed, since most of them are defined to behave the same as libm
functions, which set errno.
However, the only error that can happen when executing ceil, floor,
nearbyint, rint and round libm functions per POSIX.1-2001 is -ERANGE,
and that requires the maximum value of the exponent to be smaller
than the number of mantissa bits, which is not the case with any of
the floating point types supported by LLVM.
The trunc and copysign functions never set errno per per POSIX.1-2001.
Differential Revision: http://reviews.llvm.org/D18643
llvm-svn: 265262
|
| |
|
|
|
|
|
| |
used in assertion, NFC)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265220
|
| |
|
|
|
|
|
|
|
|
| |
This patch simply mirrors the attributes we give to @llvm.nvvm.reflect
to the __nvvm_reflect libdevice call. This shaves about 30% of the code
in libdevice away because of CSE opportunities. It's also helps us
figure out that libdevice implementations of transcendental functions
don't have side-effects.
llvm-svn: 265060
|
| |
|
|
|
|
|
| |
The rule for SMIN introduced in rL236202 doesn't work as advertised: the
check for Pred == ICmpInst::ICMP_SGT was missing.
llvm-svn: 264996
|
| |
|
|
| |
llvm-svn: 264995
|
| |
|
|
|
|
|
|
|
| |
This way once we teach MatchBinaryOp to map more things into arithmetic,
the non-wrapping add recurrence construction would understand it too.
Right now MatchBinaryOp still only understands arithmetic, so this is
solely a code-reorganization change.
llvm-svn: 264994
|
| |
|
|
| |
llvm-svn: 264993
|
| |
|
|
|
|
|
|
|
|
| |
We already try not to truncate PHIs in computeMinimalBitwidths. LoopVectorize can't handle it and we really don't need to, because both induction and reduction PHIs are truncated by other means.
However, we weren't bailing out in all the places we should have, and we ended up by returning a PHI to be truncated, which has caused PR27018.
This fixes PR17018.
llvm-svn: 264852
|
| |
|
|
|
|
|
|
|
| |
MatchBinaryOp abstracts out the IR instructions from the operations they
represent. While this change is NFC, we will use this factoring later
to map things like `(extractvalue 0 (sadd.with.overflow X Y))` to `(add
X Y)`.
llvm-svn: 264747
|
| |
|
|
| |
llvm-svn: 264746
|
| |
|
|
|
|
|
|
| |
minor fixes.
Differential revision: http://reviews.llvm.org/D18469
llvm-svn: 264598
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-load forwarding across a release fence.
We choose to be much more conservative about stores. In theory, nothing prevents us from shifting a store from after a release fence to before it, and then eliminating the preceeding (previously fenced) store. Doing this without actually moving the second store is likely also legal, but we chose to be conservative at this time.
The LangRef indicates only atomic loads and stores are effected by fences. This patch chooses to be far more conservative then that.
This is the GVN companion to http://reviews.llvm.org/D11434 which applied the same logic in EarlyCSE and has been baking in tree for a while now.
Differential Revision: http://reviews.llvm.org/D11436
llvm-svn: 264472
|
| |
|
|
|
|
|
|
|
|
| |
This reserves an MDKind for !llvm.loop, which allows callers to avoid a
string-based lookup. I'm not sure why it was missing.
There should be no functionality change here, just a small compile-time
speedup.
llvm-svn: 264371
|
| |
|
|
| |
llvm-svn: 264244
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to only allow SCEVAddRecExpr for pointer expressions in order to
be able to compute the bounds. However this is also trivially possible
for loop-invariant addresses (scUnknown) since then the bounds are the
address itself.
Interestingly, we used allow this for the special case when the
loop-invariant address happens to also be an SCEVAddRecExpr (in an outer
loop).
There are a couple more loops that are vectorized in SPEC after this.
My guess is that the main reason we don't see more because for example a
loop-invariant load is vectorized into a splat vector with several
vector-inserts. This is likely to make the vectorization unprofitable.
I.e. we don't notice that a later LICM will move all of this out of the
loop so the cost estimate should really be 0.
llvm-svn: 264243
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D18233
llvm-svn: 264179
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
return SCEVAddRecExpr* instead of SCEV*
Summary:
This changes the conversion functions from SCEV * to SCEVAddRecExpr from
ScalarEvolution and PredicatedScalarEvolution to return a SCEVAddRecExpr*
instead of a SCEV* (which removes the need of most clients to do a
dyn_cast right after calling these functions).
We also don't add new predicates if the transformation was not successful.
This is not entirely a NFC (as it can theoretically remove some predicates
from LAA when we have an unknown dependece), but I couldn't find an obvious
regression test for it.
Reviewers: sanjoy
Subscribers: sanjoy, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D18368
llvm-svn: 264161
|
| |
|
|
|
|
|
| |
This is more coherent with usual containers.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 264026
|
| |
|
|
| |
llvm-svn: 263945
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
replaceCongruentIVs can break LCSSA when trying to replace IV increments
since it tries to replace all uses of a phi node with another phi node
while both of the phi nodes are not necessarily in the processed loop.
This will cause an assert in IndVars.
To fix this, we add a check to make sure that the replacement maintains
LCSSA.
Reviewers: sanjoy
Subscribers: mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D18266
llvm-svn: 263941
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
It can hurt performance to prefetch ahead too much. Be conservative for
now and don't prefetch ahead more than 3 iterations on Cyclone.
Reviewers: hfinkel
Subscribers: llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D17949
llvm-svn: 263772
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
And use this TTI for Cyclone. As it was explained in the original RFC
(http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW
prefetcher work up to 2KB strides.
I am also adding tests for this and the previous change (D17943):
* Cyclone prefetching accesses with a large stride
* Cyclone not prefetching accesses with a small stride
* Generic Aarch64 subtarget not prefetching either
Reviewers: hfinkel
Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D17945
llvm-svn: 263771
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18192
llvm-svn: 263581
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This form was replaced by a form taking an instruction instead of opcode and
return type in r258391. After committing this change (and some depending,
follow-up changes) it turned out in the review thread to be controversial. The
discussion didn't come to a conclusion yet. I'm re-adding the old form to fix
the API regression and to provide a better base for discussion, possibly on
llvm-dev.
A difference to the original function is that it can't be called with GEPs
(similarly to how it was already the case for compares). In order to support
opaque pointers in the future, folding GEPs needs to be passed the source
element type, which is not possible with the current API.
Reviewers: dberlin, reames
Subscribers: dblaikie, eddyb
Differential Revision: http://reviews.llvm.org/D17901
llvm-svn: 263501
|
| |
|
|
|
|
| |
This fixes PR26843.
llvm-svn: 263462
|
| |
|
|
|
|
|
| |
Check to see if all operands are constant before calling simplify on them
so that we don't perform wasted simplifications.
llvm-svn: 263374
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This doesn't change how many times we construct domtrees in the normal
pipeline, and it removes fragility and instability where basic-aa may
not be run in time to see domtrees because they happen to be constructed
afterward.
This isn't quite as clean as the change to memdep because there is
a mode where basic-aa specifically runs without domtrees -- in the
hacking version used by function-attrs with the legacy pass manager.
llvm-svn: 263234
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This doesn't cause us to construct dominator trees any more often in the
normal pipeline, and removes an entire mode of memdep that needed to be
reasoned about and maintained. Perhaps more importantly, it removes the
ability for the results of memdep to be different because of accidental
pass scheduling goofs or the order of evaluation of 'getResult' calls.
Essentially, 'getCachedResult', unless across IR-unit boundaries, is
extremely dangerous. We need to work much harder to avoid it (or its
analog in the old pass manager).
llvm-svn: 263232
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This was originally a pointer to support pass managers which didn't use
AnalysisManagers. However, that doesn't realistically come up much and
the complexity of supporting it doesn't really make sense.
In fact, *many* parts of the pass manager were just assuming the pointer
was never null already. This at least makes it much more explicit and
clear.
llvm-svn: 263219
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
work in the face of the limitations of DLLs and templated static
variables.
This requires passes that use the AnalysisBase mixin provide a static
variable themselves. So as to keep their APIs clean, I've made these
private and befriended the CRTP base class (which is the common
practice).
I've added documentation to AnalysisBase for why this is necessary and
at what point we can go back to the much simpler system.
This is clearly a better pattern than the extern template as it caught
*numerous* places where the template magic hadn't been applied and
things were "just working" but would eventually have broken
mysteriously.
llvm-svn: 263216
|
| |
|
|
|
|
|
| |
function analyses, and use it to wire up globals-aa to the new pass
manager.
llvm-svn: 263211
|
| |
|
|
|
|
| |
instantiation needed for the mingw dll build bot.
llvm-svn: 263114
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
actually finish wiring up the old call graph.
There were bugs in the old call graph that hadn't been caught because it
wasn't being tested. It wasn't being tested because it wasn't in the
pipeline system and we didn't have a printing pass to run in tests. This
fixes all of that.
As for why I'm still keeping the old call graph alive its so that I can
port GlobalsAA to the new pass manager with out forking it to work with
the lazy call graph. That's clearly the right eventual design, but it
seems pragmatic to defer that until its necessary. The old call graph
works just fine for GlobalsAA.
llvm-svn: 263104
|
| |
|
|
|
|
|
|
|
|
| |
location in the opt tool to live along side the analysis in LLVM's
libraries.
No functionality changed here, but this will allow me to port the
printer to the new pass manager as well.
llvm-svn: 263101
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
There is another pass by the generic name 'CallGraphPrinter' which is
actually just a call graph printer tucked away inside the opt tool. I'd
like to bring it out and make it follow the same patterns as the rest of
the CallGraph code, but doing so would end up conflicting with the name
of the DOT printing pass. So this makes the DOT printing pass name be
more precise.
No functionality changed here.
llvm-svn: 263100
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a fairly straightforward port to the new pass manager with one
exception. It removes a very questionable use of releaseMemory() in
the old pass to invalidate its caches between runs on a function.
I don't think this is really guaranteed to be safe. I've just used the
more direct port to the new PM to address this by nuking the results
object each time the pass runs. While this could cause some minor malloc
traffic increase, I don't expect the compile time performance hit to be
noticable, and it makes the correctness and other aspects of the pass
much easier to reason about. In some cases, it may make things faster by
making the sets and maps smaller with better locality. Indeed, the
measurements collected by Bruno (thanks!!!) show mostly compile time
improvements.
There is sadly very limited testing at this point as there are only two
tests of memdep, and both rely on GVN. I'll be porting GVN next and that
will exercise this heavily though.
Differential Revision: http://reviews.llvm.org/D17962
llvm-svn: 263082
|
| |
|
|
|
|
|
|
|
|
| |
MemoryDependenceAnalysis had a hard-coded exception to the general aliasing rules for malloc and calloc. The reasoning that applied there is equally valid in BasicAA and clarifies the remaining logic in MDA.
In principal, this can expose slightly more optimization opportunities, but since essentially all of our aliasing aware memory optimization passes go through MDA, this will likely be NFC in practice.
Differential Revision: http://reviews.llvm.org/D15912
llvm-svn: 263075
|
| |
|
|
|
|
| |
Extract out a generic interface from a recently landed patch and document a TODO in case compile time becomes a problem.
llvm-svn: 263062
|
| |
|
|
|
|
|
|
|
| |
Building on the previous change, this generalizes
ScalarEvolution::getRangeViaFactoring to work with
{Ext(C?A:B)+k0,+,Ext(C?A:B)+k1} where Ext can be a zero extend, sign
extend or truncate operation, and k0 and k1 are constants.
llvm-svn: 262979
|
| |
|
|
|
|
|
|
| |
This change generalizes ScalarEvolution::getRangeViaFactoring to work
with {Ext(C?A:B),+,Ext(C?A:B)} where Ext can be a zero extend, sign
extend or truncate operation.
llvm-svn: 262978
|
| |
|
|
| |
llvm-svn: 262956
|
| |
|
|
| |
llvm-svn: 262883
|
| |
|
|
| |
llvm-svn: 262831
|
| |
|
|
|
|
|
|
| |
This is much more clear and less surprising IMO. It also makes things
more consistent with the increasingly large chunk of LLVM code that
assumes true-on-success.
llvm-svn: 262826
|
| |
|
|
|
|
|
|
|
|
| |
duplicated comments.
In several cases these had diverged making them especially nice to
canonicalize. I checked to make sure we weren't losing important
information of course.
llvm-svn: 262825
|
| |
|
|
|
|
|
|
|
|
|
| |
the new pass manager.
The port will involve substantial edits here, and would likely introduce
bad formatting if formatted in isolation, so just get all the formatting
up to snuff. I'll also go through and try to freshen the doxygen here as
well as modernizing some of the code.
llvm-svn: 262821
|
| |
|
|
|
|
| |
The diff is relatively large since I took a chance to rearrange the code I had to touch in a more obvious way, but the key bit is merely using the !range metadata when we can't analyze the instruction further. The previous !range metadata code was essentially just dead since no binary operator or cast will have !range metadata (per Verifier) and it was otherwise dropped on the floor.
llvm-svn: 262751
|
| |
|
|
| |
llvm-svn: 262682
|
| |
|
|
| |
llvm-svn: 262648
|