| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
| |
This fixes the issue exposed in PR31393, where we weren't trying
sufficiently hard to diagnose bad TBAA metadata.
This does reduce the variety in the error messages we print out, but I
think the tradeoff of verifying more, simply and quickly overrules the
need for more helpful error messags here.
llvm-svn: 290713
|
|
|
|
|
|
|
|
| |
analysis handles become invalid.
Add a test case for its invalidation logic.
llvm-svn: 290620
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BasicAA in r290603.
I've kept the basic testing in the new PM test file as that also covers
the AAManager invalidation logic. If/when there is a good place for
broader AA testing it could move there.
This test is somewhat unsatisfying as I can't get it to fail even with
ASan outside of explicit checks of the invalidation. Apparently we don't
yet have any test coverage of the BasicAA code paths using either the
domtree or loopinfo -- I made both of them always be null and check-llvm
passed.
llvm-svn: 290612
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D27034
llvm-svn: 290526
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As mentioned on PR30845, we were performing our vXi64 multiplication as:
AloBlo = pmuludq(a, b);
AloBhi = pmuludq(a, psrlqi(b, 32));
AhiBlo = pmuludq(psrlqi(a, 32), b);
return AloBlo + psllqi(AloBhi, 32)+ psllqi(AhiBlo, 32);
when we could avoid one of the upper shifts with:
AloBlo = pmuludq(a, b);
AloBhi = pmuludq(a, psrlqi(b, 32));
AhiBlo = pmuludq(psrlqi(a, 32), b);
return AloBlo + psllqi(AloBhi + AhiBlo, 32);
This matches the lowering on gcc/icc.
Differential Revision: https://reviews.llvm.org/D27756
llvm-svn: 290267
|
|
|
|
|
|
|
|
|
|
| |
For vector GEPs, CastGEPIndices can end up in an infinite recursion, because
we compare the vector type to the scalar pointer type, find them different,
and then try to cast a type to itself.
Differential Revision: https://reviews.llvm.org/D28009
llvm-svn: 290260
|
|
|
|
|
|
| |
Sorry!
llvm-svn: 290087
|
|
|
|
|
|
|
| |
This creates non-linear behavior in the inliner (see more details in
r289755's commit thread).
llvm-svn: 290086
|
|
|
|
|
|
|
|
|
| |
This patch checks that the SlowMisaligned128Store subtarget feature is set
when penalizing such stores in getMemoryOpCost.
Differential Revision: https://reviews.llvm.org/D27677
llvm-svn: 289845
|
|
|
|
| |
llvm-svn: 289819
|
|
|
|
|
|
| |
Incorrect 'undef' mask index matching meant that broadcast shuffles could be detected as reverse shuffles
llvm-svn: 289811
|
|
|
|
| |
llvm-svn: 289800
|
|
|
|
|
|
|
|
|
| |
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...
llvm-svn: 289756
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was an efficiency problem with how we processed @llvm.assume in
ValueTracking (and other places). The AssumptionCache tracked all of the
assumptions in a given function. In order to find assumptions relevant to
computing known bits, etc. we searched every assumption in the function. For
ValueTracking, that means that we did O(#assumes * #values) work in InstCombine
and other passes (with a constant factor that can be quite large because we'd
repeat this search at every level of recursion of the analysis).
Several of us discussed this situation at the last developers' meeting, and
this implements the discussed solution: Make the values that an assume might
affect operands of the assume itself. To avoid exposing this detail to
frontends and passes that need not worry about it, I've used the new
operand-bundle feature to add these extra call "operands" in a way that does
not affect the intrinsic's signature. I think this solution is relatively
clean. InstCombine adds these extra operands based on what ValueTracking, LVI,
etc. will need and then those passes need only search the users of the values
under consideration. This should fix the computational-complexity problem.
At this point, no passes depend on the AssumptionCache, and so I'll remove
that as a follow-up change.
Differential Revision: https://reviews.llvm.org/D27259
llvm-svn: 289755
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change adds some verification in the IR verifier around struct path
TBAA metadata.
Other than some basic sanity checks (e.g. we get constant integers where
we expect constant integers), this checks:
- That by the time an struct access tuple `(base-type, offset)` is
"reduced" to a scalar base type, the offset is `0`. For instance, in
C++ you can't start from, say `("struct-a", 16)`, and end up with
`("int", 4)` -- by the time the base type is `"int"`, the offset
better be zero. In particular, a variant of this invariant is needed
for `llvm::getMostGenericTBAA` to be correct.
- That there are no cycles in a struct path.
- That struct type nodes have their offsets listed in an ascending
order.
- That when generating the struct access path, you eventually reach the
access type listed in the tbaa tag node.
Reviewers: dexonsmith, chandlerc, reames, mehdi_amini, manmanren
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D26438
llvm-svn: 289402
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ConstantFolding tried to cast one of the scalar indices to a vector
type. Instead, use the vector type only for the first index (which
is the only one allowed to be a vector) and use its scalar type
otherwise.
Fixes PR31250.
Reviewers: majnemer
Differential Revision: https://reviews.llvm.org/D27389
llvm-svn: 289073
|
|
|
|
|
|
|
|
| |
In the addressing mode, signed 9-bit imm is [-256, 255], not [-512, 511].
Differential Revision: https://reviews.llvm.org/D27480
llvm-svn: 288876
|
|
|
|
|
|
|
|
| |
Fix a bug when we call isLegalAddressingMode() from getGEPCost().
Differential Revision: https://reviews.llvm.org/D27357
llvm-svn: 288569
|
|
|
|
|
|
|
|
| |
VSX has instructions lxsiwax/lxsdx that can load 32/64 bit value into VSX register cheaply. That patch makes it known to memory cost model, so the vectorization of the test case in pr30990 is beneficial.
Differential Revision: https://reviews.llvm.org/D26713
llvm-svn: 288560
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently when cost of scalar operations is evaluated the vector type is
used for scalar operations. Patch fixes this issue and fixes evaluation
of the vector operations cost.
Several test showed that vector cost model is too optimistic. It
allowed vectorization of 8 or less add/fadd operations, though scalar
code is faster. Actually, only for 16 or more operations vector code
provides better performance.
Differential Revision: https://reviews.llvm.org/D26277
llvm-svn: 288398
|
|
|
|
| |
llvm-svn: 288377
|
|
|
|
|
|
| |
This reverts commit a61718435fc4118c82f8aa6133fd81f803789c1e.
llvm-svn: 288371
|
|
|
|
| |
llvm-svn: 288369
|
|
|
|
|
|
|
|
| |
Note that the non-splat lshr+lshr test folded, but that does not
work in general. Something is missing or wrong in computeKnownBits
as the non-splat shl+shl test still shows.
llvm-svn: 288005
|
|
|
|
| |
llvm-svn: 287998
|
|
|
|
|
|
|
|
| |
AVX512DQ-only targets
Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances
llvm-svn: 287882
|
|
|
|
|
|
|
|
| |
AVX512DQ-only targets
Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances
llvm-svn: 287762
|
|
|
|
| |
llvm-svn: 287760
|
|
|
|
| |
llvm-svn: 287756
|
|
|
|
|
|
|
|
| |
Better coverage of all legal types + special cases.
Removed old fptoui tests which are all handled in fptoui.ll
llvm-svn: 287678
|
|
|
|
|
|
|
|
|
|
| |
Currently LLVM assumes that a pointer addrspacecasted to a different addr space is equivalent to trunc or zext bitwise, which is not true. For example, in amdgcn target, when a null pointer is addrspacecasted from addr space 4 to 0, its value is changed from i64 0 to i32 -1.
This patch teaches LLVM not to assume known bits of addrspacecast instruction to its operand.
Differential Revision: https://reviews.llvm.org/D26803
llvm-svn: 287545
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This extends FCOPYSIGN support to 512-bit vectors.
I've also added tests to show what the 128-bit and 256-bit cases look like with broadcast loads.
Reviewers: delena, zvi, RKSimon, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26791
llvm-svn: 287298
|
|
|
|
|
|
| |
More realistic v16i8/v32i8/v64i8 MUL costs - we have to extend to vXi16, use PMULLW and then truncate the result
llvm-svn: 286838
|
|
|
|
|
|
|
|
| |
Add explicit v16i16/v32i8 ADD/SUB costs, matching the costs of v4i64/v8i32 - they were missing for some reason.
This has side effects on the LV max bandwidth tests (AVX1 now prefers 128-bit vectors vs AVX2 which still prefers 256-bit)
llvm-svn: 286832
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the inrange keyword is present before any index, loading from or
storing to any pointer derived from the getelementptr has undefined
behavior if the load or store would access memory outside of the bounds of
the element selected by the index marked as inrange.
This can be used, e.g. for alias analysis or to split globals at element
boundaries where beneficial.
As previously proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html
Differential Revision: https://reviews.llvm.org/D22793
llvm-svn: 286514
|
|
|
|
|
|
|
|
| |
These examples are variations that were inspired from a small subgraph taken
from paper.ll which are interesting as they show certain issues with infinite
loops.
llvm-svn: 286450
|
|
|
|
|
|
|
|
| |
inaccessiblemem_or_argmemonly attributes
Differential Revision: https://reviews.llvm.org/D26382
llvm-svn: 286294
|
|
|
|
|
|
|
|
|
|
| |
This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available.
This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful.
Differential Revision: https://reviews.llvm.org/D25910
llvm-svn: 286233
|
|
|
|
|
|
|
|
| |
actually affect memory.
Differential Revision: https://reviews.llvm.org/D26252
llvm-svn: 286108
|
|
|
|
|
|
|
|
|
|
| |
There is a bug describing poor cost model for floating point operations:
Bug 29083 - [X86][SSE] Improve costs for floating point operations. This
patch is the second one in series of patches dealing with cost model.
Differential Revision: https://reviews.llvm.org/D25722
llvm-svn: 285564
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We were trying to add APInt values with different bit sizes after
visiting an addrspacecast instruction which changed the bit width
of the pointer.
Reviewers: majnemer, hfinkel
Subscribers: hfinkel, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D24774
llvm-svn: 285407
|
|
|
|
| |
llvm-svn: 285329
|
|
|
|
|
|
|
|
|
|
| |
With DQI but without VLX, lower v2i64 and v4i64 MUL operations with v8i64 MUL (vpmullq).
Updated cost table accordingly.
Differential Revision: https://reviews.llvm.org/D26011
llvm-svn: 285304
|
|
|
|
|
|
|
|
|
|
|
| |
actually affect memory."
This reverts commit r285191.
LICM appears to rely on the Alias Set Tracker hitting lifetime markers to prevent
code from being moved outside of the original scope.
llvm-svn: 285227
|
|
|
|
|
|
|
|
| |
affect memory.
Differential Revision: https://reviews.llvm.org/D25969
llvm-svn: 285191
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are two fixes here: one, AnalyzeUsesOfPointer can't return
false until it has checked all the uses of the pointer. Two, if a
global uses another global, we have to assume the address of the
first global escapes.
Fixes https://llvm.org/bugs/show_bug.cgi?id=30707 .
Differential Revision: https://reviews.llvm.org/D25798
llvm-svn: 285034
|
|
|
|
| |
llvm-svn: 284940
|
|
|
|
|
|
| |
We were defaulting to SSE2 costs which weren't taking into account the availability of PBLENDW/PBLENDVB to improve merging of per-element shift results.
llvm-svn: 284939
|
|
|
|
| |
llvm-svn: 284938
|
|
|
|
|
|
|
|
|
|
|
|
| |
In BasicAA GEP operand values get adjusted ("wrap-around") based on the
pointersize. Otherwise, in non-64b modes, AA could report false negatives.
However, a wrap-around is valid only for a fully evaluated expression.
It had been introduced to fix an alias problem in
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160118/326163.html.
This commit restricts the wrap-around to constant gep operands only where the
value is known at compile-time.
llvm-svn: 284908
|