| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 277066
|
|
|
|
|
|
|
|
| |
We were able to fold masked loads with an all-ones mask to a normal
load. However, we couldn't turn a masked load with a mask with mixed
ones and undefs into a normal load.
llvm-svn: 275380
|
|
|
|
|
|
|
| |
This transform doesn't require any new instructions, it can safely live
in InstSimplify.
llvm-svn: 275344
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This actually uncovered a surprisingly large chain of ultimately unused
TLI args.
From what I can gather, this argument is a remnant of when
isKnownNonNull would look at the TLI directly.
The current approach seems to be that InferFunctionAttrs runs early in
the pipeline and uses TLI to annotate the TLI-dependent non-null
information as return attributes.
This also removes the dependence of functionattrs on TLI altogether.
llvm-svn: 274455
|
|
|
|
|
|
|
| |
Specific instances of intrinsic calls may want to be convergent, such
as certain register reads but the intrinsic declaration is not.
llvm-svn: 273188
|
|
|
|
|
|
| |
for one of the signatures of CreateShuffleVector. This better emphasises that you can't use it for the -1 as undef behavior.
llvm-svn: 272491
|
|
|
|
|
|
|
|
| |
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.
llvm-svn: 272126
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
native shifts
Unlike native shifts, the AVX2 per-element shift instructions VPSRAV/VPSRLV/VPSLLV handle out of range shift values (logical shifts set the result to zero, arithmetic shifts splat the sign bit).
If the shift amount is constant we can sometimes convert these instructions to native shifts:
1 - if all shift amounts are in range then the conversion is trivial.
2 - out of range arithmetic shifts can be clamped to the (bitwidth - 1) (a legal shift amount) before conversion.
3 - logical shifts just return zero if all elements have out of range shift amounts.
In addition, UNDEF shift amounts are handled - either as an UNDEF shift amount in a native shift or as an UNDEF in the logical 'all out of range' zero constant special case for logical shifts.
Differential Revision: http://reviews.llvm.org/D19675
llvm-svn: 271996
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for folding undef/zero/constant inputs to MOVMSK instructions.
The SSE/AVX versions can be fully folded, but the MMX version can only handle undef inputs.
Differential Revision: http://reviews.llvm.org/D20998
llvm-svn: 271990
|
|
|
|
|
|
| |
them. Auto upgrade to native unaligned store instructions.
llvm-svn: 271236
|
|
|
|
|
|
|
|
|
|
|
|
| |
intrinsics with generic IR (llvm)
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.
Reapplied now that the the companion patch (D20684) removes/auto-upgrade the clang intrinsics has been committed.
Differential Revision: http://reviews.llvm.org/D20686
llvm-svn: 271131
|
|
|
|
|
|
| |
extension intrinsics with generic IR (llvm)
llvm-svn: 270976
|
|
|
|
|
|
|
|
|
|
|
|
| |
generic IR (llvm)
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.
A companion patch (D20684) removes/auto-upgrade the clang intrinsics.
Differential Revision: http://reviews.llvm.org/D20686
llvm-svn: 270973
|
|
|
|
|
|
|
|
| |
the same places that the SSE/SSE2 storeu intrinsics appear.
I don't really know how to test this. Just seemed like we should be consistent.
llvm-svn: 270819
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a va_start or va_copy is immediately followed by a va_end (ignoring
debug information or other start/end in between), then it is safe to
remove the pair. As this code shares some commonalities with the lifetime
markers, this has been factored to helper functions.
This InstCombine pattern kicks-in 3 times when running the LLVM test
suite.
llvm-svn: 269033
|
|
|
|
|
|
| |
accept UNDEF elements.
llvm-svn: 268206
|
|
|
|
|
|
| |
UNDEF elements.
llvm-svn: 268204
|
|
|
|
|
|
| |
UNDEF elements.
llvm-svn: 268202
|
|
|
|
|
|
| |
shufflevector.
llvm-svn: 268199
|
|
|
|
|
|
|
|
| |
elements. NFCI.
Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements.
llvm-svn: 268158
|
|
|
|
|
|
|
|
| |
elements. NFCI.
Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements.
llvm-svn: 268115
|
|
|
|
|
|
|
| |
We neglected to transfer operand bundles for some transforms. These
were found via inspection, I'll try to come up with some test cases.
llvm-svn: 268010
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch improves support for determining the demanded vector elements through SSE scalar intrinsics:
1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input)
2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input
Differential Revision: http://reviews.llvm.org/D17490
llvm-svn: 267356
|
|
|
|
|
|
| |
As discussed on D17490, we should attempt to update an intrinsic's arguments demanded elements in one pass if we can.
llvm-svn: 267355
|
|
|
|
|
|
| |
function. NFCI.
llvm-svn: 267352
|
|
|
|
|
|
| |
function. NFCI.
llvm-svn: 267351
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This bug was introduced with:
http://reviews.llvm.org/rL262269
AVX masked loads are specified to set vector lanes to zero when the high bit of the mask
element for that lane is zero:
"If the mask is 0, the corresponding data element is set to zero in the load form of these
instructions, and unmodified in the store form." --Intel manual
Differential Revision: http://reviews.llvm.org/D19017
llvm-svn: 266148
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`allocsize` is a function attribute that allows users to request that
LLVM treat arbitrary functions as allocation functions.
This patch makes LLVM accept the `allocsize` attribute, and makes
`@llvm.objectsize` recognize said attribute.
The review for this was split into two patches for ease of reviewing:
D18974 and D14933. As promised on the revisions, I'm landing both
patches as a single commit.
Differential Revision: http://reviews.llvm.org/D14933
llvm-svn: 266032
|
|
|
|
|
|
|
|
|
|
| |
Two or more identical assumes are occasionally next to each other in a
basic block.
While our generic machinery will turn a redundant assume into a no-op,
it is not super cheap.
We can perform a simpler check to achieve the same result for this case.
llvm-svn: 265801
|
|
|
|
| |
llvm-svn: 264944
|
|
|
|
| |
llvm-svn: 264943
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously we had a notion of convergent functions but not of convergent
calls. This is insufficient to correctly analyze calls where the target
is unknown, e.g. indirect calls.
Now a call is convergent if it targets a known-convergent function, or
if it's explicitly marked as convergent. As usual, we can remove
convergent where we can prove that no convergent operations are
performed in the call.
Originally landed as r261544, then reverted in r261544 for (incidental)
build breakage. Re-landed here with no changes.
Reviewers: chandlerc, jingyue
Subscribers: llvm-commits, tra, jhen, hfinkel
Differential Revision: http://reviews.llvm.org/D17739
llvm-svn: 263481
|
|
|
|
|
|
|
|
|
| |
This follows up on the related AVX instruction transforms, but this
one is too strange to do anything more with. Intel's behavioral
description of this instruction in its Software Developer's Manual
is tragi-comic.
llvm-svn: 263340
|
|
|
|
|
|
|
| |
Continuation of:
http://reviews.llvm.org/rL262269
llvm-svn: 262273
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The intended effect of this patch in conjunction with:
http://reviews.llvm.org/rL259392
http://reviews.llvm.org/rL260145
is that customers using the AVX intrinsics in C will benefit from combines when
the load mask is constant:
__m128 mload_zeros(float *f) {
return _mm_maskload_ps(f, _mm_set1_epi32(0));
}
__m128 mload_fakeones(float *f) {
return _mm_maskload_ps(f, _mm_set1_epi32(1));
}
__m128 mload_ones(float *f) {
return _mm_maskload_ps(f, _mm_set1_epi32(0x80000000));
}
__m128 mload_oneset(float *f) {
return _mm_maskload_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0));
}
...so none of the above will actually generate a masked load for optimized code.
This is the masked load counterpart to:
http://reviews.llvm.org/rL262064
llvm-svn: 262269
|
|
|
|
|
|
|
| |
We ended up removing a save/restore pair around an inalloca call,
leading to a miscompile in Chromium.
llvm-svn: 262095
|
|
|
|
|
|
|
|
|
| |
Replicate everything for integers...because x86.
Continuation of:
http://reviews.llvm.org/rL262064
llvm-svn: 262077
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The intended effect of this patch in conjunction with:
http://reviews.llvm.org/rL259392
http://reviews.llvm.org/rL260145
is that customers using the AVX intrinsics in C will benefit from combines when
the store mask is constant:
void mstore_zero_mask(float *f, __m128 v) {
_mm_maskstore_ps(f, _mm_set1_epi32(0), v);
}
void mstore_fake_ones_mask(float *f, __m128 v) {
_mm_maskstore_ps(f, _mm_set1_epi32(1), v);
}
void mstore_ones_mask(float *f, __m128 v) {
_mm_maskstore_ps(f, _mm_set1_epi32(0x80000000), v);
}
void mstore_one_set_elt_mask(float *f, __m128 v) {
_mm_maskstore_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0), v);
}
...so none of the above will actually generate a masked store for optimized code.
Differential Revision: http://reviews.llvm.org/D17485
llvm-svn: 262064
|
|
|
|
|
|
|
|
|
|
| |
This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated.
Reviewed By: hfinkel
Differential Revision: http://reviews.llvm.org/D16180
llvm-svn: 261736
|
|
|
|
|
|
|
| |
This reverts r261544, which was causing a test failure in
Transforms/FunctionAttrs/readattrs.ll.
llvm-svn: 261549
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously we had a notion of convergent functions but not of convergent
calls. This is insufficient to correctly analyze calls where the target
is unknown, e.g. indirect calls.
Now a call is convergent if it targets a known-convergent function, or
if it's explicitly marked as convergent. As usual, we can remove
convergent where we can prove that no convergent operations are
performed in the call.
Reviewers: chandlerc, jingyue
Subscribers: hfinkel, jhen, tra, llvm-commits
Differential Revision: http://reviews.llvm.org/D17317
llvm-svn: 261544
|
|
|
|
| |
llvm-svn: 261484
|
|
|
|
|
|
|
|
|
| |
Originally part of:
http://reviews.llvm.org/D17485
We need this when simplifying masked memory ops too.
llvm-svn: 261483
|
|
|
|
|
|
| |
the lowest vector element
llvm-svn: 261460
|
|
|
|
|
|
| |
Cleanup for upcoming Clang warning -Wcomma. No functionality change intended.
llvm-svn: 261270
|
|
|
|
|
|
|
|
| |
Reviewed By: reames
Differential Revision: http://reviews.llvm.org/D16143
llvm-svn: 260509
|
|
|
|
|
|
|
|
| |
We introduced gc.relocates of vector-of-pointer types a couple of weeks back. Somehow, I missed updating the InstCombine rule to account for this. If we hit this code path with a vector-of-pointers gc.relocate, we'd crash on a cast<PointerType>.
I also took the chance to do a bit of code style cleanup.
llvm-svn: 260279
|
|
|
|
| |
llvm-svn: 259425
|
|
|
|
|
|
|
|
|
|
|
| |
A masked scatter with a zero mask means there's no store.
A masked gather with a zero mask means the passthru arg is returned.
This is a continuation of:
http://reviews.llvm.org/rL259369
http://reviews.llvm.org/rL259392
llvm-svn: 259421
|
|
|
|
|
|
|
|
|
|
| |
A masked store with a zero mask means there's no store.
A masked store with an allOnes mask means it's a normal vector store.
This is a continuation of:
http://reviews.llvm.org/rL259369
llvm-svn: 259392
|