| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
| |
Without this, we could end up trying to get the Nth (0-indexed) element
from a subvector of size N.
Differential Revision: https://reviews.llvm.org/D37880
llvm-svn: 314380
|
|
|
|
|
|
| |
You Use warnings; other minor fixes (NFC).
llvm-svn: 313941
|
|
|
|
|
|
|
|
| |
countTrailingOnes instead of getting active bits and checking if all the bits below that make a mask.
At least for the 64-bit and less case, we should be able to determine if we even have a mask without counting any bits. This also removes the need to explicitly check for 0 active bits, isMask will return false for 0.
llvm-svn: 313908
|
|
|
|
|
|
|
|
| |
This exact block of code exists right below.
Differential Revision: https://reviews.llvm.org/D38122
llvm-svn: 313891
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we have an AssertZext of a truncated value that has already been AssertZext'ed,
we can assert on the wider source op to improve the zext-y knowledge:
assert (trunc (assert X, i8) to iN), i1 --> trunc (assert X, i1) to iN
This moves a fold from being Mips-specific to general combining, and x86 shows
improvements.
Differential Revision: https://reviews.llvm.org/D37017
llvm-svn: 313577
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rL310710 allowed store merging to occur after legalization to catch stores that are created late,
but this exposes a logic hole seen in PR34217:
https://bugs.llvm.org/show_bug.cgi?id=34217
We will miss merging stores if the target lowers vector extracts into target-specific operations.
This patch allows store merging to occur both before and after legalization if the target chooses
to get maximum merging.
I don't think the potential regressions in the other tests are relevant. The tests are for
correctness of weird IR constructs rather than perf tests, and I think those are still correct.
Differential Revision: https://reviews.llvm.org/D37987
llvm-svn: 313564
|
|
|
|
|
|
|
|
|
|
| |
We already have a combine for this pattern when the input to shl is add, so we just need to enable the transformation when the input is or.
Original patch by @tstellar
Differential Revision: https://reviews.llvm.org/D19325
llvm-svn: 313251
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes some combine issues for AMDGPU where we weren't
getting the many extract_vector_elt combines expected
in a future patch.
This should really be checking isOperationLegalOrCustom on
the extract. That improves a number of x86 lit tests, but
a few get stuck in an infinite loop from one place
where a similar looking extract is created. I have a
different workaround in the backend for that which
keeps many of those improvements, but also adds a few
regressions.
llvm-svn: 312730
|
|
|
|
|
|
| |
we don't create a BUILD_VECTOR with an illegal type after type legalization.
llvm-svn: 312621
|
|
|
|
|
|
|
|
|
|
| |
The function combineShuffleToVectorExtend in DAGCombine might generate an illegal typed node after "legalize types" phase, causing assertion on non-simple type to fail afterwards.
Adding a type check in case the combine is running after the type legalize pass.
Differential Revision: https://reviews.llvm.org/D37330
llvm-svn: 312438
|
|
|
|
|
|
| |
combining an extract_subvector of a bitcasted build_vector.
llvm-svn: 312253
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The loop dependence check looks for dependencies between store merge
candidates not captured by the chain sub-DAG doing a check of
predecessors which may be very large. Conservatively bound number of
nodes checked for compilation time. (Resolves PR34326).
Landing on behalf of Nirav Dave to unblock the 5.0.0 release.
Differential Revision: https://reviews.llvm.org/D37220
llvm-svn: 312022
|
|
|
|
|
|
|
|
|
|
| |
into smaller BUILD_VECTORs
Only do this before operations are legalized of BUILD_VECTOR is Legal for the target.
Differential Revision: https://reviews.llvm.org/D37186
llvm-svn: 311892
|
|
|
|
|
|
|
|
| |
As noted in the FIXME, this could be improved more, but this is the smallest fix
that helps:
https://bugs.llvm.org/show_bug.cgi?id=34111
llvm-svn: 311853
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If all the operands of a BUILD_VECTOR extract elements from same vector then split the
vector efficiently based on the maximum vector access index.
This will also fix PR 33784
Reviewers: zvi, delena, RKSimon, thakis
Reviewed By: RKSimon
Subscribers: chandlerc, eladcohen, llvm-commits
Differential Revision: https://reviews.llvm.org/D35788
llvm-svn: 311833
|
|
|
|
|
|
|
|
| |
Summary: This reverts commit rL311247.
Differential Revision: https://reviews.llvm.org/D36927
llvm-svn: 311832
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This goes back to a discussion about IR canonicalization. We'd like to preserve and convert
more IR to 'select' than we currently do because that's likely the best choice in IR:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105335.html
...but that's often not true for codegen, so we need to account for this pattern coming in
to the backend and transform it to better DAG ops.
Steps in this patch:
1. Add an EVT param to the existing convertSelectOfConstantsToMath() TLI hook to more finely
enable this transform. Other targets will probably want that anyway to distinguish scalars
from vectors. We're using that here to exclude AVX512 targets, but it may not be necessary.
2. Convert a vselect to ext+add. This eliminates a constant load/materialization, and the
vector ext is often free.
Implementing a more general fold using xor+and can be a follow-up for targets that don't have
a legal vselect. It's also possible that we can remove the TLI hook for the special case fold
implemented here because we're eliminating a constant, but it needs to be tested on other
targets.
Differential Revision: https://reviews.llvm.org/D36840
llvm-svn: 311731
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When one operand is a user of another in a promoted binary operation
we may replace and delete the returned value before returning
triggering an assertion. Reorder node replacements to prevent this.
Fixes PR34137.
Landing on behalf of Nirav.
Differential Revision: https://reviews.llvm.org/D36581
llvm-svn: 311623
|
|
|
|
|
|
|
|
|
|
|
|
| |
sized APInt.
This partially reverts r311429 in favor of making ISD::isConstantSplatVector do something not confusing. Turns out the only other user of it was also having to deal with the weird property of it returning a smaller size.
So rather than continue to deal with this quirk everywhere, just make the interface do something sane.
Differential Revision: https://reviews.llvm.org/D37039
llvm-svn: 311510
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If all the operands of a BUILD_VECTOR extract elements from same vector then split the
vector efficiently based on the maximum vector access index.
Reviewers: zvi, delena, RKSimon, thakis
Reviewed By: RKSimon
Subscribers: chandlerc, eladcohen, llvm-commits
Differential Revision: https://reviews.llvm.org/D35788
llvm-svn: 311255
|
|
|
|
|
|
|
|
| |
Summary: This reverts commit rL311247.
Differential Revision: https://reviews.llvm.org/D36927
llvm-svn: 311252
|
|
|
|
| |
llvm-svn: 311247
|
|
|
|
|
|
|
|
|
|
|
|
| |
post rebase."
Summary:
This reverts commit rL311242.
Differential Revision: https://reviews.llvm.org/D36924
llvm-svn: 311246
|
|
|
|
| |
llvm-svn: 311242
|
|
|
|
|
|
| |
code and what they need to be to make sense. NFC
llvm-svn: 311144
|
|
|
|
|
|
| |
c)) -> x << c
llvm-svn: 311083
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
operand are the same.
Summary: It is creating useless work as the commuted nodes is the same as the node we are working on in that case.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33840
llvm-svn: 310832
|
|
|
|
|
|
|
|
| |
(REAPPLIED)"
This reverts commit r310782.
llvm-svn: 310822
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not.
For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors.
Reviewers: RKSimon, zvi, efriedma
Reviewed By: RKSimon
Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D36649
llvm-svn: 310793
|
|
|
|
|
|
|
|
|
|
|
|
| |
If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index.
Reapplied with fix to only work with simple value types.
Committed on behalf of @jbhateja (Jatin Bhateja)
Differential Revision: https://reviews.llvm.org/D35788
llvm-svn: 310782
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous rev (r310208) failed to account for overflow when subtracting the
constants to see if they're suitable for shift/lea. This version add a check
for that and more test were added in r310490.
We can convert any select-of-constants to math ops:
http://rise4fun.com/Alive/d7d
For this patch, I'm enhancing an existing x86 transform that uses fake multiplies
(they always become shl/lea) to avoid cmov or branching. The current code misses
cases where we have a negative constant and a positive constant, so this is just
trying to plug that hole.
The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start
with a select in IR, create a select DAG node, convert it into a sext, convert it
back into a select, and then lower it to sext machine code.
Some notes about the test diffs:
1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR.
2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. We could avoid the
push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a
post-DAG problem though.
3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if
that's a regression, but those would always be nearly equivalent.
4. pr22338.ll and sext-i1.ll - These tests have undef operands, so we don't actually care about these diffs.
5. sbb.ll - This shows a win for what is likely a common case: choose -1 or 0.
6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again.
7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops.
Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass.
Differential Revision: https://reviews.llvm.org/D35340
llvm-svn: 310717
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix insert_subvector / extract_subvector merges of bitcast values.
Reviewers: efriedma, craig.topper, RKSimon
Subscribers: RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D34571
llvm-svn: 310711
|
|
|
|
|
|
|
|
| |
rL310372 enabled simplifyShuffleMask to support undef shuffle mask inputs, but its causing hangs.
Removing support until I can triage the problem
llvm-svn: 310699
|
|
|
|
|
|
| |
This reverts commit r310648 which causes an unexpected assertion failure
llvm-svn: 310659
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Allow stores of bitcastable types to be merged by peeking through BITCAST nodes and recasting stored values constant and vector extract nodes as necessary.
Reviewers: jyknight, hfinkel, efriedma, RKSimon, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34569
llvm-svn: 310655
|
|
|
|
| |
llvm-svn: 310648
|
|
|
|
| |
llvm-svn: 310608
|
|
|
|
| |
llvm-svn: 310474
|
|
|
|
| |
llvm-svn: 310405
|
|
|
|
| |
llvm-svn: 310404
|
|
|
|
|
|
|
|
| |
well as BUILD_VECTOR
Minor extension to D36393
llvm-svn: 310372
|
|
|
|
|
|
|
|
|
|
| |
UNDEF
Fixes one of the cases in PR34041.
Differential Revision: https://reviews.llvm.org/D36393
llvm-svn: 310344
|
|
|
|
| |
llvm-svn: 310264
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Relanding after case to insert explicit truncation as necessary.
Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to
EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally
improves vector shuffle computations.
Reviewers: efriedma, RKSimon, spatel
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D35566
llvm-svn: 310256
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can convert any select-of-constants to math ops:
http://rise4fun.com/Alive/d7d
For this patch, I'm enhancing an existing x86 transform that uses fake multiplies
(they always become shl/lea) to avoid cmov or branching. The current code misses
cases where we have a negative constant and a positive constant, so this is just
trying to plug that hole.
The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start
with a select in IR, create a select DAG node, convert it into a sext, convert it
back into a select, and then lower it to sext machine code.
Some notes about the test diffs:
1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR.
2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. I
think we could avoid the push/pop in some cases if we used 'movzbl %al' instead of an xor on
a different reg? That's a post-DAG problem though.
3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if
that's a regression, but I think those would always be nearly equivalent.
4. pr22338.ll and sext-i1.ll - These tests have undef operands, so I don't think we actually care about these diffs.
5. sbb.ll - This shows a win for what I think is a common case: choose -1 or 0.
6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again.
7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops.
Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass.
Differential Revision: https://reviews.llvm.org/D35340
llvm-svn: 310208
|
|
|
|
| |
llvm-svn: 310118
|
|
|
|
|
|
|
|
|
|
| |
If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index.
Committed on behalf of @jbhateja (Jatin Bhateja)
Differential Revision: https://reviews.llvm.org/D35788
llvm-svn: 310058
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove restriction disallowing merging of stores vector loads into
larger store of larger vector load.
Reviewers: RKSimon, efriedma, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36158
llvm-svn: 309951
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During store merge we construct a sorted list of consecutive store
candidates and consider subsequences for merging into a single
store. For each subsequence we check if the stored value type is legal
the merged store would have valid and fast and if the constructed
value to be stored is valid. The only properties that affect this
check between subsequences is the size of the subsequence, the
alignment of the first store, the alignment of the stored load value
(when merging stores-of-loads), and whether the merged value is a
constant zero.
If we do not find a viable mergeable subsequence starting from the
first store of length N, we know that a subsequence starting at a
later store of length N will also fail unless the new store's
alignment, the new load's alignment (if we're merging store-of-loads),
or we've dropped stores of nonzero value and could construct a merged
stores of zero (for merging constants).
As a result if we fail to find a valid subsequence starting from the
first store we can safely skip considering subsequences that start
with subsequent stores unless one of the above properties is
true. This significantly (2x) improves compile time in some
pathological cases.
Reviewers: RKSimon, efriedma, zvi, spatel, waltl
Subscribers: grandinj, llvm-commits
Differential Revision: https://reviews.llvm.org/D35901
llvm-svn: 309830
|
|
|
|
|
|
| |
Distribute various expressions across ifs.
llvm-svn: 309777
|