| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
| |
Will allow re-using the machinery for independent
sets of register allocators.
This will allow AMDGPU to use separate command line
options for the allocator to use for SGPRs separate
from VGPRs.
llvm-svn: 354687
|
| |
|
|
|
|
|
|
| |
This patch factor out the function hasViableTopFallthrough from rotateLoop. It is also enhanced. Original code checks only if there is a block can be placed before current loop top. This patch also checks if the loop top is the most possible successor of its predecessor. The attached test case shows its effect.
Differential Revision: https://reviews.llvm.org/D58393
llvm-svn: 354682
|
| |
|
|
| |
llvm-svn: 354677
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fold a smaller constant store into larger constant stores immediately
preceeding it.
Reviewers: rnk, courbet
Subscribers: javed.absar, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58468
llvm-svn: 354676
|
| |
|
|
|
|
|
|
|
|
|
|
| |
non-byte-sized loads.
When we need to merge two adjacent loads the AND mask for the low piece was still sized for the full src element size. But we didn't have that many bits. The upper bits are already zero due to the SRL. So we can skip the AND if we're going to combine with the high bits.
We do need an AND to clear out any bits from the high part. We were anding the high part before combining with the low part, but it looks like ANDing after the OR gets better results.
So we can just emit the final AND after the optional concatentation is done. That will handling skipping before the OR and get rid of extra high bits after the OR.
llvm-svn: 354655
|
| |
|
|
|
|
|
|
| |
VectorLegalizer::ExpandLoad. NFCI
Remove an if that should always be true. Merge the body of another into the only block that could make the if true.
llvm-svn: 354654
|
| |
|
|
| |
llvm-svn: 354649
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
if the input needs to be promoted. Use that to determine the element type to extract.
Otherwise we end up creating extract_vector_elts that then each need to have their input promoted. This can lead to truncates needing to be emitted for each of those.
But we already emitted any_extends when we legalized the extract_subvector. So now we have pairs of any_extend+trunc that partially cancel. But depending on how DAGCombiner visits them we can get weird results.
By promoting the input at the same time we can create only a single any_extend or truncate.
There's one regression in the vector-narrow-binop.ll case, but that looks easy to fix with a follow up patch.
llvm-svn: 354647
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fold can occur during legalization, so it can fight with promotion
to the larger type. It apparently takes a special sequence and subtarget
to avoid more basic simplifications that would hide the problem.
But there's a bigger question raised here: why does distributeTruncateThroughAnd()
even exist? It duplicates functionality from a more minimal pattern that we
already have. But getting rid of this function requires some preliminary steps.
https://bugs.llvm.org/show_bug.cgi?id=40793
llvm-svn: 354594
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
For AMDGPU, if an operand requires an SGPR but is only available as a
VGPR, a loop needs to be introduced to execute the instruction with
each unique combination of values across all lanes. The rest of the
instructions in the block will be moved to a new block following the
loop. Check if the next instruction's parent changed, and update the
iterators and insertion block if this happened.
Tests will be included in a future patch.
llvm-svn: 354591
|
| |
|
|
|
|
| |
This part introduces the lifetime node.
llvm-svn: 354578
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Add skipFunction to PostRA machine sinking pass.
Reviewers: junbuml
Subscribers: arsenm, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57847
llvm-svn: 354541
|
| |
|
|
|
|
|
| |
This is the 'sub0' (negate) pattern from PR31754:
https://bugs.llvm.org/show_bug.cgi?id=31754
llvm-svn: 354519
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Remove stores that are immediately overwritten by larger
stores.
Reviewers: courbet, rnk
Reviewed By: rnk
Subscribers: javed.absar, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58467
llvm-svn: 354518
|
| |
|
|
|
|
|
|
|
|
|
|
| |
when handling ISD::AND
If the LHS has known zeros, then the RHS immediate mask might have been simplified to remove those bits.
This patch adds a call to computeKnownBits to get the known zeroes to handle that possibility. I left an early out to skip the call if all of the demanded bits are set in the mask.
Differential Revision: https://reviews.llvm.org/D58464
llvm-svn: 354514
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Second part of https://bugs.llvm.org/show_bug.cgi?id=40442.
This adds an extra UnrollVectorOverflowOp() method to SDAG, because
the general UnrollOverflowOp() method can't deal with multiple results.
Additionally we need to expand UMULO/SMULO during vector op
legalization, as it may result in unrolling, which may need additional
type legalization.
Differential Revision: https://reviews.llvm.org/D57997
llvm-svn: 354513
|
| |
|
|
|
|
|
|
| |
explicit AND on the bit position from BT when it has known zeros."
I accidentally committed more than just the test.
llvm-svn: 354499
|
| |
|
|
|
|
|
|
|
|
| |
the bit position from BT when it has known zeros.
If the bit position has known zeros in it, then the AND immediate will likely be optimized to remove bits.
This can prevent GetDemandedBits from recognizing that the AND is unnecessary.
llvm-svn: 354498
|
| |
|
|
|
|
| |
Also complete the set of related operations.
llvm-svn: 354480
|
| |
|
|
| |
llvm-svn: 354477
|
| |
|
|
|
|
|
|
| |
to stack."
This is an NFC.
llvm-svn: 354476
|
| |
|
|
|
|
|
|
|
|
| |
We may leave behind incorrect dead flags on instructions that are CSE'd. Make
sure we remove the dead flags on physical registers to prevent other incorrect
code motion.
Differential Revision: https://reviews.llvm.org/D58115
llvm-svn: 354443
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is a follow-up to r353988 where tryEvict was extended to take last
chance recoloring into account. Now we do the same thing for trySplit and
tryAssign.
Now we always pass a "FixedRegisters" argument to canEvictInterference and
tryEvict so it doesn't need to have a default value anymore.
The need for this was found long ago in an out-of-tree target.
Unfortunately I don't have a reproducer for an in-tree target.
Reviewers: qcolombet, rudkx
Reviewed By: qcolombet, rudkx
Subscribers: rudkx, MatzeB, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58376
llvm-svn: 354439
|
| |
|
|
| |
llvm-svn: 354438
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Rename MemoryIndex to InitFlags and implement logic for determining
data segment layout in ObjectYAML and MC. Also adds a "passive" flag
for the .section assembler directive although this cannot be assembled
yet because the assembler does not support data sections.
Reviewers: sbc100, aardappel, aheejin, dschuff
Subscribers: jgravelle-google, hiraditya, sunfish, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57938
llvm-svn: 354397
|
| |
|
|
|
|
|
|
|
|
| |
Directly use the correct shift amount type if it is possible, and
future-proof the code against vectors. The added test makes sure that
bitwidths that do not fit into the shift amount type do not assert.
Split out from D57997.
llvm-svn: 354359
|
| |
|
|
| |
llvm-svn: 354354
|
| |
|
|
| |
llvm-svn: 354348
|
| |
|
|
| |
llvm-svn: 354345
|
| |
|
|
| |
llvm-svn: 354342
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Legalize/select llvm.ctlz.*
Add select-ctlz to show that we actually select them. Update arm64-clrsb.ll and
arm64-vclz.ll to show that we perform valid transformations in optimized builds,
and document where GISel can improve.
Differential Revision: https://reviews.llvm.org/D58155
llvm-svn: 354299
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The motivating x86 cases for forming the intrinsic are shown in PR31754 and PR40487:
https://bugs.llvm.org/show_bug.cgi?id=31754
https://bugs.llvm.org/show_bug.cgi?id=40487
..and those are shown in the IR test file and x86 codegen file.
Matching the usubo pattern is harder than uaddo because we have 2 independent values rather than a def-use.
This adds a TLI hook that should preserve the existing behavior for uaddo formation, but disables usubo
formation by default. Only x86 overrides that setting for now although other targets will likely benefit
by forming usbuo too.
Differential Revision: https://reviews.llvm.org/D57789
llvm-svn: 354298
|
| |
|
|
| |
llvm-svn: 354293
|
| |
|
|
| |
llvm-svn: 354292
|
| |
|
|
|
|
|
| |
Fixes cases with odd vectors that break into multiple requested size
pieces.
llvm-svn: 354280
|
| |
|
|
|
|
| |
Breaks some bots.
llvm-svn: 354245
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A store to an object whose lifetime is about to end can be removed.
See PR40550 for motivation.
Reviewers: niravd
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D57541
llvm-svn: 354244
|
| |
|
|
|
|
|
|
|
|
|
|
| |
In preparation for supporting vector expansion.
Add an isPostTypeLegalization flag to makeLibCall(), because this
expansion relies on the legalized form using MERGE_VALUES. Drop
the corresponding variant of ExpandLibCall, which is no longer used.
Differential Revision: https://reviews.llvm.org/D58006
llvm-svn: 354226
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Update Flag when generating cc output.
Fixes PR40737.
Reviewers: rnk, nickdesaulniers, craig.topper, spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58283
llvm-svn: 354163
|
| |
|
|
| |
llvm-svn: 354152
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Testing based on the total size of the elements failed to catch a few
invalid scenarios, so explicitly check the number of elements/operands
and types.
This failed to catch situations like
<4 x s16> = G_BUILD_VECTOR s32, s32 since the total size added
up. This also would fail to catch an implicit conversion between
pointers and scalars.
llvm-svn: 354139
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
https://reviews.llvm.org/D58073
Speed up insertion during the initial populating phase into the
GISelWorkList by deferring repeatedly resizing the DenseMap.
This results in ~10% improvement in the combiner passes, and
~3% speedup in the Legalizer.
reviewed by: aemerson.
llvm-svn: 354093
|
| |
|
|
|
|
|
| |
This allows targets to specify the minimum alignment required for the
load/store.
llvm-svn: 354071
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Select G_BR and G_BRCOND for MIPS32.
Unconditional branch G_BR does not have register operand,
for that reason we only add tests.
Since conditional branch G_BRCOND compares register to zero on MIPS32,
explicit extension must be performed on i1 condition in order to set
high bits to appropriate value.
Differential Revision: https://reviews.llvm.org/D58182
llvm-svn: 354022
|
| |
|
|
|
|
|
|
|
| |
While rebasing a refactor in r353950 I accidentally swapped two function
arguments; one is SelectionDAGBuilders "current" DebugLoc, the other is the one
from the "current" debug intrinsic. They're probably always identical, but I
haven't proved that yet.
llvm-svn: 354019
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The declarative tablegen definitions split rules into match and apply steps.
Prepare for that by doing the same in the C++ implementations. This aids
some of the migration effort while the tablegen version is incomplete.
Reviewers: bogner, volkan, aditya_nandakumar, paquette, aemerson
Reviewed By: aditya_nandakumar
Subscribers: rovka, kristof.beyls, Petar.Avramovic, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58150
llvm-svn: 353996
|
| |
|
|
|
|
|
|
| |
interface [NFC]
For D57601, we need to know whether the instruction is volatile. We'd either have to pass yet another parameter, or just standardize on the MMO interface. I chose the second.
llvm-svn: 353989
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Last chance recoloring inserts into FixedRegisters those virtual
registers it is attempting to assign a physical register to.
We must consider these when we consider candidates for eviction so that
we do not end up evicting something while we are attempting to recolor
to assign it.
This is hitting in an out-of-tree target and no longer reproduces on
trunk. That does not appear to be a result of it having been fixed, but
rather, it appears that optimization changes and/or other changes to
register allocation mask the problem.
I haven't found a way to come up with a reasonable test case for this
(i.e. one that I can actually commit to open source, is reasonable
in size, and actually reproduces the issue).
rdar://problem/45708741
llvm-svn: 353988
|
| |
|
|
|
|
| |
The helper function was used by only two callers, and largely ended up providing distinct functionality based on optional arguments and opcode. Inline and simply to make the functionality much more clear.
llvm-svn: 353977
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In this patch SelectionDAG tries to salvage any dbg.values that are going to be
dropped, in case they can be recovered from Values in the current BB. It also
strengthens SelectionDAGs handling of dangling debug data, so that dbg.values
are *always* emitted (as Undef or otherwise) instead of dangling forever.
The motivation behind this patch exists in the new test case: a memory address
(here a bitcast and GEP) exist in one basic block, and a dbg.value referring to
the address is left in the 'next' block. The base pointer is live across all
basic blocks. In current llvm trunk the dbg.value cannot be encoded, and it
isn't even emitted as an Undef DBG_VALUE.
The change is simply: if we're definitely going to drop a dbg.value, repeatedly
apply salvageDebugInfo to its operand until either we find something that can
be encoded, or we can't salvage any further in which case we produce an Undef
DBG_VALUE. To know when we're "definitely going to drop a dbg.value",
SelectionDAG signals SelectionDAGBuilder when all IR instructions have been
encoded to force salvaging. This ensures that any dbg.value that's dangling
after DAG creation will have a corresponding DBG_VALUE encoded.
Differential Revision: https://reviews.llvm.org/D57694
llvm-svn: 353954
|