| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
| |
This lead to errors when dumping binaries with v4 and v5 units linked
together (but could've also errored on v5 units that did/didn't use
str_offsets).
Also improves error handling and messages around invalid str_offsets
contributions.
llvm-svn: 361683
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a few places in getInstrMapping, we check if use/def instructions for the
instruction we're mapping have floating point constraints.
We can improve this check and reduce the number of copies in GISel-compiled code
if we make a couple observations:
- For a def instruction, it only matters if the def instruction must always
output a value stored on a FPR
- For a use instruction, it only matters if the use instruction must always
only take in values stored in FPRs
This adds two new functions:
- onlyUsesFP
- onlyDefinesFP
Then we can use those when we're checking the uses/defs instead.
Without this patch, the load, unmerge, store, and select in the added test
would have unnecessary copies.
Differential Revision: https://reviews.llvm.org/D62426
llvm-svn: 361679
|
| |
|
|
|
|
|
|
|
| |
Factor it out into a function, and replace places where we had the same check
with the new function.
Differential Revision: https://reviews.llvm.org/D62421
llvm-svn: 361677
|
| |
|
|
|
|
|
|
|
| |
This adds `-parent-recurse-depth` which limits the number of parent DIEs
being dumped.
Differential revision: https://reviews.llvm.org/D62359
llvm-svn: 361671
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary:dd
This patch implements call lowering for calls without parameters
on AIX as initial support.
Reviewers: sfertile, hubert.reinterpretcast, aheejin, efriedma
Differential Revision: https://reviews.llvm.org/D61948
llvm-svn: 361669
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fcsel and csel instructions differ in only the register banks they work on.
So, they're entirely interchangeable otherwise.
With this in mind, this does two things:
- Teach AArch64RegisterBankInfo to consider the inputs to G_SELECT as well as
the outputs.
- Teach it to choose the best register bank mapping based off the constraints
of the inputs and outputs.
The "best" in this case means the one that requires the smallest number of
copies to properly emit a fcsel/csel.
For example, if the inputs are all already going to be on FPRs, we should
emit a fcsel, even if the output is a GPR. This costs one copy to produce the
result, but saves us from copying the inputs into GPRs.
Also update the regbank-select.mir to check that we end up with the right
select instruction.
Differential Revision: https://reviews.llvm.org/D62267
llvm-svn: 361665
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
It looks like since INLINEASM_BR was created off of INLINEASM, a few
checks for INLINEASM needed to be updated to check for either case.
pr/41999
Reviewers: t.p.northover, peter.smith
Reviewed By: peter.smith
Subscribers: craig.topper, javed.absar, kristof.beyls, hiraditya, llvm-commits, peter.smith, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62402
llvm-svn: 361661
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We were observing failures for arm32 allyesconfigs of the Linux kernel
with the asm goto Clang patch, where ldr's were being generated to
offsets too far away to encode in imm12.
It looks like since INLINEASM_BR was created off of INLINEASM, a few
checks for INLINEASM needed to be updated to check for either case.
pr/41999
Link: https://github.com/ClangBuiltLinux/linux/issues/490
Reviewers: peter.smith, kristof.beyls, ostannard, rengolin, t.p.northover
Reviewed By: peter.smith
Subscribers: jyu2, javed.absar, hiraditya, llvm-commits, nathanchance, craig.topper, kees, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62400
llvm-svn: 361659
|
| |
|
|
|
|
|
| |
If some lanes weren't active on entry to the function, this could
clobber their VGPR values.
llvm-svn: 361655
|
| |
|
|
|
|
|
| |
This was skipping GetUnderlyingObject for nonprivate addresses, but an
alloca could also be found through an addrspacecast if it's flat.
llvm-svn: 361649
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
values according to the divergence.
Details: To make instruction selection really divergence driven it is necessary to assign
the correct register classes to the cross block values beforehand. For the divergent targets
same value type requires different register classes dependent on the value divergence.
Reviewers: rampitec, nhaehnle
Differential Revision: https://reviews.llvm.org/D59990
llvm-svn: 361644
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the situation, where we generate the following code:
crxor 8, 8, 8
< Some instructions>
.LBB0_1:
< Some instructions>
cror 1, 8, 8
cror (COPY of CRbit) depends on the result of the crxor instruction.
CR8 is known to be zero as crxor is equivalent to CRUNSET. We can simply use
crxor 1, 1, 1 instead to zero out CR1, which does not have any dependency on
any previous instruction.
This patch will optimize it to:
< Some instructions>
.LBB0_1:
< Some instructions>
cror 1, 1, 1
Patch By: Victor Huang (NeHuang)
Differential Revision: https://reviews.llvm.org/D62044
llvm-svn: 361632
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Patch adds support for the SVE2 character match instructions MATCH and NMATCH.
The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D62206
llvm-svn: 361627
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Patch adds support for the following instructions:
SVE2 bitwise shift right narrow:
* SQSHRUNB, SQSHRUNT, SQRSHRUNB, SQRSHRUNT, SHRNB, SHRNT, RSHRNB, RSHRNT,
SQSHRNB, SQSHRNT, SQRSHRNB, SQRSHRNT, UQSHRNB, UQSHRNT, UQRSHRNB,
UQRSHRNT
SVE2 integer add/subtract narrow high part:
* ADDHNB, ADDHNT, RADDHNB, RADDHNT, SUBHNB, SUBHNT, RSUBHNB, RSUBHNT
SVE2 saturating extract narrow:
* SQXTNB, SQXTNT, UQXTNB, UQXTNT, SQXTUNB, SQXTUNT
The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D62205
llvm-svn: 361624
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Patch adds support for the following instructions:
SVE2 bitwise shift and insert:
* SRI, SLI
SVE2 bitwise shift right and accumulate:
* SSRA, USRA, SRSRA, URSRA
SVE2 complex integer add:
* CADD, SQCADD
SVE2 integer absolute difference and accumulate:
* SABA, UABA
SVE2 integer absolute difference and accumulate long:
* SABALB, SABALT, UABALB, UABALT
SVE2 integer add/subtract long with carry:
* ADCLB, ADCLT, SBCLB, SBCLT
The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D62204
llvm-svn: 361622
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the overridable TargetLowering::getTargetConstantFromLoad function which allows targets to return any constant value loaded by a LoadSDNode node - only X86 makes use of this so far but everything should be in place for other targets.
computeKnownBits then uses this function to improve codegen, notably vector code after legalization.
A future commit will do the same for ComputeNumSignBits but computeKnownBits sees the bigger benefit.
This required a couple of fixes:
* SimplifyDemandedBits must early-out for getTargetConstantFromLoad cases to prevent infinite loops of constant regeneration (similar to what we already do for BUILD_VECTOR).
* Fix a DAGCombiner::visitTRUNCATE issue as we had trunc(shl(v8i32),v8i16) <-> shl(trunc(v8i16),v8i32) infinite loops after legalization on AVX512 targets.
Differential Revision: https://reviews.llvm.org/D61887
llvm-svn: 361620
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch adds support for the polynomial multiplication instructions
PMULLB/PMULLT. The 64-bit source and 128-bit destination element
variants are enabled with crypto extensions (+sve2-aes), similar to the
NEON PMULL2 instruction. All other variants are enabled with +sve2.
The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D62145
llvm-svn: 361619
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Patch adds support for the following instructions:
SVE2 integer add/subtract long:
* SADDLB, SADDLT, UADDLB, UADDLT, SSUBLB, SSUBLT, USUBLB, USUBLT,
SABDLB, SABDLT, UABDLB, UABDLT
SVE2 integer add/subtract wide:
* SADDWB, SADDWT, UADDWB, UADDWT, SSUBWB, SSUBWT, USUBWB, USUBWT
The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D62142
llvm-svn: 361615
|
| |
|
|
|
|
|
|
|
| |
Just a minor refactoring to use the new helper method
DataLayout::typeSizeEqualsStoreSize(). This is done when
checking if getTypeSizeInBits is equal/non-equal to
getTypeStoreSizeInBits.
llvm-svn: 361613
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch adds support for the SVE2 saturating/rounding bitwise shift
left (predicated) group of instructions:
* SRSHL, URSHL, SRSHLR, URSHLR, SQSHL, UQSHL, SQRSHL, UQRSHL,
SQSHLR, UQSHLR, SQRSHLR, UQRSHLR
Immediate forms of the SQSHL and UQSHL instructions are also added to
the existing SVE bitwise shift by immediate (predicated) group, as well
as three new instructions SRSHR/URSHR/SQSHLU. The new instructions in
this group are encoded similarly and are implemented using the same
TableGen class with a minimal change (1 bit in encoding).
The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D62140
llvm-svn: 361612
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Patch adds support for the following instructions:
* SQADD, UQADD, SUQADD, USQADD
* SQSUB, UQSUB, SQSUBR, UQSUBR
The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest
Reviewed By: chill
Differential Revision: https://reviews.llvm.org/D62130
llvm-svn: 361611
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change relaxes the checks for hasOnlyUniformBranches such that our
region is uniform if:
1. All conditional branches that are direct children are uniform.
2. And either:
a. All sub-regions are uniform.
b. There is one or less conditional branches among the direct
children.
Differential Revision: https://reviews.llvm.org/D62198
llvm-svn: 361610
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Bit 20 in sve2_int_arith_pred TableGen class was overlapping. The
encodings are not affected as bit 20 is defined by the opc bits
and this was overwriting the earlier error of setting bit 20 to 0.
Raised by Momchil: https://reviews.llvm.org/D62130
Reviewed By: chill
Differential Revision: https://reviews.llvm.org/D62292
llvm-svn: 361609
|
| |
|
|
|
|
|
|
| |
swifterror marks an argument as a register pretending to be a pointer, so we
need a guaranteed mem2reg-like analysis of its uses. Fortunately most of the
infrastructure can be reused from the DAG world.
llvm-svn: 361608
|
| |
|
|
| |
llvm-svn: 361607
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The D45316 introduced the `shouldTransformMulToShiftsAddsSubs` function
to check that breaking down constant multiplications into a series
of shifts, adds, and subs is efficient. Unfortunately, this function
does not check maximum number of steps on all paths of the algorithm.
This patch fixes this bug.
Fix for PR41929.
Differential Revision: https://reviews.llvm.org/D62166
llvm-svn: 361606
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The DeadStoreElimination pass now skips doing
PartialStoreMerging when stores overlap according to
OW_PartialEarlierWithFullLater and at least one of
the stores is having a store size that is different
from the size of the type being stored.
This solves problems seen in
https://bugs.llvm.org/show_bug.cgi?id=41949
for which we in the past could end up with
mis-compiles or assertions.
The content and location of the padding bits is not
formally described (or undefined) in the LangRef
at the moment. So the solution is chosen based on
that we cannot assume anything about the padding bits
when having a store that clobbers more memory than
indicated by the type of the value that is stored
(such as storing an i6 using an 8-bit store instruction).
Fixes: https://bugs.llvm.org/show_bug.cgi?id=41949
Reviewers: spatel, efriedma, fhahn
Reviewed By: efriedma
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62250
llvm-svn: 361605
|
| |
|
|
|
|
|
|
|
|
|
| |
This pass wasn't printing any messages at all, which I find really inconvenient
while debugging/tracing things. It now dumps the before and after of expanded
instructions. It doesn't do this yet for all instructions, but this is a good
start I guess.
Differential Revision: https://reviews.llvm.org/D62297
llvm-svn: 361604
|
| |
|
|
|
|
|
|
|
|
| |
When we are scheduling the load and addi, if all other heuristic didn't take effect,
we will try to schedule the addi before the load, to hide the latency, and avoid the
true dependency added by RA. And this only take effects for Power9.
Differential Revision: https://reviews.llvm.org/D61930
llvm-svn: 361600
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces a wrapper class that re-implements
several mutator methods of SwitchInst to handle changes
of prof branch_weights metadata along with remove/add
switch case methods.
Subsequent patches will use this wrapper to implement
prof branch_weights metadata handling for SwitchInst.
Reviewers: davidx, eraman, reames, chandlerc
Reviewed By: davidx
Differential Revision: https://reviews.llvm.org/D62122
llvm-svn: 361596
|
| |
|
|
|
|
|
|
|
| |
DWARF64 str_offsets header
Rather than trying one and then the other - use the kind of the CU to
select which kind of header to parse.
llvm-svn: 361589
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
On Windows, X8 may be used to pass in the address of an aggregate that
is returned indirectly. Therefore, it should be forwarded to variadic
musttail calls and preserved in thunks.
Fixes PR41997
Reviewers: mgrang, efriedma
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62344
llvm-svn: 361585
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Constant stores of f32 values can create such NvCast nodes.
Reviewers: t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62285
llvm-svn: 361584
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This test case was incorrect because it mixed DWARF32 and DWARF64 for a
single unit (DWARF32 unit referencing a DWARF64 str_offsets section). So
fix enough of the unit parsing for DWARF64 and make the test valid.
(not sure if anyone needs DWARF64 support though - support in
libDebugInfoDWARF has been added piecemeal and LLVM doesn't produce it
at all)
llvm-svn: 361582
|
| |
|
|
|
|
|
| |
It regresses https://bugs.llvm.org/show_bug.cgi?id=38309 (represented
by the testcase test/Transforms/GlobalOpt/globalsra-multigep.ll).
llvm-svn: 361581
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: These were previously causing ISel failures.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62354
llvm-svn: 361577
|
| |
|
|
|
|
|
|
| |
This was partly handled in InstCombine (only the constant
index case), so delete that and zap it more generally in
InstSimplify.
llvm-svn: 361576
|
| |
|
|
|
|
|
|
| |
The out-of-bounds index pattern is handled by InstSimplify,
so the extractelement should be eliminated next time it is
visited.
llvm-svn: 361570
|
| |
|
|
|
|
| |
The out-of-bounds index pattern is handled by InstSimplify.
llvm-svn: 361569
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Mirror tuning option from old pass manager in new pass manager.
Reviewers: chandlerc
Subscribers: mehdi_amini, jlebar, zzheng, dmgreen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61612
llvm-svn: 361560
|
| |
|
|
|
|
|
|
| |
This was part of InstCombine, but it's better placed in
InstSimplify. InstCombine also had an unreachable but weaker
fold for insertelement with undef index, so that is deleted.
llvm-svn: 361559
|
| |
|
|
|
|
|
|
| |
bounds, step, induction variable, and guard branch.
This reverts r361517 (git commit 2049e4dd8f61100f88f14db33bd95d197bcbfbbc)
llvm-svn: 361553
|
| |
|
|
|
|
|
|
|
|
| |
This is no-functional-change-intended currently because the definition
of isBinOp() only includes opcodes that produce 1 value. But if we
share that implementation with isCommutativeBinOp() as proposed in
D62191, then we need to make sure that the callers bail out for
opcodes that they are not prepared to handle correctly.
llvm-svn: 361547
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were assuming a much larger possible per-wave visible stack
allocation than is possible:
https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/faa3ae51388517353afcdaf9c16621f879ef0a59/src/core/runtime/amd_gpu_agent.cpp#L70
Based on this, we can assume the high 15 bits of a frame index or sret
are 0. The frame index value is the per-lane offset, so the maximum
frame index value is MAX_WAVE_SCRATCH / wavesize.
Remove the corresponding subtarget feature and option that made
this configurable.
llvm-svn: 361541
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Mirror tuning option from old pass manager in new pass manager.
Reviewers: chandlerc
Subscribers: jlebar, dmgreen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61618
llvm-svn: 361540
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The refactoring in r360276 moved the `RunSLPVectorization` flag and added the default explicitly. The default should have been `false`, as before.
The new pass manager used to have SLPVectorization on by default, now it's off in opt, and needs D61617 checked in to enable it in clang.
Reviewers: chandlerc
Subscribers: mehdi_amini, jlebar, eraman, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61955
llvm-svn: 361537
|
| |
|
|
|
|
|
|
|
|
|
| |
This is reduced from a fuzzer test:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14890
Usually, demanded elements should be able to simplify shuffle
mask elements that are pointing to undef elements of its source
operands, but that doesn't happen in the test case.
llvm-svn: 361533
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
debug info"
Fixes https://bugs.llvm.org/show_bug.cgi?id=40969
The functions findPotentiallyBlockedCopies and buildCopy are currently not
accounting for the presence of debug instructions. In the former this results
in the optimization not being trigerred, and in the latter results in
inconsistent codegen.
This patch enables the optimization to be performed in a debug build and
ensures the codegen is consistent with non-debug builds.
Patch by Chris Dawson.
Differential Revision: https://reviews.llvm.org/D61680
llvm-svn: 361527
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61037
llvm-svn: 361526
|
| |
|
|
| |
llvm-svn: 361519
|