| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
| |
into account when determining the shift amount VT.
If the target's preferred shift amount VT can't hold any shift
amount for the promoted VT, we should use i32. The specific shift
amount shouldn't matter. The type will be adjusted later when the
shift itself is type legalized. This avoids an assert in getNode.
Fixes PR43820.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
[0|1], [EQ|NE] -> SETCC" combine.
This combine is only valid if the inner setcc produces a 0/1 result
or the inner type is MVT::i1.
I haven't seen this cause any issues, just happened to notice it
while reviewing combines in this function.
While there also fix another call to use the value type from the
SDValue for the operand instead of calling SDNode::getValueType(0).
Though its likely the use is result 0, its not guaranteed.
|
| |
|
|
|
|
|
|
|
|
|
| |
This makes the DAG behavior consistent with IR's extractelement after:
rGb32e4664a715
https://bugs.llvm.org/show_bug.cgi?id=42689
I've tried to maintain test intent for WebAssembly.
The AMDGPU test is trying to test for crashing or other bad behavior,
but I'm not sure if that's possible after this change.
|
| |
|
|
|
| |
Reverting commit b85b4e5a6f8579c137fecb59a4d75d7bfb111f79 due to some
buildbot failures/ out of memory errors.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zext (ctpop X) --> ctpop (zext X)
This is a prerequisite step for canonicalizing in the other direction (narrow the popcount) in IR - PR43688:
https://bugs.llvm.org/show_bug.cgi?id=43688
I'm not sure if any other targets are affected, but I found a missing fold for PPC, so added tests based on that.
The reason we widen all the way to 64-bit in these tests is because the initial DAG looks something like this:
t5: i8 = ctpop t4
t6: i32 = zero_extend t5 <-- created based on IR, but unused node?
t7: i64 = zero_extend t5
Differential Revision: https://reviews.llvm.org/D69127
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add instruction marker to MachineInstr ExtraInfo. This does almost the
same thing as Pre/PostInstrSymbols, except that it doesn't create a label until
printing instructions. This allows for labels to be put around instructions that
are deleted/duplicated somewhere.
Also undo the workaround in r375137.
Reviewers: rnk
Subscribers: MatzeB, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69136
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Ternary expression checks for ISD::ADD instead of ISD::UADDO inside DAGTypeLegalizer::ExpandIntRes_UADDSUBO.
This means the ternary expression will evaluate to ISD::SUBCARRY for both ISD::UADDO and ISD::USUBO nodes.
Targets are likely to implement both, so impact will be very limited in practice.
Reviewers: bogner, lebedev.ri
Reviewed By: lebedev.ri
Subscribers: lebedev.ri, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68123
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'-c' can be folded into the x node."
This broke various Windows builds, see comments on the Phabricator
review.
This also reverts the follow-up 20bf0cf.
> Summary:
> This fold, helps recover from the rest of the D62266 ARM regressions.
> https://rise4fun.com/Alive/TvpC
>
> Note that while the fold is quite flexible, i've restricted it
> to the single interesting pattern at the moment.
>
> Reviewers: efriedma, craig.topper, spatel, RKSimon, deadalnix
>
> Reviewed By: deadalnix
>
> Subscribers: javed.absar, kristof.beyls, llvm-commits
>
> Tags: #llvm
>
> Differential Revision: https://reviews.llvm.org/D62450
|
| |
|
|
|
|
|
|
|
|
|
|
| |
checks (PR43769)
We should do the fold only if both constants are plain,
non-opaque constants, at least that is the DAG.FoldConstantArithmetic()
requirement.
And if the constant we are comparing with is zero - we shouldn't be
trying to do this fold in the first place.
Fixes https://bugs.llvm.org/show_bug.cgi?id=43769
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
into the x node.
Summary:
This fold, helps recover from the rest of the D62266 ARM regressions.
https://rise4fun.com/Alive/TvpC
Note that while the fold is quite flexible, i've restricted it
to the single interesting pattern at the moment.
Reviewers: efriedma, craig.topper, spatel, RKSimon, deadalnix
Reviewed By: deadalnix
Subscribers: javed.absar, kristof.beyls, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62450
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69248
llvm-svn: 375410
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TargetLowering::getABIAlignmentForCallingConv
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69243
llvm-svn: 375407
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(1/2)
Provides a TLI hook to allow targets to relax the emission of shifts, thus enabling
codegen improvements on targets with no multiple shift instructions and cheap selects
or branches.
Contributes to a Fix for PR43559:
https://bugs.llvm.org/show_bug.cgi?id=43559
Patch by: @joanlluch (Joan LLuch)
Differential Revision: https://reviews.llvm.org/D69116
llvm-svn: 375347
|
| |
|
|
|
|
| |
Now X86ISelLowering doesn't depend on many IR analyses.
llvm-svn: 375320
|
| |
|
|
|
|
| |
Only forward declarations are needed here. Follow-on to r375311.
llvm-svn: 375319
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a new ISD node to replicate a scalar value across all elements of
a vector. This is needed for scalable vectors, since BUILD_VECTOR cannot
be used.
Fixes up default type legalization for scalable vectors after the
new MVT type ranges were introduced.
At present I only use this node for scalable vectors. A DAGCombine has
been added to transform a BUILD_VECTOR into a SPLAT_VECTOR if all
elements are the same, but only if the default operation action of
Expand has been overridden by the target.
I've only added result promotion legalization for scalable vector
i8/i16/i32/i64 types in AArch64 for now.
Reviewers: t.p.northover, javed.absar, greened, cameron.mcinally, jmolloy
Reviewed By: jmolloy
Differential Revision: https://reviews.llvm.org/D47775
llvm-svn: 375222
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The default promotion for the add_sat/sub_sat nodes currently does:
ANY_EXTEND iN to iM
SHL by M-N
[US][ADD|SUB]SAT
L/ASHR by M-N
If the promoted add_sat or sub_sat node is not legal, this can produce code
that effectively does a lot of shifting (and requiring large constants to be
materialised) just to use the overflow flag. It is simpler to just do the
saturation manually, using the higher bitwidth addition and a min/max against
the saturating bounds. That is what this patch attempts to do.
Differential Revision: https://reviews.llvm.org/D68926
llvm-svn: 375211
|
| |
|
|
|
|
|
|
|
|
|
| |
Add generic DAG combine for extending masked loads.
Allow us to generate sext/zext masked loads which can access v4i8,
v8i8 and v4i16 memory to produce v4i32, v8i16 and v4i32 respectively.
Differential Revision: https://reviews.llvm.org/D68337
llvm-svn: 375085
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to r374970, but I don't have a test for this.
PromoteTargetBoolean is intended to be use for legalizing an
operand that needs to be promoted. It picks its type based on
the return from getSetccResultType and is intended to be used
when we have freedom to pick the new type. But the return type
we need for WidenVecOp_SETCC is completely determined by the
type of the input node.
llvm-svn: 374972
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PromoteTargetBoolean calls getSetccResultType to get the return
type. But we were passing it the setcc result type rather than the
setcc input type. This causes an issue on X86 with avx512vl where
the setcc result type for vXf16 vectors is vXi16 while the
result type for vXi16 vectors is vXi1.
There's really no guarantee that getSetccResultType is the type
we need here. So now we just grab the extend type from
getExtendForContent and extend to the original result VT of the
node we're splitting.
llvm-svn: 374970
|
| |
|
|
| |
llvm-svn: 374922
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Examples:
i32 X > -1 ? C1 : -1 --> (X >>s 31) | C1
i8 X < 0 ? C1 : 0 --> (X >>s 7) & C1
This is a small generalization of a fold requested in PR43650:
https://bugs.llvm.org/show_bug.cgi?id=43650
The sign-bit of the condition operand can be used as a mask for the true operand:
https://rise4fun.com/Alive/paT
Note that we already handle some of the patterns (isNegative + scalar) because
there's an over-specialized, yet over-reaching fold for that in foldSelectCCToShiftAnd().
It doesn't use any TLI hooks, so I can't easily rip out that code even though we're
duplicating part of it here. This fold is guarded by TLI.convertSelectOfConstantsToMath(),
so it should not cause problems for targets that prefer select over shift.
Also worth noting: I thought we could generalize this further to include the case where
the true operand of the select is not constant, but Alive says that may allow poison to
pass through where it does not in the original select form of the code.
Differential Revision: https://reviews.llvm.org/D68949
llvm-svn: 374902
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch kills off a significant user of the "IsIndirect" field of
DBG_VALUE machine insts. Brought up in in PR41675, IsIndirect is
techncally redundant as it can be expressed by the DIExpression of a
DBG_VALUE inst, and it isn't helpful to have two ways of expressing
things.
Rather than setting IsIndirect, have DBG_VALUE creators add an extra deref
to the insts DIExpression. There should now be no appearences of
IsIndirect=True from isel down to LiveDebugVariables / VirtRegRewriter,
which is ensured by an assertion in LDVImpl::handleDebugValue. This means
we also get to delete the IsIndirect handling in LiveDebugVariables. Tests
can be upgraded by for example swapping the following IsIndirect=True
DBG_VALUE:
DBG_VALUE $somereg, 0, !123, !DIExpression(DW_OP_foo)
With one where the indirection is in the DIExpression, by _appending_
a deref:
DBG_VALUE $somereg, $noreg, !123, !DIExpression(DW_OP_foo, DW_OP_deref)
Which both mean the same thing.
Most of the test changes in this patch are updates of that form; also some
changes in how the textual assembly printer handles these insts.
Differential Revision: https://reviews.llvm.org/D68945
llvm-svn: 374877
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a pass to lower is.constant and objectsize intrinsics
This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.
The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.
Differential Revision: https://reviews.llvm.org/D65280
llvm-svn: 374784
|
| |
|
|
|
|
|
| |
This reverts commit r374743. It broke the build with Ocaml enabled:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218
llvm-svn: 374768
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.
The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.
Differential Revision: https://reviews.llvm.org/D65280
llvm-svn: 374743
|
| |
|
|
|
|
|
|
|
| |
subs
This commit is not extending the promoted integers as it should. Reverting
whilst I look into the details.
llvm-svn: 374592
|
| |
|
|
|
|
|
|
|
|
| |
The diffs suggest that we are missing some more basic
analysis/transforms, but this keeps the vector path in
sync with the scalar (rL374397). This is again a
preliminary step for introducing the reverse transform
in IR as proposed in D63382.
llvm-svn: 374555
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverses the scalar canonicalization proposed in D63382.
Pre: isPowerOf2(C1)
%r = select i1 %cond, i32 C1, i32 0
=>
%z = zext i1 %cond to i32
%r = shl i32 %z, log2(C1)
https://rise4fun.com/Alive/Z50
x86 already tries to fold this pattern, but it isn't done
uniformly, so we still see a diff. AArch64 probably should
enable the TLI hook to benefit too, but that's a follow-on.
llvm-svn: 374397
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The default promotion for the add_sat/sub_sat nodes currently does:
1. ANY_EXTEND iN to iM
2. SHL by M-N
3. [US][ADD|SUB]SAT
4. L/ASHR by M-N
If the promoted add_sat or sub_sat node is not legal, this can produce code
that effectively does a lot of shifting (and requiring large constants to be
materialised) just to use the overflow flag. It is simpler to just do the
saturation manually, using the higher bitwidth addition and a min/max against
the saturating bounds. That is what this patch attempts to do.
Differential Revision: https://reviews.llvm.org/D68643
llvm-svn: 374373
|
| |
|
|
| |
llvm-svn: 374370
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: It ensures that the bswap is generated even when a part of the subtree already matches a bswap transform.
Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68250
llvm-svn: 374340
|
| |
|
|
|
|
|
|
|
|
|
|
| |
As background, starting in D66309, I'm working on support unordered atomics analogous to volatile flags on normal LoadSDNode/StoreSDNodes for X86.
As part of that, I spent some time going through usages of LoadSDNode and StoreSDNode looking for cases where we might have missed a volatility check or need an atomic check. I couldn't find any cases that clearly miscompile - i.e. no test cases - but a couple of pieces in code loop suspicious though I can't figure out how to exercise them.
This patch adds defensive checks and asserts in the places my manual audit found. If anyone has any ideas on how to either a) disprove any of the checks, or b) hit the bug they might be fixing, I welcome suggestions.
Differential Revision: https://reviews.llvm.org/D68419
llvm-svn: 374261
|
| |
|
|
|
|
|
|
| |
Add own version of the mathematical constants from the upcoming C++20 `std::numbers`.
Differential revision: https://reviews.llvm.org/D68257
llvm-svn: 374207
|
| |
|
|
|
|
|
|
|
|
|
| |
Earlier in the year intrinsics for lrint, llrint, lround and llround were
added to llvm. The constrained versions are now implemented here.
Reviewed by: andrew.w.kaylor, craig.topper, cameron.mcinally
Approved by: craig.topper
Differential Revision: https://reviews.llvm.org/D64746
llvm-svn: 373900
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
load (PR43217)
If a fp scalar is loaded and then used as both a scalar and a vector broadcast, perform the load as a broadcast and then extract the scalar for 'free' from the 0th element.
This involved switching the order of the X86ISD::BROADCAST combines so we only convert to X86ISD::BROADCAST_LOAD once all other canonicalizations have been attempted.
Adds a DAGCombinerInfo::recursivelyDeleteUnusedNodes wrapper.
Fixes PR43217
Differential Revision: https://reviews.llvm.org/D68544
llvm-svn: 373871
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
split a setcc condition if the setcc input is legal and vXi1 conditions are supported
Summary: The VSELECT splitting code tries to split a setcc input as well. But on avx512 where mask registers are well supported it should be better to just split the mask and use a single compare.
Reviewers: RKSimon, spatel, efriedma
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68359
llvm-svn: 373863
|
| |
|
|
|
|
|
|
| |
This reverts r373850 (git commit 25ba49824d2d4f2347b4a7cb1623600a76ce9433)
This patch appears to cause multiple codegen regression test failures - http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/10680
llvm-svn: 373853
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: It ensures that the bswap is generated even when a part of the subtree already matches a bswap transform.
Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68250
llvm-svn: 373850
|
| |
|
|
|
|
| |
Fixes PR43575.
llvm-svn: 373836
|
| |
|
|
|
|
|
|
|
|
| |
This is an omission in rL371441. Loads which happened to be unordered weren't being added to the PendingLoad set, and thus weren't be ordered w/respect to side effects which followed before the end of the block.
Included test case is how I spotted this. We had an atomic load being folded into a using instruction after a fence that load was supposed to be ordered with. I'm sure it showed up a bunch of other ways as well.
Spotted via manual inspecting of assembly differences in a corpus w/and w/o the new experimental mode. Finding this with testing would have been "unpleasant".
llvm-svn: 373814
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
InstrEmitter's virtual register handling assumes that clones are emitted
after the cloned node. Make sure this assumption actually holds.
Fixes a "Node emitted out of order - early" assertion on the testcase.
This is probably a very rare case to actually hit in practice; even
without the explicit edge, the scheduler will usually end up scheduling
the nodes in the expected order due to other constraints.
Differential Revision: https://reviews.llvm.org/D68068
llvm-svn: 373782
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed on llvm-dev and:
https://bugs.llvm.org/show_bug.cgi?id=43542
...we have transforms that assume shift operations are legal and transforms to
use them are profitable, but that may not hold for simple targets.
In this case, the MSP430 target custom lowers shifts by repeating (many)
simpler/fixed ops. That can be avoided by keeping this code as setcc/select.
Differential Revision: https://reviews.llvm.org/D68397
llvm-svn: 373666
|
| |
|
|
|
|
|
|
| |
SplitVecRes_SETCC in SplitRes_SELECT.
No point in manually splitting the SETCC if it was already done.
llvm-svn: 373535
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
default is unreachable (PR43129)"
This was reverted in r373454 due to breaking the expensive-checks bot.
This version addresses that by omitting the addSuccessorWithProb() call
when omitting the range check.
> Switch lowering: omit range check for bit tests when default is unreachable (PR43129)
>
> This is modeled after the same functionality for jump tables, which was
> added in r357067.
>
> Differential revision: https://reviews.llvm.org/D68131
llvm-svn: 373477
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
is unreachable (PR43129)"
This broke http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/19967
> Switch lowering: omit range check for bit tests when default is unreachable (PR43129)
>
> This is modeled after the same functionality for jump tables, which was
> added in r357067.
>
> Differential revision: https://reviews.llvm.org/D68131
llvm-svn: 373454
|
| |
|
|
|
|
|
|
|
|
|
| |
(PR43129)
This is modeled after the same functionality for jump tables, which was
added in r357067.
Differential revision: https://reviews.llvm.org/D68131
llvm-svn: 373431
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR42863)
This patch converts the DAGCombine isNegatibleForFree/GetNegatedExpression into overridable TLI hooks.
The intention is to let us extend existing FNEG combines to work more generally with negatible float ops, allowing it work with target specific combines and opcodes (e.g. X86's FMA variants).
Unlike the SimplifyDemandedBits, we can't just handle target nodes through a Target callback, we need to do this as an override to allow targets to handle generic opcodes as well. This does mean that the target implementations has to duplicate some checks (recursion depth etc.).
Partial reversion of rL372756 - I've identified the infinite loop issue inside the X86 override but haven't fixed it yet so I've only (re)committed the common TargetLowering refactoring part of the patch.
Differential Revision: https://reviews.llvm.org/D67557
llvm-svn: 373343
|
| |
|
|
|
|
|
|
|
|
|
| |
Replace with the MachineFunction. X86 is the only user, and only uses
it for the function. This removes one obstacle from using this in
GlobalISel. The other is the more tolerable EVT argument.
The X86 use of the function seems questionable to me. It checks hasFP,
before frame lowering.
llvm-svn: 373292
|