| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
| |
This patch adds vecreduce_smax, vecredude_umax, vecreduce_smin, vecreduce_umin and selection for vmaxv and minv.
Differential Revision: https://reviews.llvm.org/D66413
llvm-svn: 371827
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: spatel, asbirlea, craig.topper
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D67521
llvm-svn: 371819
|
| |
|
|
|
|
|
|
|
| |
Follow-up of rL371321 that added some more FP16 FMA patterns, and an attempt to
reduce the copy-pasting and make this more readable.
Differential Revision: https://reviews.llvm.org/D67403
llvm-svn: 371818
|
| |
|
|
|
|
|
|
| |
This was added to support fp128 on x86-64, but appears to be
unneeded now. This may be because the FR128 register class
added back then was merged with the VR128 register class later.
llvm-svn: 371815
|
| |
|
|
|
|
| |
llvm.amdgcn.else hits this.
llvm-svn: 371812
|
| |
|
|
| |
llvm-svn: 371811
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D61884
llvm-svn: 371810
|
| |
|
|
|
|
| |
This reverts commit 1c340c62058d4115d21e5fa1ce3a0d094d28c792.
llvm-svn: 371809
|
| |
|
|
| |
llvm-svn: 371808
|
| |
|
|
| |
llvm-svn: 371807
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D61884
llvm-svn: 371806
|
| |
|
|
| |
llvm-svn: 371803
|
| |
|
|
|
|
|
|
| |
dead defs".
It reveals a miscompile on Hexagon. See PR43302 for details.
llvm-svn: 371802
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unlike SelectionDAG, treat this as a normally legalizable operation.
In SelectionDAG this is supposed to only ever formed if it's legal,
but I've found that to be restricting. For AMDGPU this is contextually
legal depending on whether denormal flushing is allowed in the use
function.
Technically we currently treat the denormal mode as a subtarget
feature, so custom lowering could be avoided. However I consider this
to be a defect, and this should be contextually dependent on the
controllable rounding mode of the parent function.
llvm-svn: 371800
|
| |
|
|
| |
llvm-svn: 371798
|
| |
|
|
|
|
|
|
| |
The result integer does not need to be the same width as the input.
AMDGPU, NVPTX, and Hexagon all have patterns working around the types
matching. GlobalISel defines these as being different type indexes.
llvm-svn: 371797
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This testcase is invalid, and caught by the verifier. For the verifier
to catch it, the live interval computation needs to complete. Remove
the assert so the verifier catches this, which is less confusing.
In this testcase there is an undefined use of a subregister, and lanes
which aren't used or defined. An equivalent testcase with the
super-register shrunk to have no untouched lanes already hit this
verifier error.
llvm-svn: 371792
|
| |
|
|
|
|
|
|
|
| |
This was relying on the SGPR usable for the carry out clobber to also
be used for the input. There was no carry out on gfx9. With no carry
out clobber to worry about, so the literal can just be directly used
with a VOP2 add.
llvm-svn: 371791
|
| |
|
|
|
|
| |
Implement the TODO from D66318.
llvm-svn: 371789
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Swiftself uses a callee-saved register. We can tail call when the register used
in the caller and callee is the same.
This behaviour is equivalent to that in `TargetLowering::parametersInCSRMatch`.
Update call-translator-tail-call.ll to verify that we can do this. When we
support inline assembly, we can write a check similar to the one in the
general swiftself.ll. For now, we need to verify that we get the correct COPY
instruction after call lowering.
Differential Revision: https://reviews.llvm.org/D67511
llvm-svn: 371788
|
| |
|
|
|
|
|
|
|
|
| |
to isVolatile
This is the first sweep of generic code to add isAtomic bailouts where appropriate. The intention here is to have the switch from AtomicSDNode to LoadSDNode/StoreSDNode be close to NFC; that is, I'm not looking to allow additional optimizations at this time. That will come later. See D66309 for context.
Differential Revision: https://reviews.llvm.org/D66318
llvm-svn: 371786
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for lowering sibling calls with outgoing arguments.
e.g
```
define void @foo(i32 %a)
```
Support is ported from AArch64ISelLowering's `isEligibleForTailCallOptimization`.
The only thing that is missing is a full port of
`TargetLowering::parametersInCSRMatch`. So, if we're using swiftself,
we'll never tail call.
- Rename `analyzeCallResult` to `analyzeArgInfo`, since the function is now used
for both outgoing and incoming arguments
- Teach `OutgoingArgHandler` about tail calls. Tail calls use frame indices for
stack arguments.
- Teach `lowerFormalArguments` to set the bytes in the caller's stack argument
area. This is used later to check if the tail call's parameters will fit on
the caller's stack.
- Add `areCalleeOutgoingArgsTailCallable` to perform the eligibility check on
the callee's outgoing arguments.
For testing:
- Update call-translator-tail-call to verify that we can now tail call with
outgoing arguments, use G_FRAME_INDEX for stack arguments, and respect the
size of the caller's stack
- Remove GISel-specific check lines from speculation-hardening.ll, since GISel
now tail calls like the other selectors
- Add a GISel test line to tailcall-string-rvo.ll since we can tail call in that
test now
- Add a GISel test line to tailcall_misched_graph.ll since we tail call there
now. Add specific check lines for GISel, since the debug output from the
machine-scheduler differs with GlobalISel. The dependency still holds, but
the output comes out in a different order.
Differential Revision: https://reviews.llvm.org/D67471
llvm-svn: 371780
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
register class.
Summary:
Since the SPE4RC register class contains an identical set of registers
and an identical spill size to the GPRC class its slightly confusing
the tablegen emitter. It's preventing the GPRC_and_GPRC_NOR0 synthesized
register class from inheriting VTs and AltOrders from GPRC or GPRC_NOR0.
This is because SPE4C is found first in the super register class list
when inheriting these properties and it doesn't set the VTs or
AltOrders the same way as GPRC or GPRC_NOR0.
This patch replaces all uses of GPE4RC with GPRC and allows GPRC and
GPRC_NOR0 to contain f32.
The test changes here are because the AltOrders are being inherited
to GPRC_NOR0 now.
Found while trying to determine if getCommonSubClass needs to take
a VT argument. It was originally added to support fp128 on x86-64,
I've changed some things about that so that it might be needed
anymore. But a PowerPC test crashed without it and I think its
due to this subclass issue.
Reviewers: jhibbits, nemanjai, kbarton, hfinkel
Subscribers: wuzish, nemanjai, mehdi_amini, hiraditya, kbarton, MaskRay, dexonsmith, jsji, shchenz, steven.zhang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67513
llvm-svn: 371779
|
| |
|
|
|
|
|
|
| |
We were failing to compute trip counts (both exact and maximum) for any loop which involved a comparison against either an umin or smin. It looks like this simply got missed when we added smin/umin to SCEV. (Note: umin was submitted separately earlier today. Turned out two folks hit this at the same time.)
Differential Revision: https://reviews.llvm.org/D67514
llvm-svn: 371776
|
| |
|
|
|
|
|
|
|
|
|
| |
can exclude fp128 compares.
The X86 decision assumes the compare will produce a result in an XMM
register, but that can't happen for an fp128 compare since those
go to a libcall the returns an i32. Pass the VT so X86 can check
the type.
llvm-svn: 371775
|
| |
|
|
|
|
|
|
|
| |
Expanding the folding of `nearbyint()`, `rint()` and `trunc()` to library
functions, in addition to the current support for intrinsics.
Differential revision: https://reviews.llvm.org/D67468
llvm-svn: 371774
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
r255558.
This code was changed to accomodate fp128 being softened to itself
during type legalization on x86-64. This was done in order to create
libcalls while having fp128 as a legal type. We're now doing the
libcall creation during LegalizeDAG and the type legalization changes
to enable the old behavior have been removed. So this change to
SelectionDAGBuilder is no longer needed.
llvm-svn: 371771
|
| |
|
|
| |
llvm-svn: 371770
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for SCEVUMinExpr to getRangeRef,
similar to the support for SCEVUMaxExpr.
Reviewers: sanjoy.google, efriedma, reames, nikic
Reviewed By: sanjoy.google
Differential Revision: https://reviews.llvm.org/D67177
llvm-svn: 371768
|
| |
|
|
| |
llvm-svn: 371761
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Resolves PR38513.
Credit to @bjope for debugging this.
Reviewers: hfinkel, uabelho, bjope
Subscribers: sanjoy.google, bjope, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67417
llvm-svn: 371752
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Pass MSSAU to makeLoopInvariant in order to properly update MSSA.
Reviewers: george.burgess.iv
Subscribers: Prazek, sanjoy.google, uabelho, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67470
llvm-svn: 371748
|
| |
|
|
|
|
|
|
|
|
| |
Implement a TODO from rL371452, and handle loop invariant addresses in predicated blocks. If we can prove that the load is safe to speculate into the header, then we can avoid using a masked.load in favour of a normal load.
This is mostly about vectorization robustness. In the common case, it's generally expected that LICM/LoadStorePromotion would have eliminated such loads entirely.
Differential Revision: https://reviews.llvm.org/D67372
llvm-svn: 371745
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In MVE, as of rL371218, we are attempting to sink chains of instructions such as:
%l1 = insertelement <8 x i8> undef, i8 %l0, i32 0
%broadcast.splat26 = shufflevector <8 x i8> %l1, <8 x i8> undef, <8 x i32> zeroinitializer
In certain situations though, we can end up breaking the dominance relations of
instructions. This happens when we sink the instruction into a loop, but cannot
remove the originals. The Use is updated, which might in fact be a Use from the
second instruction to the first.
This attempts to fix that by reversing the order of instruction that are sunk,
and ensuring that we update the uses on new instructions if they have already
been sunk, not the old ones.
Differential Revision: https://reviews.llvm.org/D67366
llvm-svn: 371743
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet, JDevlieghere, alexshap, rupprecht, jhenderson
Subscribers: sdardis, nemanjai, hiraditya, kbarton, jakehehrlich, jrtc27, MaskRay, atanasyan, jsji, seiya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D67499
llvm-svn: 371742
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Folding for fma/fmuladd was added here:
rL202914
...and as seen in existing/unchanged tests, that works to propagate NaN
if it's already an input, but we should fold an fma() that creates NaN too.
From IEEE-754-2008 7.2 "Invalid Operation", there are 2 clauses that apply
to fma, so I added tests for those patterns:
c) fusedMultiplyAdd: fusedMultiplyAdd(0, ∞, c) or fusedMultiplyAdd(∞, 0, c)
unless c is a quiet NaN; if c is a quiet NaN then it is implementation
defined whether the invalid operation exception is signaled
d) addition or subtraction or fusedMultiplyAdd: magnitude subtraction of
infinities, such as: addition(+∞, −∞)
Differential Revision: https://reviews.llvm.org/D67446
llvm-svn: 371735
|
| |
|
|
|
|
|
|
| |
Select G_BRINDIRECT for MIPS32.
Differential Revision: https://reviews.llvm.org/D67441
llvm-svn: 371730
|
| |
|
|
|
|
|
|
|
|
| |
IRTranslator creates G_DYN_STACKALLOC instruction during expansion of
alloca when argument that tells number of elements to allocate on stack
is a virtual register. Use default lowering for MIPS32.
Differential Revision: https://reviews.llvm.org/D67440
llvm-svn: 371728
|
| |
|
|
|
|
|
|
|
|
| |
G_IMPLICIT_DEF is used for both integer and floating point implicit-def.
Handle G_IMPLICIT_DEF as ambiguous opcode in MipsRegisterBankInfo.
Select G_IMPLICIT_DEF for MIPS32.
Differential Revision: https://reviews.llvm.org/D67439
llvm-svn: 371727
|
| |
|
|
|
|
|
|
| |
(fneg X), (fneg Y)) -> (fdiv X, Y). NFCI.
Minor cleanup to use equivalent helper code.
llvm-svn: 371724
|
| |
|
|
|
|
|
|
| |
This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM.
FastISel is mostly disabled for now since it would generate incorrect code for
ILP32.
llvm-svn: 371722
|
| |
|
|
|
|
|
|
|
| |
Up to now, we've decided whether to sink address calculations using GEPs or
normal arithmetic based on the useAA hook, but there are other reasons GEPs
might be preferred. So this patch splits the two questions, with a default
implementation falling back to useAA.
llvm-svn: 371721
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
I don't have a direct motivational case for this,
but it would be good to have this for completeness/symmetry.
This pattern is basically the motivational pattern from
https://bugs.llvm.org/show_bug.cgi?id=43251
but with different predicate that requires that the offset is non-zero.
The completeness bit comes from the fact that a similar pattern (offset != zero)
will be needed for https://bugs.llvm.org/show_bug.cgi?id=43259,
so it'd seem to be good to not overlook very similar patterns..
Proofs: https://rise4fun.com/Alive/21b
Also, there is something odd with `isKnownNonZero()`, if the non-zero
knowledge was specified as an assumption, it didn't pick it up (PR43267)
Reviewers: spatel, nikic, xbolva00
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67411
llvm-svn: 371718
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Current implementation of estimating divisions loses precision since it
estimates reciprocal first and does multiplication. This patch is to re-order
arithmetic operations in the last iteration in DAGCombiner to improve the
accuracy.
Reviewed By: Sanjay Patel, Jinsong Ji
Differential Revision: https://reviews.llvm.org/D66050
llvm-svn: 371713
|
| |
|
|
|
|
|
|
|
| |
This was previously used to turn fp128 operations into libcalls
on X86. This is now done through op legalization after r371672.
This restores much of this code to before r254653.
llvm-svn: 371709
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have been using -switch-inst-prof-update-wrapper-strict
set to true by default for some time. It is time to remove
the safety stuff and make SwitchInstProfUpdateWrapper
intolerant to inconsistencies in !prof branch_weights
metadata of SwitchInst.
This patch gets rid of the Invalid state of
SwitchInstProfUpdateWrapper and the option
-switch-inst-prof-update-wrapper-strict. So there is only
two states: changed and unchanged.
Reviewers: davidx, nikic, eraman, reames, chandlerc
Reviewed By: davidx
Differential Revision: https://reviews.llvm.org/D67435
llvm-svn: 371707
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
later Intel CPUs.
AVX512 instructions can cause a frequency drop on these CPUs. This
can negate the performance gains from using wider vectors. Enabling
prefer-vector-width=256 will prevent generation of zmm registers
unless explicit 512 bit operations are used in the original source
code.
I believe gcc and icc both do something similar to this by default.
Differential Revision: https://reviews.llvm.org/D67259
llvm-svn: 371694
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
stack.
First we were asserting that the ValNo of a VA was the wrong value. It doesn't actually
make a difference for us in CallLowering but fix that anyway to silence the assert.
The bigger issue was that after fixing the assert we were generating invalid MIR
because the merging/unmerging of values split across multiple registers wasn't
also implemented for memory locs. This happens when we run out of registers and
have to pass the split types like i128 -> i64 x 2 on the stack. This is do-able, but
for now just fall back.
llvm-svn: 371693
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before, we only checked the callee for swifterror. However, we should also be
checking the caller to see if it has a swifterror parameter.
Since we don't currently handle outgoing arguments, this didn't show up in the
swifterror.ll testcase.
Also, remove the swifterror checks from call-translator-tail-call.ll, since
they are covered by the existing swifterror testing. Better to have it all in
one place.
Differential Revision: https://reviews.llvm.org/D67465
llvm-svn: 371692
|
| |
|
|
|
|
|
|
|
| |
Doing the CRLF translation while writing the file defeats our
optimization to not update the file if it hasn't changed.
Fixes PR43271.
llvm-svn: 371683
|