| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While updating clang tests for having clang set dso_local I noticed
that:
- There are *a lot* of tests to update.
- Many of the updates are redundant.
They are redundant because a GV is "obviously dso_local". This patch
starts formalizing that a bit by requiring that internal and private
GVs be dso_local too. Since they all are, we don't have to print
dso_local to the textual representation, making it a bit more compact
and easier to read.
llvm-svn: 322317
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D41965
llvm-svn: 322314
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When replacing a PHI the PeepholeOptimizer currently takes the register
class of the register at the first operand. This however is not correct
if this argument has a subregister index.
As there is currently no API to query the register class resulting from
applying a subregister index to all registers in a class, we can only
abort in these cases and not perform the transformation.
This changes findNextSource() to require the end of all copy chains to
not use a subregister if there is any PHI in the chain. I had to rewrite
the overly complicated inner loop there to have a good place to insert
the new check.
This fixes https://llvm.org/PR33071 (aka rdar://32262041)
Differential Revision: https://reviews.llvm.org/D40758
llvm-svn: 322313
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: efriedma, pcc
Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D39975
llvm-svn: 322312
|
| |
|
|
|
|
|
|
|
|
|
| |
LoadInst isn't enough; we need to include intrinsics that perform loads too.
All side-effecting intrinsics and such are already covered by the isSafe
check, so we just need to care about things that read from memory.
D41960, originally from D33179.
llvm-svn: 322311
|
| |
|
|
|
|
|
|
| |
This was causing undefined references at link time in lld.
Differential Revision: https://reviews.llvm.org/D41959
llvm-svn: 322309
|
| |
|
|
|
|
|
|
| |
sign extending the index.
We can just widen the vectors with undef and zero extend the mask.
llvm-svn: 322308
|
| |
|
|
| |
llvm-svn: 322307
|
| |
|
|
|
|
|
|
|
|
|
| |
-> (zext (truncate x))
This patch adds debug info support to the dagcombine rule (zext
(truncate x)) -> (zext (truncate x)).
Differential Revision: https://reviews.llvm.org/D41924
llvm-svn: 322304
|
| |
|
|
|
|
| |
The constants were aggregated in a reverse order.
llvm-svn: 322303
|
| |
|
|
| |
llvm-svn: 322301
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fold cases such as:
(v8i8 truncate (v8i32 extract_subvector (v16i32 sext (v16i8 V), Idx)))
->
(v8i8 extract_subvector (v16i8 V), Idx)
This can be generalized to cases where the truncate and extend do not
fully cancel each other out, but it may require querying the target
about profitability.
Reviewers: RKSimon, craig.topper, spatel, efriedma
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41927
llvm-svn: 322300
|
| |
|
|
|
|
| |
Cases were missing as observed in D41927
llvm-svn: 322297
|
| |
|
|
|
|
| |
Broken test from old attempt for folding tables - we don't peek through extract_subvector spills at all (which is why it doesn't fold), and we already have foldMemoryOperandCustom to handle insertps immediate correction anyway.
llvm-svn: 322292
|
| |
|
|
| |
llvm-svn: 322285
|
| |
|
|
|
|
|
|
|
|
|
| |
parent function
Ideally we should merge the attributes from the functions somehow, but
this is obviously an improvement over taking random attributes from the
caller which will trip up the verifier if they're nonsensical for an
unary intrinsic call.
llvm-svn: 322284
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was originally planned as the fix for:
https://bugs.llvm.org/show_bug.cgi?id=35834
...but simpler transforms handled that case, so I implemented a
lesser solution. It turns out we need to handle the case with 'not'
ops too because the real code example that we are trying to solve:
https://bugs.llvm.org/show_bug.cgi?id=35875
...has extra uses of the intermediate values, so we can't rely on
smaller canonicalizations to get us to the goal.
As with rL321672, I've tried to show every possibility in the
codegen tests because that's the simplest way to prove we're doing
the right thing in the wide variety of permutations of this pattern.
We can also show an InstCombine win because we added a fold for
this case in:
rL321998 / D41603
An Alive proof for one variant of the pattern to show that the
InstCombine and codegen results are correct:
https://rise4fun.com/Alive/vd1
Name: min3_nots
%nx = xor i8 %x, -1
%ny = xor i8 %y, -1
%nz = xor i8 %z, -1
%cmpxz = icmp slt i8 %nx, %nz
%minxz = select i1 %cmpxz, i8 %nx, i8 %nz
%cmpyz = icmp slt i8 %ny, %nz
%minyz = select i1 %cmpyz, i8 %ny, i8 %nz
%cmpyx = icmp slt i8 %y, %x
%r = select i1 %cmpyx, i8 %minxz, i8 %minyz
=>
%cmpxyz = icmp slt i8 %minxz, %ny
%r = select i1 %cmpxyz, i8 %minxz, i8 %ny
Name: min3_nots_alt
%nx = xor i8 %x, -1
%ny = xor i8 %y, -1
%nz = xor i8 %z, -1
%cmpxz = icmp slt i8 %nx, %nz
%minxz = select i1 %cmpxz, i8 %nx, i8 %nz
%cmpyz = icmp slt i8 %ny, %nz
%minyz = select i1 %cmpyz, i8 %ny, i8 %nz
%cmpyx = icmp slt i8 %y, %x
%r = select i1 %cmpyx, i8 %minxz, i8 %minyz
=>
%xz = icmp sgt i8 %x, %z
%maxxz = select i1 %xz, i8 %x, i8 %z
%xyz = icmp sgt i8 %maxxz, %y
%maxxyz = select i1 %xyz, i8 %maxxz, i8 %y
%r = xor i8 %maxxyz, -1
llvm-svn: 322283
|
| |
|
|
| |
llvm-svn: 322281
|
| |
|
|
|
|
| |
Primarily, this allows us to use the aggressive extraction mechanisms in combineExtractWithShuffle earlier and make use of UNDEF elements that may be lost during lowering.
llvm-svn: 322279
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
As RKSimon suggested in pr35820, in the case that Src is smaller in
bit-size than Indices, need to widen Src to avoid type mismatch.
Fixes pr35820
Reviewers: RKSimon, craig.topper
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41865
llvm-svn: 322272
|
| |
|
|
|
|
|
|
|
|
|
| |
necessary
Although the register scavenger can often find a spare register, an emergency
spill slot is needed to guarantee success. Reserve this slot in cases where
the function is known to have a large stack (meaning the scavenger may be
needed when forming stack addresses).
llvm-svn: 322269
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D41610
llvm-svn: 322267
|
| |
|
|
|
|
|
|
|
| |
Fail gracefully instead of crashing upon encountering
this type of relocation.
Differential revision: https://reviews.llvm.org/D41857
llvm-svn: 322266
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Patch [3/3] in a series to add operand constraint checks for SVE's predicated ADD/SUB.
Reviewers: rengolin, mcrosier, evandro, fhahn, echristo
Reviewed By: rengolin, fhahn
Subscribers: aemerson, javed.absar, tschuett, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D41447
llvm-svn: 322265
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- MSVC uses the none type for a variadic argument in CodeView
- Add a unit test
Reviewers: zturner, llvm-commits
Reviewed By: zturner
Differential Revision: https://reviews.llvm.org/D41931
llvm-svn: 322257
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch enables folding sin(x) / cos(x) -> tan(x), cos(x) / sin(x) -> 1 / tan(x) under -ffast-math flag
Reviewers: hfinkel, spatel
Reviewed By: spatel
Subscribers: andrew.w.kaylor, efriedma, scanon, llvm-commits
Differential Revision: https://reviews.llvm.org/D41286
llvm-svn: 322255
|
| |
|
|
|
|
|
|
| |
If the index is v2i64 we can use the scatter instruction that has v4i32/v4f32 data register, v2i64 index, and v2i1 mask. Similar was already done for gather.
Implement custom widening for v2i32 data to remove the code that reverses type legalization during lowering.
llvm-svn: 322254
|
| |
|
|
|
|
|
|
| |
Like rL321668 / rL321672, the planned optimizer change to
fix these will be in ValueTracking, but we can test the
changes cleanly here with AArch64 codegen.
llvm-svn: 322238
|
| |
|
|
|
|
|
|
|
|
|
|
| |
callframes"
Revert for now as the testcase is hitting a pre-existing verifier error
that manifest as a failure when expensive checks are enabled (or
-verify-machineinstrs) is used.
This reverts commit r322200.
llvm-svn: 322231
|
| |
|
|
| |
llvm-svn: 322225
|
| |
|
|
|
|
|
|
|
| |
Branch relaxation is needed to support branch displacements that overflow the
instruction's immediate field.
Differential Revision: https://reviews.llvm.org/D40830
llvm-svn: 322224
|
| |
|
|
|
|
|
|
| |
Make sure I really get back to the beahvior before my rewrite in r321035
which turned out not to be completely NFC as I changed the behavior for
the ios simulator environment.
llvm-svn: 322223
|
| |
|
|
|
|
|
|
|
| |
This is a prerequisite for the branch relaxation pass, and allows a number of
optimisation passes (e.g. BranchFolding and MachineBlockPlacement) to work.
Differential Revision: https://reviews.llvm.org/D40808
llvm-svn: 322222
|
| |
|
|
| |
llvm-svn: 322218
|
| |
|
|
| |
llvm-svn: 322217
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D40807
llvm-svn: 322216
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Includes support for expanding va_copy. Also adds support for using 'aligned'
registers when necessary for vararg calls, and ensure the frame pointer always
points to the bottom of the vararg spill region. This is necessary to ensure
that the saved return address and stack pointer are always available at fixed
known offsets of the frame pointer.
Differential Revision: https://reviews.llvm.org/D40805
llvm-svn: 322215
|
| |
|
|
|
|
|
|
|
|
| |
Currently we infer the scale at isel time by analyzing whether the base is a constant 0 or not. If it is we assume scale is 1, else we take it from the element size of the pass thru or stored value. This seems a little weird and I think it makes more sense to make it explicit in the DAG rather than doing tricky things in the backend.
Most of this patch is just making sure we copy the scale around everywhere.
Differential Revision: https://reviews.llvm.org/D40055
llvm-svn: 322210
|
| |
|
|
|
|
|
|
|
| |
ADRP instructions weren't being outlined because they're PC-relative and thus
fail the LR checks. This patch adds a special case for ADRPs to
getOutliningType to make sure that ADRPs can be outlined and updates the MIR
test.
llvm-svn: 322207
|
| |
|
|
|
|
|
|
| |
D41353 / D41233 are proposing to alter the shl/and canonicalization,
but I think that would just move an existing pattern-matching hole
to a different place.
llvm-svn: 322206
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Large callframes (calls with several hundreds or thousands or
parameters) could lead to situations in which the emergency spillslot is
out of range to be addressed relative to the stack pointer.
This commit forces the use of a frame pointer in the presence of large
callframes.
This commit does several things:
- Compute max callframe size at the end of instruction selection.
- Add mirFileLoaded target callback. Use it to compute the max callframe size
after loading a .mir file when the size wasn't specified in the file.
- Let TargetFrameLowering::hasFP() return true if there exists a
callframe > 255 bytes.
- Always place the emergency spillslot close to FP if we have a frame
pointer.
- Note that `useFPForScavengingIndex()` would previously return false
when a base pointer was available leading to the emergency spillslot
getting allocated late (that's the whole effect of this callback).
Which made no sense to me so I took this case out: Even though the
emergency spillslot is technically not referenced by FP in this case
we still want it allocated early.
Differential Revision: https://reviews.llvm.org/D40876
llvm-svn: 322200
|
| |
|
|
| |
llvm-svn: 322197
|
| |
|
|
|
|
| |
To be improved in a future patch
llvm-svn: 322192
|
| |
|
|
|
|
|
|
|
| |
See bug 35764: https://bugs.llvm.org/show_bug.cgi?id=35764
Differential Revision: https://reviews.llvm.org/D41614
Reviewers: vpykhtin, artem.tamazov, arsenm
llvm-svn: 322189
|
| |
|
|
| |
llvm-svn: 322182
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
After teaching InlineCost more about address spaces ()
another fault was detected in the inliner. If an argument has
the byval attribute the parameter might be copied to an alloca.
That part seems to work fine even if the argument has a different
address space than the alloca address space. However, if the
address spaces differ, then the inlined function still might
refer to the parameter using the original address space (the
inliner does not handle that situation very well).
This patch avoids the problem by simply disallowing inlining
when there are byval arguments with address space that differs
from the alloca address space.
I'm not really sure how to transform the code if we want to
get inlining for this situation. I assume that it never has
been working, and that the fixes in r321809 just exposed an
old problem.
Fault found by skatkov (Serguei Katkov). It is mentioned in
follow up comments to https://reviews.llvm.org/D40455.
Reviewers: skatkov
Reviewed By: skatkov
Subscribers: uabelho, eraman, llvm-commits, haicheng
Differential Revision: https://reviews.llvm.org/D41898
llvm-svn: 322181
|
| |
|
|
|
|
| |
Adds missing coverage for SHUFPD undef argument lowering, and also shows a missed opportunity to remove a unnecessary move compared to 02 shuffle mask.
llvm-svn: 322175
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch adds support for 'dup' (Scalar -> SVE) and its corresponding 'mov' alias.
Reviewers: fhahn, rengolin, evandro, echristo
Reviewed By: fhahn
Subscribers: aemerson, javed.absar, tschuett, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D41822
llvm-svn: 322172
|
| |
|
|
|
|
|
| |
G_FNEG is already handled by the TableGen'erated code. Just add a few
tests to make sure everything works as expected.
llvm-svn: 322170
|
| |
|
|
| |
llvm-svn: 322169
|