| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
llvm-svn: 369013
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
xxspltib/vspltisb are 3 cycle PM instructions,
xxleqv is 2 cycle ALU instruction.
We should use xxleqv to set all one vectors.
Reviewers: hfinkel, nemanjai, steven.zhang
Subscribers: hiraditya, kbarton, MaskRay, shchenz, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65529
llvm-svn: 369006
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to allow any alignment at least 2, not just exactly 2, so that the big
endian loads and stores can be selected successfully. I've also added extra BE
testing for the load and store tests.
Thanks to Oliver for the report.
Differential Revision: https://reviews.llvm.org/D66222
llvm-svn: 368996
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the last step in an FP add reduction allows reassociation and doesn't care
about -0.0, then we are free to recognize that computation as a reduction
that may reorder the intermediate steps.
This is requested directly by PR42705:
https://bugs.llvm.org/show_bug.cgi?id=42705
and solves PR42947 (if horizontal math instructions are actually faster than
the alternative):
https://bugs.llvm.org/show_bug.cgi?id=42947
Differential Revision: https://reviews.llvm.org/D66236
llvm-svn: 368995
|
|
|
|
|
|
|
|
|
| |
Stack loads and stores were already working, but direct stores were not. This
adds the patterns for them, same as predicate loads.
Differential Revision: https://reviews.llvm.org/D66213
llvm-svn: 368988
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch changes the location of the frame-record (FP, LR) to the
bottom of the callee-saved area. According to the AAPCS the location of
the frame-record within the stackframe is unspecified (section 5.2.3 The
Frame Pointer), so the compiler should be free to choose a different
location.
The reason for changing the location of the frame-record is to prepare
the frame for allocating an SVE area below the callee-saves. This way the
compiler can use the VL-scaled addressing modes to directly access SVE
objects from the frame-pointer.
: :
| stack | | stack |
| args | | args |
+-------+ +-------+
| x30 | | x19 |
| x29 | | x20 |
FP -> |- - - -| | x21 |
| x19 | ==> | x22 |
| x20 | |- - - -|
| x21 | | x30 |
| x22 | | x29 |
+-------+ +-------+ <- FP
|///////| |///////| // realignment gap
|- - - -| |- - - -|
|spills/| |spills/|
| locals| | locals|
SP -> +-------+ +-------+ <- SP
Things to point out:
- The algorithm to find a paired register should be prevented from
accidentally pairing some callee-saved register with LR that is not
FP, since they should always be paired together when the frame
has a frame-record.
- For Darwin platforms the location of the frame-record is unchanged,
since the unwind encoding does not allow for encoding this position
dynamically and other tools currently depend on the former layout.
Reviewers: efriedma, rovka, rengolin, thegameg, greened, t.p.northover
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D65653
llvm-svn: 368987
|
|
|
|
|
|
|
|
|
| |
This adds patterns for selecting trunc instructions from full vectors to i1's
vectors.
Differential Revision: https://reviews.llvm.org/D66201
llvm-svn: 368981
|
|
|
|
|
|
|
|
|
| |
bitcasted from x86mmx to MOVQ2DQ.
We already had the pattern for just the scalar to vector and bitcast,
but not the case where we wanted zeroes in the high half of the xmm.
llvm-svn: 368972
|
|
|
|
|
|
|
|
|
| |
pattern.
This pattern will narrow the load so we should make sure its not
volatile.
llvm-svn: 368971
|
|
|
|
|
|
|
|
|
|
| |
conversion to MMX.
fp_to_sint is turned into X86cvttp2si during isel preprocessing.
The other redundant isel patterns were removed previously, but I
missed this one because its in the MMX td file.
llvm-svn: 368968
|
|
|
|
|
|
| |
The default legalization can take care of this.
llvm-svn: 368967
|
|
|
|
|
|
|
| |
The generic legalization handles this in the same way so just use
that.
llvm-svn: 368966
|
|
|
|
| |
llvm-svn: 368965
|
|
|
|
|
|
|
|
| |
If the width is 256 bits, then we must have AVX so the else here
was unnecessary. Once that's removed then the >= 256 bit code is
identical to the 128 bit code with a different VT so combine them.
llvm-svn: 368956
|
|
|
|
|
|
|
|
|
| |
Implement this single atomic load instruction so that we can compile stack
protector code.
Differential Revision: https://reviews.llvm.org/D66245
llvm-svn: 368923
|
|
|
|
|
|
|
|
| |
I'm planning on handling intrinsics that will benefit from checking
the address space enums. Don't bother moving the address collection
for now, since those won't need th enums.
llvm-svn: 368895
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fixes PR42973. Tests don't change because simd-arith.ll tests behavior
on unimplemented-simd128, which does not include any temporary
workarounds such as the one removed in this revision.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, dmgreen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66166
llvm-svn: 368868
|
|
|
|
|
|
|
|
| |
Improves the 8 byte case from PR42674.
Differential Revision: https://reviews.llvm.org/D66069
llvm-svn: 368864
|
|
|
|
|
|
|
|
|
| |
Address one left comment on https://reviews.llvm.org/D63547. A minor
change for assertion.
Differential Revision: https://reviews.llvm.org/D63547
llvm-svn: 368860
|
|
|
|
|
|
|
|
|
|
|
|
| |
128-bit inputs
Now that we legalize by widening, the element types here won't change. Previously these were modeled as the elements being widened and then the instruction might become an AND or SHL/ASHR pair. But now they'll become something like a ZERO_EXTEND_VECTOR_INREG/SIGN_EXTEND_VECTOR_INREG.
For AVX2, when the destination type is legal its clear the cost should be 1 since we have extend instructions that can produce 256 bit vectors from less than 128 bit vectors. I'm a little less sure about AVX1 costs, but I think the ones I changed were definitely too high, but they might still be too high.
Differential Revision: https://reviews.llvm.org/D66169
llvm-svn: 368858
|
|
|
|
| |
llvm-svn: 368857
|
|
|
|
|
|
| |
Rename one to XX3Form_SameOp, remove the other one.
llvm-svn: 368856
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch adds call lowering functionality to enable passing
parameters onto floating point registers when needed.
Differential Revision: https://reviews.llvm.org/D63654
llvm-svn: 368855
|
|
|
|
|
|
|
|
| |
The destinations should be FPRs (for now).
Differential Revision: https://reviews.llvm.org/D66184
llvm-svn: 368775
|
|
|
|
|
|
|
| |
This has no effect at the moment, but might matter if we try to
implement non-temporal loads in the future.
llvm-svn: 368770
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements global address lowering for 32/64 bit with small/large code models.
1.For 32bit large code model on AIX, there are newly added pseudo opcode LWZtocL & ADDIStocHA32, the support of which on MC layer will be
provided by future patches.
2.The default code model on AIX should be small code model.
3.Since AIX does not have medium code model, "report_fatal_error" when users specify it.
Differential Revision: https://reviews.llvm.org/D63547
llvm-svn: 368744
|
|
|
|
|
|
|
|
|
| |
That change (r363670) could leave a copy from vgpr to sgpr. Fixed.
Differential Revision: https://reviews.llvm.org/D66133
Change-Id: I00c3fe6fda2e8e1e36f53195b881b1449c777ea4
llvm-svn: 368736
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The MVE architecture has the idea of "beats", where a vector instruction can be
executed over several ticks of the architecture. This adds a similar system
into the Arm backend cost model, multiplying the cost of all vector
instructions by a factor.
This factor essentially becomes the expected difference between scalar code
and vector code, on average. MVE Vector instructions can also overlap so the a
true cost of them is often lower. But equally scalar instructions can in some
situations be dual issued, or have other optimisations such as unrolling or
make use of dsp instructions. The default is chosen as 2. This should not
prevent vectorisation is a most cases (as the vector instructions will still be
doing at least 4 times the work), but it will help prevent over vectorising in
cases where the benefits are less likely.
This adds things so far to the obvious places in ARMTargetTransformInfo, and
updates a few related costs like not treating float instructions as cost 2 just
because they are floats.
Differential Revision: https://reviews.llvm.org/D66005
llvm-svn: 368733
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
While D65962 is pending for review, I landed D65475 that added one more
use of `unsigned`. Changed it to `Register`.
Reviewers: dsanders
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66064
llvm-svn: 368727
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Trying again with the code changes (and not just the new test).
Summary:
This patch fixes the offsets of fields in the stack frame linkage save
area for AIX.
Reviewers: sfertile, hubert.reinterpretcast, jasonliu, Xiangling_L, xingxue, ZarkoCA, daltenty
Reviewed By: hubert.reinterpretcast
Subscribers: wuzish, nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64424
Patch by Chris Bowler!
llvm-svn: 368721
|
|
|
|
|
|
| |
Odd sized vectors aren't handled yet.
llvm-svn: 368713
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D65957
llvm-svn: 368712
|
|
|
|
|
|
|
|
|
| |
The APSR is encoded by setting bit 15 in the register list of the CLRM
instruction (cf. https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf).
Differential Revision: https://reviews.llvm.org/D65873
llvm-svn: 368711
|
|
|
|
| |
llvm-svn: 368709
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently shufflemasks get emitted as any other constant, and you end
up with a bunch of virtual registers of G_CONSTANT with a
G_BUILD_VECTOR. The AArch64 selector then asserts on anything that
doesn't fit this pattern. This isn't an ideal representation, and
should avoid legalization and have fewer opportunities for a
representational error.
Rather than invent a new shuffle mask operand type, similar to what
ShuffleVectorSDNode does, just track the original IR Constant mask
operand. I don't completely like the idea of adding another link to
the IR, but MIR is already quite dependent on IR constants already,
and this will allow sharing the shuffle mask utility functions with
the IR.
llvm-svn: 368704
|
|
|
|
|
|
|
|
|
|
| |
If the target shuffle mask is from a wider type, attempt to scale the mask so that the extraction can attempt to peek through.
Fixes the regression mentioned in rL368662
Reapplying this as rL368308 had to be reverted as part of rL368660 to revert rL368276
llvm-svn: 368663
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DemandedElts mask (reapplied)
If we don't demand all elements, then attempt to combine to a simpler shuffle.
At the moment we can only do this if Depth == 0 as combineX86ShufflesRecursively uses Depth to track whether the shuffle has really changed or not - we'll need to change this before we can properly start merging combineX86ShufflesRecursively into SimplifyDemandedVectorElts.
The insertps-combine.ll regression is because XFormVExtractWithShuffleIntoLoad can't see through shuffles of different widths - this will be fixed in a follow-up commit.
Reapplying this as rL368307 had to be reverted as part of rL368660 to revert rL368276
llvm-svn: 368662
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT"
This introduced a false positive MemorySanitizer warning about use of
uninitialized memory in a vectorized crc function in Chromium. That suggests
maybe something is not right with this transformation. See
https://crbug.com/992853#c7 for a reproducer.
This also reverts the follow-up commits r368307 and r368308 which
depended on this.
> This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract.
>
> In particular this helps remove some unnecessary scalar->vector->scalar patterns.
>
> The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue.
>
> Differential Revision: https://reviews.llvm.org/D65887
llvm-svn: 368660
|
|
|
|
|
|
|
|
|
|
|
|
| |
The legalizer would hit an assertion on PowerPC platform when truncating
a vector whose size is not power of 2. This patch is to add a check to
prevent vectors with such odd-size elements from being custom lowered.
Reviewed By: Hal Finkel
Differential Revision: https://reviews.llvm.org/D65261
llvm-svn: 368654
|
|
|
|
|
|
| |
SrcOp. NFC.
llvm-svn: 368653
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to be maintained.
Currently we can't keep any state in the selector object that we get from
subtarget. As a result we have to plumb through all our variables through
multiple functions. This change makes it non-const and adds a virtual init()
method to allow further state to be captured for each target.
AArch64 makes use of this in this patch to cache a call to hasFnAttribute()
which is expensive to call, and is used on each selection of G_BRCOND.
Differential Revision: https://reviews.llvm.org/D65984
llvm-svn: 368652
|
|
|
|
| |
llvm-svn: 368645
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D66117
llvm-svn: 368633
|
|
|
|
|
|
|
|
|
|
|
|
| |
r367088 made it so that funclets store XMM registers into their local
frame instead of storing them to the parent frame. However, that change
forgot to update the parent frame pointer offset for catch blocks. This
change does that.
Fixes crashes when an exception is rethrown in a catch block that saves
XMMs, as described in https://crbug.com/992860.
llvm-svn: 368631
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).
Depends on D65919
Reviewers: lenary
Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision for full review was: https://reviews.llvm.org/D65962
llvm-svn: 368629
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).
Manual fixups in:
AArch64InstrInfo.cpp - genFusedMultiply() now takes a Register* instead of unsigned*
AArch64LoadStoreOptimizer.cpp - Ternary operator was ambiguous between Register/MCRegister. Settled on Register
Depends on D65919
Reviewers: aemerson
Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision for full review was: https://reviews.llvm.org/D65962
llvm-svn: 368628
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).
Reviewers: aheejin
Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision for whole review: https://reviews.llvm.org/D65962
llvm-svn: 368627
|
|
|
|
|
|
|
|
|
|
|
| |
This is infrastructural, will be needed for future work.
For some reason it was only used in MIMG_NoSampler, while
needed everywere we use MIMGBaseOpcode if we want to use
predicates.
Differential Revision: https://reviews.llvm.org/D66115
llvm-svn: 368626
|
|
|
|
|
|
|
|
|
|
|
| |
without AVX512BW.
We need AVX512BW to be able to truncate an i16 vector. If we don't
have that we have to extend i16->i32, then trunc, i32->i8. But we
won't be able to remove the min/max if we do that. At least not
without more special handling.
llvm-svn: 368623
|
|
|
|
|
|
|
|
| |
All three 256->128 bit cases were already handled above.
Noticed while looking at the coverage report.
llvm-svn: 368609
|