| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a long-awaited follow-up suggested in D33578. Since then, we've picked up even more
opportunities for vector narrowing from changes like D53784, so there are a lot of test diffs.
Apart from 2-3 strange cases, these are all wins.
I've structured this to be no-functional-change-intended for any target except for x86
because I couldn't tell if AArch64, ARM, and AMDGPU would improve or not. All of those
targets have existing regression tests (4, 4, 10 files respectively) that would be
affected. Also, Hexagon overrides the shouldReduceLoadWidth() hook, but doesn't show
any regression test diffs. The trade-off is deciding if an extra vector load is better
than a single wide load + extract_subvector.
For x86, this is almost always better (on paper at least) because we often can fold
loads into subsequent ops and not increase the official instruction count. There's also
some unknown -- but potentially large -- benefit from using narrower vector ops if wide
ops are implemented with multiple uops and/or frequency throttling is avoided.
Differential Revision: https://reviews.llvm.org/D54073
llvm-svn: 346595
|
| |
|
|
| |
llvm-svn: 346592
|
| |
|
|
|
|
|
|
| |
No lit tests fail with this code removed.
This is a pre-commit for D54346.
llvm-svn: 346590
|
| |
|
|
|
|
| |
types (PR39615)
llvm-svn: 346589
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
There are two AGU units, and per 1cy, there can be either two loads,
or a load and a store; but not two stores, or two loads and a store.
Additionally, loads shouldn't affect the store scheduler and vice versa.
(but *should* affect the PdEX scheduler.)
Required rL346545.
Fixes https://bugs.llvm.org/show_bug.cgi?id=39465
llvm-svn: 346587
|
| |
|
|
|
|
|
|
| |
any_extend of the remainder from an 8-bit sdivrem.
The sdivrem will emit its own MOVSX to move %ah to the low byte of a register. By using a MOVSX for an any_extend this allows a post-isel peephole to merge them.
llvm-svn: 346581
|
| |
|
|
|
|
|
|
|
|
| |
directly using X86ISD::UNPCKL/X86ISD::UNPCKH.
This gives shuffle lowering the freedom to use zero_extend_vector_inreg for the unpckl shuffle. Shuffle combining usually makes this swap later, but not when AVX512 is enabled it seems.
While there also use DAG.getConstant to create a 0 vector instead of using the helper the forces a specific BUILD_VECTOR. I don't think that helper is usually needed. We're basically free to create a constant build_vector anytime and it will be legalized on its own.
llvm-svn: 346574
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D54362
llvm-svn: 346570
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for funclets in frame lowering and ISel
lowering. Together with D50288 and D50166, it enables C++ exception
handling.
Patch by Sanjin Sijaric, with some fixes by me.
Differential Revision: https://reviews.llvm.org/D51524
llvm-svn: 346568
|
| |
|
|
|
|
|
|
|
|
|
|
| |
LDRcp should be deleted when the dest register is dead in register
coalescing. Without MemOp, dead LDRcp will cause dead constant pool
value which references to non-existing label.
Patch by Yin Ma.
Differential Revision: https://reviews.llvm.org/D54173
llvm-svn: 346563
|
| |
|
|
|
|
|
|
|
|
| |
lowering to isel. Change to use vpmovzx instead of vpmovsx.
With avx512f but not avx512bw we need to extend to v16i32 then truncate that to to v16i8. Previously we emitted both nodes during lowering, but I'm trying to switch to using target independent nodes and with that switched the extend+truncate wou
This patch changes the implementation to what will be necessary with that patch which helps minimize test diffs.
llvm-svn: 346552
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: t.p.northover, SjoerdMeijer, kristof.beyls
Reviewed By: kristof.beyls
Subscribers: olista01, javed.absar, kristof.beyls, kristina, llvm-commits
Differential Revision: https://reviews.llvm.org/D53908
llvm-svn: 346546
|
| |
|
|
| |
llvm-svn: 346543
|
| |
|
|
|
|
|
|
| |
This makes X86ISD::VSEXT more similar to ISD::SIGN_EXTEND and ISD::ZERO_EXTEND.
I'm hoping to replace X86ISD::VSEXT/VZEXT with target independent nodes. Making the target specific nodes similar to the target independent nodes helps minimize test diffs in that patch.
llvm-svn: 346539
|
| |
|
|
|
|
| |
start of the source vector
llvm-svn: 346538
|
| |
|
|
| |
llvm-svn: 346537
|
| |
|
|
| |
llvm-svn: 346535
|
| |
|
|
|
|
|
|
|
|
|
| |
Eliminate the stack frame in functions with the noreturn nounwind
attributes, and when the noreturn-stack-elim target feature is
enabled. This reduces the code and stack space needed for noreturn
functions.
Differential Revision: https://reviews.llvm.org/D54210
llvm-svn: 346532
|
| |
|
|
|
|
|
|
| |
This only covers AMDGPU BE, hopefully all occurrences.
Differential Revision: https://reviews.llvm.org/D54235
llvm-svn: 346528
|
| |
|
|
|
|
|
|
| |
Both -fPIC and -G0 disable placement of globals in small data section,
but if a global has an explicit section assigmnent placing it in small
data, it should go there anyway.
llvm-svn: 346523
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently in llvm, CalleeSavedInfo can only assign a callee saved register to
stack frame index to be spilled in the prologue. We would like to enable
spilling gprs to vector registers. This patch adds the capability to spill to
other registers aside from just the stack. It also adds the changes for power9
to spill gprs to volatile vector registers when they are available.
This happens only for leaf functions when using the option
-ppc-enable-pe-vector-spills.
Differential Revision: https://reviews.llvm.org/D39386
llvm-svn: 346512
|
| |
|
|
|
|
|
|
|
| |
only debug directives are requested."
This reverts commit r345972. Need to update the description + possibly
to update the patch itself after discussion with Eric Christofer.
llvm-svn: 346508
|
| |
|
|
|
|
|
|
|
|
|
| |
A minor improvement of buildVector() that skips creating an
INSERT_VECTOR_ELT for a Value which has already been used for the
REPLICATE.
Review: Ulrich Weigand
https://reviews.llvm.org/D54315
llvm-svn: 346504
|
| |
|
|
|
|
|
|
|
| |
Now that we have mixed type sizes, i1 values need to be explicitly
handled as we want to avoid promoting these values.
Differential Revision: https://reviews.llvm.org/D54308
llvm-svn: 346499
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
I noticed that we weren't generating broadcasts as much I thought we would with
D54271, and this is part of the problem.
Widening the shuffle elements means adding bitcasts and hiding the relationship
between a splatted scalar and the vector. If we can form a broadcast, do that
before going through the rest of the shuffle lowering because broadcasts should
be cheap and can often be load-folded.
Differential Revision: https://reviews.llvm.org/D54280
llvm-svn: 346498
|
| |
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D53492
Patch by James Clarke.
llvm-svn: 346497
|
| |
|
|
|
|
|
|
| |
Legalize s64 G_CONSTANT using narrowScalar on MIPS 32.
Differential Revision: https://reviews.llvm.org/D54255
llvm-svn: 346495
|
| |
|
|
|
|
| |
This will be necessary for an update to D54267
llvm-svn: 346490
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
unbound CPUs for a target.
Summary:
This simplifies the code and moves everything to tablegen for consistency. This
also prepares the ground for adding issue counters.
Reviewers: gchatelet, john.brawn, jsji
Subscribers: nemanjai, mgorny, javed.absar, kbarton, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D54297
llvm-svn: 346489
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Starting from SNB, VZEROUPPER is handled by the renamer and uses no proc resources.
After HSW, it also has zero latency.
This fixes PR35606.
To reproduce:
Uops:
llvm-exegesis -mode=uops -opcode-name=VZEROUPPER
Latency:
echo -e '#LLVM-EXEGESIS-DEFREG XMM0 1\n#LLVM-EXEGESIS-DEFREG XMM1 1\nvzeroupper' | /tmp/llvm-exegesis -mode=latency -snippets-file=-
echo -e '#LLVM-EXEGESIS-DEFREG XMM0 1\n#LLVM-EXEGESIS-DEFREG XMM1 1\nvzeroupper\naddps %xmm0, %xmm1' | /tmp/llvm-exegesis -mode=latency -snippets-file=-
Reviewers: RKSimon, craig.topper, andreadb
Subscribers: gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D54107
llvm-svn: 346482
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, during the search, all values had to have the same
'TypeSize', which is equal to number of bits of the integer type of
the icmp operand. All values in the tree had to match this size;
meaning that, if we searched from i16, we wouldn't accept i8s. A
change in type size requires zext and truncs to perform the casts so,
to allow mixed narrow types, the handling of these instructions is
now slightly different:
- we allow casts if their result or operand is <= TypeSize.
- zexts are sinks if their result > TypeSize.
- truncs are still sinks if their operand == TypeSize.
- truncs are still sources if their result == TypeSize.
The transformation bails on finding an icmp that operates on data
smaller than the current TypeSize.
Differential Revision: https://reviews.llvm.org/D54108
llvm-svn: 346480
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A few code movement things:
- AreSymmetrical is now a method of BinOpChain.
- Created a lambda in CreateParallelMACPairs to reduce loop nesting.
- A Reduction object now gets pasted in a couple of places instead,
including CreateParallelMACPairs so it doesn't need to return a
value.
I've also added RecordSequentialLoads, which is run before the
transformation begins, and caches the interesting loads. This can then
be queried later instead of cross checking many load values.
Differential Revision: https://reviews.llvm.org/D54254
llvm-svn: 346479
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: rnk, mstorsjo, compnerd, efriedma, TomTan
Reviewed By: rnk
Subscribers: javed.absar, kristof.beyls, chrib, llvm-commits
Differential Revision: https://reviews.llvm.org/D54248
llvm-svn: 346469
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Depends on D54126.
Reviewers: aheejin, dschuff, aardappel
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D54138
llvm-svn: 346465
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Reorders the sections in the SIMD tablegen file to roughly match the
new opcode ordering. Depends on D54126.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D54134
llvm-svn: 346464
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: aheejin, dschuff, aardappel
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D54126
llvm-svn: 346463
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D53675
llvm-svn: 346462
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The pass incorrectly assumed if there's a longjmp declaration in the
module, there is also a setjmp function declaration. Fixed it, and now
the pass only converts longjmp and does not do any other transformation
when there's no setjmp declaration in the module.
Fixes PR39562.
Reviewers: jgravelle-google, sbc100
Subscribers: dschuff, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D54273
llvm-svn: 346445
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed in D54073, we have a potential regression from more aggressive vector narrowing here, so let's try to avoid that by changing build-vector lowering slightly.
Insert-vector-element lowering always does this since there's no "pinsr" for ymm/zmm:
// If the vector is wider than 128 bits, extract the 128-bit subvector, insert
// into that, and then insert the subvector back into the result.
...but we can sometimes do better for insert-into-constant-vector by using shuffle lowering.
Differential Revision: https://reviews.llvm.org/D54271
llvm-svn: 346433
|
| |
|
|
|
|
|
|
|
|
|
| |
This commit broke the module buildbots.
Error:
lib/Target/MSP430/MSP430GenAsmMatcher.inc:1027:1: error: redundant
namespace 'llvm' [-Wmodules-import-nested-redundant]
^
llvm-svn: 346410
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was discovered in randomized testing that the SystemZ implementation of
shouldCoalesce() could be caused to crash when subreg liveness was
enabled. This was because an undef use of the virtual register was copied
outside current MBB at the point of shouldCoalesce() being called. For more
details, see https://bugs.llvm.org/show_bug.cgi?id=39276.
This patch changes the check for MBB locality from livein/liveout checks to
do checks for all instructions of both intervals being inside MBB. This
avoids the cases with dead defs / undef uses outside MBB, which are not
affecting liveness in/out of MBB.
The original test case included as a reduced .mir test case.
Review: Ulrich Weigand
https://reviews.llvm.org/D54197
llvm-svn: 346406
|
| |
|
|
|
|
|
|
|
|
| |
Generalize code in Thumb2InstrInfo::storeRegToStackSlot() and
loadRegToStackSlot() to allow the GPR class or any of its sub-classes
(including hGPR) to be stored/loaded by ARM::t2STRi12/ARM::t2LDRi12.
Differential Revision: https://reviews.llvm.org/D51927
llvm-svn: 346401
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: asl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54251
llvm-svn: 346391
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D53872
llvm-svn: 346384
|
| |
|
|
|
|
|
|
|
|
|
| |
Promote alloca can vectorize a small array by bitcasting it to a
vector type. Extend vectorization for the case when alloca is
already a vector type. We still want to replace GEPs with an
insert/extract element instructions in this case.
Differential Revision: https://reviews.llvm.org/D54219
llvm-svn: 346376
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change implements assembler parser, code emitter, ELF object writer
and disassembler for the MSP430 ISA. Also, more instruction forms are added
to the target description.
Reviewers: asl
Reviewed By: asl
Subscribers: pftbest, krisb, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D53661
llvm-svn: 346374
|
| |
|
|
|
|
|
|
| |
In this context, usesWindowsCFI() is basically the same thing as
isOSWindows(), but it makes the relevant property of the target
more explicit.
llvm-svn: 346366
|
| |
|
|
|
|
|
|
| |
This reverts commit r344696 for now (except for some test additions).
See https://bugs.freedesktop.org/show_bug.cgi?id=108611.
llvm-svn: 346364
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Remove redundant logic and simplify control flow.
Reviewers: msearles, rampitec, scott.linder, kanarayan
Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D54086
llvm-svn: 346363
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is not needed, because we don't actually insert relevant branches
for KILLs that late in the compilation flow.
Besides, this was always checking for the wrong kill opcode anyway...
Reviewers: msearles, rampitec, scott.linder, kanarayan
Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D54085
llvm-svn: 346362
|