| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
| |
Patch by Tom Stellard
llvm-svn: 326477
|
| |
|
|
|
|
| |
Patch by Tom Stellard
llvm-svn: 326472
|
| |
|
|
| |
llvm-svn: 326471
|
| |
|
|
|
|
| |
Patch by Tom Stellard
llvm-svn: 326470
|
| |
|
|
|
|
| |
Patch by Tom Stellard
llvm-svn: 326468
|
| |
|
|
| |
llvm-svn: 326466
|
| |
|
|
|
|
| |
Patch by Tom Stellard
llvm-svn: 326465
|
| |
|
|
|
|
| |
Patch by Tom Stellard
llvm-svn: 326464
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when a BUILD_VECTOR is created out of a sequence of EXTRACT_VECTOR_ELT with a
specific pattern sequence, either <0, 2, 4, ...> or <1, 3, 5, ...>, replace the
BUILD_VECTOR with either vuzp1 or vuzp2.
With this patch LLVM generates the following code for the first function fun1 in the testcase:
adrp x8, .LCPI0_0
ldr q0, [x8, :lo12:.LCPI0_0]
tbl v0.16b, { v0.16b }, v0.16b
ext v1.16b, v0.16b, v0.16b, #8
uzp1 v0.8b, v0.8b, v1.8b
str d0, [x8]
ret
Without this patch LLVM currently generates this code:
adrp x8, .LCPI0_0
ldr q0, [x8, :lo12:.LCPI0_0]
tbl v0.16b, { v0.16b }, v0.16b
mov v1.16b, v0.16b
mov v1.b[1], v0.b[2]
mov v1.b[2], v0.b[4]
mov v1.b[3], v0.b[6]
mov v1.b[4], v0.b[8]
mov v1.b[5], v0.b[10]
mov v1.b[6], v0.b[12]
mov v1.b[7], v0.b[14]
str d1, [x8]
ret
llvm-svn: 326443
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Use the correct loop index varaible, ArgI, to retrieve attributes.
Reviewers: thanm, sanjoy, rnk
Reviewed By: rnk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43832
llvm-svn: 326433
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently it's impossible to test InstructionSelect pass with MIR which
is considered illegal by the Legalizer in Assert builds. In early stages
of porting an existing backend from SelectionDAG ISel to GlobalISel,
however, we would have very basic CallLowering, Legalizer, and
RegBankSelect implementations, but rather functional Instruction Select
with quite a few patterns selectable due to the semi-automatic porting
process borrowing them from SelectionDAG ISel.
As we are trying to define legality as a property of being selectable by
the instruction selector, it would be nice to be able to easily check
what the selector can do in its current state w/o the legality check
provided by the Legalizer getting in the way.
It also seems beneficial to have a regression testing set up that would
not allow the selector to silently regress in its support of the MIR not
supported yet by the previous passes in the GlobalISel pipeline.
This commit adds -disable-gisel-legality-check command line option to
llc that disables those legality checks in RegBankSelect and
InstructionSelect passes.
It also adds quite a few MIR test cases for AArch64's Instruction
Selector. Every one of them would fail on the legality check at the
moment, but will select just fine if the check is disabled. Every test
MachineFunction is intended to exercise a specific selection rule and
that rule only, encoded in the MachineFunction's name by the rule's
number, ID, and index of its GIM_Try opcode in TableGen'erated
MatchTable (-optimize-match-table=false).
Reviewers: ab, dsanders, qcolombet, rovka
Reviewed By: bogner
Subscribers: kristof.beyls, volkan, aditya_nandakumar, aemerson,
rengolin, t.p.northover, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D42886
llvm-svn: 326396
|
| |
|
|
|
|
|
|
|
|
| |
node when there are no FMA instructions.
This would cause a 'cannot select' error at isel when we should have emitted a lib call and an xor.
Fixes PR36553.
llvm-svn: 326393
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
After D43914, loads from global variables in addrspace(1) happen with
ld.global. But since they're constants, even better would be to use
ld.global.nc, aka ldg.
Reviewers: tra
Subscribers: jholewinski, sanjoy, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D43915
llvm-svn: 326390
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NVPTXGenericToNVVM.
Summary:
NVPTXGenericToNVVM was using target-specific intrinsics to do address
space casts. Using the addrspacecast instruction is (a lot) simpler.
But it also has the advantage of being understandable to other passes.
In particular, InferAddrSpaces is able to understand these address space
casts and remove them in most cases.
Reviewers: tra
Subscribers: jholewinski, sanjoy, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D43914
llvm-svn: 326389
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Function::lookupIntrinsicID is somewhat forgiving as it comes to
overloaded intrinsics' names: it returns an ID as soon as the name
provided has a prefix that matches a registered intrinsic's name w/o
actually checking that the rest of the name encodes all the concrete arg
types, let alone that those types are compatible with the intrinsic's
definition.
That's probably fine and comes in handy in MIR serialization: we don't
care about IR types at MIR level and every intrinsic should be
selectable based on its ID and low-level types (LLTs) of its operands,
including the overloaded ones, so there is no point in serializing
mangled IR types as part of the intrinsic's name.
However, lookupIntrinsicID is somewhat inconsistent in its forgiveness:
if the name provided is actually an exact match, it will refuse to
return the ID if the intrinsic is overloaded. There is probably no
real reason for that and it renders MIRParser incapable to deserialize
MIR MIRPrinter serialized.
This commit fixes it.
Reviewers: rnk, aditya_nandakumar, qcolombet, thegameg, dsanders,
marcello.maggioni
Reviewed By: bogner
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D43267
llvm-svn: 326387
|
| |
|
|
|
|
| |
Add 64-bit cmpxchg8b tests
llvm-svn: 326380
|
| |
|
|
|
|
|
|
| |
and extending/truncating.
This is equivalent to what isel was doing anyway but by canonicalizing earlier we can remove some patterns.
llvm-svn: 326375
|
| |
|
|
|
|
|
|
| |
Matches what we already manage for unsigned saturation truncation stores
Differential Revision: https://reviews.llvm.org/D43629
llvm-svn: 326372
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For use by LLPC SPV_AMD_shader_ballot extension.
The v_writelane instruction was already implemented for use by SGPR
spilling, but I had to add an extra dummy operand tied to the
destination, to represent that all lanes except the selected one keep
the old value of the destination register.
.ll test changes were due to schedule changes caused by that new
operand.
Differential Revision: https://reviews.llvm.org/D42838
llvm-svn: 326353
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FailedISel MachineFunction property is part of the CodeGen pipeline
state as much as every other property, notably, Legalized,
RegBankSelected, and Selected. Let's make that part of the state also
serializable / de-serializable, so if GlobalISel aborts on some of the
functions of a large module, but not the others, it could be easily seen
and the state of the pipeline could be maintained through llc's
invocations with -stop-after / -start-after.
To make MIR printable and generally to not to break it too much too
soon, this patch also defers cleaning up the vreg -> LLT map until
ResetMachineFunctionPass.
To make MIR with FailedISel: true also machine verifiable, machine
verifier is changed so it treats a MIR-module as non-regbankselected and
non-selected if there is FailedISel property set.
Reviewers: qcolombet, ab
Reviewed By: dsanders
Subscribers: javed.absar, rovka, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D42877
llvm-svn: 326343
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Emulated TLS is enabled by llc flag -emulated-tls,
which is passed by clang driver.
When llc is called explicitly or from other drivers like LTO,
missing -emulated-tls flag would generate wrong TLS code for targets
that supports only this mode.
Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether
emulated TLS code should be generated.
Unit tests are modified to run with and without the -emulated-tls flag.
Differential Revision: https://reviews.llvm.org/D42999
llvm-svn: 326341
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves
In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before.
Patch by Martin Svanfeldt
Reviewers: fhahn, pbarrio, rogfer01
Reviewed By: pbarrio, rogfer01
Subscribers: chrib, yroux, eugenis, efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D42574
llvm-svn: 326333
|
| |
|
|
|
|
|
|
|
|
|
| |
Add legalization/selection for x86/x86_64 and
corresponding tests.
Reviewed By: igorb
Differential Revision: https://reviews.llvm.org/D43622
llvm-svn: 326320
|
| |
|
|
|
|
|
|
|
|
|
| |
Add legalization/selection for x86/x86_64 and
corresponding tests.
Reviewed By: igorb
Differential Revision: https://reviews.llvm.org/D43617
llvm-svn: 326311
|
| |
|
|
| |
llvm-svn: 326309
|
| |
|
|
|
|
|
|
|
|
| |
need to guarantee zeroes in the upper bits of return.
An extract_element where the result type is larger than the scalar element type is semantically an any_extend of from the scalar element type to the result type. If we expect zeroes in the upper bits of the i8/i32 we need to mae sure those zeroes are explicit in the DAG.
For these cases the best way to accomplish this is use an insert_subvector to pad zeroes to the upper bits of the v1i1 first. We extend to either v16i1(for i32) or v8i1(for i8). Then bitcast that to a scalar and finish with a zero_extend up to i32 if necessary. We can't extend past v16i1 because that's the largest mask size on KNL. But isel is smarter enough to know that a zext of a bitcast from v16i1 to i16 can use a KMOVW instruction. The insert_subvectors will be dropped during isel because we can determine that the producing instruction already zeroed the upper bits of the k-register.
llvm-svn: 326308
|
| |
|
|
| |
llvm-svn: 326263
|
| |
|
|
|
|
|
|
|
| |
Absence of memory operands is treated as "aliasing everything", so
dropping them is sufficient.
Recommit r326256 with a fixed testcase.
llvm-svn: 326262
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
We were always setting the block alignment to 2 bytes in Thumb mode
and 4-bytes in ARM mode (r325754, and r325012), but this could cause
reducing the block alignment when it already had been aligned (e.g.
in Thumb mode when the block is a CPE that was already 4-byte aligned).
Patch by Momchil Velikov, I've only added a test.
Differential Revision: https://reviews.llvm.org/D43777
llvm-svn: 326232
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D43807
llvm-svn: 326226
|
| |
|
|
| |
llvm-svn: 326220
|
| |
|
|
|
|
|
|
| |
Re-enable commit r323991 now that r325931 has been committed to make
MachineOperand::isRenamable() check more conservative w.r.t. code
changes and opt-in on a per-target basis.
llvm-svn: 326208
|
| |
|
|
|
|
| |
SplitBinaryOpsAndApply
llvm-svn: 326189
|
| |
|
|
|
|
| |
NFC
llvm-svn: 326147
|
| |
|
|
|
|
| |
There's still some shortcoming in our ability to combine binops of constants with different sizes separated by an extend. I'll try to look at that next.
llvm-svn: 326128
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v32i1)) without AVX512 to prevent scalarization
Summary:
We have an early DAG combine to turn these patterns into MOVMSK, but that combine doesn't work if the vXi1 type has more elements than the widest legal vXi8 type. Type legalization will eventually split it down to v16i1 or v32i1 and then the bitcast gets legalized to a truncstore and a scalar load. The truncstore will get lowered to a series of extracts and bit math.
This patch adds a custom legalization to use a sign extend and MOVMSK instead. This prevents the eventual scalarization.
Reviewers: spatel, RKSimon, zvi
Reviewed By: RKSimon
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D43593
llvm-svn: 326119
|
| |
|
|
|
|
| |
SplitBinaryOpsAndApply
llvm-svn: 326104
|
| |
|
|
| |
llvm-svn: 326101
|
| |
|
|
|
|
| |
Cleanup check-prefixes to share more AVX/AVX512 codegen checks
llvm-svn: 326097
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In r322867, we introduced IsStandalone when printing MIR in -debug
output. The default behaviour for that was:
1) If any of MBB, MI, or MO are -debug-printed separately, don't omit any
redundant information.
2) When -debug-printing a MF entirely, don't print any redundant
information.
3) When printing MIR, don't print any redundant information.
I'd like to change 2) to:
2) When -debug-printing a MF entirely, don't omit any redundant information.
Differential Revision: https://reviews.llvm.org/D43337
llvm-svn: 326094
|
| |
|
|
|
|
| |
Fixes scary typo in a check that lost the end digit off a reg#...
llvm-svn: 326093
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
With OS type AMDPAL, the scratch descriptor is hardwired to be loaded
from offset 0 of the global information table, whose low pointer is
passed in s0. For a merge shader on gfx9+, it needs to be s8 instead, as
the hardware reserves s0-s7.
Reviewers: kzhuravl
Subscribers: arsenm, nhaehnle, dstuttard, llvm-commits, t-tye, yaxunl, wdng, kzhuravl
Differential Revision: https://reviews.llvm.org/D42203
llvm-svn: 326088
|
| |
|
|
|
|
|
|
|
|
| |
Enable multiple COPY hints to eliminate more COPYs during register allocation.
Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.
Review: Robert Lytton
llvm-svn: 326069
|
| |
|
|
|
|
| |
256-bit operations.
llvm-svn: 326068
|
| |
|
|
|
|
| |
elements are.
llvm-svn: 326066
|
| |
|
|
|
|
| |
Which types are considered 'simple' is a function of the requirements of all targets that LLVM supports. That shouldn't directly affect what types we are able to handle. The remainder of this code checks that the number of elements is a power of 2 and takes care of splitting down to a legal size.
llvm-svn: 326063
|
| |
|
|
|
|
| |
ADD/SUB ops
llvm-svn: 326044
|
| |
|
|
|
|
| |
TRUNCATE ops
llvm-svn: 326043
|
| |
|
|
| |
llvm-svn: 326042
|
| |
|
|
|
|
| |
EVEX encoded instructions.
llvm-svn: 326041
|