| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
(Not so) boringly identical to pattern a (D62786)
Not yet sure how do deal with the last pattern c.
Reviewers: RKSimon, craig.topper, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62793
llvm-svn: 364418
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Finally tying up loose ends here.
The problem is quite simple:
If we have pattern `(x >> start) & (1 << nbits) - 1`,
and then truncate the result, that truncation will be propagated upwards,
into the `and`. And that isn't currently handled.
I'm only fixing pattern `a` here,
the same fix will be needed for patterns `b`/`c` too.
I *think* this isn't missing any extra legality checks,
since we only look past truncations. Similary, i don't think
we can get any other truncation there other than i64->i32.
Reviewers: craig.topper, RKSimon, spatel
Reviewed By: craig.topper
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62786
llvm-svn: 364417
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was failing with:
/b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.cpp:18772:66:
error: call of overloaded 'makeArrayRef(<brace-enclosed initializer list>)' is ambiguous
scaleShuffleMask<int>(Scale, makeArrayRef<int>({ 0, 2, 1, 3 }), Mask);
^
/b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.cpp:18772:66: note: candidates are:
In file included from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/CodeGen/MachineFunction.h:20:0,
from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/CodeGen/CallingConvLower.h:19,
from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.h:17,
from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Target/X86/X86ISelLowering.cpp:14:
/b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/ADT/ArrayRef.h:480:15:
note: llvm::ArrayRef<T> llvm::makeArrayRef(const std::vector<_RealType>&) [with T = int]
ArrayRef<T> makeArrayRef(const std::vector<T> &Vec) {
^
/b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/ADT/ArrayRef.h:485:37:
note: llvm::ArrayRef<T> llvm::makeArrayRef(const llvm::ArrayRef<T>&) [with T = int]
template <typename T> ArrayRef<T> makeArrayRef(const ArrayRef<T> &Vec) {
^
llvm-svn: 364414
|
| |
|
|
|
|
|
|
| |
extract_subvector(bitcast()) support
Ideally this needs to be a generic combine in DAGCombiner::visitEXTRACT_SUBVECTOR but there's some nasty regressions in aarch64 due to neon shuffles not handling bitcasts at all.....
llvm-svn: 364407
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The getFixupKindContainerSizeBytes function returns the size of the
instruction containing a given fixup. Currently fixup_arm_pcrel_9 is
not handled in this function, this causes an assertion failure in
the debug build and incorrect codegen in the release build.
This patch fixes the problem.
Reviewers: ostannard, simon_tatham
Reviewed By: ostannard
Subscribers: javed.absar, kristof.beyls, hiraditya, pbarrio, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63778
llvm-svn: 364404
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the PseudoCALLReg instruction which allows using an
explicit register operand as the destination for the return address.
GCC can successfully parse this form of the call instruction, which
would be used for calls to functions which do not use ra as the return
address register, such as the __riscv_save libcalls. This patch forms
the first part of an implementation of -msave-restore for RISC-V.
Differential Revision: https://reviews.llvm.org/D62685
llvm-svn: 364403
|
| |
|
|
|
|
|
|
| |
truncateVectorWithPACK is often used in conjunction with ComputeNumSignBits which struggles when peeking through bitcasts.
This fix tries to avoid bitcast(shuffle(bitcast())) patterns in the 256-bit 64-bit sublane shuffles so we can still see through at least until lowering when the shuffles will need to be bitcasted to widen the shuffle type.
llvm-svn: 364401
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: gchatelet, echristo, spatel, atdt
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63769
llvm-svn: 364384
|
| |
|
|
|
|
|
|
|
| |
PPCMIPeephole::emitRLDICWhenLoweringJumpTables should return a bool
value to indicate optimization is conducted or not.
Differential Revision: https://reviews.llvm.org/D63801
llvm-svn: 364383
|
| |
|
|
| |
llvm-svn: 364376
|
| |
|
|
|
|
|
|
|
|
| |
This was just an omission in the back end. We have had the instructions for both
single and double precision for a few HW generations, but never got around to
legalizing these.
Differential revision: https://reviews.llvm.org/D63634
llvm-svn: 364373
|
| |
|
|
| |
llvm-svn: 364372
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
`catch_all` is from the first version of EH proposal and now has been
removed. There were no tests covering this, and thus no tests to remove
or fix.
Reviewers: aardappel
Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63737
llvm-svn: 364360
|
| |
|
|
|
|
| |
The same oddity was apparently copy-pasted between multiple targets.
llvm-svn: 364349
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Original patch https://reviews.llvm.org/D63659 from
Steven Perron <stevenperron@google.com>
The pass AMDGPUUnifyDivergentExitNodes does not update the phi nodes in
the successors of blocks that is splits. This is fixed by calling
BasicBlock::splitBasicBlock to split the block instead of doing it
manually. This does extra work because a new conditional branch is
created in BB which is immediately replaced, but I think the simplicity
is worth it. It also helps make the code more future proof in case other
things need to be updated.
llvm-svn: 364342
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D63780
llvm-svn: 364339
|
| |
|
|
|
|
|
|
| |
I believe these all get canonicalized to vzext_movl. The only case where that wasn't true was when the load was loadi32 and the load was an extload aligned to 32 bits. But that was fixed in r364207.
Differential Revision: https://reviews.llvm.org/D63701
llvm-svn: 364337
|
| |
|
|
|
|
|
|
|
|
|
|
| |
volatile. Remove isel patterns for vzmovl+load
We currently have some isel patterns for treating vzmovl+load the same as vzload, but that shrinks the load which we shouldn't do if the load is volatile.
Rather than adding isel checks for volatile. This patch removes the patterns and teachs DAG combine to merge them into vzload when its legal to do so.
Differential Revision: https://reviews.llvm.org/D63665
llvm-svn: 364333
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"To" selects an odd-numbered GPR, and "Te" an even one. There are some
8.1-M instructions that have one too few bits in their register fields
and require registers of particular parity, without necessarily using
a consecutive even/odd pair.
Also, the constraint letter "t" should select an MVE q-register, when
MVE is present. This didn't need any source changes, but some extra
tests have been added.
Reviewers: dmgreen, samparker, SjoerdMeijer
Subscribers: javed.absar, eraman, kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D60709
llvm-svn: 364331
|
| |
|
|
|
|
|
|
|
|
|
|
| |
A refactor in r364191 changed register types from an unsigned int to the
llvm:Register class. Adjust the AVR backend to this change.
This fixes build errors when building with the experimental AVR backend
enabled.
Differential Revision: https://reviews.llvm.org/D63776
llvm-svn: 364330
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This provides the low-level support to start using MVE vector types in
LLVM IR, loading and storing them, passing them to __asm__ statements
containing hand-written MVE vector instructions, and *if* you have the
hard-float ABI turned on, using them as function parameters.
(In the soft-float ABI, vector types are passed in integer registers,
and combining all those 32-bit integers into a q-reg requires support
for selection DAG nodes like insert_vector_elt and build_vector which
aren't implemented yet for MVE. In fact I've also had to add
`arm_aapcs_vfpcc` to a couple of existing tests to avoid that
problem.)
Specifically, this commit adds support for:
* spills, reloads and register moves for MVE vector registers
* ditto for the VPT predication mask that lives in VPR.P0
* make all the MVE vector types legal in ISel, and provide selection
DAG patterns for BITCAST, LOAD and STORE
* make loads and stores of scalar FP types conditional on
`hasFPRegs()` rather than `hasVFP2Base()`. As a result a few
existing tests needed their llc command lines updating to use
`-mattr=-fpregs` as their method of turning off all hardware FP
support.
Reviewers: dmgreen, samparker, SjoerdMeijer
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60708
llvm-svn: 364329
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In Secure PLT ABI, -fpic is similar to -fPIC. The differences are that:
* -fpic stores the address of _GLOBAL_OFFSET_TABLE_ in r30, while -fPIC stores .got2+0x8000.
* -fpic uses an addend of 0 for R_PPC_PLTREL24, while -fPIC uses 0x8000.
Reviewers: hfinkel, jhibbits, joerg, nemanjai, spetrovic
Reviewed By: jhibbits
Subscribers: adalava, kbarton, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63563
llvm-svn: 364324
|
| |
|
|
|
|
|
|
|
| |
The expensive buildbots highlighted the mir tests were broken, which
I've now updated and added --verify-machineinstrs to them. This also
uncovered a couple of bugs in the backend pass, so these have also
been fixed.
llvm-svn: 364323
|
| |
|
|
|
|
|
| |
- `test/Misc/backend-resource-limit-diagnostics.cl` crashes as null
streamer is used.
llvm-svn: 364318
|
| |
|
|
|
|
| |
lowerShuffleAsSpecificZeroOrAnyExtend should be able to lower to ANY_EXTEND_VECTOR_INREG as well as ZER_EXTEND_VECTOR_INREG.
llvm-svn: 364313
|
| |
|
|
| |
llvm-svn: 364312
|
| |
|
|
| |
llvm-svn: 364308
|
| |
|
|
|
|
|
|
| |
Including both 'case ARM_AM::uxtw' and 'default' in the getShiftOp
switch caused a buildbot to fail with
error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default]
llvm-svn: 364300
|
| |
|
|
|
|
|
|
|
|
|
|
| |
A minor iteration on the MVE VPT Block pass to enable more efficient VPT Block
code generation: consecutive VPT predicated statements, predicated on the same
condition, will be placed within the same VPT Block. This essentially is also
an exercise to write some more tests for the next step, which should be more
generic also merging instructions when they are not consecutive.
Differential Revision: https://reviews.llvm.org/D63711
llvm-svn: 364298
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The symbols use the processor-specific SHN_AMDGPU_LDS section index
introduced with a previous change. The linker is then expected to resolve
relocations, which are also emitted.
Initially disabled for HSA and PAL environments until they have caught up
in terms of linker and runtime loader.
Some notes:
- The llvm.amdgcn.groupstaticsize intrinsics can no longer be lowered
to a constant at compile times, which means some tests can no longer
be applied.
The current "solution" is a terrible hack, but the intrinsic isn't
used by Mesa, so we can keep it for now.
- We no longer know the full LDS size per kernel at compile time, which
means that we can no longer generate a relevant error message at
compile time. It would be possible to add a check for the size of
individual variables, but ultimately the linker will have to perform
the final check.
Change-Id: If66dbf33fccfbf3609aefefa2558ac0850d42275
Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin
Subscribers: qcolombet, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61494
llvm-svn: 364297
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The directive defines a symbol as an group/local memory (LDS) symbol.
LDS symbols behave similar to common symbols for the purposes of ELF,
using the processor-specific SHN_AMDGPU_LDS as section index.
It is the linker and/or runtime loader's job to "instantiate" LDS symbols
and resolve relocations that reference them.
It is not possible to initialize LDS memory (not even zero-initialize
as for .bss).
We want to be able to link together objects -- starting with relocatable
objects, but possible expanding to shared objects in the future -- that
access LDS memory in a flexible way.
LDS memory is in an address space that is entirely separate from the
address space that contains the program image (code and normal data),
so having program segments for it doesn't really make sense.
Furthermore, we want to be able to compile multiple kernels in a
compilation unit which have disjoint use of LDS memory. In that case,
we may want to place LDS symbols differently for different kernels
to save memory (LDS memory is very limited and physically private to
each kernel invocation), so we can't simply place LDS symbols in a
.lds section.
Hence this solution where LDS symbols always stay undefined.
Change-Id: I08cbc37a7c0c32f53f7b6123aa0afc91dbc1748f
Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61493
llvm-svn: 364296
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If an FP_EXTEND or FP_ROUND isel dag node converts directly between
f16 and f32 when the target CPU has no instruction to do it in one go,
it has to be done in two steps instead, going via f32.
Previously, this was done implicitly, because all such CPUs had the
storage-only implementation of f16 (i.e. the only thing you can do
with one at all is to convert it to/from f32). So isel would legalize
the f16 into an f32 as soon as it saw it, by inserting an fp16_to_fp
node (or vice versa), and then the fp_extend would already be f32->f64
rather than f16->f64.
But that technique can't support a target CPU which has full f16
support but _not_ f64, such as some variants of Arm v8.1-M. So now we
provide custom lowering for FP_EXTEND and FP_ROUND, which checks
support for f16 and f64 and decides on the best thing to do given the
combination of flags it gets back.
Reviewers: dmgreen, samparker, SjoerdMeijer
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60692
llvm-svn: 364294
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This final batch includes the tail-predicated versions of the
low-overhead loop instructions (LETP); the VPSEL instruction to select
between two vector registers based on the predicate mask without
having to open a VPT block; and VPNOT which complements the predicate
mask in place.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62681
llvm-svn: 364292
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the rest of the vector memory access instructions. It
includes contiguous loads/stores, with an ordinary addressing mode
such as [r0,#offset] (plus writeback variants); gather loads and
scatter stores with a scalar base address register and a vector of
offsets from it (written [r0,q1] or similar); and gather/scatters with
a vector of base addresses (written [q0,#offset], again with
writeback). Additionally, some of the loads can widen each loaded
value into a larger vector lane, and the corresponding stores narrow
them again.
To implement these, we also have to add the addressing modes they
need. Also, in AsmParser, the `isMem` query function now has
subqueries `isGPRMem` and `isMVEMem`, according to which kind of base
register is used by a given memory access operand.
I've also had to add an extra check in `checkTargetMatchPredicate` in
the AsmParser, without which our last-minute check of `rGPR` register
operands against SP and PC was failing an assertion because Tablegen
had inserted an immediate 0 in place of one of a pair of tied register
operands. (This matches the way the corresponding check for `MCK_rGPR`
in `validateTargetOperandClass` is guarded.) Apparently the MVE load
instructions were the first to have ever triggered this assertion, but
I think only because they were the first to have a combination of the
usual Arm pre/post writeback system and the `rGPR` class in particular.
Reviewers: dmgreen, samparker, SjoerdMeijer, t.p.northover
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62680
llvm-svn: 364291
|
| |
|
|
|
|
|
|
|
|
| |
As pointed out in https://bugs.llvm.org/show_bug.cgi?id=41777
we do not emit a vector select even when the pretty much asks for one.
This patch changes that.
Differential revision: https://reviews.llvm.org/D61658
llvm-svn: 364289
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce three pseudo instructions to be used during DAG ISel to
represent v8.1-m low-overhead loops. One maps to set_loop_iterations
while loop_decrement_reg is lowered to two, so that we can separate
the decrement and branching operations. The pseudo instructions are
expanded pre-emission, where we can still decide whether we actually
want to generate a low-overhead loop, in a new pass:
ARMLowOverheadLoops. The pass currently bails, reverting to an sub,
icmp and br, in the cases where a call or stack spill/restore happens
between the decrement and branching instructions, or if the loop is
too large.
Differential Revision: https://reviews.llvm.org/D63476
llvm-svn: 364288
|
| |
|
|
|
|
| |
Split off from D60318.
llvm-svn: 364281
|
| |
|
|
| |
llvm-svn: 364262
|
| |
|
|
| |
llvm-svn: 364215
|
| |
|
|
| |
llvm-svn: 364214
|
| |
|
|
|
|
|
|
|
|
|
| |
Scalar extends to s64 can use S_BFE_{I64|U64}, but vector extends need
to extend to the 32-bit half, and then to 64.
I'm not sure what the line should be between what RegBankSelect
handles, and what instruction select does, but for now I'm erring on
the side of RegBankSelect for future post-RBS combines.
llvm-svn: 364212
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The LLVM disassembler assumes that the unused src0 operand of v_nop is
zero. Other tools can put another value in that field, which is still
valid. This commit fixes the LLVM disassembler to recognize such an
encoding as v_nop, in the same way as we already do for s_getpc.
Differential Revision: https://reviews.llvm.org/D63724
Change-Id: Iaf0363eae26ff92fc4ebc716216476adbff37a6f
llvm-svn: 364208
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
there are no zeroes in the vector we're building.
In LowerBuildVectorv16i8 we took care to use an any_extend if the first pair is in the lower 16-bits of the vector and no elements are 0. So bits [31:16] will be undefined. But we still emitted a vzext_movl to ensure that bits [127:32] are 0. If we don't need any zeroes we should be consistent and make all of 127:16 undefined.
In LowerBuildVectorv8i16 we can just delete the vzext_movl code because we only use the scalar_to_vector when there are no zeroes. So the vzext_movl is always unnecessary.
Found while investigating whether (vzext_movl (scalar_to_vector (loadi32)) patterns are necessary. At least one of the cases where they were necessary was where the loadi32 matched 32-bit aligned 16-bit extload. Seemed weird that we required vzext_movl for that case.
Differential Revision: https://reviews.llvm.org/D63700
llvm-svn: 364207
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch does a few things to start cleaning up the isFNEG function.
-Remove the Op0/Op1 peekThroughBitcast calls that seem unnecessary. getTargetConstantBitsFromNode has its own peekThroughBitcast inside. And we have a separate peekThroughBitcast on the return value.
-Add a check of the scalar size after the first peekThroughBitcast to ensure we haven't changed the element size and just did something like f32->i32 or f64->i64.
-Remove an unnecessary check that Op1's type is floating point after the peekThroughBitcast. We're just going to look for a bit pattern from a constant. We don't care about its type.
-Add VT checks on several places that consume the return value of isFNEG. Due to the peekThroughBitcasts inside, the type of the return value isn't guaranteed. So its not safe to use it to build other nodes without ensuring the type matches the type being used to build the node. We might be able to replace these checks with bitcasts instead, but I don't have a test case so a bail out check seemed better for now.
Differential Revision: https://reviews.llvm.org/D63683
llvm-svn: 364206
|
| |
|
|
|
|
| |
Try to fail for scc, since I don't think that should ever be produced.
llvm-svn: 364199
|
| |
|
|
|
|
| |
For some reason clang is happy with the conflict, but MSVC is not.
llvm-svn: 364196
|
| |
|
|
| |
llvm-svn: 364195
|
| |
|
|
|
|
|
|
|
| |
Force using Register.
One downside is the generated register enums require explicit
conversion.
llvm-svn: 364194
|
| |
|
|
|
|
|
|
|
| |
Avoids using a plain unsigned for registers throughoug codegen.
Doesn't attempt to change every register use, just something a little
more than the set needed to build after changing the return type of
MachineOperand::getReg().
llvm-svn: 364191
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Removing the unused variable AllSGPRSpilledToVGPRs in
SIFrameLowering::processFunctionBeforeFrameFinalized
to avoid
error: variable 'AllSGPRSpilledToVGPRs' set but not used
[-Werror=unused-but-set-variable]
Reviewers: arsenm, nhaehnle
Reviewed By: nhaehnle
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63721
llvm-svn: 364190
|