| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ARMDisassembler now depends on the banked register tables in ARMUtils, so the
LLVMBuild.txt needed updating to reflect this.
Original commit mesage:
[ARM] Fix disassembly of invalid banked register moves
When disassembling banked register move instructions, we don't have an
assembly syntax for the unallocated register numbers, so we have to
return Fail rather than SoftFail. Previously we were returning SoftFail,
then crashing in the InstPrinter as we have no way to represent these
encodings in an assembly string.
This also switches the decoder to use the table-generated list of banked
registers, removing the duplicated list of encodings.
Differential revision: https://reviews.llvm.org/D43066
llvm-svn: 324606
|
| |
|
|
|
|
|
|
| |
The broken bot (clang-ppc64le-linux-multistage) is doign a shared-object build,
so I guess using lookupBankedRegByEncoding in the disassembler is a layering
violation?
llvm-svn: 324604
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When disassembling banked register move instructions, we don't have an
assembly syntax for the unallocated register numbers, so we have to
return Fail rather than SoftFail. Previously we were returning SoftFail,
then crashing in the InstPrinter as we have no way to represent these
encodings in an assembly string.
This also switches the decoder to use the table-generated list of banked
registers, removing the duplicated list of encodings.
Differential revision: https://reviews.llvm.org/D43066
llvm-svn: 324600
|
| |
|
|
|
|
| |
@ctopper Can you check that the fix is correct ?
llvm-svn: 324586
|
| |
|
|
| |
llvm-svn: 324585
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instructions affected:
mthc1, mfhc1, add.d, sub.d, mul.d, div.d,
mov.d, neg.d, cvt.w.d, cvt.d.s, cvt.d.w, cvt.s.d
These instructions are now defined for
microMIPS32r3 + microMIPS32r6 in MicroMipsInstrFPU.td
since they shared their encoding with those already defined
in microMIPS32r6InstrInfo.td and have been therefore
removed from the latter file.
Some instructions present in MicroMipsInstrFPU.td which
did not have both AFGR64 and FGR64 variants defined have
been altered to do so.
Differential revision: https://reviews.llvm.org/D42738
llvm-svn: 324584
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were generating "fmov h0, wzr" instructions when FullFP16 is not enabled.
I've not added any tests, because the problem was visible in:
test/CodeGen/AArch64/arm64-zero-cycle-zeroing.ll,
which I had to change: I don't think Cyclone has FullFP16 enabled
by default, so it shouldn't be using this v8.2a instruction.
I've also removed these rdar tags, please shout if there are any objections.
Differential Revision: https://reviews.llvm.org/D43020
llvm-svn: 324581
|
| |
|
|
|
|
|
|
| |
compare of a bitcast from vXi1.
This should allow us to remove the kortest intrinsic from IR and use compare+bitcast+or in IR instead.
llvm-svn: 324580
|
| |
|
|
|
|
|
|
| |
The KTEST instruction sets the C flag if the result of anding both operands together is all 1s. We can use this to lower (icmp eq/ne (bitcast (vXi1 X), -1)
Differential Revision: https://reviews.llvm.org/D42772
llvm-svn: 324577
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
KTEST has weird flag behavior. The Z flag is set for all bits in the AND of the k-registers being 0, and the C flag is set for all bits being 1. All other flags are cleared.
We currently emit this instruction in EmitTEST and don't check the condition code. This can lead to strange things like using the S flag after a KTEST for a signed compare.
The domain reassignment pass can also transform TEST instructions into KTEST and is not protected against the flag usage either. For now I've disabled this part of the domain reassignment pass. I tried to comment out the checks in the mir test so that we could recover them later, but I couldn't figure out how to get that to work.
This patch moves the KTEST handling into LowerSETCC and now creates a ktest+x86setcc. I've chosen this approach because I'd like to add support for the C flag for all ones in a followup patch. To do that requires that I can rewrite the condition code going in the x86setcc to be different than the original SETCC condition code.
This fixes PR36182. I'll file a PR to fix domain reassignment once this goes in. Should this be merged to 6.0?
Reviewers: spatel, guyblank, RKSimon, zvi
Reviewed By: guyblank
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42770
llvm-svn: 324576
|
| |
|
|
| |
llvm-svn: 324565
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LowerSELECT_CC is not generating optimal Select_Ri pattern at the moment. It
is not guaranteed to place ConstantNode at RHS which would miss matching
Select_Ri.
A new testcase added into the existing select_ri.ll, also there is an
existing case in cmp.ll which would be improved to use Select_Ri after this
patch, it is adjusted accordingly.
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
llvm-svn: 324560
|
| |
|
|
|
|
|
| |
Defs of operands outside of the instruction's explicit defs need
to be checked.
llvm-svn: 324554
|
| |
|
|
| |
llvm-svn: 324550
|
| |
|
|
| |
llvm-svn: 324549
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The code reusing existing wait counts is incorrect since it keeps
adding new operands to an old instruction instead of replacing
the immediate. It was also effectively switched off by the condition
that wait count is not an AMDGPU::S_WAITCNT.
Also switched to BuildMI instead of creating instructions directly.
Differential Revision: https://reviews.llvm.org/D42997
llvm-svn: 324547
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
hit from IR but creates a minefield for MI passes.
The x86 backend has fairly powerful logic to try and fold loads that
feed register operands to instructions into a memory operand on the
instruction. This is almost always a good thing, but there are specific
relocated loads that are only allowed to appear in specific
instructions. Notably, R_X86_64_GOTTPOFF is only allowed in `movq` and
`addq`. This patch blocks folding of memory operands using this
relocation unless the target is in fact `addq`.
The particular relocation indicates why we simply don't hit this under
normal circumstances. This relocation is only used for TLS, and it gets
used in very specific ways in conjunction with %fs-relative addressing.
The result is that loads using this relocation are essentially never
eligible for folding into an instruction's memory operands. Unless, of
course, you have an MI pass that inserts usage of such a load. I have
exactly such an MI pass and was greeted by truly mysterious miscompiles
where the linker replaced my instruction with a completely garbage byte
sequence. Go team.
This is the only such relocation I'm aware of in x86, but there may be
others that need to be similarly restricted.
Fixes PR36165.
Differential Revision: https://reviews.llvm.org/D42732
llvm-svn: 324546
|
| |
|
|
|
|
|
|
| |
LowerSIGN_EXTEND/LowerZERO_EXTEND/LowerANY_EXTEND.
We were doing a lot of whitelisting of what we handle in these routines, but setOperationAction constrains what we can get here. So just add some asserts and prune the unreachable paths.
llvm-svn: 324538
|
| |
|
|
|
|
| |
have already been type legalized away. NFC
llvm-svn: 324536
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
KMOVQ on non-BWI targets
If we are saving/restoring k-registers, the default behavior of getMinimalRegisterClass will find the VK64 class with a spill size of 64 bits. This will cause the KMOVQ opcode to be used for save/restore. If we don't have have BWI instructions we need to constrain the class returned to give us VK16 with a 16-bit spill size. We can do this by passing the either v16i1 or v64i1 into getMinimalRegisterClass.
Also add asserts to make sure BWI is enabled anytime we use KMOVD/KMOVQ. These are what caught this bug.
Fixes PR36256
Differential Revision: https://reviews.llvm.org/D42989
llvm-svn: 324533
|
| |
|
|
|
|
|
|
| |
This reverts commit r324487.
It broke clang tests.
llvm-svn: 324494
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note: This is a candidate for LLVM 6.0, because it was planned to be
in that release but was delayed due to a long review period.
Merge conflict in release_60 - resolution:
Add "-p6:32:32" into the second (non-amdgiz) string.
Only scalar loads support 32-bit pointers. An address in a VGPR will
fail to compile. That's OK because the results of loads will only be used
in places where VGPRs are forbidden.
Updated AMDGPUAliasAnalysis and used SReg_64_XEXEC.
The tests cover all uses cases we need for Mesa.
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D41651
llvm-svn: 324487
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
I checked the AMD closed source compiler and the workaround is only
needed when x3 is emulated as x4, which we don't do in LLVM.
SMEM x3 opcodes don't exist, and instead there is a possibility to use x4
with the last component being unused. If the last component is out of
buffer bounds and falls on the next 4K page, the hw hangs.
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D42756
llvm-svn: 324486
|
| |
|
|
|
|
| |
SSE and shorter vector sizes will have to wait until we can add support for general SMIN/SMAX matching.
llvm-svn: 324485
|
| |
|
|
| |
llvm-svn: 324477
|
| |
|
|
|
|
|
|
| |
Both operand codes now work the same way in case of register or memory
operands. It print high-order or low-order word in a double-word
register or memory location.
llvm-svn: 324476
|
| |
|
|
|
|
|
|
|
| |
This is a follow up of r324321, adding a match pattern for mov with a FP16
immediate (also fixing operand vfp_f16imm that wasn't even compiling).
Differential Revision: https://reviews.llvm.org/D42973
llvm-svn: 324456
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
that happened to end up in GCC.
This is really unfortunate, as the names don't have much rhyme or reason
to them. Originally in the discussions it seemed fine to rely on aliases
to map different names to whatever external thunk code developers wished
to use but there are practical problems with that in the kernel it turns
out. And since we're discovering this practical problems late and since
GCC has already shipped a release with one set of names, we are forced,
yet again, to blindly match what is there.
Somewhat rushing this patch out for the Linux kernel folks to test and
so we can get it patched into our releases.
Differential Revision: https://reviews.llvm.org/D42998
llvm-svn: 324449
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D42152
llvm-svn: 324446
|
| |
|
|
|
|
|
|
|
|
|
|
| |
1. Run the memory legalizer prior to the waitcnt pass; keep the policy that the waitcnt pass does not remove any waitcnts within the incoming IR.
2. The waitcnt pass doesn't (yet) track waitcnts that exist prior to the waitcnt pass (it just skips over them); because the waitcnt pass is ignorant of them, it may insert a redundant waitcnt. To avoid this, check the prev instr. If it and the to-be-inserted waitcnt are the same, suppress the insertion. We keep the existing waitcnt under the assumption that whomever, e.g., the memory legalizer, inserted it knows what they were doing.
3. Follow-on work: teach the waitcnt pass to record the pre-existing waitcnts for better waitcnt production.
Differential Revision: https://reviews.llvm.org/D42854
llvm-svn: 324440
|
| |
|
|
| |
llvm-svn: 324431
|
| |
|
|
|
|
|
|
|
|
|
|
| |
cttz_zero_undef/ctlz_zero_undef if we can prove the input is never zero
X86 currently has a late DAG combine after cttz/ctlz are turned into BSR+BSF+CMOV to detect this and remove the CMOV. But we should be able to do this much earlier and avoid creating the cmov all together.
For the changed AMDGPU test case it appears that previously the i8 cttz was type legalized to i16 which introduced an OR with 256 in order to limit the result to 8 on the widened type. At this point the result is known to never be zero, but nothing checked that. Then operation legalization is told to promote all i16 cttz to i32. This introduces an extend and a truncate and another OR with 65536 to limit the result to 16. With the DAG combiner change we are able to prevent the creation of the second OR since the opcode will have been changed to cttz_zero_undef after the first OR. I the lack of the OR caused the instruction to change to v_ffbl_b32_sdwa
Differential Revision: https://reviews.llvm.org/D42985
llvm-svn: 324427
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Following up on the discussion from
http://lists.llvm.org/pipermail/llvm-dev/2017-April/112305.html, undef
values are now placed in the .bss as well as null values. This prevents
undef global values taking up potentially huge amounts of space in the
.data section.
The following two lines now both generate equivalent .bss data:
@vals1 = internal unnamed_addr global [20000000 x i32] zeroinitializer, align 4
@vals2 = internal unnamed_addr global [20000000 x i32] undef, align 4 ; previously unaccounted for
This is primarily motivated by the corresponding issue in the Rust
compiler (https://github.com/rust-lang/rust/issues/41315).
Differential Revision: https://reviews.llvm.org/D41705
Patch by varkor!
llvm-svn: 324424
|
| |
|
|
|
|
|
| |
Fix the modeling of long division and SIMD conversion from integer and
horizontal minimum and maximum.
llvm-svn: 324417
|
| |
|
|
| |
llvm-svn: 324392
|
| |
|
|
| |
llvm-svn: 324391
|
| |
|
|
|
|
|
|
|
| |
It was always using cmpxchg path and in rmw and cmpxchg instructions
are not distinguishable in the BE.
Differential Revision: https://reviews.llvm.org/D42976
llvm-svn: 324383
|
| |
|
|
|
|
|
| |
Additionally, verify that the register defined by the producer is a
32-bit register.
llvm-svn: 324381
|
| |
|
|
|
|
|
|
|
| |
This is a follow up of r324321, adding f16 <-> f32 and f16 <-> f64 conversion
match patterns.
Differential Revision: https://reviews.llvm.org/D42954
llvm-svn: 324360
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instruction Selection
Cleanup cycle/validity checks in ISel (IsLegalToFold,
HandleMergeInputChains) and X86 (isFusableLoadOpStore). Now do a full
search for cycles / dependencies pruning the search when topological
property of NodeId allows.
As part of this propogate the NodeId-based cutoffs to narrow
hasPreprocessorHelper searches.
Reviewers: craig.topper, bogner
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D41293
llvm-svn: 324359
|
| |
|
|
|
|
|
|
| |
Author: Bas Nieuwenhuizen
https://reviews.llvm.org/D42881
llvm-svn: 324353
|
| |
|
|
| |
llvm-svn: 324352
|
| |
|
|
|
|
|
|
| |
Vector pairs are legal types, but not every operation can work on pairs.
For those operations that are legal for single vectors, generate a concat
of their results on pair halves.
llvm-svn: 324350
|
| |
|
|
| |
llvm-svn: 324349
|
| |
|
|
|
|
|
|
|
|
| |
It was expanded directly into instructions earlier. That was to avoid
loads from a constant pool for a vector negation: "xor x, splat(i1 -1)".
Implement ISD opcodes QTRUE and QFALSE to denote logical vectors of
all true and all false values, and handle setcc with negations through
selection patterns.
llvm-svn: 324348
|
| |
|
|
|
|
| |
Followup to D42544 that matches PACKUSWB cases for non-AVX512, SSE and PACKUSDW cases will have to wait until we can add support for general SMIN/SMAX matching.
llvm-svn: 324347
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Now we generate PAL metadata for the amdpal os type, there is no need to
generate the .AMDGPU.config section.
Reviewers: arsenm, nhaehnle, dstuttard
Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D37760
Change-Id: I303c5fad66656ce97293da60621afac6595b4c18
llvm-svn: 324346
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Adds support for the SVE AND instruction with vector and logical-immediate operands, and their corresponding aliases.
Reviewers: fhahn, rengolin, samparker, echristo, aadg, kristof.beyls
Reviewed By: fhahn
Subscribers: aemerson, javed.absar, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D42295
llvm-svn: 324343
|
| |
|
|
|
|
| |
Followup to D42544 that matches PACKSSWB cases for non-AVX512, SSE and PACKSSDW cases will have to wait until we can add support for general SMIN/SMAX matching.
llvm-svn: 324339
|
| |
|
|
|
|
| |
This patch fixes up my previous commit (add initialization of local variables).
llvm-svn: 324336
|