| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch adds a possibility to make library calls on NVPTX.
An important thing about library functions - they must be defined within
the current module. This basically should guarantee that we produce a
valid PTX assembly (without calls to not defined functions). The one who
wants to use the libcalls is probably will have to link against
compiler-rt or any other implementation.
Currently, it's completely impossible to make library calls because of
error LLVM ERROR: Cannot select: i32 = ExternalSymbol '...'. But we can
lower ExternalSymbol to TargetExternalSymbol and verify if the function
definition is available.
Also, there was an issue with a DAG during legalisation. When we expand
instruction into libcall, the inner call-chain isn't being "integrated"
into outer chain. Since the last "data-flow" (call retval load) node is
located in call-chain earlier than CALLSEQ_END node, the latter becomes
a leaf and therefore a dead node (and is being removed quite fast).
Proposed here solution relies on another data-flow pseudo nodes
(ProxyReg) which purpose is only to keep CALLSEQ_END at legalisation and
instruction selection phases - we remove the pseudo instructions before
register scheduling phase.
Patch by Denys Zariaiev!
Differential Revision: https://reviews.llvm.org/D34708
llvm-svn: 350069
|
| |
|
|
|
|
|
|
|
|
| |
Add widen scalar for type index 1 (i1 condition) for G_SELECT.
Select G_SELECT for pointer, s32(integer) and smaller low level
types on MIPS32.
Differential Revision: https://reviews.llvm.org/D56001
llvm-svn: 350063
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch is to fix the bug imported by rL341634.
In above submit , the the return type of ISD::ADDE is
14224: SDVTList VTs = DAG.getVTList(MVT::i64, MVT::i64),
but in fact, the second return type of ISD::ADDE should be
MVT::Glue not MVT::i64.
Reviewed By: hfinkel
Differential Revision: https://reviews.llvm.org/D55977
llvm-svn: 350061
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an alternative to what I attempted in D56057.
GetDemandedBits is a special version of SimplifyDemandedBits that allows simplifications even when the operand has other uses. GetDemandedBits will only do simplifications that allow a node to be bypassed. It won't create new nodes or alter any of the other users.
I had to add support for bypassing SIGN_EXTEND_INREG to GetDemandedBits.
Based on a patch that Simon Pilgrim sent me in email.
Fixes PR40142.
llvm-svn: 350059
|
| |
|
|
|
|
|
|
|
| |
Both of these places reference memset-like loops. Memset is precise.
Trying to keep these patches super small so they're easily post-commit
verifiable, as requested in D44748.
llvm-svn: 350044
|
| |
|
|
| |
llvm-svn: 350043
|
| |
|
|
|
|
|
|
| |
SRL/SHL+TEST to X86ISelDAGToDAG.
This cleans more code out of EmitTest.
llvm-svn: 350041
|
| |
|
|
|
|
|
|
| |
Remove the TESTmr isel patterns and add another postprocessing combine for TESTrr+ANDrm->TESTmr. We already have a postprocessing combine for TESTrr+ANDrr->TESTrr. With this we can give ANDN a chance to match first. And clean it up during post processing if we ended up with just a regular AND.
This is another step towards my plan to gut EmitTest and do more flag handling during isel matching or by using optimizeCompare.
llvm-svn: 350038
|
| |
|
|
|
|
| |
We won't end up using an ANDN instruction in this case so we should generate the same code we do for pre-BMI targets.
llvm-svn: 350018
|
| |
|
|
|
|
| |
instruction we use for sequentially consistent fence in 32-bit mode without SSE2.
llvm-svn: 350013
|
| |
|
|
|
|
|
|
|
|
|
| |
The missed load folding noticed in D55898 is visible independent of that change
either with an adjusted IR pattern to start or with AVX2/AVX512 (where the build
vector becomes a broadcast first; movddup is not produced until we get into isel
via tablegen patterns).
Differential Revision: https://reviews.llvm.org/D55936
llvm-svn: 350005
|
| |
|
|
|
|
|
|
| |
X86::AddrBaseReg/AddrIndexReg, etc. instead of hardcoded constants.
Makes the code a little more readable.
llvm-svn: 349983
|
| |
|
|
|
|
|
|
| |
NVPTXAsmPrinter::doInitialization() was creating an NVPTXSubtarget on
the stack. This object is huge, about 80kb. Also it's slow to create.
And it's all redundant; we have one in NVPTXTargetMachine anyway!
llvm-svn: 349982
|
| |
|
|
|
|
|
|
| |
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D56030
llvm-svn: 349975
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Added a pair of APIs for encoding/decoding the 3 components of a DWARF discriminator described in http://lists.llvm.org/pipermail/llvm-dev/2016-October/106532.html: the base discriminator, the duplication factor (useful in profile-guided optimization) and the copy index (used to identify copies of code in cases like loop unrolling)
The encoding packs 3 unsigned values in 32 bits. This CL addresses 2 issues:
- communicates overflow back to the user
- supports encoding all 3 components together. Current APIs assume a sequencing of events. For example, creating a new discriminator based on an existing one by changing the base discriminator was not supported.
Reviewers: davidxl, danielcdh, wmi, dblaikie
Reviewed By: dblaikie
Subscribers: zzheng, dmgreen, aprantl, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D55681
llvm-svn: 349973
|
| |
|
|
|
|
|
|
|
|
|
|
| |
turned the root nodes into one of the flag producing binops.
This fixes the patterns that have or/and as a root. 'and' is handled differently since thy usually have a CMP wrapped around them.
I had to look for uses of the CF flag because all these nodes have non-standard CF flag behavior. A real or/xor would always clear CF. In practice we shouldn't be using the CF flag from these nodes as far as I know.
Differential Revision: https://reviews.llvm.org/D55813
llvm-svn: 349962
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
sign flag is used.
The BEXTR instruction documents the SF bit as undefined.
The TBM BEXTR instruction has the same issue, but I'm not sure how to test it. With the control being an immediate we can determine the sign bit is 0 or the BEXTR would have been removed.
Fixes PR40060
Differential Revision: https://reviews.llvm.org/D55807
llvm-svn: 349956
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
negative in Indirect addressing.
Summary:
Don't peel of the offset if the resulting base could possibly be negative in Indirect addressing.
This is because the M0 field is of unsigned.
This patch achieves the similar goal as https://reviews.llvm.org/D55241, but keeps the optimization
if the base is known unsigned.
Reviewers:
arsemn
Differential Revision:
https://reviews.llvm.org/D55568
llvm-svn: 349951
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is admittedly a narrow fix for the problem:
https://bugs.llvm.org/show_bug.cgi?id=37502
...but as the XOP restriction shows, it's a maze to get this right.
In the motivating example, note that we have movddup before SSE4.1 and
again with AVX2. That's because insertps isn't available pre-SSE41 and
vbroadcast is (more generally) available with AVX2 (and the splat is
reduced to movddup via isel pattern).
Differential Revision: https://reviews.llvm.org/D55898
llvm-svn: 349937
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Fixes PR35023.
Reviewers: MatzeB, t.p.northover, sunfish, qcolombet, efriedma
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D55909
llvm-svn: 349935
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This adds support for widening G_FCEIL in LegalizerHelper and
AArch64LegalizerInfo. More specifically, it teaches the AArch64 legalizer to
widen G_FCEIL from a 16-bit float to a 32-bit float when the subtarget doesn't
support full FP 16.
This also updates AArch64/f16-instructions.ll to show that we perform the
correct transformation.
llvm-svn: 349927
|
| |
|
|
|
|
| |
Change order of conditions in predicate.
llvm-svn: 349918
|
| |
|
|
|
|
| |
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.
llvm-svn: 349915
|
| |
|
|
|
|
| |
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.
llvm-svn: 349914
|
| |
|
|
|
|
| |
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.
llvm-svn: 349912
|
| |
|
|
|
|
|
|
| |
value. NFCI.
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.
llvm-svn: 349911
|
| |
|
|
|
|
| |
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.
llvm-svn: 349909
|
| |
|
|
|
|
| |
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.
llvm-svn: 349908
|
| |
|
|
|
|
| |
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.
llvm-svn: 349906
|
| |
|
|
|
|
| |
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.
llvm-svn: 349905
|
| |
|
|
|
|
| |
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.
llvm-svn: 349903
|
| |
|
|
|
|
| |
Continues the work started by @bogner in rL340594 to remove uses of the old KnownBits output paramater version.
llvm-svn: 349902
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- When signing return addresses with -msign-return-address=<scope>{+<key>},
either the A key instructions or the B key instructions can be used. To
correctly authenticate the return address, the unwinder/debugger must know
which key was used to sign the return address.
- When and exception is thrown or a break point reached, it may be necessary to
unwind the stack. To accomplish this, the unwinder/debugger must be able to
first authenticate an the return address if it has been signed.
- To enable this, the augmentation string of CIEs has been extended to allow
inclusion of a 'B' character. Functions that are signed using the B key
variant of the instructions should have and FDE whose associated CIE has a 'B'
in the augmentation string.
- One must also be able to preserve these semantics when first stepping from a
high level language into assembly and then, as a second step, into an object
file. To achieve this, I have introduced a new assembly directive
'.cfi_b_key_frame ', that tells the assembler the current frame uses return
address signing with the B key.
- This ensures that the FDE is associated with a CIE that has 'B' in the
augmentation string.
Differential Revision: https://reviews.llvm.org/D51798
llvm-svn: 349895
|
| |
|
|
|
|
|
|
|
|
|
|
| |
intrinsics (llvm)
This auto upgrades the signed SSE saturated math intrinsics to SADD_SAT/SSUB_SAT generic intrinsics.
Clang counterpart: https://reviews.llvm.org/D55890
Differential Revision: https://reviews.llvm.org/D55894
llvm-svn: 349892
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D55956
llvm-svn: 349889
|
| |
|
|
| |
llvm-svn: 349882
|
| |
|
|
| |
llvm-svn: 349880
|
| |
|
|
|
|
|
|
|
|
| |
It seems better to avoid using the callback if possible since
there are coverage assertions which are disabled if this is used.
Also fix missing tests. Only test the legal cases since it seems
legalization for build_vector is quite lacking.
llvm-svn: 349878
|
| |
|
|
|
|
|
|
|
|
| |
X86ISelDAGToDAG.cpp to tranlate opcode to condition code using the helpers in X86InstrInfo.cpp.
This shortens the switches in X86ISelDAGToDAG.cpp to only need to check condition code instead of a list of opcodes.
This also fixes a bug where the memory forms of SETcc were missing from hasNoCarryFlagUses.
llvm-svn: 349868
|
| |
|
|
|
|
| |
Found while working on another patch
llvm-svn: 349867
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This saves materializing the immediate. The additional forms are less
common (they don't usually show up for bitfield insert/extract), but
they're still relevant.
I had to add a new target hook to prevent DAGCombine from reversing the
transform. That isn't the only possible way to solve the conflict, but
it seems straightforward enough.
Differential Revision: https://reviews.llvm.org/D55630
llvm-svn: 349857
|
| |
|
|
|
|
|
|
|
| |
If you don't do this, then if you hit a G_LOAD in getInstrMapping, you'll end
up with GPRs on the G_FCEIL instead of FPRs. This causes a fallback.
Add it to the switch, and add a test verifying that this happens.
llvm-svn: 349822
|
| |
|
|
|
|
|
|
|
|
|
| |
We have to treat constructs like this as if they were "symbolic", to use
the correct codepath to resolve them. This mostly only affects movz
etc. because the other uses of classifySymbolRef conservatively treat
everything that isn't a constant as if it were a symbol.
Differential Revision: https://reviews.llvm.org/D55906
llvm-svn: 349800
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This requires a bit more code than other fixups, to distingush between
abs_g0/abs_g1/etc. Actually, I think some of the other fixups are
missing some checks, but I won't try to address that here.
I haven't seen any real-world code that uses a construct like this, but
it clearly should work, and we're considering using it in the
implementation of localescape/localrecover on Windows (see
https://reviews.llvm.org/D53540). I've verified that binutils produces
the same code as llvm-mc for the testcase.
This currently doesn't include support for the *_s variants (that
requires a bit more work to set the opcode).
Differential Revision: https://reviews.llvm.org/D55896
llvm-svn: 349799
|
| |
|
|
|
|
|
|
|
|
|
|
| |
intrinsics (llvm)
This emits FSHL/FSHR generic intrinsics for the XOP VPROT and AVX512 VPROL/VPROR rotation intrinsics.
Clang counterpart: https://reviews.llvm.org/D55937
Differential Revision: https://reviews.llvm.org/D55938
llvm-svn: 349795
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Build llvm with assertion on, and then build bcc against this llvm.
Run any bcc tool with debug=8 (turning on -g for clang compilation),
you will get the following assertion errors,
/home/yhs/work/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:888:
void llvm::RuntimeDyldELF::resolveBPFRelocation(const llvm::SectionEntry&, uint64_t,
uint64_t, uint32_t, int64_t): Assertion `Value <= (4294967295U)' failed.
The .BTF.ext ELF section uses Fixup's to get the instruction
offsets. The data width of the Fixup is 4 bytes since we only need
the insn offset within the section.
This caused the above error though since R_BPF_64_32 expects
4-byte value and the Runtime Dyld tried to resolve the actual
insn address which is 8 bytes.
Actually the offset within the section is all what we need.
Therefore, there is no need to perform any kind of relocation
for .BTF.ext section and such relocation will actually cause
incorrect result.
This patch changed BPFELFObjectWriter::getRelocType() such that
for Fixup Kind FK_Data_4, if the relocation Target is a temporary
symbol, let us skip the relocation (ELF::R_BPF_NONE).
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 349778
|
| |
|
|
| |
llvm-svn: 349770
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a update to D43157 to correctly handle fixup_riscv_pcrel_lo12.
Notable changes:
Rebased onto trunk
Handle and test S-type
Test case pcrel-hilo.s is merged into relocations.s
D43157 description:
VK_RISCV_PCREL_LO has to be handled specially. The MCExpr inside is
actually the location of an auipc instruction with a VK_RISCV_PCREL_HI fixup
pointing to the real target.
Differential Revision: https://reviews.llvm.org/D54029
Patch by Chih-Mao Chen and Michael Spencer.
llvm-svn: 349764
|
| |
|
|
|
|
| |
As discussed on D55747, the expansion to (wider) shifts is better on all AVX512 cases, not just BWI.
llvm-svn: 349763
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
There are several vector instructions which may or may not set the
condition code register, depending on the value of an argument.
For codegen, we use two versions of the instruction, one that sets
CC and one that doesn't, which hard-code appropriate values of that
argument. But we also have a "generic" version of the instruction
that is used for the assembler/disassembler. These generic versions
should always be considered to clobber CC just to be safe.
llvm-svn: 349761
|