| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds intrinsics for the following:
- fabd, fadd, fsub & fsubr
- fmul, fmulx, fdiv & fdivr
- fmax, fmaxnm, fmin & fminnm
- fscale & ftsmul
Reviewers: huntergr, sdesmalen, dancgr
Reviewed By: sdesmalen
Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cameron.mcinally, cfe-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69657
|
|
|
|
|
|
| |
The default FP mode should really be a property of a specific
function, and not a subtarget. Introduce the necessary fields to the
SIMachineFunctionInfo to help move towards this goal.
|
|
|
|
|
|
|
|
|
|
| |
Update TargetTransformInfo to allow AVX1 to use YMM registers for memcmp.
This is a follow up to D68632 which enabled XOR compares which made this possible.
This also updates the memcmp-optsize.ll test unlike the first patch.
https://reviews.llvm.org/D69658
|
|
|
|
|
| |
For AMDGPU this is dependent on the FP mode, which should eventually
not be a property of the subtarget.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Introduces a clang builtins and LLVM intrinsics representing integer
min/max instructions. These instructions have not been merged to the
SIMD spec proposal yet, so they are currently opt-in only via builtins
and not produced by general pattern matching. If these instructions
are accepted into the spec proposal the builtins and intrinsics will
be replaced with normal pattern matching.
Defined in https://github.com/WebAssembly/simd/pull/27.
Reviewers: aheejin
Reviewed By: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D69696
|
|
|
|
|
| |
This reverts commit 92a25fbf11da51c8e3573b81a877d3b226990c07 and fixes
the ambiguous method call that was causing build failures.
|
|
|
|
|
|
|
| |
This reverts commit 2ab1b8c1ec452fb743f6cc5051e75a01039cabfe, it is
causing build failures on numerous bots, including
sanitizer-x86_64-linux-bootstrap-ubsan. My previous revert was for the
wrong commit.
|
|
|
|
|
|
| |
This reverts commit 11850a6305c5778b180243eb06aefe86762dd4ce, it was
causing build failures on numerous bots, including
sanitizer-x86_64-linux-bootstrap-ubsan.
|
|
|
|
|
|
|
| |
instructions"
This reverts commit a678677da498a45f59c16ee74fea438e34a801ce.
It broke CodeGen/ms-inline-asm.c on most bots.
|
|
|
|
|
|
|
|
|
| |
This patch adds flag "mayRaiseFPException" , FPCW and FPSW for X87 instructions which could raise
float exception.
Patch by LiuChen. With a couple small fixes from me.
Differential Revision: https://reviews.llvm.org/D68854
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fixes an ISel failure when a splatted load is used more than once. The
failure was due to the hacks we were doing in ISel lowering to
preserve the original load as the operand of a LOAD_SPLAT node. The
fix is to properly lower the splatted use of the load to a separate
LOAD_SPLAT node.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69640
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The SIMD spec does not include i64x2 comparisons, so they need to be
expanded. Using setOperationAction to expand them also causes f64x2
comparisons to be expanded, so setCondCodeAction needs to be used
instead. But since there are no legal condition codes, the legalizer
does not know how to expand the comparisons. We therefore manually
unroll the operation, taking care to fill each lane with -1 or 0
rather than 1 or 0 for consistency with the other vector comparisons.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69604
|
|
|
|
|
|
|
| |
selected for the FP stackifier.
We always expand these to libcalls so get rid of the last vestiges
of using the instructions.
|
|
|
|
| |
Fix the costs of `add` and `orr` with an immediate operand.
|
|
|
|
|
|
|
|
| |
with AVX1
Breaks build bots
Differential Revision: https://reviews.llvm.org/D69658
|
|
|
|
|
|
|
|
| |
Update TargetTransformInfo to allow AVX1 to use YMM registers for memcmp.
This is a follow up to D68632 which enabled XOR compares which made this possible.
https://reviews.llvm.org/D69658
|
|
|
|
|
|
| |
destination and source pair as a return value; NFC
This is breaking MSVC builds: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20375
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds some extra patterns to select AArch64 Neon SQADD, UQADD, SQSUB
and UQSUB from the existing target independent sadd_sat, uadd_sat,
ssub_sat and usub_sat nodes.
It does not attempt to replace the existing int_aarch64_neon_uqadd
intrinsic nodes as they are apparently used for both scalar and vector,
and need to be legal on scalar types for some of the patterns to work.
The int_aarch64_neon_uqadd on scalar would move the two integers into
floating point registers, perform a Neon uqadd and move the value back.
I don't believe this is good idea for uadd_sat to do the same as the
scalar alternative is simpler (an adds with a csinv). For signed it may
be smaller, but I'm not sure about it being better.
So this just adds some extra patterns for the existing vector
instructions, matching on the _sat nodes.
Differential Revision: https://reviews.llvm.org/D69374
|
|
|
|
|
|
|
|
|
| |
For AMDGPU this depends on whether denormals are enabled in the
default FP mode for the function. Currently this is treated as a
subtarget feature, so FMAD is selectively legal based on that. I want
to move this out of the subtarget features so this can be controlled
with a denormal mode attribute. Additionally, this will allow folding
based on a future ftz fast math flag.
|
|
|
|
|
| |
These can be directly taken from the GlobalValue instead of going
through the type.
|
|
|
|
|
|
|
|
|
|
| |
Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods
to return optional machine operand pair of destination and source
registers.
Patch by Nikola Prica
Differential Revision: https://reviews.llvm.org/D69622
|
|
|
|
| |
KnownZero
|
|
|
|
|
|
|
|
|
| |
This adds a flag to LLVM and clang to always generate a .debug_frame
section, even if other debug information is not being generated. In
situations where .eh_frame would normally be emitted, both .debug_frame
and .eh_frame will be used.
Differential Revision: https://reviews.llvm.org/D67216
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add pattern matching for SVE vector instructions:
-- add, sub, and, or, xor instructions
-- sqadd, uqadd, sqsub, uqsub target-independent intrinsics
-- bic intrinsics
-- predicated add, sub, subr intrinsics
Patch Review: https://reviews.llvm.org/D69128
Patch authored by: dancgr (Danilo Carvalho Grael)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds MXCSR as a reserved physical register and models its use
by X86 SSE instructions. It also adds flag "mayRaiseFPException" for the
instructions that possibly can raise FP exception according to the
architecture definition.
Following what SystemZ and other targets does, only the current rounding
modes and the IEEE exception masks are modeled. *Changes* of the MXCSR
due to exceptions are not modeled.
Patch by Pengfei Wang
Differential Revision: https://reviews.llvm.org/D68121
|
|
|
|
|
|
|
|
|
|
| |
readlane and writelane instructions are not allowed to use m0 as the
data operand, so spilling them is tricky and would require an
intermediate SGPR to spill it. Constrain the virtual register class in
this caes to disallow the inline spiller from folding the m0 operand
directly into the spill instruction.
I copied this hack from AArch64 which has the same problem for $sp.
|
| |
|
|
|
|
|
|
| |
hardcode number of operands or position of the EFLAGS operand.
This makes the code immune to the MXCSR addition in D68121.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The scheduling definitions for ASIMD transpose and zipping overlapped with
others a few lines below. Somehow, they didn't raise errors before.
There seem to be other overlapping definitions. Somehow, they still don't
raise errors.
Differential revision: https://reviews.llvm.org/D68353
|
|
|
|
| |
getTargetShuffleAndZeroables
|
|
|
|
| |
helper. NFCI.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
VCCZBugHandledSet was used to make sure we don't apply the same
workaround more than once to a single cbranch instruction, but it's not
necessary because the workaround involves inserting an s_waitcnt
instruction, which is enough for subsequent iterations to detect that no
further workaround is necessary.
Also beef up the test case to check that the workaround was only applied
once. I have also manually verified that the test still passes even if I
hack the big do-while loop in runOnMachineFunction to run a minimum of
five iterations.
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69621
|
|
|
|
|
| |
It used to generate S2_tstbit_i with constant -33 which resulted in an assert.
The reason is log2_32 was called with 64bit value 0.
|
|
|
|
|
|
| |
Enable lowering of constant pool index, jump table index, and bloack address to MIR on AIX.
Differential Revision: https://reviews.llvm.org/D69264
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
During AArch64 frame lowering instructions to enable return address
signing are inserted into function if needed. Functions generated during
machine outlining don't run through target frame lowering and hence are
missing such instructions.
This patch introduces the following changes:
1. If not all functions that potentially participate in function outlining
agree on their return address signing scope and their return address
signing key, outlining is disabled for these functions.
2. If not all functions that potentially participate in function outlining
agree on their support for v8.3A features, outlining is disabled for
these functions.
2. If all candidate functions agree on the signing scope, signing key and
and their support for v8.3 features, the outlined function behaves as
if it had the same scope and key attributes and as if it would provide
the same v8.3A support as the original functions.
Reviewers: olista01, paquette, t.p.northover, ostannard
Reviewed By: ostannard
Subscribers: ostannard, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69097
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extend the describeLoadedValue() with support for target specific ARM and
AArch64 instructions interpretation. The patch provides specialization for
ADD and SUB operations that include a register and an immediate/offset
operand. Some of the instructions can operate with global string addresses
or constant pool indexes but such cases are omitted since we currently lack
flexible support for processing such operands at DWARF production stage.
Patch by Nikola Prica
Differential Revision: https://reviews.llvm.org/D67556
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds support for codegen of masked stores, with non-truncating
and truncating variants.
Reviewers: huntergr, greened, dmgreen, rovka, sdesmalen
Reviewed By: dmgreen, sdesmalen
Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69378
|
|
|
|
|
|
| |
FSHL/FSHR support
Remove hard wired legality check.
|
|
|
|
| |
hardwiring to MVT::i8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add intrinsics for the following:
- sxt[b|h|w] & uxt[b|h|w]
- cls & clz
- not & cnot
Reviewers: huntergr, sdesmalen, dancgr
Reviewed By: sdesmalen
Subscribers: cameron.mcinally, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69567
|
|
|
|
|
| |
This one should have been done in r363902 when hasReadVCCZBug was
introduced.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet, craig.topper, rnk
Reviewed By: rnk
Subscribers: rnk, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69034
|
|
|
|
|
|
|
|
|
|
|
|
| |
the match pattern
If the instruction have match pattern, llvm-tblgen will infer the sideeffect bit from the match pattern and it works well.
If not, the tblgen will set it as true that hurt the scheduling.
PowerPC has some instructions that didn't specify the match pattern(i.e. LXSD etc), which is manually selected post-ra according
to the register pressure. We need to clear the sideeffect flag for these instructions.
Differential Revision: https://reviews.llvm.org/D69232
|
|
|
|
|
|
|
|
|
|
| |
Teach combineVectorSizedSetCCEquality() to handle arbitrary memcmp
expansions but do not change any default policy for now.
This also fixes a bug in the memcmp expansion itself when large
displacements are needed.
https://reviews.llvm.org/D69507
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69581
|
|
|
|
|
|
|
|
|
|
| |
stores w/StoreSDNode) by default
Enable the new SelectionDAG representation for unordered loads and stores introduced in r371441 by default. As a reminder, the new lowering changes the representation of an unordered atomic load from an AtomicSDNode - which is essentially a black box which gets passed through without combines messing with it - to a LoadSDNode w/a atomic marker on the MMO. The later parallels the way we handle volatiles, and I've audited the code to ensure that every location which checks one checks the other.
This has been fairly heavily fuzzed, and I examined diffs in a reasonable large corpus of assembly by hand, so I'm reasonable sure this is correct for the common case. Late in the review for this, it was discovered that I hadn't correctly handled cases which could be legalized into CAS operations. This points out that there's a strong bias in the IR of the frontend I'm working with towards only legal atomics. If there are problems with this patch, the most likely area will be legalization.
Differential Revision: https://reviews.llvm.org/D69219
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
known zero.
This catches some cases. There are probably ways to improve this.
I tried doing it as a combine on the setcc, but that broke
some cases involving flag reuse in place of test.
I renamed the isX86CCUnsigned to isX86CCSigned and flipped its
polarity to make it consistent with the similar functions for
ISD::SETCC. This avoids calling EQ/NE as being signed or unsigned.
Fixes PR43823.
Differential Revision: https://reviews.llvm.org/D69499
|
|
|
|
|
|
|
|
|
| |
Adding patten matching for two SVE intrinsics: frecps and frsqrts.
Also added patterns for fsub and fmul - these SDNodes directly correspond
to machine instructions.
Review: https://reviews.llvm.org/D68476
Patch authored by mgudim (Mikhail Gudim).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
llvm/test/DebugInfo/MIR/X86/live-debug-values-reg-copy.mir failed with
EXPENSIVE_CHECKS enabled, causing the patch to be reverted in
rG2c496bb5309c972d59b11f05aee4782ddc087e71.
This patch relands the patch with a proper fix to the
live-debug-values-reg-copy.mir tests, by ensuring the MIR encodes the
callee-saves correctly so that the CalleeSaved info is taken from MIR
directly, rather than letting it be recalculated by the PEI pass. I've
done this by running `llc -stop-before=prologepilog` on the LLVM
IR as captured in the test files, adding the extra MOV instructions
that were manually added in the original test file, then running `llc
-run-pass=prologepilog` and finally re-added the comments for the MOV
instructions.
|