| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Add legalizer support for loads of v8s8 and update legalize-load-store.mir.
Differential Revision: https://reviews.llvm.org/D60877
llvm-svn: 358714
|
|
|
|
|
|
|
|
|
|
| |
Legalize things like i24 load/store by splitting them into smaller power of 2 operations.
This matches how SelectionDAG handles these operations.
Differential Revision: https://reviews.llvm.org/D59971
llvm-svn: 358613
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
function calls without used results (PR41259)
The original commit caused false positives from AddressSanitizer's
use-after-scope checks, which have now been fixed in r358478.
> The code was previously checking that candidates for sinking had exactly
> one use or were a store instruction (which can't have uses). This meant
> we could sink call instructions only if they had a use.
>
> That limitation seemed a bit arbitrary, so this patch changes it to
> "instruction has zero or one use" which seems more natural and removes
> the need to special-case stores.
>
> Differential revision: https://reviews.llvm.org/D59936
llvm-svn: 358483
|
|
|
|
|
|
|
|
| |
Since non-pow-2 types are going to get split up into multiple loads anyway,
don't do the [SZ]EXTLOAD combine for those and save us trouble later in
legalization.
llvm-svn: 358458
|
|
|
|
|
|
| |
This is need to pass future checks of perf branch_weights metadata.
llvm-svn: 358384
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
constants only.
Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't
be degraded.
This change also improves the IRTranslator so that in most places, but not all,
it creates constants using the MIRBuilder directly instead of first creating a
new destination vreg and then creating a constant. By doing this, the
buildConstant() method can just return the vreg of an existing G_CONSTANT
instead of having to create a COPY from it.
I measured a 0.2% improvement in compile time and a 0.9% improvement in code
size at -O0 ARM64.
Compile time:
Program base cse diff
test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8%
test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7%
test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4%
test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3%
test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2%
test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2%
test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2%
test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1%
test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1%
test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0%
Geomean difference -0.2%
Code size:
Program base cse diff
test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7%
test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2%
test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1%
test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1%
test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9%
test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 8033492 -0.6%
test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4%
test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0%
test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0%
test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0%
Geomean difference -0.9%
Differential Revision: https://reviews.llvm.org/D60580
llvm-svn: 358369
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fix a crash.
This enables the simple copy combine that already exists in the CombinerHelper.
However, it exposed a bug in the GISelChangeObserver where it wouldn't clear a
set of MIs to process, and so would end up causing a crash when deleted MIs were
being added to the combiner worklist again.
Differential Revision: https://reviews.llvm.org/D60579
llvm-svn: 358318
|
|
|
|
|
|
|
|
| |
This crash was introduced in r358032 as we try to construct an EVT from an MVT
in order to find the register type for the calling conv. Fall back instead of
trying to do this with an invalid MVT coming from i256.
llvm-svn: 358314
|
|
|
|
|
|
|
|
|
|
|
| |
undef mask element.
If a shufflevector's mask vector has an element with "undef" then the generic
instruction defining that element register is a G_IMPLICT_DEF instead of G_CONSTANT.
This fixes the selector to handle this case, and for now assumes that undef just means
zero. In future we'll optimize this case properly.
llvm-svn: 358312
|
|
|
|
|
|
|
| |
Some of these were legalizing into smaller vector types unnecessarily,
others were simply not supported yet.
llvm-svn: 358223
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vectors of pointers.
Loads and store of values with type like <2 x p0> currently don't get imported
because SelectionDAG has no knowledge of pointer types. To leverage the existing
support for vector load/stores, we can bitcast the value to have s64 element
types instead. We do this as a custom legalization.
This patch also adds support for general loads of <2 x s64>, and relaxes some
type conditions on selecting G_BITCAST.
Differential Revision: https://reviews.llvm.org/D60534
llvm-svn: 358221
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Add lowering pattern for llvm.aarch64.neon.vcvtfxs2fp.f16.i64
Reviewers: pbarrio, DavidSpickett, LukeGeeson
Reviewed By: LukeGeeson
Subscribers: javed.absar, kristof.beyls, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60259
llvm-svn: 358171
|
|
|
|
|
|
| |
The existing isel support already works for p0 once the legalizer accepts it.
llvm-svn: 358144
|
|
|
|
| |
llvm-svn: 358143
|
|
|
|
| |
llvm-svn: 358142
|
|
|
|
| |
llvm-svn: 358109
|
|
|
|
|
|
|
|
| |
This patch teach getTestBitOperand to look through ANY_EXTENDs when the extended bits aren't used. The test case changed here is based what D60358 did to test16 in tbz-tbnz.ll. So this patch will avoid that regression.
Differential Revision: https://reviews.llvm.org/D60482
llvm-svn: 358108
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The fp16 scalar version of facge and facgt requires a custom patter matching, as the result type is not the same width of the operands.
Reviewers: olista01, javed.absar, pbarrio
Reviewed By: javed.absar
Subscribers: kristof.beyls, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60212
llvm-svn: 358083
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The selection for G_ICMP is unfortunately not currently importable from SDAG
due to the use of custom SDNodes. To support this, this selection method has an
opcode table which has been generated by a script, indexed by various
instruction properties. Ideally in future we will have a GISel native selection
patterns that we can write in tablegen to improve on this.
For selection of some types we also need support for G_ASHR and G_SHL which are
generated as a result of legalization. This patch also adds support for them,
generating the same code as SelectionDAG currently does.
Differential Revision: https://reviews.llvm.org/D60436
llvm-svn: 358035
|
|
|
|
|
|
|
|
| |
Selection support will be coming in a later patch.
Differential Revision: https://reviews.llvm.org/D60435
llvm-svn: 358034
|
|
|
|
|
|
|
|
| |
This is needed for some future support for vector ICMP.
Differential Revision: https://reviews.llvm.org/D60433
llvm-svn: 358033
|
|
|
|
|
|
|
|
|
|
|
| |
required to be passed as different register types. E.g. <2 x i16> may need to
be passed as a larger <2 x i32> type, so formal arg lowering needs to be able
truncate it back. Likewise, when dealing with returns of these types, they need
to be widened in the appropriate way back.
Differential Revision: https://reviews.llvm.org/D60425
llvm-svn: 358032
|
|
|
|
|
|
|
|
| |
tbnz when the shift has been zero extended from i32 to i64. NFC
This pattern showed up in D60358 and it was suggested I had a test and fix that separately.
llvm-svn: 358030
|
|
|
|
|
|
|
|
|
|
| |
Second half of PR40800, this patch adds DAG undef handling to fcmp instructions to match the behavior in llvm::ConstantFoldCompareInstruction, this permits constant folding of vector comparisons where some elements had been reduced to UNDEF (by SimplifyDemandedVectorElts etc.).
This involves a lot of tweaking to reduced tests as bugpoint loves to reduce fcmp arguments to undef........
Differential Revision: https://reviews.llvm.org/D60006
llvm-svn: 357765
|
|
|
|
|
|
|
|
|
|
| |
function calls without used results (PR41259)'
This revision causes tests to fail under ASAN. Since the cause of the failures
is not clear (could be ASAN, could be a Clang bug, could be a bug in this
revision), the safest course of action seems to be to revert while investigating.
llvm-svn: 357667
|
|
|
|
|
|
|
|
|
| |
Same as G_EXP. Add a test, and update legalizer-info-validation.mir and
f16-instructions.ll.
Differential Revision: https://reviews.llvm.org/D60165
llvm-svn: 357605
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are 3 changes to make this correspond to the same transform in instcombine:
1. Remove the legality check - we can't create anything less legal than we started with.
2. Ease the use restriction, so we only bail out if both operands have >1 use.
3. Ease the use restriction for binops with a repeated operand (eg, mul x, x).
As discussed in D60150, there's a scalarization opportunity that will be made
easier by allowing this transform more generally.
llvm-svn: 357580
|
|
|
|
|
|
|
|
| |
Also update arm64-irtranslator.ll.
Differential Revision: https://reviews.llvm.org/D60140
llvm-svn: 357538
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds partial instruction selection support for llvm.aarch64.stlxr. It also
factors out selection for G_INTRINSIC_W_SIDE_EFFECTS into its own function. The
new function removes the restriction that the intrinsic ID on the
G_INTRINSIC_W_SIDE_EFFECTS be on operand 0.
Also add a test, and add a GISel line to arm64-ldxr-stxr.ll.
Differential Revision: https://reviews.llvm.org/D60100
llvm-svn: 357518
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are various places in LLVM where the definition of StackID is not
properly honoured, for example in PEI where objects with a StackID > 0 are
allocated on the default stack (StackID0). This patch enforces that PEI
only considers allocating objects to StackID 0.
Reviewers: arsenm, thegameg, MatzeB
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D60062
llvm-svn: 357460
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
used results (PR41259)
The code was previously checking that candidates for sinking had exactly
one use or were a store instruction (which can't have uses). This meant
we could sink call instructions only if they had a use.
That limitation seemed a bit arbitrary, so this patch changes it to
"instruction has zero or one use" which seems more natural and removes
the need to special-case stores.
Differential revision: https://reviews.llvm.org/D59936
llvm-svn: 357452
|
|
|
|
|
|
|
|
|
| |
This improves selection for vector stores into v2s64s. Before we just
scalarized them, but we can just use a STRQui instead.
Differential Revision: https://reviews.llvm.org/D60083
llvm-svn: 357432
|
|
|
|
|
|
|
|
|
| |
This adds support for v2s32 vector inserts, and updates the selection +
regbankselect tests for G_INSERT_VECTOR_ELT.
Differential Revision: https://reviews.llvm.org/D59910
llvm-svn: 357318
|
|
|
|
|
|
| |
Prep work for PR40800 (Add UNDEF handling to SelectionDAG::FoldSetCC)
llvm-svn: 357286
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cluster
In the example below, we would previously emit two range checks, one for cases
1--3 and one for 4--6. This patch makes us exploit the fact that the
fall-through is unreachable and only one range check is necessary.
switch i32 %i, label %default [
i32 1, label %bb1
i32 2, label %bb1
i32 3, label %bb1
i32 4, label %bb2
i32 5, label %bb2
i32 6, label %bb2
]
default: unreachable
llvm-svn: 357252
|
|
|
|
|
|
|
| |
This patch appears to trigger very large compile time increases in
halide builds.
llvm-svn: 357116
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instructions.
The artifact combiners push instructions which have been marked for deletion
onto an list for the legalizer to deal with on return. However, for trunc(ext)
combines the combiner routine recursively calls itself. When it does this the
dead instructions list may not be empty, and the other combiners don't expect
to be dealing with essentially invalid MIR (multiple vreg defs etc).
This change fixes it by ensuring that the dead instructions are processed on
entry into tryCombineInstruction.
As a result, this fix exposed a few places in tests where G_TRUNC instructions
were not being deleted even though they were dead.
Differential Revision: https://reviews.llvm.org/D59892
llvm-svn: 357101
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lowering switches with unreachable default"
Original commit by Ayonam Ray.
This commit adds a regression test for the issue discovered in the
previous commit: that the range check for the jump table can only be
omitted if the fall-through destination of the jump table is
unreachable, which isn't necessarily true just because the default of
the switch is unreachable.
This addresses the missing optimization in PR41242.
> During the lowering of a switch that would result in the generation of a
> jump table, a range check is performed before indexing into the jump
> table, for the switch value being outside the jump table range and a
> conditional branch is inserted to jump to the default block. In case the
> default block is unreachable, this conditional jump can be omitted. This
> patch implements omitting this conditional branch for unreachable
> defaults.
>
> Differential Revision: https://reviews.llvm.org/D52002
> Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev
llvm-svn: 357067
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Various SelectionDAG non-combine operations (e.g. the getNode smart
constructor and legalization) may leave dangling nodes by applying
optimizations or not fully pruning unused result values. This can
result in nodes that are never added to the worklist and therefore can
not be pruned.
Add a node inserter as the current node deleter to make sure such
nodes have the chance of being pruned.
Many minor changes, mostly positive.
llvm-svn: 356996
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is generally more readable due to the way the assembler aliases
work.
(This causes a lot of test changes, but it's not really as scary as it
looks at first glance; it's just mechanically changing a bunch of checks
for orr to check for mov instead.)
Differential Revision: https://reviews.llvm.org/D59720
llvm-svn: 356954
|
|
|
|
|
|
|
|
|
|
| |
First half of PR40800, this patch adds DAG undef handling to icmp instructions to match the behaviour in llvm::ConstantFoldCompareInstruction and SimplifyICmpInst, this permits constant folding of vector comparisons where some elements had been reduced to UNDEF (by SimplifyDemandedVectorElts etc.).
This involved a lot of tweaking to reduced tests as bugpoint loves to reduce icmp arguments to undef........
Differential Revision: https://reviews.llvm.org/D59363
llvm-svn: 356938
|
|
|
|
|
|
| |
Add Exynos M5 support and test cases.
llvm-svn: 356793
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the result of discussions on the list about how to deal with intrinsics
which require codegen to disambiguate them via only the integer/fp overloads.
It causes problems for GlobalISel as some of that information is lost during
translation, while with other operations like IR instructions the information is
encoded into the instruction opcode.
This patch changes clang to emit the new faddp intrinsic if the vector operands
to the builtin have FP element types. LLVM IR AutoUpgrade has been taught to
upgrade existing calls to aarch64.neon.addp with fp vector arguments, and
we remove the workarounds introduced for GlobalISel in r355865.
This is a more permanent solution to PR40968.
Differential Revision: https://reviews.llvm.org/D59655
llvm-svn: 356722
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The AArch64 test was broken since the result register already had a
set register class, so this test was a no-op. The mapping verify call
would fail because the result size is not the same as the inputs like
in a copy or phi.
The AMDGPU testcases are half broken and introduce illegal VGPR->SGPR
copies which need much more work to handle correctly (same for phis),
but add them as a baseline.
llvm-svn: 356713
|
|
|
|
|
|
|
|
|
|
|
| |
Added subtarget features for AArch64 to use TPIDR_EL[1|2|3] as the TLS base
register, rather than the default TPIDR_EL0.
Patch by Philip Derrin!
Differential revision: https://reviews.llvm.org/D54685
llvm-svn: 356657
|
|
|
|
|
|
|
|
|
| |
This adds pattern matching for the insert+shufflevector sequence so we can
generate dup instructions instead of the current TBL sequence.
Differential Revision: https://reviews.llvm.org/D59558
llvm-svn: 356526
|
|
|
|
| |
llvm-svn: 356525
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cost 0 anyway for free regs
The 2nd loop calculates spill costs but reports free registers as cost
0 anyway, so there is little benefit from having a separate early
loop.
Surprisingly this is not NFC, as many register are marked regDisabled
so the first loop often picks up later registers unnecessarily instead
of the first one available in the allocation order...
Patch by Matthias Braun
llvm-svn: 356499
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SelectionDAGBuilder::visitSelect
These changes are related to PR37743 and include:
SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node.
Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner.
Add promoting the integer ABS node in the LegalizeIntegerType.
Expand-based legalization of integer result for the ABS nodes.
Expand-based legalization of ABS vector operations.
Add some integer abs testcases for different typesizes for Thumb arch
Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to:
tmp = (SRA, Hi, 31)
Lo = (UADDO tmp, Lo)
Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1))
Lo = (XOR tmp, Lo)
The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern:
(ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))).
Change integer abs testcases for codegen with the ABS node support for AArch64.
Indicate that the ABS is legal for the i64 type when the NEON is supported.
Change the integer abs testcases to show changing of codegen.
Add combine and legalization of ABS nodes for Thumb arch.
Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition.
For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743
Patch by: @ikulagin (Ivan Kulagin)
Differential Revision: https://reviews.llvm.org/D49837
llvm-svn: 356468
|
|
|
|
|
|
|
|
|
|
|
| |
It uses the generic AArch64_IMM::expandMOVImm to get the correct
number of instruction used in immediate materialization.
Reviewers: efriedma
Differential Revision: https://reviews.llvm.org/D58461
llvm-svn: 356391
|