| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 358564
|
|
|
|
|
|
|
|
| |
Since non-pow-2 types are going to get split up into multiple loads anyway,
don't do the [SZ]EXTLOAD combine for those and save us trouble later in
legalization.
llvm-svn: 358458
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
constants only.
Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't
be degraded.
This change also improves the IRTranslator so that in most places, but not all,
it creates constants using the MIRBuilder directly instead of first creating a
new destination vreg and then creating a constant. By doing this, the
buildConstant() method can just return the vreg of an existing G_CONSTANT
instead of having to create a COPY from it.
I measured a 0.2% improvement in compile time and a 0.9% improvement in code
size at -O0 ARM64.
Compile time:
Program base cse diff
test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8%
test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7%
test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4%
test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3%
test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2%
test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2%
test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2%
test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1%
test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1%
test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0%
Geomean difference -0.2%
Code size:
Program base cse diff
test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7%
test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2%
test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1%
test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1%
test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9%
test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 8033492 -0.6%
test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4%
test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0%
test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0%
test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0%
Geomean difference -0.9%
Differential Revision: https://reviews.llvm.org/D60580
llvm-svn: 358369
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
their own CSE configs.
Because CodeGen can't depend on GlobalISel, we need a way to encapsulate the CSE
configs that can be passed between TargetPassConfig and the targets' custom
pass configs. This CSEConfigBase allows targets to create custom CSE configs
which is then used by the GISel passes for the CSEMIRBuilder.
This support will be used in a follow up commit to allow constant-only CSE for
-O0 compiles in D60580.
llvm-svn: 358368
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fix a crash.
This enables the simple copy combine that already exists in the CombinerHelper.
However, it exposed a bug in the GISelChangeObserver where it wouldn't clear a
set of MIs to process, and so would end up causing a crash when deleted MIs were
being added to the combiner worklist again.
Differential Revision: https://reviews.llvm.org/D60579
llvm-svn: 358318
|
|
|
|
|
|
|
|
| |
This crash was introduced in r358032 as we try to construct an EVT from an MVT
in order to find the register type for the calling conv. Fall back instead of
trying to do this with an invalid MVT coming from i256.
llvm-svn: 358314
|
|
|
|
| |
llvm-svn: 358277
|
|
|
|
|
|
| |
This reapplies rL358161. That commit inadvertently reverted an exegesis file to an old version.
llvm-svn: 358246
|
|
|
|
|
|
|
|
|
| |
This reverts commit rL358161.
This patch have broken the test:
llvm/test/tools/llvm-exegesis/X86/uops-CMOV16rm-noreg.s
llvm-svn: 358199
|
|
|
|
| |
llvm-svn: 358161
|
|
|
|
| |
llvm-svn: 358142
|
|
|
|
|
|
|
|
|
| |
Call lowering should use this directly instead of going through the
EVT version, but more work is needed to deal with this (mostly the
passing of the IR type pointer instead of the relevant properties in
ArgInfo).
llvm-svn: 358111
|
|
|
|
|
|
|
| |
Unlike the call handling, this wasn't checking for void results and
creating a register with the invalid LLT
llvm-svn: 358110
|
|
|
|
| |
llvm-svn: 358109
|
|
|
|
| |
llvm-svn: 358105
|
|
|
|
|
|
|
|
|
|
|
| |
required to be passed as different register types. E.g. <2 x i16> may need to
be passed as a larger <2 x i32> type, so formal arg lowering needs to be able
truncate it back. Likewise, when dealing with returns of these types, they need
to be widened in the appropriate way back.
Differential Revision: https://reviews.llvm.org/D60425
llvm-svn: 358032
|
|
|
|
|
|
|
| |
It's annoying to have to create an array of the result type,
particularly when you don't care about the size of the value.
llvm-svn: 357763
|
|
|
|
|
|
|
|
| |
Rename the functions that query the optimization kind attributes.
Differential revision: https://reviews.llvm.org/D60287
llvm-svn: 357731
|
|
|
|
|
|
|
|
|
| |
Create method `optForNone()` testing for the function level equivalent of
`-O0` and refactor appropriately.
Differential revision: https://reviews.llvm.org/D59852
llvm-svn: 357638
|
|
|
|
|
|
|
|
|
| |
Same as G_EXP. Add a test, and update legalizer-info-validation.mir and
f16-instructions.ll.
Differential Revision: https://reviews.llvm.org/D60165
llvm-svn: 357605
|
|
|
|
|
|
|
|
| |
Also update arm64-irtranslator.ll.
Differential Revision: https://reviews.llvm.org/D60140
llvm-svn: 357538
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instructions.
The artifact combiners push instructions which have been marked for deletion
onto an list for the legalizer to deal with on return. However, for trunc(ext)
combines the combiner routine recursively calls itself. When it does this the
dead instructions list may not be empty, and the other combiners don't expect
to be dealing with essentially invalid MIR (multiple vreg defs etc).
This change fixes it by ensuring that the dead instructions are processed on
entry into tryCombineInstruction.
As a result, this fix exposed a few places in tests where G_TRUNC instructions
were not being deleted even though they were dead.
Differential Revision: https://reviews.llvm.org/D59892
llvm-svn: 357101
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The AArch64 test was broken since the result register already had a
set register class, so this test was a no-op. The mapping verify call
would fail because the result size is not the same as the inputs like
in a copy or phi.
The AMDGPU testcases are half broken and introduce illegal VGPR->SGPR
copies which need much more work to handle correctly (same for phis),
but add them as a baseline.
llvm-svn: 356713
|
|
|
|
|
|
| |
Forgot to add a change to relax some asserts in r356396.
llvm-svn: 356411
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After review comments, it was preferred to not teach MachineIRBuilder about
non-generic instructions beyond using buildInstr().
For AArch64 I've changed the buildCopy() calls to buildInstr() + a
separate addReg() call.
This also relaxes the MachineIRBuilder's COPY checking more because it may
not always have a SrcOp given to it.
llvm-svn: 356396
|
|
|
|
| |
llvm-svn: 356309
|
|
|
|
|
|
|
|
|
|
|
|
| |
This relaxes some asserts about sizes, and adds an optional subreg parameter
to buildCopy().
Also update AArch64 instruction selector to use this in places where we
previously used MachineInstrBuilder manually.
Differential Revision: https://reviews.llvm.org/D59434
llvm-svn: 356304
|
|
|
|
|
|
|
|
|
|
|
| |
This is consistent with what SelectionDAG does and is much easier to
work with than the extract sequence with an artificial wide register.
For the AMDGPU control flow intrinsics, this was producing an s128 for
the i64, i1 tuple return. Any legalization that should apply to a real
s128 value would badly obscure the direct values that need to be seen.
llvm-svn: 356147
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
getConstantVRegVal used to only look for G_CONSTANT when looking at
unboxing the value of a vreg. However, constants are sometimes not
directly used and are hidden behind trunc, s|zext or copy chain of
computation.
In particular this may be introduced by the legalization process that
doesn't want to simplify these patterns because it can lead to infine
loop when legalizing a constant.
To circumvent that problem, add a new variant of getConstantVRegVal,
named getConstantVRegValWithLookThrough, that allow to look through
extensions.
Differential Revision: https://reviews.llvm.org/D59227
llvm-svn: 356116
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Overloaded intrinsics aren't necessarily safe for instruction selection. One
such intrinsic is aarch64.neon.addp.*.
This is a temporary workaround to ensure that we always fall back on that
intrinsic. Eventually this will be replaced with a proper solution.
https://bugs.llvm.org/show_bug.cgi?id=40968
Differential Revision: https://reviews.llvm.org/D59062
llvm-svn: 355865
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The control flow here cannot ever use the uninitialized value, but it's
too hard for the compiler to figure that out. Clang warns:
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2600:28: error: variable 'CarrySum' is used uninitialized whenever 'for' loop exits because its condition is false [-Werror,-Wsometimes-uninitialized]
for (unsigned i = 2; i < Factors.size(); ++i)
^~~~~~~~~~~~~~~~~~
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2604:26: note: uninitialized use occurs here
CarrySumPrevDstIdx = CarrySum;
^~~~~~~~
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2600:28: note: remove the condition if it is always true
for (unsigned i = 2; i < Factors.size(); ++i)
^~~~~~~~~~~~~~~~~~
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2583:22: note: initialize the variable 'CarrySum' to silence this warning
unsigned CarrySum;
^
= 0
llvm-svn: 355818
|
|
|
|
|
|
|
|
|
|
| |
NarrowScalar G_UMULH in LegalizerHelper
using multiplyRegisters helper function.
NarrowScalar G_UMULH for MIPS32.
Differential Revision: https://reviews.llvm.org/D58825
llvm-svn: 355815
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Narrow Scalar G_MUL for MIPS32.
Revisit NarrowScalar implementation in LegalizerHelper.
Introduce new helper function multiplyRegisters.
It performs generic multiplication of values held in multiple registers.
Generated instructions use only types NarrowTy and i1.
Destination can be same or two times size of the source.
Differential Revision: https://reviews.llvm.org/D58824
llvm-svn: 355814
|
|
|
|
| |
llvm-svn: 355048
|
|
|
|
| |
llvm-svn: 355047
|
|
|
|
|
|
|
|
|
| |
Lower G_UADDO.
Legalize G_UADDO for MIPS32
Differential Revision: https://reviews.llvm.org/D58671
llvm-svn: 354900
|
|
|
|
|
|
|
|
| |
Try to use concat_vectors. Also remove unnecessary assert on
pointers. Fixes asserting for <4 x s16> operations and 64-bit pointers
for AMDGPU.
llvm-svn: 354828
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For AMDGPU, if an operand requires an SGPR but is only available as a
VGPR, a loop needs to be introduced to execute the instruction with
each unique combination of values across all lanes. The rest of the
instructions in the block will be moved to a new block following the
loop. Check if the next instruction's parent changed, and update the
iterators and insertion block if this happened.
Tests will be included in a future patch.
llvm-svn: 354591
|
|
|
|
|
|
| |
Also complete the set of related operations.
llvm-svn: 354480
|
|
|
|
| |
llvm-svn: 354477
|
|
|
|
| |
llvm-svn: 354354
|
|
|
|
| |
llvm-svn: 354348
|
|
|
|
| |
llvm-svn: 354345
|
|
|
|
|
|
|
|
|
|
|
|
| |
Legalize/select llvm.ctlz.*
Add select-ctlz to show that we actually select them. Update arm64-clrsb.ll and
arm64-vclz.ll to show that we perform valid transformations in optimized builds,
and document where GISel can improve.
Differential Revision: https://reviews.llvm.org/D58155
llvm-svn: 354299
|
|
|
|
| |
llvm-svn: 354293
|
|
|
|
| |
llvm-svn: 354292
|
|
|
|
|
|
|
| |
Fixes cases with odd vectors that break into multiple requested size
pieces.
llvm-svn: 354280
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://reviews.llvm.org/D58073
Speed up insertion during the initial populating phase into the
GISelWorkList by deferring repeatedly resizing the DenseMap.
This results in ~10% improvement in the combiner passes, and
~3% speedup in the Legalizer.
reviewed by: aemerson.
llvm-svn: 354093
|
|
|
|
|
|
|
| |
This allows targets to specify the minimum alignment required for the
load/store.
llvm-svn: 354071
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Select G_BR and G_BRCOND for MIPS32.
Unconditional branch G_BR does not have register operand,
for that reason we only add tests.
Since conditional branch G_BRCOND compares register to zero on MIPS32,
explicit extension must be performed on i1 condition in order to set
high bits to appropriate value.
Differential Revision: https://reviews.llvm.org/D58182
llvm-svn: 354022
|