| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 358892
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
constants only.
Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't
be degraded.
This change also improves the IRTranslator so that in most places, but not all,
it creates constants using the MIRBuilder directly instead of first creating a
new destination vreg and then creating a constant. By doing this, the
buildConstant() method can just return the vreg of an existing G_CONSTANT
instead of having to create a COPY from it.
I measured a 0.2% improvement in compile time and a 0.9% improvement in code
size at -O0 ARM64.
Compile time:
Program base cse diff
test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8%
test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7%
test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4%
test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3%
test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2%
test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2%
test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2%
test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1%
test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1%
test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0%
Geomean difference -0.2%
Code size:
Program base cse diff
test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7%
test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2%
test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1%
test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1%
test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9%
test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 8033492 -0.6%
test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4%
test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0%
test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0%
test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0%
Geomean difference -0.9%
Differential Revision: https://reviews.llvm.org/D60580
llvm-svn: 358369
|
|
|
|
| |
llvm-svn: 358109
|
|
|
|
| |
llvm-svn: 358105
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, jvesely, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, volkan, llvm-commits
Differential Revision: https://reviews.llvm.org/D57166
llvm-svn: 357964
|
|
|
|
| |
llvm-svn: 357762
|
|
|
|
|
|
|
|
| |
The register index can only really be an SGPR. Lie that a VGPR index
is legal, and then rewrite the instruction in a waterfall loop to
handle the index.
llvm-svn: 357235
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instructions.
The artifact combiners push instructions which have been marked for deletion
onto an list for the legalizer to deal with on return. However, for trunc(ext)
combines the combiner routine recursively calls itself. When it does this the
dead instructions list may not be empty, and the other combiners don't expect
to be dealing with essentially invalid MIR (multiple vreg defs etc).
This change fixes it by ensuring that the dead instructions are processed on
entry into tryCombineInstruction.
As a result, this fix exposed a few places in tests where G_TRUNC instructions
were not being deleted even though they were dead.
Differential Revision: https://reviews.llvm.org/D59892
llvm-svn: 357101
|
|
|
|
|
|
|
|
|
|
|
|
| |
The AMDGPU implementation of getReservedRegs depends on
MachineFunctionInfo fields that are parsed from the YAML section. This
was reserving the wrong register since it was setting the reserved
regs before parsing the correct one.
Some tests were relying on the default reserved set for the assumed
default calling convention.
llvm-svn: 357083
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The AArch64 test was broken since the result register already had a
set register class, so this test was a no-op. The mapping verify call
would fail because the result size is not the same as the inputs like
in a copy or phi.
The AMDGPU testcases are half broken and introduce illegal VGPR->SGPR
copies which need much more work to handle correctly (same for phis),
but add them as a baseline.
llvm-svn: 356713
|
|
|
|
|
|
|
|
|
| |
The constantness shouldn't change the register bank choice. We also
don't need to restrict this to only indexing VGPRs, since it's
possible to index SGPRs (but SelectionDAG made using this
difficult). Allow directly indexing SGPRs when appropriate.
llvm-svn: 356611
|
|
|
|
|
|
|
|
|
|
|
| |
This is consistent with what SelectionDAG does and is much easier to
work with than the extract sequence with an artificial wide register.
For the AMDGPU control flow intrinsics, this was producing an s128 for
the i64, i1 tuple return. Any legalization that should apply to a real
s128 value would badly obscure the direct values that need to be seen.
llvm-svn: 356147
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add support for immediate operand in S_ENDPGM
Change-Id: I0c56a076a10980f719fb2a8f16407e9c301013f6
Reviewers: alexshap
Subscribers: qcolombet, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, eraman, arphaman, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59213
llvm-svn: 355902
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Narrow Scalar G_MUL for MIPS32.
Revisit NarrowScalar implementation in LegalizerHelper.
Introduce new helper function multiplyRegisters.
It performs generic multiplication of values held in multiple registers.
Generated instructions use only types NarrowTy and i1.
Destination can be same or two times size of the source.
Differential Revision: https://reviews.llvm.org/D58824
llvm-svn: 355814
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-commit r344310.
Reviewers: arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D53116
llvm-svn: 355159
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D49714
llvm-svn: 355156
|
|
|
|
|
|
|
| |
Add baseline for future fixes. These mostly show how this is broken
and producing illegal situations.
llvm-svn: 355057
|
|
|
|
| |
llvm-svn: 355048
|
|
|
|
| |
llvm-svn: 355047
|
|
|
|
|
|
|
|
| |
Try to use concat_vectors. Also remove unnecessary assert on
pointers. Fixes asserting for <4 x s16> operations and 64-bit pointers
for AMDGPU.
llvm-svn: 354828
|
|
|
|
| |
llvm-svn: 354825
|
|
|
|
| |
llvm-svn: 354818
|
|
|
|
| |
llvm-svn: 354592
|
|
|
|
| |
llvm-svn: 354587
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D52922
llvm-svn: 354516
|
|
|
|
|
|
| |
Also complete the set of related operations.
llvm-svn: 354480
|
|
|
|
| |
llvm-svn: 354477
|
|
|
|
| |
llvm-svn: 354354
|
|
|
|
| |
llvm-svn: 354348
|
|
|
|
| |
llvm-svn: 354345
|
|
|
|
| |
llvm-svn: 354293
|
|
|
|
|
|
|
|
|
|
| |
This is basically a pointer typed add, so shouldn't be any different.
This was assuming everything was an SGPR, which is not true.
Also cleanup legality for GEP. I don't seem to be seeing the problem
the hack marking s64 as a legal pointer type the comment mentions.
llvm-svn: 354067
|
|
|
|
| |
llvm-svn: 354065
|
|
|
|
| |
llvm-svn: 353848
|
|
|
|
|
|
| |
We could deal with it, but there's no real point.
llvm-svn: 353845
|
|
|
|
| |
llvm-svn: 353759
|
|
|
|
| |
llvm-svn: 353754
|
|
|
|
| |
llvm-svn: 353719
|
|
|
|
| |
llvm-svn: 353559
|
|
|
|
|
|
|
|
| |
clampScalar doesn't do anything for non-power-of-2 in range.
There should probably be a combination rule to reduce the number
of matching rules.
llvm-svn: 353526
|
|
|
|
| |
llvm-svn: 353522
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make behavior of G_LOAD in widenScalar same as for G_ZEXTLOAD and
G_SEXTLOAD. That is perform widenScalarDst to size given by the target
and avoid additional checks in common code. Targets can reorder or add
additional rules in LegalizeRuleSet for the opcode to achieve desired
behavior.
Select extending load that does not have specified type of extension
into zero extending load.
Select truncating store that stores number of bytes indicated by size
in MachineMemoperand.
Differential Revision: https://reviews.llvm.org/D57454
llvm-svn: 353520
|
|
|
|
| |
llvm-svn: 353516
|
|
|
|
|
|
|
| |
Use a placeholder constant for now on targets
that need the load from the queue ptr.
llvm-svn: 353497
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is pretty much directly ported from SelectionDAG. Doesn't include
the shift by non-constant but known bits version, since there isn't a
globalisel version of computeKnownBits yet.
This shows a disadvantage of targets not specifically which type
should be used for the shift amount. If type 0 is legalized before
type 1, the operations on the shift amount type use the wider type
(which are also less likely to legalize). This can be avoided by
targets specifying legalization actions on type 1 earlier than for
type 0.
llvm-svn: 353455
|
|
|
|
| |
llvm-svn: 353452
|
|
|
|
|
|
|
| |
Since G_CONSTANT is illegal for vectors, this needs to check
what buildConstant will produce for a splat vector.
llvm-svn: 353449
|
|
|
|
| |
llvm-svn: 353443
|
|
|
|
| |
llvm-svn: 353438
|
|
|
|
| |
llvm-svn: 353436
|