| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Also requires making G_IMPLICIT_DEF of v2s32 legal.
Differential Revision: https://reviews.llvm.org/D72422
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
uadd.with.overflow.
Summary: AArch64 doesn't support uadd.with.overflow.i16 natively. This change adds a legalization rule to convert the 32bit add result to 16bit. This should fix PR43981.
Reviewers: arsenm, qcolombet, paquette, aemerson
Reviewed By: paquette
Subscribers: wdng, rovka, kristof.beyls, hiraditya, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71587
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
G_GEP is rather poorly named. It's a simple pointer+scalar addition and
doesn't support any of the complexities of getelementptr. I therefore
propose that we rename it. There's a G_PTR_MASK so let's follow that
convention and go with G_PTR_ADD
Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm
Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69734
|
|
|
|
|
|
| |
s16.
llvm-svn: 372812
|
|
|
|
| |
llvm-svn: 372465
|
|
|
|
|
|
|
|
|
| |
Now that we have the infrastructure to support s128 types as parameters
we can expand these to libcalls.
Differential Revision: https://reviews.llvm.org/D66185
llvm-svn: 370823
|
|
|
|
|
|
|
|
|
|
|
| |
This change moves the actual stack pointer manipulation into the legalizer,
available to targets via lower(). The codegen is slightly different because
we're using explicit masks instead of G_PTRMASK, and using G_SUB rather than
adding a negative amount via G_GEP.
Differential Revision: https://reviews.llvm.org/D66678
llvm-svn: 370104
|
|
|
|
|
|
|
|
| |
We do this by merging the source with the high bits set to 0.
Differential Revision: https://reviews.llvm.org/D66181
llvm-svn: 369480
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).
Manual fixups in:
AArch64InstrInfo.cpp - genFusedMultiply() now takes a Register* instead of unsigned*
AArch64LoadStoreOptimizer.cpp - Ternary operator was ambiguous between Register/MCRegister. Settled on Register
Depends on D65919
Reviewers: aemerson
Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision for full review was: https://reviews.llvm.org/D65962
llvm-svn: 368628
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Targets often have instructions that can sign-extend certain cases faster
than the equivalent shift-left/arithmetic-shift-right. Such cases can be
identified by matching a shift-left/shift-right pair but there are some
issues with this in the context of combines. For example, suppose you can
sign-extend 8-bit up to 32-bit with a target extend instruction.
%1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity)
%2:_(s32) = G_ASHR %1:_(s32), i32 24
%3:_(s32) = G_ASHR %2:_(s32), i32 1
would reasonably combine to:
%1:_(s32) = G_SHL %0:_(s32), i32 24
%2:_(s32) = G_ASHR %1:_(s32), i32 25
which no longer matches the special case. If your shifts and extend are
equal cost, this would break even as a pair of shifts but if your shift is
more expensive than the extend then it's cheaper as:
%2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8
%3:_(s32) = G_ASHR %2:_(s32), i32 1
It's possible to match the shift-pair in ISel and emit an extend and ashr.
However, this is far from the only way to break this shift pair and make
it hard to match the extends. Another example is that with the right
known-zeros, this:
%1:_(s32) = G_SHL %0:_(s32), i32 24
%2:_(s32) = G_ASHR %1:_(s32), i32 24
%3:_(s32) = G_MUL %2:_(s32), i32 2
can become:
%1:_(s32) = G_SHL %0:_(s32), i32 24
%2:_(s32) = G_ASHR %1:_(s32), i32 23
All upstream targets have been configured to lower it to the current
G_SHL,G_ASHR pair but will likely want to make it legal in some cases to
handle their faster cases.
To follow-up: Provide a way to legalize based on the constant. At the
moment, I'm thinking that the best way to achieve this is to provide the
MI in LegalityQuery but that opens the door to breaking core principles
of the legalizer (legality is not context sensitive). That said, it's
worth noting that looking at other instructions and acting on that
information doesn't violate this principle in itself. It's only a
violation if, at the end of legalization, a pass that checks legality
without being able to see the context would say an instruction might not be
legal. That's a fairly subtle distinction so to give a concrete example,
saying %2 in:
%1 = G_CONSTANT 16
%2 = G_SEXT_INREG %0, %1
is legal is in violation of that principle if the legality of %2 depends
on %1 being constant and/or being 16. However, legalizing to either:
%2 = G_SEXT_INREG %0, 16
or:
%1 = G_CONSTANT 16
%2:_(s32) = G_SHL %0, %1
%3:_(s32) = G_ASHR %2, %1
depending on whether %1 is constant and 16 does not violate that principle
since both outputs are genuinely legal.
Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm
Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61289
llvm-svn: 368487
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
operand (MO.isImm() == true)
Summary:
This is tested by D61289 but has been pulled into a separate patch at
a reviewers request.
Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm, rovka
Reviewed By: arsenm
Subscribers: javed.absar, hiraditya, wdng, kristof.beyls, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61321
llvm-svn: 368063
|
|
|
|
|
|
|
|
|
|
| |
stores""
This is an old commit that exposed a bug in the GISel importer, which caused
non-truncating stores to be selected for truncating store patterns. Now that's
been fixed in r367737 this can go back in.
llvm-svn: 367739
|
|
|
|
|
|
|
|
| |
We need this to narrow a sext to s128.
Differential Revision: https://reviews.llvm.org/D65357
llvm-svn: 367164
|
|
|
|
|
|
| |
Changes the order of legalization of G_ICMP suggested by Petar in D65079.
llvm-svn: 367060
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
or fp.
Throughout the legalizerinfo we currently make the assumption that the target
has neon and FP target features available. Fixing it will require a refactor of
the whole thing, so until then make sure we fall back.
Works around PR42734
Differential Revision: https://reviews.llvm.org/D65244
llvm-svn: 366957
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to be able to load and store s128 for memcpy inlining, where we want to
generate Q register mem ops. Making these legal also requires that we add some
support in other instructions. Regbankselect should also know about these since
they have no GPR register class that can hold them, so need special handling to
live on the FPR bank.
Differential Revision: https://reviews.llvm.org/D65166
llvm-svn: 366857
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and legalize later.
I plan on adding memcpy optimizations in the GlobalISel pipeline, but we can't
do that unless we delay lowering to actual function calls. This patch changes
the translator to generate G_INTRINSIC_W_SIDE_EFFECTS for these functions, and
then have each target specify that using the new custom legalizer for intrinsics
hook that they want it expanded it a libcall.
Differential Revision: https://reviews.llvm.org/D64895
llvm-svn: 366516
|
|
|
|
| |
llvm-svn: 365343
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
immediate forms.
There are two main issues preventing us from generating immediate form shifts:
1) We have partial SelectionDAG imported support for G_ASHR and G_LSHR shift
immediate forms, but they currently don't work because the amount type is
expected to be an s64 constant, but we only legalize them to have homogenous
types.
To deal with this, first we introduce a custom legalizer to *only* custom legalize
s32 shifts which have a constant operand into a s64.
There is also an additional artifact combiner to fold zexts(g_constant) to a
larger G_CONSTANT if it's legal, a counterpart to the anyext version committed
in an earlier patch.
2) For G_SHL the importer can't cope with the pattern. For this I introduced an
early selection phase in the arm64 selector to select these forms manually
before the tablegen selector pessimizes it to a register-register variant.
Differential Revision: https://reviews.llvm.org/D63910
llvm-svn: 364994
|
|
|
|
|
|
|
|
|
|
| |
and G_BRJT ops.
With this we can now fully code generate jump tables, which is important for code size.
Differential Revision: https://reviews.llvm.org/D63223
llvm-svn: 364086
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We sometimes get poor code size because constants of types < 32b are legalized
as 32 bit G_CONSTANTs with a truncate to fit. This works but means that the
localizer can no longer sink them (although it's possible to extend it to do so).
On AArch64 however s8 and s16 constants can be selected in the same way as s32
constants, with a mov pseudo into a W register. If we make s8 and s16 constants
legal then we can avoid unnecessary truncates, they can be CSE'd, and the
localizer can sink them as normal.
There is a caveat: if the user of a smaller constant has to widen the sources,
we end up with an anyext of the smaller typed G_CONSTANT. This can cause
regressions because of the additional extend and missed pattern matching. To
remedy this, there's a new artifact combiner to generate the wider G_CONSTANT
if it's legal for the target.
Differential Revision: https://reviews.llvm.org/D63587
llvm-svn: 364075
|
|
|
|
|
|
|
|
| |
Although we had the support in the prelegalizer combiner to generate the
G_SEXTLOAD or G_ZEXTLOAD ops, the legalizer definitions for arm64 had them as
lowering back to separate ops.
llvm-svn: 362553
|
|
|
|
|
|
|
|
|
| |
There are instructions for these, so mark them as legal. Select the correct
instruction in AArch64InstructionSelector.cpp.
Update select-bswap.mir and arm64-rev.ll to reflect the changes.
llvm-svn: 359331
|
|
|
|
|
|
|
|
| |
This case was missing before, so we couldn't legalize it.
Add it to AArch64LegalizerInfo.cpp and update select-extract-vector-elt.mir.
llvm-svn: 359231
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a legalization rule for G_ZEXT, G_ANYEXT, and G_SEXT which allows
extends whenever the types will fit in registers (or the source is an s1).
Update tests. Add GISel checks throughout all of arm64-vabs.ll,
where we now select a good portion of the code. Add GISel checks to
arm64-subvector-extend.ll, which has a good number of vector extends in it.
Differential Revision: https://reviews.llvm.org/D60889
llvm-svn: 359222
|
|
|
|
|
|
|
|
|
| |
Add legalizer support for G_FNEARBYINT. It's the same as G_FCEIL etc.
Since the importer allows us to automatically select this after legalization,
also add tests for selection etc. Also update arm64-vfloatintrinsics.ll.
llvm-svn: 359204
|
|
|
|
|
|
|
| |
Add it to the same rule as G_FCEIL etc. Add a legalizer test, and add a missing
switch case to AArch64LegalizerInfo.cpp.
llvm-svn: 359033
|
|
|
|
|
|
|
|
|
| |
Same patch as G_FCEIL etc.
Add the missing switch case in widenScalar, add G_INTRINSIC_TRUNC to the correct
rule in AArch64LegalizerInfo.cpp, and add a test.
llvm-svn: 359021
|
|
|
|
|
|
|
|
|
| |
Same as G_FCEIL, G_FABS, etc. Just move it into that rule.
Add a legalizer test for G_FMA, which we didn't have before and update
arm64-vfloatintrinsics.ll.
llvm-svn: 359015
|
|
|
|
|
|
|
| |
The last attempt fixed gcc and consumer-typeset, but Obsequi seems to fail with
a different issue.
llvm-svn: 358829
|
|
|
|
|
|
|
|
|
| |
and stores""
We were shifting the wrong component of a split load when trying to combine them
back into a single value.
llvm-svn: 358800
|
|
|
|
|
|
|
|
|
|
| |
Exactly the same as G_FCEIL, G_FABS, etc.
Add tests for the fp16/nofp16 behaviour, update arm64-vfloatintrinsics, etc.
Differential Revision: https://reviews.llvm.org/D60895
llvm-svn: 358799
|
|
|
|
|
|
| |
This introduces some runtime failures which I'll need to investigate further.
llvm-svn: 358771
|
|
|
|
|
|
|
|
|
|
| |
This instruction is legalized in the same way as G_FSIN, G_FCOS, G_FLOG10, etc.
Update legalize-pow.mir and arm64-vfloatintrinsics.ll to reflect the change.
Differential Revision: https://reviews.llvm.org/D60218
llvm-svn: 358764
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds legalization for G_SEXT, G_ZEXT, and G_ANYEXT for v8s8s.
We were falling back on G_ZEXT in arm64-vabs.ll before, preventing us from
selecting the @llvm.aarch64.neon.sabd.v8i8 intrinsic.
This adds legalizer support for those 3, which gives us selection via the
importer. Update the relevant tests (legalize-ext.mir, select-int-ext.mir) and
add a GISel line to arm64-vabs.ll.
Differential Revision: https://reviews.llvm.org/D60881
llvm-svn: 358715
|
|
|
|
|
|
|
|
| |
Add legalizer support for loads of v8s8 and update legalize-load-store.mir.
Differential Revision: https://reviews.llvm.org/D60877
llvm-svn: 358714
|
|
|
|
|
|
|
|
|
|
| |
Legalize things like i24 load/store by splitting them into smaller power of 2 operations.
This matches how SelectionDAG handles these operations.
Differential Revision: https://reviews.llvm.org/D59971
llvm-svn: 358613
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
constants only.
Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't
be degraded.
This change also improves the IRTranslator so that in most places, but not all,
it creates constants using the MIRBuilder directly instead of first creating a
new destination vreg and then creating a constant. By doing this, the
buildConstant() method can just return the vreg of an existing G_CONSTANT
instead of having to create a COPY from it.
I measured a 0.2% improvement in compile time and a 0.9% improvement in code
size at -O0 ARM64.
Compile time:
Program base cse diff
test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8%
test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7%
test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4%
test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3%
test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2%
test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2%
test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2%
test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1%
test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1%
test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0%
Geomean difference -0.2%
Code size:
Program base cse diff
test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7%
test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2%
test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1%
test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1%
test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9%
test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 8033492 -0.6%
test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4%
test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0%
test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0%
test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0%
Geomean difference -0.9%
Differential Revision: https://reviews.llvm.org/D60580
llvm-svn: 358369
|
|
|
|
|
|
|
| |
Some of these were legalizing into smaller vector types unnecessarily,
others were simply not supported yet.
llvm-svn: 358223
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vectors of pointers.
Loads and store of values with type like <2 x p0> currently don't get imported
because SelectionDAG has no knowledge of pointer types. To leverage the existing
support for vector load/stores, we can bitcast the value to have s64 element
types instead. We do this as a custom legalization.
This patch also adds support for general loads of <2 x s64>, and relaxes some
type conditions on selecting G_BITCAST.
Differential Revision: https://reviews.llvm.org/D60534
llvm-svn: 358221
|
|
|
|
|
|
| |
The existing isel support already works for p0 once the legalizer accepts it.
llvm-svn: 358144
|
|
|
|
| |
llvm-svn: 358143
|
|
|
|
| |
llvm-svn: 358142
|
|
|
|
|
|
|
|
| |
Selection support will be coming in a later patch.
Differential Revision: https://reviews.llvm.org/D60435
llvm-svn: 358034
|
|
|
|
|
|
|
|
| |
This is needed for some future support for vector ICMP.
Differential Revision: https://reviews.llvm.org/D60433
llvm-svn: 358033
|
|
|
|
|
|
|
|
|
| |
Same as G_EXP. Add a test, and update legalizer-info-validation.mir and
f16-instructions.ll.
Differential Revision: https://reviews.llvm.org/D60165
llvm-svn: 357605
|
|
|
|
|
|
|
|
|
| |
This improves selection for vector stores into v2s64s. Before we just
scalarized them, but we can just use a STRQui instead.
Differential Revision: https://reviews.llvm.org/D60083
llvm-svn: 357432
|
|
|
|
|
|
|
|
|
| |
This adds support for v2s32 vector inserts, and updates the selection +
regbankselect tests for G_INSERT_VECTOR_ELT.
Differential Revision: https://reviews.llvm.org/D59910
llvm-svn: 357318
|
|
|
|
| |
llvm-svn: 357108
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the result of discussions on the list about how to deal with intrinsics
which require codegen to disambiguate them via only the integer/fp overloads.
It causes problems for GlobalISel as some of that information is lost during
translation, while with other operations like IR instructions the information is
encoded into the instruction opcode.
This patch changes clang to emit the new faddp intrinsic if the vector operands
to the builtin have FP element types. LLVM IR AutoUpgrade has been taught to
upgrade existing calls to aarch64.neon.addp with fp vector arguments, and
we remove the workarounds introduced for GlobalISel in r355865.
This is a more permanent solution to PR40968.
Differential Revision: https://reviews.llvm.org/D59655
llvm-svn: 356722
|