| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
AMDGPU can't unambiguously go back from the selected instruction
register class to the register bank without knowing if this was used
in a boolean context.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
G_GEP is rather poorly named. It's a simple pointer+scalar addition and
doesn't support any of the complexities of getelementptr. I therefore
propose that we rename it. There's a G_PTR_MASK so let's follow that
convention and go with G_PTR_ADD
Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm
Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69734
|
|
|
|
|
|
|
|
| |
The destinations should be FPRs (for now).
Differential Revision: https://reviews.llvm.org/D66184
llvm-svn: 368775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).
Manual fixups in:
AArch64InstrInfo.cpp - genFusedMultiply() now takes a Register* instead of unsigned*
AArch64LoadStoreOptimizer.cpp - Ternary operator was ambiguous between Register/MCRegister. Settled on Register
Depends on D65919
Reviewers: aemerson
Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision for full review was: https://reviews.llvm.org/D65962
llvm-svn: 368628
|
|
|
|
|
|
| |
llvm::Register as started by r367614. NFC
llvm-svn: 367633
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to be able to load and store s128 for memcpy inlining, where we want to
generate Q register mem ops. Making these legal also requires that we add some
support in other instructions. Regbankselect should also know about these since
they have no GPR register class that can hold them, so need special handling to
live on the FPR bank.
Differential Revision: https://reviews.llvm.org/D65166
llvm-svn: 366857
|
|
|
|
|
|
|
|
| |
The condition can never be fed by FPRs, so it should always be on a GPR.
Differential Revision: https://reviews.llvm.org/D65157
llvm-svn: 366854
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
immediate forms.
There are two main issues preventing us from generating immediate form shifts:
1) We have partial SelectionDAG imported support for G_ASHR and G_LSHR shift
immediate forms, but they currently don't work because the amount type is
expected to be an s64 constant, but we only legalize them to have homogenous
types.
To deal with this, first we introduce a custom legalizer to *only* custom legalize
s32 shifts which have a constant operand into a s64.
There is also an additional artifact combiner to fold zexts(g_constant) to a
larger G_CONSTANT if it's legal, a counterpart to the anyext version committed
in an earlier patch.
2) For G_SHL the importer can't cope with the pattern. For this I introduced an
early selection phase in the arm64 selector to select these forms manually
before the tablegen selector pessimizes it to a register-register variant.
Differential Revision: https://reviews.llvm.org/D63910
llvm-svn: 364994
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a few places in getInstrMapping, we check if use/def instructions for the
instruction we're mapping have floating point constraints.
We can improve this check and reduce the number of copies in GISel-compiled code
if we make a couple observations:
- For a def instruction, it only matters if the def instruction must always
output a value stored on a FPR
- For a use instruction, it only matters if the use instruction must always
only take in values stored in FPRs
This adds two new functions:
- onlyUsesFP
- onlyDefinesFP
Then we can use those when we're checking the uses/defs instead.
Without this patch, the load, unmerge, store, and select in the added test
would have unnecessary copies.
Differential Revision: https://reviews.llvm.org/D62426
llvm-svn: 361679
|
|
|
|
|
|
|
|
|
| |
Factor it out into a function, and replace places where we had the same check
with the new function.
Differential Revision: https://reviews.llvm.org/D62421
llvm-svn: 361677
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fcsel and csel instructions differ in only the register banks they work on.
So, they're entirely interchangeable otherwise.
With this in mind, this does two things:
- Teach AArch64RegisterBankInfo to consider the inputs to G_SELECT as well as
the outputs.
- Teach it to choose the best register bank mapping based off the constraints
of the inputs and outputs.
The "best" in this case means the one that requires the smallest number of
copies to properly emit a fcsel/csel.
For example, if the inputs are all already going to be on FPRs, we should
emit a fcsel, even if the output is a GPR. This costs one copy to produce the
result, but saves us from copying the inputs into GPRs.
Also update the regbank-select.mir to check that we end up with the right
select instruction.
Differential Revision: https://reviews.llvm.org/D62267
llvm-svn: 361665
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This saves us some unnecessary copies.
If the inputs to a G_SELECT are floating point, we should use fcsel rather than
csel.
Changes here are...
- Teach selectCopy about s1-to-s1 copies across register banks.
- AArch64RegisterBankInfo about G_SELECT in general.
- Teach the instruction selector about the FCSEL instructions.
Also add two tests:
- select-select.mir to show that we get the expected FCSEL
- regbank-select.mir (unfortunately named) to show the register banks on
G_SELECT are properly preserved
And update fast-isel-select.ll to show that we do the same thing as other
instruction selectors in these cases.
llvm-svn: 359940
|
|
|
|
|
|
|
|
|
| |
Add legalizer support for G_FNEARBYINT. It's the same as G_FCEIL etc.
Since the importer allows us to automatically select this after legalization,
also add tests for selection etc. Also update arm64-vfloatintrinsics.ll.
llvm-svn: 359204
|
|
|
|
|
|
|
|
|
| |
Add G_INTRINSIC_ROUND to isPreISelGenericFloatingPointOpcode to ensure that its
input and output are assigned the correct register bank.
Add a regbankselect test to verify that we get what we expect here.
llvm-svn: 359044
|
|
|
|
|
|
|
|
| |
Add it to isPreISelGenericFloatingPointOpcode, and add a regbankselect test.
Update arm64-vfloatintrinsics.ll now that we can select it.
llvm-svn: 359022
|
|
|
|
|
|
|
|
| |
Noticed an unnecessary fallback in arm64-vmul caused by this.
Also add a regbankselect test for G_FMA.
llvm-svn: 359013
|
|
|
|
|
|
|
|
|
|
| |
Exactly the same as G_FCEIL, G_FABS, etc.
Add tests for the fp16/nofp16 behaviour, update arm64-vfloatintrinsics, etc.
Differential Revision: https://reviews.llvm.org/D60895
llvm-svn: 358799
|
|
|
|
|
|
|
|
|
| |
gpr sources.
Since we can't insert s16 gprs as we don't have 16 bit GPR registers, we need to
teach RBS to assign them to the FPR bank so our selector works.
llvm-svn: 356282
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for inserting elements into packed vectors. It also adds
two tests: one for selection, and one for regbank select.
Unpacked vectors will come in a follow-up.
Differential Revision: https://reviews.llvm.org/D59325
llvm-svn: 356182
|
|
|
|
|
|
|
|
|
| |
After r355865, we should be able to safely select G_EXTRACT_VECTOR_ELT without
running into any problematic intrinsics.
Also add a fix for lane copies, which don't support index 0.
llvm-svn: 355871
|
|
|
|
|
|
|
|
|
|
|
| |
This broke test-suite::aarch64_neon_intrinsics.test
Reverting while I look into it.
Example failure:
http://lab.llvm.org:8011/builders/clang-cmake-aarch64-quick/builds/17740
llvm-svn: 355408
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds instruction selection support for G_EXTRACT_VECTOR_ELT for cases
where the index is defined by a G_CONSTANT.
It also factos out the lane copy opcode selection part into its own function,
`getLaneCopyOpcode`. This is used by both `selectUnmergeValues` and
`selectExtractElt`.
Differential Revision: https://reviews.llvm.org/D58469
llvm-svn: 355344
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow up to D48580 and D48581 which allows reserving
arbitrary general purpose registers with the exception of registers
with special purpose (X8, X16-X18, X29, X30) and registers used by LLVM
(X0, X19). This change also generalizes some of the existing logic to
rely entirely on values generated from tablegen.
Differential Revision: https://reviews.llvm.org/D56305
llvm-svn: 353957
|
|
|
|
|
|
|
|
|
|
|
| |
This teaches the legalizer about G_FFLOOR, and lets us select G_FFLOOR in
AArch64.
It updates the existing floating point tests, and adds a select-floor.mir test.
Differential Revision: https://reviews.llvm.org/D57486
llvm-svn: 353722
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This teaches the legalizer to handle G_FEXP in AArch64. As a result, it also
allows us to select G_FEXP.
It...
- Updates the legalizer-info tests
- Adds a test for legalizing exp
- Updates the existing fp tests to show that we can now select G_FEXP
https://reviews.llvm.org/D57483
llvm-svn: 352692
|
|
|
|
|
|
|
|
|
| |
This adds instruction selection support for G_FABS in AArch64. It also updates
the existing basic FP tests, adds a selection test for G_FABS.
https://reviews.llvm.org/D57418
llvm-svn: 352684
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This teaches GlobalISel to emit a RTLib call for @llvm.log2 when it encounters
it.
It updates the existing floating point tests to show that we don't fall back on
the intrinsic, and select the correct instructions. It also adds a legalizer
test for G_FLOG2.
https://reviews.llvm.org/D57357
llvm-svn: 352673
|
|
|
|
|
|
|
|
|
|
| |
This teaches the legalizer about G_FSQRT in AArch64. Also adds a legalizer
test for G_FSQRT, a selection test for it, and updates existing floating point
tests.
https://reviews.llvm.org/D57361
llvm-svn: 352671
|
|
|
|
|
|
|
|
|
| |
This currently shows up as a selection fallback since the dest regs were given
GPR banks but the source was a vector FPR reg.
Differential Revision: https://reviews.llvm.org/D57408
llvm-svn: 352545
|
|
|
|
|
|
|
|
|
|
| |
This adds support for legalizing G_FLOG into a RTLib call.
It adds a legalizer test, and updates the existing floating point tests.
https://reviews.llvm.org/D57347
llvm-svn: 352429
|
|
|
|
|
|
|
|
|
|
| |
This adds instruction selection support for @llvm.log10 in AArch64. It teaches
GISel to lower it to a library call, updates the relevant tests, and adds a
legalizer test for log10.
https://reviews.llvm.org/D57341
llvm-svn: 352418
|
|
|
|
|
|
|
|
|
|
|
|
| |
This contains all of the legalizer changes from D57197 necessary to select
G_FCOS and G_FSIN. It also updates several existing IR tests in
test/CodeGen/AArch64 that verify that we correctly lower the G_FCOS and G_FSIN
instructions.
https://reviews.llvm.org/D57197
3/3
llvm-svn: 352402
|
|
|
|
| |
llvm-svn: 352340
|
|
|
|
|
|
| |
Some unrelated, but benign, test changes as well due to the test update script.
llvm-svn: 352337
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for vector @llvm.ceil intrinsics when full 16 bit
floating point support isn't available.
To do this, this patch...
- Implements basic isel for G_UNMERGE_VALUES
- Teaches the legalizer about 16 bit floats
- Teaches AArch64RegisterBankInfo to respect floating point registers on
G_BUILD_VECTOR and G_UNMERGE_VALUES
- Teaches selectCopy about 16-bit floating point vectors
It also adds
- A legalizer test for the 16-bit vector ceil which verifies that we create a
G_UNMERGE_VALUES and G_BUILD_VECTOR when full fp16 isn't supported
- An instruction selection test which makes sure we lower to G_FCEIL when
full fp16 is supported
- A test for selecting G_UNMERGE_VALUES
And also updates arm64-vfloatintrinsics.ll to show that the new ceiling types
work as expected.
https://reviews.llvm.org/D56682
llvm-svn: 352113
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each hwasan check requires emitting a small piece of code like this:
https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html#memory-accesses
The problem with this is that these code blocks typically bloat code
size significantly.
An obvious solution is to outline these blocks of code. In fact, this
has already been implemented under the -hwasan-instrument-with-calls
flag. However, as currently implemented this has a number of problems:
- The functions use the same calling convention as regular C functions.
This means that the backend must spill all temporary registers as
required by the platform's C calling convention, even though the
check only needs two registers on the hot path.
- The functions take the address to be checked in a fixed register,
which increases register pressure.
Both of these factors can diminish the code size effect and increase
the performance hit of -hwasan-instrument-with-calls.
The solution that this patch implements is to involve the aarch64
backend in outlining the checks. An intrinsic and pseudo-instruction
are created to represent a hwasan check. The pseudo-instruction
is register allocated like any other instruction, and we allow the
register allocator to select almost any register for the address to
check. A particular combination of (register selection, type of check)
triggers the creation in the backend of a function to handle the check
for specifically that pair. The resulting functions are deduplicated by
the linker. The pseudo-instruction (really the function) is specified
to preserve all registers except for the registers that the AAPCS
specifies may be clobbered by a call.
To measure the code size and performance effect of this change, I
took a number of measurements using Chromium for Android on aarch64,
comparing a browser with inlined checks (the baseline) against a
browser with outlined checks.
Code size: Size of .text decreases from 243897420 to 171619972 bytes,
or a 30% decrease.
Performance: Using Chromium's blink_perf.layout microbenchmarks I
measured a median performance regression of 6.24%.
The fact that a perf/size tradeoff is evident here suggests that
we might want to make the new behaviour conditional on -Os/-Oz.
But for now I've enabled it unconditionally, my reasoning being that
hwasan users typically expect a relatively large perf hit, and ~6%
isn't really adding much. We may want to revisit this decision in
the future, though.
I also tried experimenting with varying the number of registers
selectable by the hwasan check pseudo-instruction (which would result
in fewer variants being created), on the hypothesis that creating
fewer variants of the function would expose another perf/size tradeoff
by reducing icache pressure from the check functions at the cost of
register pressure. Although I did observe a code size increase with
fewer registers, I did not observe a strong correlation between the
number of registers and the performance of the resulting browser on the
microbenchmarks, so I conclude that we might as well use ~all registers
to get the maximum code size improvement. My results are below:
Regs | .text size | Perf hit
-----+------------+---------
~all | 171619972 | 6.24%
16 | 171765192 | 7.03%
8 | 172917788 | 5.82%
4 | 177054016 | 6.89%
Differential Revision: https://reviews.llvm.org/D56954
llvm-svn: 351920
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
|
|
|
|
|
|
|
|
|
| |
If you don't do this, then if you hit a G_LOAD in getInstrMapping, you'll end
up with GPRs on the G_FCEIL instead of FPRs. This causes a fallback.
Add it to the switch, and add a test verifying that this happens.
llvm-svn: 349822
|
|
|
|
|
|
|
|
|
| |
We used to detect loads feeding fp instructions, but we were
failing to take into account cases where this happens through copies.
For instance, loads can fed copies coming from the ABI lowering
of floating point arguments/results.
llvm-svn: 318589
|
|
|
|
|
|
|
|
|
| |
We used to detect that stores were fed by fp instructions, but we were
failing to take into account cases where this happens through copies.
For instance, stores can be fed by copies coming from the ABI lowering
of floating point arguments.
llvm-svn: 318588
|
|
|
|
|
|
|
|
| |
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).
llvm-svn: 318490
|
|
|
|
|
|
|
|
|
|
| |
This fixes http://llvm.org/PR32560. We were missing a description for
half floating point type and as a result were using the FPR 32 mapping.
Because of the size mismatch the generic code was complaining that the
default mapping is not appropriate. Fix the mapping description so that
the default mapping can be properly applied.
llvm-svn: 317287
|
|
|
|
|
|
| |
NFC.
llvm-svn: 317286
|
|
|
|
|
|
| |
Otherwise they are ambiguous in MIR.
llvm-svn: 316047
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for COPY
This reverts commit r315823, thus re-applying r315781.
Also make sure we don't use G_BITCAST mapping for non-generic registers.
Non-generic registers don't have a type but do have a reg bank.
Something the COPY mapping now how to deal with but the G_BITCAST
mapping don't.
-- Original Commit Message --
We use to resort on the generic implementation to get the mappings for
COPYs. The generic implementation resorts on table lookup and
dynamically allocated objects to get the valid mappings.
Given we already know how to map G_BITCAST and have the static mappings
for them, use that code path for COPY as well. This is much more
efficient.
Improve the compile time of RegBankSelect by up to 20%.
Note: When we eventually generate all the mappings via TableGen, we
wouldn't have to do that dance to shave compile time. The intent of this
change was to make sure that moving to static structure really pays off.
NFC.
llvm-svn: 315947
|
|
|
|
|
|
| |
Anything bigger than 64-bit just map to FPR.
llvm-svn: 315946
|
|
|
|
|
|
|
|
|
| |
COPY"
This reverts commit r315781, breaks:
http://green.lab.llvm.org/green/job/Compiler_Verifiers_GlobalISEL/9882
llvm-svn: 315823
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We use to resort on the generic implementation to get the mappings for
COPYs. The generic implementation resorts on table lookup and
dynamically allocated objects to get the valid mappings.
Given we already know how to map G_BITCAST and have the static mappings
for them, use that code path for COPY as well. This is much more
efficient.
Improve the compile time of RegBankSelect by up to 20%.
Note: When we eventually generate all the mappings via TableGen, we
wouldn't have to do that dance to shave compile time. The intent of this
change was to make sure that moving to static structure really pays off.
NFC.
llvm-svn: 315781
|
|
|
|
|
|
|
|
|
|
| |
G_PHI has the same semantics as PHI but also has types.
This lets us verify that the types in the G_PHI are consistent.
This also allows specifying legalization actions for G_PHIs.
https://reviews.llvm.org/D36990
llvm-svn: 311596
|
|
|
|
|
|
|
|
| |
With this change, the GlobalISel library gets always built. In
particular, this is not possible to opt GlobalISel out of the build
using the LLVM_BUILD_GLOBAL_ISEL variable any more.
llvm-svn: 309990
|