| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The MachineOutliner has a bunch of target hooks that will call llvm_unreachable
if the target doesn't implement them. Therefore, if you enable the outliner on
such a target, it'll just crash. It'd be much better if it'd just *not* run
the outliner at all in this case.
This commit adds a hook to TargetInstrInfo that returns false by default.
Targets that implement the hook make it return true. The outliner checks the
return value of this hook to decide whether or not to continue.
llvm-svn: 329220
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches.
Reviewers: t.p.northover, jmolloy, RKSimon, rengolin
Reviewed By: rengolin
Subscribers: dexonsmith, rengolin, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D44853
llvm-svn: 329216
|
|
|
|
|
|
|
|
|
|
|
| |
Makes it easier to see mistakes such as the one fixed in r329178 and makes
the different target CMakeLists more consistent.
Also remove some stale-looking comments from the Nios2 target cmakefile.
No intended behavior change.
llvm-svn: 329181
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D44573
llvm-svn: 329163
|
|
|
|
|
|
| |
Fix typo and simplify matching expression.
llvm-svn: 329130
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a hasRedZone() function to AArch64MachineFunctionInfo. It
returns true if the function is known to use a redzone, false if it is known
to not use a redzone, and no value otherwise.
This removes the requirement to pass -mno-red-zone when outlining for AArch64.
https://reviews.llvm.org/D45189
llvm-svn: 329120
|
|
|
|
|
|
|
|
| |
This register is reserved as a platform register on Fuchsia.
Differential Revision: https://reviews.llvm.org/D45105
llvm-svn: 328950
|
|
|
|
|
|
|
|
|
|
|
|
| |
CodeGen layer.
Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it.
The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly.
Differential Revision: https://reviews.llvm.org/D45017
llvm-svn: 328806
|
|
|
|
|
|
|
|
| |
dependency
Thanks to echristo for the pointers on direction.
llvm-svn: 328737
|
|
|
|
|
|
|
|
|
|
| |
for call outlining
This commit simplifies the call outlining logic by removing references to the
Function associated with the callee. To do this, it requires that valid
callee save info is available to the outliner.
llvm-svn: 328719
|
|
|
|
|
|
|
|
|
| |
If an ADRP appears with, say, a CPI operand, we shouldn't outline it.
This moves the check for unsafe operands so that it occurs before the special-case
for ADRPs. Also add a test for outlining ADRPs.
llvm-svn: 328674
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is a canonical way to teach objdump to print the target
symbols for branches when disassembling AArch64 code.
Reviewers: evandro, t.p.northover, espindola
Reviewed By: t.p.northover
Differential Revision: https://reviews.llvm.org/D44851
llvm-svn: 328638
|
|
|
|
|
|
| |
ValueTypes.h is implemented in IR already.
llvm-svn: 328397
|
|
|
|
|
|
|
|
|
| |
This is used by llvm tblgen as well as by LLVM Targets, so the only
common place is Support for now. (maybe we need another target for these
sorts of things - but for now I'm at least making them correct & we can
make them better if/when people have strong feelings)
llvm-svn: 328395
|
|
|
|
|
|
|
| |
It's implemented in Target & include from other Target headers, so the
header should be in Target.
llvm-svn: 328392
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Loads and stores can only shift the offset register by the size of the value
being loaded, but currently the DAGCombiner will reduce the width of the load
if it's followed by a trunc making it impossible to later combine the shift.
Solve this by implementing shouldReduceLoadWidth for the AArch64 backend and
make it prevent the width reduction if this is what would happen, though do
allow it if reducing the load width will let us eliminate a later sign or zero
extend.
Differential Revision: https://reviews.llvm.org/D44794
llvm-svn: 328321
|
|
|
|
|
|
|
|
|
| |
Patch by Simon Pilgrim <llvm-dev@redking.me.uk>
That is a slightly modified version of the AArch64 changes from
Simon's D44687 .
llvm-svn: 328303
|
|
|
|
|
|
| |
That removes some redundant recomputations from the passes pipeline.
llvm-svn: 328272
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This pass sinks COPY instructions into a successor block, if the COPY is not
used in the current block and the COPY is live-in to a single successor
(i.e., doesn't require the COPY to be duplicated). This avoids executing the
the copy on paths where their results aren't needed. This also exposes
additional opportunites for dead copy elimination and shrink wrapping.
These copies were either not handled by or are inserted after the MachineSink
pass. As an example of the former case, the MachineSink pass cannot sink
COPY instructions with allocatable source registers; for AArch64 these type
of copy instructions are frequently used to move function parameters (PhyReg)
into virtual registers in the entry block..
For the machine IR below, this pass will sink %w19 in the entry into its
successor (%bb.1) because %w19 is only live-in in %bb.1.
```
%bb.0:
%wzr = SUBSWri %w1, 1
%w19 = COPY %w0
Bcc 11, %bb.2
%bb.1:
Live Ins: %w19
BL @fun
%w0 = ADDWrr %w0, %w19
RET %w0
%bb.2:
%w0 = COPY %wzr
RET %w0
```
As we sink %w19 (CSR in AArch64) into %bb.1, the shrink-wrapping pass will be
able to see %bb.0 as a candidate.
With this change I observed 12% more shrink-wrapping candidate and 13% more dead copies deleted in spec2000/2006/2017 on AArch64.
Reviewers: qcolombet, MatzeB, thegameg, mcrosier, gberry, hfinkel, john.brawn, twoh, RKSimon, sebpop, kparzysz
Reviewed By: sebpop
Subscribers: evandro, sebpop, sfertile, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D41463
llvm-svn: 328237
|
|
|
|
|
|
| |
Fix typo in the number of integer dividers.
llvm-svn: 328027
|
|
|
|
| |
llvm-svn: 327982
|
|
|
|
|
|
|
|
|
|
|
| |
When outlining calls, the outliner needs to update CFI to ensure that, say,
exception handling works. This commit adds that functionality and adds a test
just for call outlining.
Call outlining stuff in machine-outliner.mir should be moved into
machine-outliner-calls.mir in a later commit.
llvm-svn: 327917
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extends the use of this attribute on ARM and AArch64 from
SVN r325900 (where it was only checked for fixed stack
allocations on ARM/AArch64, but for all stack allocations on X86).
This also adds a testcase for the existing use of disabling the
fixed stack probe with the attribute on ARM and AArch64.
Differential Revision: https://reviews.llvm.org/D44291
llvm-svn: 327897
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The docs already claim that this happens, but so far it hasn't. As a
consequence, existing TableGen files get this wrong a lot, but luckily
the fixes are all reasonably straightforward.
To make this work with all the existing forms of self-references (since
the true type of a record is only built up over time), the lookup of
self-references in !cast is delayed until the final resolving step.
Change-Id: If5923a72a252ba2fbc81a889d59775df0ef31164
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D44475
llvm-svn: 327849
|
|
|
|
|
|
|
|
| |
FullInstRWOverlapCheck.
This fixes the errors found by the new check added in r327808.
llvm-svn: 327812
|
|
|
|
|
|
|
|
|
|
|
|
| |
InstRW, make sure we haven't already seen another InstRW containing this instruction on this CPU.
This is similar to the check later when we remap some of the instructions from one class to a new one. But if we reuse the class we don't get to do that check.
So many CPUs have violations of this check that I had to add a flag to the SchedMachineModel to allow it to be disabled. Hopefully we can get those cleaned up quickly and remove this flag.
A lot of the violations are due to overlapping regular expressions, but that's not the only kind of issue it found.
llvm-svn: 327808
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D44586
llvm-svn: 327779
|
|
|
|
|
|
|
|
|
| |
At the point the outliner runs, KILLs don't impact anything, but they're still
considered unique instructions. This commit makes them invisible like
DebugValues so that they can still be outlined without impacting outlining
decisions.
llvm-svn: 327760
|
|
|
|
|
|
|
|
|
|
| |
This patch provides an implementation of getArithmeticReductionCost for
AArch64. We can specialize the cost of add reductions since they are computed
using the 'addv' instruction.
Differential Revision: https://reviews.llvm.org/D44490
llvm-svn: 327702
|
|
|
|
|
|
| |
Fix typo.
llvm-svn: 327663
|
|
|
|
|
|
| |
Add special case for rotate right.
llvm-svn: 327662
|
|
|
|
|
|
| |
Increase the number of cheap as move cases of register reset.
llvm-svn: 327661
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optionally allow the order of restoring the callee-saved registers in the
epilogue to be reversed.
The flag -reverse-csr-restore-seq generates the following code:
```
stp x26, x25, [sp, #-64]!
stp x24, x23, [sp, #16]
stp x22, x21, [sp, #32]
stp x20, x19, [sp, #48]
; [..]
ldp x24, x23, [sp, #16]
ldp x22, x21, [sp, #32]
ldp x20, x19, [sp, #48]
ldp x26, x25, [sp], #64
ret
```
Note how the CSRs are restored in the same order as they are saved.
One exception to this rule is the last `ldp`, which allows us to merge
the stack adjustment and the ldp into a post-index ldp. This is done by
first generating:
ldp x26, x27, [sp]
add sp, sp, #64
which gets merged by the arm64 load store optimizer into
ldp x26, x25, [sp], #64
The flag is disabled by default.
llvm-svn: 327569
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Merging:
* $x26, $x25 = frame-setup LDPXi $sp, 0
* $sp = frame-destroy ADDXri $sp, 64, 0
into an LDPXpost should preserve the flags from both instructions as
following:
* frame-setup frame-destroy LDPXpost
Differential Revision: https://reviews.llvm.org/D44446
llvm-svn: 327533
|
|
|
|
|
|
|
|
|
| |
Support for this relocation is missing in both LLD and GNU binutils
at the moment.
This reverts the ELF parts of SVN r327316.
llvm-svn: 327503
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D44355
llvm-svn: 327316
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D43971
llvm-svn: 327220
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fixes an UB caught by sanitizer. The shift amount might be larger than 32 so the operand should be 1ULL.
In this patch, we replace the original expression with existing API with uint64_t type.
Reviewers: eli.friedman, rengolin
Reviewed By: rengolin
Subscribers: rengolin, javed.absar, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D44234
llvm-svn: 326969
|
|
|
|
|
|
|
| |
With r249303 the expression evaluation should expand variables that
are not in sections (and so don't have an atom).
llvm-svn: 326966
|
|
|
|
|
|
|
|
|
|
| |
Since there is no instruction for integer vector division, factor in the
cost of singling out each element to be used with the scalar division
instruction.
Differential revision: https://reviews.llvm.org/D43974
llvm-svn: 326955
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The attached testcase started failing after the patch to define
isExtractSubvectorCheap with the following pattern mismatch:
ISEL: Starting pattern match
Initial Opcode index to 85068
Match failed at index 85076
LLVM ERROR: Cannot select: t47: v8i16 = insert_subvector undef:v8i16, t43, Constant:i64<0>
The code generated from llvm/lib/Target/AArch64/AArch64InstrInfo.td
def : Pat<(insert_subvector undef, (v4i16 FPR64:$src), (i32 0)),
(INSERT_SUBREG (v8i16 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
is in ninja/lib/Target/AArch64/AArch64GenDAGISel.inc
At the location of the error it is:
/* 85076*/ OPC_CheckChild2Type, MVT::i32,
And it failed to match the type of operand 2.
Adding another def-pat for i64 fixes the failed def-pat error:
def : Pat<(insert_subvector undef, (v4i16 FPR64:$src), (i64 0)),
(INSERT_SUBREG (v8i16 (IMPLICIT_DEF)), FPR64:$src, dsub)>;
llvm-svn: 326949
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Following the ARM-neon backend, define isExtractSubvectorCheap to return true
when extracting low and high part of a neon register.
The patch disables a test in llvm/test/CodeGen/AArch64/arm64-ext.ll This
testcase is fragile in the sense that it requires a BUILD_VECTOR to "survive"
all DAG transforms until ISelLowering. The testcase is supposed to check that
AArch64TargetLowering::ReconstructShuffle() works, and for that we need a
BUILD_VECTOR in ISelLowering. As we now transform the BUILD_VECTOR earlier into
an VEXT + vector_shuffle, we don't have the BUILD_VECTOR pattern when we get to
ISelLowering. As there is no way to disable the combiner to only exercise the
code in ISelLowering, the patch disables the testcase.
Differential revision: https://reviews.llvm.org/D43973
llvm-svn: 326811
|
|
|
|
|
|
|
|
|
| |
The error occurs when reading i16 elements (as in the testcase) from a v8i8
with a pattern of <0,2,4,6>. As all the data in the vector is accessed, the
operation is not a VUZP. The patch stops the pattern recognition of VUZP when
EXTRACT_VECTOR_ELT has a different element type than BUILD_VECTOR.
llvm-svn: 326722
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the whole gammut of constant immediates available to set up a vector.
Instead of using, for example, `mov w0, #0xffff; dup v0.4s, w0`, which
transfers between register files, use the more efficient `movi v0.4s, #-1`
instead. Not limited to just a few values, but any immediate value that can
be encoded by all the variants of `FMOV`, `MOVI`, `MVNI`, thus eliminating
the need to there be patterns to optimize special cases.
Differential revision: https://reviews.llvm.org/D42133
llvm-svn: 326718
|
|
|
|
|
|
|
| |
Clean up a couple of functions in `AArch64TargetLowering` by removing
redundant statements.
llvm-svn: 326486
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D43288
llvm-svn: 326480
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when a BUILD_VECTOR is created out of a sequence of EXTRACT_VECTOR_ELT with a
specific pattern sequence, either <0, 2, 4, ...> or <1, 3, 5, ...>, replace the
BUILD_VECTOR with either vuzp1 or vuzp2.
With this patch LLVM generates the following code for the first function fun1 in the testcase:
adrp x8, .LCPI0_0
ldr q0, [x8, :lo12:.LCPI0_0]
tbl v0.16b, { v0.16b }, v0.16b
ext v1.16b, v0.16b, v0.16b, #8
uzp1 v0.8b, v0.8b, v1.8b
str d0, [x8]
ret
Without this patch LLVM currently generates this code:
adrp x8, .LCPI0_0
ldr q0, [x8, :lo12:.LCPI0_0]
tbl v0.16b, { v0.16b }, v0.16b
mov v1.16b, v0.16b
mov v1.b[1], v0.b[2]
mov v1.b[2], v0.b[4]
mov v1.b[3], v0.b[6]
mov v1.b[4], v0.b[8]
mov v1.b[5], v0.b[10]
mov v1.b[6], v0.b[12]
mov v1.b[7], v0.b[14]
str d1, [x8]
ret
llvm-svn: 326443
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Emulated TLS is enabled by llc flag -emulated-tls,
which is passed by clang driver.
When llc is called explicitly or from other drivers like LTO,
missing -emulated-tls flag would generate wrong TLS code for targets
that supports only this mode.
Now use useEmulatedTLS() instead of Options.EmulatedTLS to decide whether
emulated TLS code should be generated.
Unit tests are modified to run with and without the -emulated-tls flag.
Differential Revision: https://reviews.llvm.org/D42999
llvm-svn: 326341
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we assert that only non target specific opcodes can have
missing RegisterClass constraints in the MCDesc. The backend can have
instructions with register operands but don't have RegisterClass
constraints (say using unknown_class) in which case the instruction
defining the register will constrain it.
Change the assert to only fire if a def has no regclass.
https://reviews.llvm.org/D43409
llvm-svn: 326142
|
|
|
|
|
|
|
|
|
| |
This feature enables the fusion of the comparison and the conditional select
instructions together.
Differential revision: https://reviews.llvm.org/D42392
llvm-svn: 325939
|