| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
The verifier began complaining about an undefined physical register in this
test. XFAILing for the purposes of getting a bot up while I look into it.
Failure:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-expensive/11385/
llvm-svn: 330493
|
| |
|
|
|
|
|
|
|
|
| |
First off, this is more correct than having the B. Second off, this was making
a bot upset. This fixes that.
Update the test to include -verify-machineinstrs as well to prevent stuff like
this slipping by non debug/assert builds in the future.
llvm-svn: 330459
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was originally committed at rL328921 and reverted at rL329920 to
investigate failures in Chrome. This time I've added to the ReleaseNotes
to warn users of the potential of exposing UB and let me repeat that
here for more exposure:
Optimization of floating-point casts is improved. This may cause surprising
results for code that is relying on undefined behavior. Code sanitizers can
be used to detect affected patterns such as this:
int main() {
float x = 4294967296.0f;
x = (float)((int)x);
printf("junk in the ftrunc: %f\n", x);
return 0;
}
$ clang -O1 ftrunc.c -fsanitize=undefined ; ./a.out
ftrunc.c:5:15: runtime error: 4.29497e+09 is outside the range of
representable values of type 'int'
junk in the ftrunc: 0.000000
Original commit message:
fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC,
so replace a pair of casts with the equivalent node. We don't have to account for
special cases (NaN, INF) because out-of-range casts are undefined.
Differential Revision: https://reviews.llvm.org/D44909
llvm-svn: 330437
|
| |
|
|
|
|
| |
rdar://39454635
llvm-svn: 330276
|
| |
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D45727
llvm-svn: 330253
|
| |
|
|
|
|
|
| |
Debugability is more important than saving 4 bytes to let us to fall
through to nonense.
llvm-svn: 330073
|
| |
|
|
|
|
|
|
|
|
| |
addresses."
Caused a hang and eventually an assertion failure in LTO builds
of 7zip-benchmark on aarch64 iOS targets.
http://green.lab.llvm.org/green/job/lnt-ctmark-aarch64-O3-flto/2024/
llvm-svn: 330063
|
| |
|
|
|
|
|
|
|
|
|
| |
This is a code size win in code that takes offseted addresses
frequently, such as C++ constructors that typically need to compute
an offseted address of a vtable. This reduces the size of Chromium
for Android's .text section by 108KB.
Differential Revision: https://reviews.llvm.org/D45199
llvm-svn: 329956
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
AFI->setRedZone(false) was put in the wrong place before, and so it only fired
on functions that didn't have stack frames. This moves that to the top of
emitPrologue to make sure that every function without a redzone has it set
correctly.
This also adds a function representing one of the early exit cases (GHC calling
convention) to the MachineOutliner noredzone test to ensure that we can outline
from functions like these, where we never use a redzone.
llvm-svn: 329922
|
| |
|
|
|
|
|
| |
This change is exposing UB in source code - as was warned/predicted. :)
See D44909 for discussion. Reverting while we figure out how to fix things.
llvm-svn: 329920
|
| |
|
|
|
|
| |
"is is" -> "is", "if if" -> "if", "or or" -> "or"
llvm-svn: 329878
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is causing compilation timeouts on code with long sequences of
local values and calls (i.e. foo(1); foo(2); foo(3); ...). It turns out
that code coverage instrumentation is a great way to create sequences
like this, which how our users ran into the issue in practice.
Intel has a tool that detects these kinds of non-linear compile time
issues, and Andy Kaylor reported it as PR37010.
The current sinking code scans the whole basic block once per local
value sink, which happens before emitting each call. In theory, local
values should only be introduced to be used by instructions between the
current flush point and the last flush point, so we should only need to
scan those instructions.
llvm-svn: 329822
|
| |
|
|
|
|
| |
Forgot to add a test case in the previous commit.
llvm-svn: 329805
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When inserting MOVs to avoid Falkor HWPF collisions, the non-base
register operand of load instructions (e.g. a register offset) was not
being considered live, so it could potentially have been used as a
scratch register, clobbering the actual offset value.
Reviewers: mcrosier
Subscribers: rengolin, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D45502
llvm-svn: 329761
|
| |
|
|
|
|
| |
rdar://39175175
llvm-svn: 329743
|
| |
|
|
|
|
| |
Caused a build failure in check-tsan.
llvm-svn: 329718
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the presence of variable-sized stack objects, we always picked the
base pointer when resolving frame indices if it was available.
This makes us hit an assert where we can't reach the emergency spill
slot if it's too far away from the base pointer. Since on AArch64 we
decide to place the emergency spill slot at the top of the frame, it
makes more sense to use FP to access it.
The changes here don't affect only emergency spill slots but all the
frame indices. The goal here is to try to choose between FP, BP and SP
so that we minimize the offset and avoid scavenging, or worse, asserting
when trying to access a slot allocated by the scavenger.
Previously discussed here: https://reviews.llvm.org/D40876.
Differential Revision: https://reviews.llvm.org/D45358
llvm-svn: 329691
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a code size win in code that takes offseted addresses
frequently, such as C++ constructors that typically need to compute
an offseted address of a vtable. It reduces the size of Chromium for
Android's .text section by 46KB, or 56KB with ThinLTO (which exposes
more opportunities to use a direct access rather than a GOT access).
Because the addend range is limited in COFF and Mach-O, this is
enabled for ELF only.
Differential Revision: https://reviews.llvm.org/D45199
llvm-svn: 329611
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently MachineLoopInfo is used in only two places:
1) for computing IsBasicBlockInsideInnermostLoop field of MCCodePaddingContext, and it is never used.
2) in emitBasicBlockLoopComments, which is called only if `isVerbose()` is true.
Despite that, we currently have a dependency on MachineLoopInfo, which makes
pass manager to compute it and MachineDominator Tree. This patch removes the
use (1) and makes the use (2) lazy, thus avoiding some redundant
recomputations.
Reviewers: opaparo, gadi.haber, rafael, craig.topper, zvi
Subscribers: rengolin, javed.absar, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D44812
llvm-svn: 329542
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
In our real world application, we found the following optimization is missed in DAGCombiner
(zext (and/or/xor (shl/shr (load x), cst), cst)) -> (and/or/xor (shl/shr (zextload x), (zext cst)), (zext cst))
If the user of original zext is an add, it may enable further lea optimization on x86.
This patch add a new function CombineZExtLogicopShiftLoad to do this optimization.
Differential Revision: https://reviews.llvm.org/D44402
llvm-svn: 329516
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Should fix UBSan bot by also checking there's no "uwtable" attribute
before skipping. Otherwise the unwind table will be useless since its
moves expect CSRs to actually be preserved.
A noreturn nounwind function can be expected to never return in any way, and by
never returning it will also never have to restore any callee-saved registers
for its caller. This makes it possible to skip spills of those registers during
function entry, saving some stack space and time in the process. This is rather
useful for embedded targets with limited stack space.
Should fix PR9970.
Patch mostly by myeisha (pmb).
llvm-svn: 329494
|
| |
|
|
|
|
|
|
| |
Breaks ubsan test TestCases/Misc/missing_return.cpp on ARM
This reverts commit r329287
llvm-svn: 329486
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
A noreturn nounwind function can be expected to never return in any way, and by
never returning it will also never have to restore any callee-saved registers
for its caller. This makes it possible to skip spills of those registers during
function entry, saving some stack space and time in the process. This is rather
useful for embedded targets with limited stack space.
Should fix PR9970.
Patch by myeisha (pmb).
llvm-svn: 329287
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The implementation of shadow call stack on aarch64 is quite different to
the implementation on x86_64. Instead of reserving a segment register for
the shadow call stack, we reserve the platform register, x18. Any function
that spills lr to sp also spills it to the shadow call stack, a pointer to
which is stored in x18.
Differential Revision: https://reviews.llvm.org/D45239
llvm-svn: 329236
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D44573
llvm-svn: 329163
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a hasRedZone() function to AArch64MachineFunctionInfo. It
returns true if the function is known to use a redzone, false if it is known
to not use a redzone, and no value otherwise.
This removes the requirement to pass -mno-red-zone when outlining for AArch64.
https://reviews.llvm.org/D45189
llvm-svn: 329120
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The linkage type on outlined functions was private before. This meant that if
you set a breakpoint in an outlined function, the debugger wouldn't be able to
give a sane name to the outlined function.
This commit changes the linkage type to internal and updates any tests that
relied on the prefixes on the names of outlined functions.
llvm-svn: 329116
|
| |
|
|
|
|
|
|
| |
This register is reserved as a platform register on Fuchsia.
Differential Revision: https://reviews.llvm.org/D45105
llvm-svn: 328950
|
| |
|
|
|
|
|
|
|
|
| |
fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC,
so replace a pair of casts with the equivalent node. We don't have to account for
special cases (NaN, INF) because out-of-range casts are undefined.
Differential Revision: https://reviews.llvm.org/D44909
llvm-svn: 328921
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code has bugs dealing with -0.0.
Since D44550 introduced FABS pattern folding in InstCombine,
this patch removes the now-redundant code that causes
https://bugs.llvm.org/show_bug.cgi?id=36600.
Patch by Mikhail Dvoretckii!
Differential Revision: https://reviews.llvm.org/D44683
llvm-svn: 328872
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MachineCopyPropagation::CopyPropagateBlock has a bunch of special
handling for COPY instructions. This handling assumes that COPY
instructions do not modify the source of the copy; this is wrong if
the COPY destination overlaps the source.
To fix the bug, check explicitly for this situation, and fall back to
the generic instruction handling.
This bug can't happen for most register classes because they don't
have this sort of overlap, but there are a few register classes
where this is possible. The testcase uses the AArch64 QQQQ register
class.
Differential Revision: https://reviews.llvm.org/D44911
llvm-svn: 328851
|
| |
|
|
|
|
|
|
|
|
| |
for call outlining
This commit simplifies the call outlining logic by removing references to the
Function associated with the callee. To do this, it requires that valid
callee save info is available to the outliner.
llvm-svn: 328719
|
| |
|
|
|
|
| |
As suggested in D44909.
llvm-svn: 328683
|
| |
|
|
|
|
|
|
|
| |
If an ADRP appears with, say, a CPI operand, we shouldn't outline it.
This moves the check for unsafe operands so that it occurs before the special-case
for ADRPs. Also add a test for outlining ADRPs.
llvm-svn: 328674
|
| |
|
|
|
|
|
|
|
|
|
| |
%tmp = bitcast i32* %arg to i8*
%tmp1 = getelementptr inbounds i8, i8* %tmp, i32 0
- %tmp2 = load i8, i8* %tmp, align 1
+ %tmp2 = load i8, i8* %tmp1, align 1
This doesn't change the semantics of the tests but makes use of %tmp1 which was originally intended.
llvm-svn: 328642
|
| |
|
|
|
|
|
|
|
| |
On Hexagon "x = y" is a syntax used in most instructions, and is not
treated as a directive.
Differential Revision: https://reviews.llvm.org/D44256
llvm-svn: 328635
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Loads and stores can only shift the offset register by the size of the value
being loaded, but currently the DAGCombiner will reduce the width of the load
if it's followed by a trunc making it impossible to later combine the shift.
Solve this by implementing shouldReduceLoadWidth for the AArch64 backend and
make it prevent the width reduction if this is what would happen, though do
allow it if reducing the load width will let us eliminate a later sign or zero
extend.
Differential Revision: https://reviews.llvm.org/D44794
llvm-svn: 328321
|
| |
|
|
|
|
|
| |
This was being masked because GISel is enabled by default for -O0 and
the abort was disabled. Modified test to explicitly enable abort.
llvm-svn: 328311
|
| |
|
|
|
|
| |
That removes some redundant recomputations from the passes pipeline.
llvm-svn: 328272
|
| |
|
|
|
|
| |
with expensive checks on.
llvm-svn: 328267
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This pass sinks COPY instructions into a successor block, if the COPY is not
used in the current block and the COPY is live-in to a single successor
(i.e., doesn't require the COPY to be duplicated). This avoids executing the
the copy on paths where their results aren't needed. This also exposes
additional opportunites for dead copy elimination and shrink wrapping.
These copies were either not handled by or are inserted after the MachineSink
pass. As an example of the former case, the MachineSink pass cannot sink
COPY instructions with allocatable source registers; for AArch64 these type
of copy instructions are frequently used to move function parameters (PhyReg)
into virtual registers in the entry block..
For the machine IR below, this pass will sink %w19 in the entry into its
successor (%bb.1) because %w19 is only live-in in %bb.1.
```
%bb.0:
%wzr = SUBSWri %w1, 1
%w19 = COPY %w0
Bcc 11, %bb.2
%bb.1:
Live Ins: %w19
BL @fun
%w0 = ADDWrr %w0, %w19
RET %w0
%bb.2:
%w0 = COPY %wzr
RET %w0
```
As we sink %w19 (CSR in AArch64) into %bb.1, the shrink-wrapping pass will be
able to see %bb.0 as a candidate.
With this change I observed 12% more shrink-wrapping candidate and 13% more dead copies deleted in spec2000/2006/2017 on AArch64.
Reviewers: qcolombet, MatzeB, thegameg, mcrosier, gberry, hfinkel, john.brawn, twoh, RKSimon, sebpop, kparzysz
Reviewed By: sebpop
Subscribers: evandro, sebpop, sfertile, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D41463
llvm-svn: 328237
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://reviews.llvm.org/D44762
Currently IRTranslator produces
%vreg17<def>(p0) = G_CONSTANT 0;
instead we should build
%vreg16(s64) = G_CONSTANT 0
%vreg17(p0) = G_INTTOPTR %vreg16
reviewed by @aemerson.
llvm-svn: 328218
|
| |
|
|
|
|
|
| |
This reverts r328159 because the two AArch64 tests fail on GreenDragon:
http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/11030/
llvm-svn: 328188
|
| |
|
|
|
|
|
| |
This is basically an extension of existing test
test/CodeGen/X86/O0-pipeline.ll introduced in r302608.
llvm-svn: 328159
|
| |
|
|
|
|
| |
https://reviews.llvm.org/D44591
llvm-svn: 328035
|
| |
|
|
| |
llvm-svn: 327995
|
| |
|
|
|
|
|
|
|
|
|
| |
When outlining calls, the outliner needs to update CFI to ensure that, say,
exception handling works. This commit adds that functionality and adds a test
just for call outlining.
Call outlining stuff in machine-outliner.mir should be moved into
machine-outliner-calls.mir in a later commit.
llvm-svn: 327917
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This extends the use of this attribute on ARM and AArch64 from
SVN r325900 (where it was only checked for fixed stack
allocations on ARM/AArch64, but for all stack allocations on X86).
This also adds a testcase for the existing use of disabling the
fixed stack probe with the attribute on ARM and AArch64.
Differential Revision: https://reviews.llvm.org/D44291
llvm-svn: 327897
|
| |
|
|
|
|
|
|
|
| |
At the point the outliner runs, KILLs don't impact anything, but they're still
considered unique instructions. This commit makes them invisible like
DebugValues so that they can still be outlined without impacting outlining
decisions.
llvm-svn: 327760
|
| |
|
|
|
|
|
|
|
| |
This is a follow up of the AArch64 FP16 intrinsics work;
the codegen tests had not been added yet.
Differential Revision: https://reviews.llvm.org/D44510
llvm-svn: 327624
|