| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "dead" markings allow existing target-independent optimizations,
like MachineSink, to trigger more frequently. The CPSR defs would have
eventually been marked dead by LiveVariables, so this only affects
optimizations before regalloc.
The ARMBaseInstrInfo.cpp change is fixing a bug which is only visible
with this change: the transform adds a use to an otherwise dead def
of CPSR. This is covered by existing regression tests.
thumb2-tbh.ll breaks for Thumb1 due to MachineLICM changing the
generated code; I'll fix it in D53452.
Differential Revision: https://reviews.llvm.org/D53453
llvm-svn: 345420
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit bd7b44f35ee9fbe365eb25ce55437ea793b39346.
Reland r342994: disabled the optimization and explicitly enable it in test.
-mllvm -consthoist-min-num-to-rebase<unsigned>=0
[ConstHoist] Do not rebase single (or few) dependent constant
If an instance (InsertionPoint or IP) of Base constant A has only one or few
rebased constants depending on it, do NOT rebase. One extra ADD instruction is
required to materialize each rebased constant, assuming A and the rebased have
the same materialization cost.
Differential Revision: https://reviews.llvm.org/D52243
llvm-svn: 343053
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This caused a couple test failures on a bot:
CodeGen/X86/constant-hoisting-bfi.ll
Transforms/ConstantHoisting/X86/ehpad.ll
Example:
http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/53575/
llvm-svn: 343005
|
|
|
|
|
|
|
|
|
|
|
| |
If an instance (InsertionPoint or IP) of Base constant A has only one or few
rebased constants depending on it, do NOT rebase. One extra ADD instruction is
required to materialize each rebased constant, assuming A and the rebased have
the same materialization cost.
Differential Revision: https://reviews.llvm.org/D52243
llvm-svn: 342994
|
|
|
|
|
|
|
|
|
| |
A simple MOVS rd, imm8 can materialize [-128, 127] in signed i8 type or
[0, 255] in unsigned i8 type on Thumb1.
Differential Revision: https://reviews.llvm.org/D52257
llvm-svn: 342898
|
|
|
|
|
|
|
|
| |
If there is an unused def, this would previously
report that the register was live. Check for uses
first so that it is reported as dead if never used.
llvm-svn: 341027
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no way in the universe, that doing a full-width division in
software will be faster than doing overflowing multiplication in
software in the first place, especially given that this same full-width
multiplication needs to be done anyway.
This patch replaces the previous implementation with a direct lowering
into an overflowing multiplication algorithm based on half-width
operations.
Correctness of the algorithm was verified by exhaustively checking the
output of this algorithm for overflowing multiplication of 16 bit
integers against an obviously correct widening multiplication. Baring
any oversights introduced by porting the algorithm to DAG, confidence in
correctness of this algorithm is extremely high.
Following table shows the change in both t = runtime and s = space. The
change is expressed as a multiplier of original, so anything under 1 is
“better” and anything above 1 is worse.
+-------+-----------+-----------+-------------+-------------+
| Arch | u64*u64 t | u64*u64 s | u128*u128 t | u128*u128 s |
+-------+-----------+-----------+-------------+-------------+
| X64 | - | - | ~0.5 | ~0.64 |
| i686 | ~0.5 | ~0.6666 | ~0.05 | ~0.9 |
| armv7 | - | ~0.75 | - | ~1.4 |
+-------+-----------+-----------+-------------+-------------+
Performance numbers have been collected by running overflowing
multiplication in a loop under `perf` on two x86_64 (one Intel Haswell,
other AMD Ryzen) based machines. Size numbers have been collected by
looking at the size of function containing an overflowing multiply in
a loop.
All in all, it can be seen that both performance and size has improved
except in the case of armv7 where code size has regressed for 128-bit
multiply. u128*u128 overflowing multiply on 32-bit platforms seem to
benefit from this change a lot, taking only 5% of the time compared to
original algorithm to calculate the same thing.
The final benefit of this change is that LLVM is now capable of lowering
the overflowing unsigned multiply for integers of any bit-width as long
as the target is capable of lowering regular multiplication for the same
bit-width. Previously, 128-bit overflowing multiply was the widest
possible.
Patch by Simonas Kazlauskas!
Differential Revision: https://reviews.llvm.org/D50310
llvm-svn: 339922
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM normally prefers to minimize the number of bits set in an AND
immediate, but that doesn't always match the available ARM instructions.
In Thumb1 mode, prefer uxtb or uxth where possible; otherwise, prefer
a two-instruction sequence movs+ands or movs+bics.
Some potential improvements outlined in
ARMTargetLowering::targetShrinkDemandedConstant, but seems to work
pretty well already.
The ARMISelDAGToDAG fix ensures we don't generate an invalid UBFX
instruction due to a larger-than-expected mask. (It's orthogonal, in
some sense, but as far as I can tell it's either impossible or nearly
impossible to reproduce the bug without this change.)
According to my testing, this seems to consistently improve codesize by
a small amount by forming bic more often for ISD::AND with an immediate.
Differential Revision: https://reviews.llvm.org/D50030
llvm-svn: 339472
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Saves materializing the immediate for the "ands".
Corresponding patterns exist for lsrs+lsls, but that seems less common
in practice.
Now implemented as a DAGCombine.
Differential Revision: https://reviews.llvm.org/D49585
llvm-svn: 337945
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is marginally helpful for removing redundant extensions, and the
code is easier to read, so it seems like an all-around win. In the new
test i8-phi-ext.ll, we used to emit an AssertSext i8; now we emit an
AssertZext i2, which allows the extension of the return value to be
eliminated.
Differential Revision: https://reviews.llvm.org/D49004
llvm-svn: 336868
|
|
|
|
|
|
|
|
|
|
| |
Even if a comparison isn't legal, we should try to prefer constants
which can be materialized with a two-instruction sequence. (Thinking
about it a bit more, there might be some more clever sequence we could
generate for certain comparisons invoving powers of two, but I'm not
sure exactly what that would look like.)
llvm-svn: 335003
|
|
|
|
|
|
|
| |
When the result of masking is truncated to i16, we should try to use
"bic" instead of "and".
llvm-svn: 335001
|
|
|
|
|
|
|
|
|
|
| |
It looks like this got left in by accident in r289794; I can't think of
any reason this check would be necessary. (Maybe it was meant to be a
check that the AND has one use? But we check that a few lines earlier.)
Differential Revision: https://reviews.llvm.org/D47921
llvm-svn: 334322
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Should fix UBSan bot by also checking there's no "uwtable" attribute
before skipping. Otherwise the unwind table will be useless since its
moves expect CSRs to actually be preserved.
A noreturn nounwind function can be expected to never return in any way, and by
never returning it will also never have to restore any callee-saved registers
for its caller. This makes it possible to skip spills of those registers during
function entry, saving some stack space and time in the process. This is rather
useful for embedded targets with limited stack space.
Should fix PR9970.
Patch mostly by myeisha (pmb).
llvm-svn: 329494
|
|
|
|
|
|
|
|
| |
Breaks ubsan test TestCases/Misc/missing_return.cpp on ARM
This reverts commit r329287
llvm-svn: 329486
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A noreturn nounwind function can be expected to never return in any way, and by
never returning it will also never have to restore any callee-saved registers
for its caller. This makes it possible to skip spills of those registers during
function entry, saving some stack space and time in the process. This is rather
useful for embedded targets with limited stack space.
Should fix PR9970.
Patch by myeisha (pmb).
llvm-svn: 329287
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes PR36658, "Constant pool entry out of range!" in Thumb1 mode.
In ARMConstantIslands::optimizeThumb2JumpTables() in Thumb1 mode,
adjustBBOffsetsAfter() is not calculating postOffset correctly by
properly accounting for the padding that is required for the constant pool
that immediately follows the jump table branch instruction.
Reviewers: t.p.northover, eli.friedman
Reviewed By: t.p.northover
Subscribers: chrib, tstellar, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D44709
llvm-svn: 328341
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an Armv6m function dynamically re-aligns the stack, access to incoming
stack arguments (and to stack area, allocated for register varargs) is done via
SP, which is incorrect, as the SP is offset by an unknown amount relative to the
value of SP upon function entry.
This patch fixes it, by making access to "fixed" frame objects be done via FP
when the function needs stack re-alignment. It also changes the access to
"fixed" frame objects be done via FP (instead of using R6/BP) also for the case
when the stack frame contains variable sized objects. This should allow more
objects to fit within the immediate offset of the load instruction.
All of the above via a small refactoring to reuse the existing
`ARMFrameLowering::ResolveFrameIndexReference.`
Differential Revision: https://reviews.llvm.org/D43566
llvm-svn: 326584
|
|
|
|
|
|
|
|
| |
Re-enable commit r323991 now that r325931 has been committed to make
MachineOperand::isRenamable() check more conservative w.r.t. code
changes and opt-in on a per-target basis.
llvm-svn: 326208
|
|
|
|
|
|
|
|
|
|
| |
Fixup to rL325573 for large xor constants.
Thanks to Eli Friedman for the catch.
Differential revision: https://reviews.llvm.org/D43549
llvm-svn: 325761
|
|
|
|
|
|
|
|
|
|
| |
We can always convert xor %a, -1 into MVN, even in thumb 1 where the -1
would not otherwise be considered a cheap constant. This prevents the
-1's from being pulled out into constants and potentially hoisted.
Differential Revision: https://reviews.llvm.org/D43451
llvm-svn: 325573
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r323991.
This commit breaks target that don't model all the register constraints
in TableGen. So far the workaround was to set the
hasExtraXXXRegAllocReq, but it proves that it doesn't cover all the
cases.
For instance, when mutating an instruction (like in the lowering of
COPYs) the isRenamable flag is not properly updated. The same problem
will happen when attaching machine operand from one instruction to
another.
Geoff Berry is working on a fix in https://reviews.llvm.org/D43042.
llvm-svn: 325421
|
|
|
|
|
|
|
|
|
|
| |
Enable multiple COPY hints to eliminate more COPYs during register allocation.
Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.
Review: Eli Friedman
llvm-svn: 325327
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch combines some cases of ARMISD::CMOV for integers that arise in comparisons of the form
a != b ? x : 0
a == b ? 0 : x
and that currently (e.g. in Thumb1) are emitted as branches.
Differential Revision: https://reviews.llvm.org/D34515
llvm-svn: 325323
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See D42509 for the original version of this.
Basically, there are two significant changes to behavior here:
- addLiveOuts always adds all pristine registers (even if a block has
no successors).
- addLiveOuts and addLiveOutsNoPristines always add all callee-saved
registers for return blocks (including conditional return blocks).
I cleaned up the functions a bit to make it clear these properties hold.
Differential Revision: https://reviews.llvm.org/D42655
llvm-svn: 324422
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change extends MachineCopyPropagation to do COPY source forwarding
and adds an additional run of the pass to the default pass pipeline just
after register allocation.
This version of this patch uses the newly added
MachineOperand::isRenamable bit to avoid forwarding registers is such a
way as to violate constraints that aren't captured in the
Machine IR (e.g. ABI or ISA constraints).
This change is a continuation of the work started in D30751.
Reviewers: qcolombet, javed.absar, MatzeB, jonpa, tstellar
Subscribers: tpr, mgorny, mcrosier, nhaehnle, nemanjai, jyknight, hfinkel, arsenm, inouehrs, eraman, sdardis, guyblank, fedor.sergeev, aheejin, dschuff, jfb, myatsina, llvm-commits
Differential Revision: https://reviews.llvm.org/D41835
llvm-svn: 323991
|
|
|
|
|
|
|
|
|
|
| |
bit-operations"
Miscompiles code. Testcase pending.
This reverts commit r323869.
llvm-svn: 323929
|
|
|
|
|
|
|
|
|
|
|
|
| |
Discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html
In preparation for adding support for named vregs we are changing the sigil for
physical registers in MIR to '$' from '%'. This will prevent name clashes of
named physical register with named vregs.
llvm-svn: 323922
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves
In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before.
Patch by Marten Svanfeldt.
Reviewers: fhahn, pbarrio
Reviewed By: pbarrio
Subscribers: efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D42574
llvm-svn: 323869
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GPR.
In Thumb 1, with the new ADDCARRY / SUBCARRY the scheduler may need to do
copies CPSR ↔ GPR but not all Thumb1 targets implement them.
The schedule can attempt, before attempting a copy, to clone the instructions
but it does not currently do that for nodes with input glue. In this patch we
introduce a target-hook to let the hook decide if a glued machinenode is still
eligible for copying. In this case these are ARM::tADCS and ARM::tSBCS .
As a follow-up of this change we should actually implement the copies for the
Thumb1 targets that do implement them and restrict the hook to the targets that
can't really do such copy as these clones are not ideal.
This change fixes PR35836.
Differential Revision: https://reviews.llvm.org/D42051
llvm-svn: 323857
|
|
|
|
|
|
|
| |
I have a fix for the issue (https://reviews.llvm.org/D42655) but
it's taking a while to get reviewed, so reverting in the meantime.
llvm-svn: 323841
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
One common source of blocks with no successors is calls to noreturn
functions; we want to preserve pristine registers in case they throw an
exception.
The whole pristine register thing is messy (we should really prefer to
explicitly model registers), but this fills a hole in the model for now.
Fixes https://bugs.llvm.org/show_bug.cgi?id=36073.
Differential Revision: https://reviews.llvm.org/D42509
llvm-svn: 323559
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
attributes (Step 1)
Summary:
This is a resurrection of work first proposed and discussed in Aug 2015:
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
and initially landed (but then backed out) in Nov 2015:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.
This change is the first in a series that allows source and dest to each
have their own alignments by using the alignment attribute on their arguments.
In this change we:
1) Remove the alignment argument.
2) Add alignment attributes to the source & dest arguments. We, temporarily,
require that the alignments for source & dest be equal.
For example, code which used to read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)
Downstream users may have to update their lit tests that check for
@llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script
may help with updating the majority of your tests, but it does not catch all possible
patterns so some manual checking and updating will be required.
s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g
The remaining changes in the series will:
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments.
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use
getDestAlignment() and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reviewers: pete, hfinkel, lhames, reames, bollu
Reviewed By: reames
Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits
Differential Revision: https://reviews.llvm.org/D41675
llvm-svn: 322965
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch allows `r7` to be used, regardless of its use as a frame pointer, as
a temporary register when popping `lr`, and also falls back to using a high
temporary register if, for some reason, we weren't able to find a suitable low
one.
Differential revision: https://reviews.llvm.org/D40961
Fixes https://bugs.llvm.org/show_bug.cgi?id=35481
llvm-svn: 321989
|
|
|
|
|
|
|
|
| |
This is NFC and a preparatory step for D34515.
Differential Revision: https://reviews.llvm.org/D41122
llvm-svn: 320795
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
output
As part of the unification of the debug format and the MIR format,
always use `printReg` to print all kinds of registers.
Updated the tests using '_' instead of '%noreg' until we decide which
one we want to be the default one.
Differential Revision: https://reviews.llvm.org/D40421
llvm-svn: 319445
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The commit https://reviews.llvm.org/rL318143 computes incorrectly to offset to
restore LR from.
The number of tPOP operands is 2 (condition) + 2 (implicit def and use of SP) +
count of the popped registers. We need to load LR from just past the last
register, hence the correct offset should be either getNumOperands() - 4 and
getNumExplicitOperands() - 2 (multiplied by 4).
Differential revision: https://reviews.llvm.org/D40305
llvm-svn: 319014
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we emit a tail call for Armv8-M, but then discover that the caller needs to
save/restore `LR`, we convert the tail call to an ordinary one, since restoring
`LR` takes extra instructions, which may negate the benefits of the tail
call. If the callee, however, takes stack arguments, this conversion is
incorrect, since nothing has been done to pass the stack arguments.
Thus the patch reverts https://reviews.llvm.org/rL294000
Also, we improve the instruction sequence for popping `LR` in the case when we
couldn't immediately find a scratch low register, but we can use as a temporary
one of the callee-saved low registers and restore `LR` before popping other
callee-saves.
Differential Revision: https://reviews.llvm.org/D39599
llvm-svn: 318143
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements dynamic stack (re-)alignment for 16-bit Thumb. When
targeting processors, which support only the 16-bit Thumb instruction set
the compiler ignores the alignment attributes of automatic variables and may
silently generate incorrect code.
Differential revision: https://reviews.llvm.org/D38143
llvm-svn: 316289
|
|
|
|
|
|
|
|
|
|
| |
forwarding""
This reverts commit r314729.
Another bug has been encountered in an out-of-tree target reported by Quentin.
llvm-svn: 314814
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issues addressed since original review:
- Avoid bug in regalloc greedy/machine verifier when forwarding to use
in an instruction that re-defines the same virtual register.
- Fixed bug when forwarding to use in EarlyClobber instruction slot.
- Fixed incorrect forwarding to register definitions that showed up in
explicit_uses() iterator (e.g. in INLINEASM).
- Moved removal of dead instructions found by
LiveIntervals::shrinkToUses() outside of loop iterating over
instructions to avoid instructions being deleted while pointed to by
iterator.
- Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
- The pass no longer forwards COPYs to physical register uses, since
doing so can break code that implicitly relies on the physical
register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
can break the machine verifier by creating LiveRanges that don't
end on a use (since the undef operand is not considered a use).
[MachineCopyPropagation] Extend pass to do COPY source forwarding
This change extends MachineCopyPropagation to do COPY source forwarding.
This change also extends the MachineCopyPropagation pass to be able to
be run during register allocation, after physical registers have been
assigned, but before the virtual registers have been re-written, which
allows it to remove virtual register COPY LiveIntervals that become dead
through the forwarding of all of their uses.
llvm-svn: 314729
|
|
|
|
|
|
|
|
|
|
| |
LR is an untypical callee saved register in that it is restored into a
different register (PC) and thus does not live-out of the return block.
This case requires the `Restored` flag in CalleeSavedInfo to be cleared.
This fixes a number of cases where this wasn't handled correctly yet.
llvm-svn: 314471
|
|
|
|
|
|
|
|
|
|
| |
forwarding""
This crashes on boringSSL on PPC (will send reduced testcase)
This reverts commit r312328.
llvm-svn: 312490
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issues addressed since original review:
- Moved removal of dead instructions found by
LiveIntervals::shrinkToUses() outside of loop iterating over
instructions to avoid instructions being deleted while pointed to by
iterator.
- Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
- The pass no longer forwards COPYs to physical register uses, since
doing so can break code that implicitly relies on the physical
register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
can break the machine verifier by creating LiveRanges that don't
end on a use (since the undef operand is not considered a use).
[MachineCopyPropagation] Extend pass to do COPY source forwarding
This change extends MachineCopyPropagation to do COPY source forwarding.
This change also extends the MachineCopyPropagation pass to be able to
be run during register allocation, after physical registers have been
assigned, but before the virtual registers have been re-written, which
allows it to remove virtual register COPY LiveIntervals that become dead
through the forwarding of all of their uses.
llvm-svn: 312328
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
source forwarding""
It caused PR34387: Assertion failed: (RegNo < NumRegs && "Attempting to access record for invalid register number!")
> Issues identified by buildbots addressed since original review:
> - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
> - The pass no longer forwards COPYs to physical register uses, since
> doing so can break code that implicitly relies on the physical
> register number of the use.
> - The pass no longer forwards COPYs to undef uses, since doing so
> can break the machine verifier by creating LiveRanges that don't
> end on a use (since the undef operand is not considered a use).
>
> [MachineCopyPropagation] Extend pass to do COPY source forwarding
>
> This change extends MachineCopyPropagation to do COPY source forwarding.
>
> This change also extends the MachineCopyPropagation pass to be able to
> be run during register allocation, after physical registers have been
> assigned, but before the virtual registers have been re-written, which
> allows it to remove virtual register COPY LiveIntervals that become dead
> through the forwarding of all of their uses.
llvm-svn: 312178
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issues identified by buildbots addressed since original review:
- Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
- The pass no longer forwards COPYs to physical register uses, since
doing so can break code that implicitly relies on the physical
register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
can break the machine verifier by creating LiveRanges that don't
end on a use (since the undef operand is not considered a use).
[MachineCopyPropagation] Extend pass to do COPY source forwarding
This change extends MachineCopyPropagation to do COPY source forwarding.
This change also extends the MachineCopyPropagation pass to be able to
be run during register allocation, after physical registers have been
assigned, but before the virtual registers have been re-written, which
allows it to remove virtual register COPY LiveIntervals that become dead
through the forwarding of all of their uses.
llvm-svn: 312154
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
accepting illegal modes
ARMTargetLowering::isLegalAddressingMode can accept illegal addressing modes
for the Thumb1 target. This causes generation of redundant code and affects
performance.
This fixes PR34106: https://bugs.llvm.org/show_bug.cgi?id=34106
Differential Revision: https://reviews.llvm.org/D36467
llvm-svn: 311649
|
|
|
|
|
|
|
|
|
|
|
| |
round 2
This reverts commit r311135.
sanitizer-x86_64-linux-android buildbot is timing out with just this
patch applied.
llvm-svn: 311142
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
forwarding"
Two issues identified by buildbots were addressed:
- The pass no longer forwards COPYs to physical register uses, since
doing so can break code that implicitly relies on the physical
register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
can break the machine verifier by creating LiveRanges that don't
end on a use (since the undef operand is not considered a use).
[MachineCopyPropagation] Extend pass to do COPY source forwarding
This change extends MachineCopyPropagation to do COPY source forwarding.
This change also extends the MachineCopyPropagation pass to be able to
be run during register allocation, after physical registers have been
assigned, but before the virtual registers have been re-written, which
allows it to remove virtual register COPY LiveIntervals that become dead
through the forwarding of all of their uses.
Reviewers: qcolombet, javed.absar, MatzeB, jonpa
Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny
Differential Revision: https://reviews.llvm.org/D30751
llvm-svn: 311135
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r311038.
Several buildbots are breaking, and at least one appears to be due to
the forwarding of physical regs enabled by this change. Reverting while
I investigate further.
llvm-svn: 311062
|