| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current implementation of ThumbRegisterInfo::saveScavengerRegister
is bad for two reasons: one, it's buggy, and two, it blocks using R12
for other optimizations. So this patch gets rid of it, and adds the
necessary support for using an ordinary emergency spill slot on Thumb1.
(Specifically, I think saveScavengerRegister was broken by r305625, and
nobody noticed for two years because the codepath is almost never used.
The new code will also probably not be used much, but it now has better
tests, and if we fail to emit a necessary emergency spill slot we get a
reasonable error message instead of a miscompile.)
A rough outline of the changes in the patch:
1. Gets rid of ThumbRegisterInfo::saveScavengerRegister.
2. Modifies ARMFrameLowering::determineCalleeSaves to allocate an
emergency spill slot for Thumb1.
3. Implements useFPForScavengingIndex, so the emergency spill slot isn't
placed at a negative offset from FP on Thumb1.
4. Modifies the heuristics for allocating an emergency spill slot to
support Thumb1. This includes fixing ExtraCSSpill so we don't try to
use "lr" as a substitute for allocating an emergency spill slot.
5. Allocates a base pointer in more cases, so the emergency spill slot
is always accessible.
6. Modifies ARMFrameLowering::ResolveFrameIndexReference to compute the
right offset in the new cases where we're forcing a base pointer.
7. Ensures we never generate a load or store with an offset outside of
its frame object. This makes the heuristics more straightforward.
8. Changes Thumb1 prologue and epilogue emission so it never uses
register scavenging.
Some of the changes to the emergency spill slot heuristics in
determineCalleeSaves affect ARM/Thumb2; hopefully, they should allow
the compiler to avoid allocating an emergency spill slot in cases
where it isn't necessary. The rest of the changes should only affect
Thumb1.
Differential Revision: https://reviews.llvm.org/D63677
llvm-svn: 364490
|
| |
|
|
| |
llvm-svn: 363538
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch changes MIR stack-id from an integer to an enum,
and adds printing/parsing support for this in MIR files. The default
stack-id '0' is now renamed to 'default'.
This should make MIR tests that have stack objects with different stack-ids
more descriptive. It also clarifies code operating on StackID.
Reviewers: arsenm, thegameg, qcolombet
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D60137
llvm-svn: 363533
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of loop
Current findBestLoopTop can find and move one kind of block to top, a latch block has one successor. Another common case is:
* a latch block
* it has two successors, one is loop header, another is exit
* it has more than one predecessors
If it is below one of its predecessors P, only P can fall through to it, all other predecessors need a jump to it, and another conditional jump to loop header. If it is moved before loop header, all its predecessors jump to it, then fall through to loop header. So all its predecessors except P can reduce one taken branch.
Differential Revision: https://reviews.llvm.org/D43256
llvm-svn: 363471
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the stack
Summary:
Relate bug: https://bugs.llvm.org/show_bug.cgi?id=37472
The shrink wrapping pass prematurally restores the stack, at a point where the stack might still be accessed.
Taking an exception can cause the stack to be corrupted.
As a first approach, this patch is overly conservative, assuming that any instruction that may load or store could access
the stack.
Reviewers: dmgreen, qcolombet
Reviewed By: qcolombet
Subscribers: simpal01, efriedma, eli.friedman, javed.absar, llvm-commits, eugenis, chill, carwil, thegameg
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63152
llvm-svn: 363265
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This extends Krzysztof Parzyszek's X86-specific solution
(https://reviews.llvm.org/D60208) to the generic code pointed out by
James Y Knight.
Reviewers: kparzysz, craig.topper, nickdesaulniers
Subscribers: efriedma, sdardis, nemanjai, javed.absar, eraman, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, srhines, void, nickdesaulniers, jyknight
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60224
llvm-svn: 361404
|
| |
|
|
|
|
|
|
| |
This adds a simple extra test for constant hoisting to show it's
usefulness with constant addresses like those seen in memory
mapped registers in embedded systems.
llvm-svn: 358114
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
There's an existing optimization for x != C, but somehow it was missing
a special case for 0.
While I'm here, also cleaned up the code/comments a bit: the second
value produced by the MERGE_VALUES was actually dead, since a CMOV only
produces one result.
Differential Revision: https://reviews.llvm.org/D59616
llvm-svn: 357437
|
| |
|
|
|
|
| |
Don't expand ISD::ABS node if its legal.
llvm-svn: 356661
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This takes sequences like "mov r4, sp; str r0, [r4]", and optimizes them
to something like "str r0, [sp]".
For regular stack variables, this optimization was already implemented:
we lower loads and stores using frame indexes, which are expanded later.
However, when constructing a call frame for a call with more than four
arguments, the existing optimization doesn't apply. We need to use
stores which are actually relative to the current value of sp, and don't
have an associated frame index.
This patch adds a special case to handle that construct. At the DAG
level, this is an ISD::STORE where the address is a CopyFromReg from SP
(plus a small constant offset).
This applies only to Thumb1: in Thumb2 or ARM mode, a regular store
instruction can access SP directly, so the COPY gets eliminated by
existing code.
The change to ARMDAGToDAGISel::SelectThumbAddrModeSP is a related
cleanup: we shouldn't pretend that it can select anything other than
frame indexes.
Differential Revision: https://reviews.llvm.org/D59568
llvm-svn: 356601
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change does two things. One, it ensures compilation will abort
instead of miscompiling if ARMFrameLowering::determineCalleeSaves
chooses not to save LR in a case where it's necessary. Two, it changes
the way we estimate the size of a function to be more conservative in
the presence of constant pool entries and jump tables.
EstimateFunctionSizeInBytes probably still isn't really conservative
enough, but I'm not sure how we can come up with a reliable estimate
before constant islands runs.
Differential Revision: https://reviews.llvm.org/D59439
llvm-svn: 356527
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
user later.
Summary:
A number of optimizations are inhibited by single-use TokenFactors not
being merged into the TokenFactor using it. This makes we consider if
we can do the merge immediately.
Most tests changes here are due to the change in visitation causing
minor reorderings and associated reassociation of paired memory
operations.
CodeGen tests with non-reordering changes:
X86/aligned-variadic.ll -- memory-based add folded into stored leaq
value.
X86/constant-combiners.ll -- Optimizes out overlap between stores.
X86/pr40631_deadstore_elision -- folds constant byte store into
preceding quad word constant store.
Reviewers: RKSimon, craig.topper, spatel, efriedma, courbet
Reviewed By: courbet
Subscribers: dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, eraman, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59260
llvm-svn: 356068
|
| |
|
|
|
|
|
|
|
| |
Targets can potentially emit more efficient code if they know address
computations never overflow. For example ILP32 code on AArch64 (which only has
64-bit address computation) can ignore the possibility of overflow with this
extra information.
llvm-svn: 355926
|
| |
|
|
|
|
|
|
| |
Committed on behalf of @ikulagin (Ivan Kulagin)
Differential Revision: https://reviews.llvm.org/D52138
llvm-svn: 355197
|
| |
|
|
|
|
|
|
|
|
|
| |
This adds a few extra Thumb1 opcodes to improve the peephole opimisers
ability to remove redundant cmp instructions. tADC and tSBC require
a small fixup to prevent MOVS being moved past the instruction, giving
the wrong flags.
Differential Revision: https://reviews.llvm.org/D58281
llvm-svn: 354791
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This adds a number of missing Thumb1 opcodes so that the peephole optimiser can
remove redundant CMP instructions.
Reapplying this after the first attempt broke non-thumb1 code as the t2ADDri
instruction can be used with frame indices. In thumb1 we use tADDframe.
Differential Revision: https://reviews.llvm.org/D57833
llvm-svn: 354667
|
| |
|
|
|
|
|
|
|
| |
optimisation of CMPs
I believe it's causing bootstrap failures for A32 code. I'll take a look at
what's wrong.
llvm-svn: 354569
|
| |
|
|
|
|
|
|
|
| |
This adds a number of missing Thumb1 opcodes so that the peephole optimiser can
remove redundant CMP instructions.
Differential Revision: https://reviews.llvm.org/D57833
llvm-svn: 354564
|
| |
|
|
|
|
|
|
|
|
| |
We may leave behind incorrect dead flags on instructions that are CSE'd. Make
sure we remove the dead flags on physical registers to prevent other incorrect
code motion.
Differential Revision: https://reviews.llvm.org/D58115
llvm-svn: 354443
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The SMULO/UMULO DAG nodes, when not directly supported by the target,
expand to a multiplication twice as wide. In case that the resulting
type is not legal, the legalizer cannot directly call the intrinsic
with the wide arguments; instead, it "pre-lowers" them by splitting
them in halves.
rL283203 made sure that on big endian targets, the legalizer passes
the argument halves in the correct order. It did not do the same
for the return value halves because the existing code used a hack;
it put an illegal type into DAG and hoped that nothing would break
and it would be correctly lowered elsewhere.
rL307207 fixed this, handling return value halves similar to how
argument handles are handled, but did not take big-endian targets
into account.
This commit fixes the expansion on big-endian targets, such as
the out-of-tree OR1K target.
Reviewers: eli.friedman, vadimcn
Subscribers: george-hopkins, efriedma, llvm-commits
Differential Revision: https://reviews.llvm.org/D45355
llvm-svn: 353854
|
| |
|
|
|
|
|
|
|
| |
We need to clear the kill flags on both SingleValReg and OldReg, to ensure they remain
conservatively correct.
Differential Revision: https://reviews.llvm.org/D58114
llvm-svn: 353847
|
| |
|
|
|
|
|
|
|
|
| |
This prevents Constant Hoisting from pulling the constant out of the block,
allowing us to still produce LDRH/UXTH nodes. LDRB/UXTB (255) is already cheap
by the default getIntImmCost, but I've added it for clarity.
Differential Revision: https://reviews.llvm.org/D57671
llvm-svn: 353040
|
| |
|
|
| |
llvm-svn: 353039
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch may seem familiar... but my previous patch handled the
equivalent lsls+and, not this case. Usually instcombine puts the
"and" after the shift, so this case doesn't come up. However, if the
shift comes out of a GEP, it won't get canonicalized by instcombine,
and DAGCombine doesn't have an equivalent transform.
This also modifies isDesirableToCommuteWithShift to suppress DAGCombine
transforms which would make the overall code worse.
I'm not really happy adding a bunch of code to handle this, but it would
probably be tricky to substantially improve the behavior of DAGCombine
here.
Differential Revision: https://reviews.llvm.org/D56032
llvm-svn: 351776
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Part of the effort to refactoring frame pointer code generation. We used
to use two function attributes "no-frame-pointer-elim" and
"no-frame-pointer-elim-non-leaf" to represent three kinds of frame
pointer usage: (all) frames use frame pointer, (non-leaf) frames use
frame pointer, (none) frame use frame pointer. This CL makes the idea
explicit by using only one enum function attribute "frame-pointer"
Option "-frame-pointer=" replaces "-disable-fp-elim" for tools such as
llc.
"no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" are still
supported for easy migration to "frame-pointer".
tests are mostly updated with
// replace command line args ‘-disable-fp-elim=false’ with ‘-frame-pointer=none’
grep -iIrnl '\-disable-fp-elim=false' * | xargs sed -i '' -e "s/-disable-fp-elim=false/-frame-pointer=none/g"
// replace command line args ‘-disable-fp-elim’ with ‘-frame-pointer=all’
grep -iIrnl '\-disable-fp-elim' * | xargs sed -i '' -e "s/-disable-fp-elim/-frame-pointer=all/g"
Patch by Yuanfang Chen (tabloid.adroit)!
Differential Revision: https://reviews.llvm.org/D56351
llvm-svn: 351049
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This saves materializing the immediate. The additional forms are less
common (they don't usually show up for bitfield insert/extract), but
they're still relevant.
I had to add a new target hook to prevent DAGCombine from reversing the
transform. That isn't the only possible way to solve the conflict, but
it seems straightforward enough.
Differential Revision: https://reviews.llvm.org/D55630
llvm-svn: 349857
|
| |
|
|
|
|
|
|
|
|
|
| |
It costs nothing to spill an IMPLICIT_DEF value (the only spill code that's
generated is a KILL of the value), so when creating split constraints if the
live-out value is IMPLICIT_DEF the exit constraint should be DontCare instead
of PrefReg.
Differential Revision: https://reviews.llvm.org/D55652
llvm-svn: 349151
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The DAGCombiner tries to SimplifySelectCC as follows:
select_cc(x, y, 16, 0, cc) -> shl(zext(set_cc(x, y, cc)), 4)
It can't cope with the situation of reordered operands:
select_cc(x, y, 0, 16, cc)
In that case we just need to swap the operands and invert the Condition Code:
select_cc(x, y, 16, 0, ~cc)
Differential Revision: https://reviews.llvm.org/D53236
llvm-svn: 346484
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The lowering was missing live-ins in certain cases, like a sequence of
multiple tMOVCCr_pseudo instructions. This would lead to a verifier
failure, and on pre-v6 Thumb CPSR would be incorrectly clobbered.
For reasons I don't completely understand, it's hard to get a sequence
of multiple tMOVCCr_pseudo instructions; the issue only seems to show up
with 64-bit comparisons where the result is zero-extended. I added some
extra testcases in case that changes in the future. Probably some
optimization opportunities here if anyone is interested. (@test_slt_not
is the case that was getting miscompiled.)
The code to check the liveness of CPSR was stolen from
X86ISelLowering.cpp; maybe it could be refactored into common helper,
but I have no idea where to put it.
Differential Revision: https://reviews.llvm.org/D54192
llvm-svn: 346355
|
| |
|
|
|
|
|
|
|
| |
Shows up rarely for 64-bit arithmetic, more frequently for the compare
patterns added in r325323.
Differential Revision: https://reviews.llvm.org/D53848
llvm-svn: 345782
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "dead" markings allow existing target-independent optimizations,
like MachineSink, to trigger more frequently. The CPSR defs would have
eventually been marked dead by LiveVariables, so this only affects
optimizations before regalloc.
The ARMBaseInstrInfo.cpp change is fixing a bug which is only visible
with this change: the transform adds a use to an otherwise dead def
of CPSR. This is covered by existing regression tests.
thumb2-tbh.ll breaks for Thumb1 due to MachineLICM changing the
generated code; I'll fix it in D53452.
Differential Revision: https://reviews.llvm.org/D53453
llvm-svn: 345420
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit bd7b44f35ee9fbe365eb25ce55437ea793b39346.
Reland r342994: disabled the optimization and explicitly enable it in test.
-mllvm -consthoist-min-num-to-rebase<unsigned>=0
[ConstHoist] Do not rebase single (or few) dependent constant
If an instance (InsertionPoint or IP) of Base constant A has only one or few
rebased constants depending on it, do NOT rebase. One extra ADD instruction is
required to materialize each rebased constant, assuming A and the rebased have
the same materialization cost.
Differential Revision: https://reviews.llvm.org/D52243
llvm-svn: 343053
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This caused a couple test failures on a bot:
CodeGen/X86/constant-hoisting-bfi.ll
Transforms/ConstantHoisting/X86/ehpad.ll
Example:
http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/53575/
llvm-svn: 343005
|
| |
|
|
|
|
|
|
|
|
|
| |
If an instance (InsertionPoint or IP) of Base constant A has only one or few
rebased constants depending on it, do NOT rebase. One extra ADD instruction is
required to materialize each rebased constant, assuming A and the rebased have
the same materialization cost.
Differential Revision: https://reviews.llvm.org/D52243
llvm-svn: 342994
|
| |
|
|
|
|
|
|
|
| |
A simple MOVS rd, imm8 can materialize [-128, 127] in signed i8 type or
[0, 255] in unsigned i8 type on Thumb1.
Differential Revision: https://reviews.llvm.org/D52257
llvm-svn: 342898
|
| |
|
|
|
|
|
|
| |
If there is an unused def, this would previously
report that the register was live. Check for uses
first so that it is reported as dead if never used.
llvm-svn: 341027
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no way in the universe, that doing a full-width division in
software will be faster than doing overflowing multiplication in
software in the first place, especially given that this same full-width
multiplication needs to be done anyway.
This patch replaces the previous implementation with a direct lowering
into an overflowing multiplication algorithm based on half-width
operations.
Correctness of the algorithm was verified by exhaustively checking the
output of this algorithm for overflowing multiplication of 16 bit
integers against an obviously correct widening multiplication. Baring
any oversights introduced by porting the algorithm to DAG, confidence in
correctness of this algorithm is extremely high.
Following table shows the change in both t = runtime and s = space. The
change is expressed as a multiplier of original, so anything under 1 is
“better” and anything above 1 is worse.
+-------+-----------+-----------+-------------+-------------+
| Arch | u64*u64 t | u64*u64 s | u128*u128 t | u128*u128 s |
+-------+-----------+-----------+-------------+-------------+
| X64 | - | - | ~0.5 | ~0.64 |
| i686 | ~0.5 | ~0.6666 | ~0.05 | ~0.9 |
| armv7 | - | ~0.75 | - | ~1.4 |
+-------+-----------+-----------+-------------+-------------+
Performance numbers have been collected by running overflowing
multiplication in a loop under `perf` on two x86_64 (one Intel Haswell,
other AMD Ryzen) based machines. Size numbers have been collected by
looking at the size of function containing an overflowing multiply in
a loop.
All in all, it can be seen that both performance and size has improved
except in the case of armv7 where code size has regressed for 128-bit
multiply. u128*u128 overflowing multiply on 32-bit platforms seem to
benefit from this change a lot, taking only 5% of the time compared to
original algorithm to calculate the same thing.
The final benefit of this change is that LLVM is now capable of lowering
the overflowing unsigned multiply for integers of any bit-width as long
as the target is capable of lowering regular multiplication for the same
bit-width. Previously, 128-bit overflowing multiply was the widest
possible.
Patch by Simonas Kazlauskas!
Differential Revision: https://reviews.llvm.org/D50310
llvm-svn: 339922
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM normally prefers to minimize the number of bits set in an AND
immediate, but that doesn't always match the available ARM instructions.
In Thumb1 mode, prefer uxtb or uxth where possible; otherwise, prefer
a two-instruction sequence movs+ands or movs+bics.
Some potential improvements outlined in
ARMTargetLowering::targetShrinkDemandedConstant, but seems to work
pretty well already.
The ARMISelDAGToDAG fix ensures we don't generate an invalid UBFX
instruction due to a larger-than-expected mask. (It's orthogonal, in
some sense, but as far as I can tell it's either impossible or nearly
impossible to reproduce the bug without this change.)
According to my testing, this seems to consistently improve codesize by
a small amount by forming bic more often for ISD::AND with an immediate.
Differential Revision: https://reviews.llvm.org/D50030
llvm-svn: 339472
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Saves materializing the immediate for the "ands".
Corresponding patterns exist for lsrs+lsls, but that seems less common
in practice.
Now implemented as a DAGCombine.
Differential Revision: https://reviews.llvm.org/D49585
llvm-svn: 337945
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This is marginally helpful for removing redundant extensions, and the
code is easier to read, so it seems like an all-around win. In the new
test i8-phi-ext.ll, we used to emit an AssertSext i8; now we emit an
AssertZext i2, which allows the extension of the return value to be
eliminated.
Differential Revision: https://reviews.llvm.org/D49004
llvm-svn: 336868
|
| |
|
|
|
|
|
|
|
|
| |
Even if a comparison isn't legal, we should try to prefer constants
which can be materialized with a two-instruction sequence. (Thinking
about it a bit more, there might be some more clever sequence we could
generate for certain comparisons invoving powers of two, but I'm not
sure exactly what that would look like.)
llvm-svn: 335003
|
| |
|
|
|
|
|
| |
When the result of masking is truncated to i16, we should try to use
"bic" instead of "and".
llvm-svn: 335001
|
| |
|
|
|
|
|
|
|
|
| |
It looks like this got left in by accident in r289794; I can't think of
any reason this check would be necessary. (Maybe it was meant to be a
check that the AND has one use? But we check that a few lines earlier.)
Differential Revision: https://reviews.llvm.org/D47921
llvm-svn: 334322
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Should fix UBSan bot by also checking there's no "uwtable" attribute
before skipping. Otherwise the unwind table will be useless since its
moves expect CSRs to actually be preserved.
A noreturn nounwind function can be expected to never return in any way, and by
never returning it will also never have to restore any callee-saved registers
for its caller. This makes it possible to skip spills of those registers during
function entry, saving some stack space and time in the process. This is rather
useful for embedded targets with limited stack space.
Should fix PR9970.
Patch mostly by myeisha (pmb).
llvm-svn: 329494
|
| |
|
|
|
|
|
|
| |
Breaks ubsan test TestCases/Misc/missing_return.cpp on ARM
This reverts commit r329287
llvm-svn: 329486
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
A noreturn nounwind function can be expected to never return in any way, and by
never returning it will also never have to restore any callee-saved registers
for its caller. This makes it possible to skip spills of those registers during
function entry, saving some stack space and time in the process. This is rather
useful for embedded targets with limited stack space.
Should fix PR9970.
Patch by myeisha (pmb).
llvm-svn: 329287
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes PR36658, "Constant pool entry out of range!" in Thumb1 mode.
In ARMConstantIslands::optimizeThumb2JumpTables() in Thumb1 mode,
adjustBBOffsetsAfter() is not calculating postOffset correctly by
properly accounting for the padding that is required for the constant pool
that immediately follows the jump table branch instruction.
Reviewers: t.p.northover, eli.friedman
Reviewed By: t.p.northover
Subscribers: chrib, tstellar, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D44709
llvm-svn: 328341
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an Armv6m function dynamically re-aligns the stack, access to incoming
stack arguments (and to stack area, allocated for register varargs) is done via
SP, which is incorrect, as the SP is offset by an unknown amount relative to the
value of SP upon function entry.
This patch fixes it, by making access to "fixed" frame objects be done via FP
when the function needs stack re-alignment. It also changes the access to
"fixed" frame objects be done via FP (instead of using R6/BP) also for the case
when the stack frame contains variable sized objects. This should allow more
objects to fit within the immediate offset of the load instruction.
All of the above via a small refactoring to reuse the existing
`ARMFrameLowering::ResolveFrameIndexReference.`
Differential Revision: https://reviews.llvm.org/D43566
llvm-svn: 326584
|
| |
|
|
|
|
|
|
| |
Re-enable commit r323991 now that r325931 has been committed to make
MachineOperand::isRenamable() check more conservative w.r.t. code
changes and opt-in on a per-target basis.
llvm-svn: 326208
|
| |
|
|
|
|
|
|
|
|
| |
Fixup to rL325573 for large xor constants.
Thanks to Eli Friedman for the catch.
Differential revision: https://reviews.llvm.org/D43549
llvm-svn: 325761
|