| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
COFF doesn't have sections with mergeable contents. Instead, each
constant pool entry ends up in a COMDAT section. The linker, when
choosing between COMDAT sections, doesn't choose the max alignment of
the two sections. You just get whatever alignment was on the section.
If one constant needed a higher alignment in one object file from
another one, then we will get into trouble if the linker chooses the
lower alignment one.
Instead, lets promote the alignment of the constant pool entry to make
sure we don't use an under aligned constant with an instruction which
assumed otherwise.
This fixes PR26680.
llvm-svn: 261462
|
| |
|
|
|
|
| |
the lowest vector element
llvm-svn: 261460
|
| |
|
|
|
|
|
| |
The register coloring pass may also need to be involved in order to
optimally sort registers.
llvm-svn: 261458
|
| |
|
|
| |
llvm-svn: 261457
|
| |
|
|
|
|
|
|
| |
The stack pointer is bumped when there is a frame pointer or when there
are static-size objects, but was only getting written back when there
were static-size objects.
llvm-svn: 261453
|
| |
|
|
|
|
|
|
| |
The instructions are the same, but fewer locals are used.
Differential Revision: http://reviews.llvm.org/D17428
llvm-svn: 261452
|
| |
|
|
|
|
|
|
|
| |
TailDuplicate can run on either on SSA code or non-SSA code, as indicated to
it by MRI->isSSA() ("PreRegAlloc" here). TailDuplicate does extra work to
preserve SSA invariants when it duplicates code. This patch makes it skip
some of this extra work in the case where the code is not in SSA form.
llvm-svn: 261450
|
| |
|
|
|
|
|
|
| |
The patch has a necessary call to a function inside an assert. Which is fine
when you have asserts turned on. Not so much when they're off. Sorry about
the regression.
llvm-svn: 261447
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch corresponds to review:
http://reviews.llvm.org/D17294
It ensures that whatever block we are emitting the prologue/epilogue into, we
have the necessary scratch registers. It takes away the hard-coded register
numbers for use as scratch registers as registers that are guaranteed to be
available in the function prologue/epilogue are not guaranteed to be available
within the function body. Since we shrink-wrap, the prologue/epilogue may end
up in the function body.
llvm-svn: 261441
|
| |
|
|
| |
llvm-svn: 261437
|
| |
|
|
|
|
|
|
| |
(PR26667)
Fixed a bug introduced by D16683 when a binary shuffle is simplified to a unary shuffle (with undef/zero sentinel mask indices) - if this resulted in only the second input being used combineX86ShuffleChain failed to take this into account and still referenced the first input.
llvm-svn: 261434
|
| |
|
|
|
|
| |
First small step towards fixing PR26667 - we need to ensure that combineX86ShuffleChain only gets called with a valid shuffle input node (a similar issue was found in D17041).
llvm-svn: 261433
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the algorithm easily degrades into quadratic memory and time complexity.
The easiest example is a long chain of BBs that don't otherwise use a
location. The caching will add an entry for every intermediate block and
limiting the number of results doesn't help as no results are produced
until a definition is found.
Introduce a limit similar to the existing instructions-per-block limit.
This limit counts the total number of blocks checked. If the limit is
reached, entries are considered unknown. The initial value is 1000,
which avoids regressions for normal sized functions while still
limiting edge cases to reasnable memory consumption and execution time.
Differential Revision: http://reviews.llvm.org/D16123
llvm-svn: 261430
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D16877
llvm-svn: 261429
|
| |
|
|
|
|
|
|
|
|
| |
Handle address displacement operands of a type other than Immediate or Global in LEAs and load/stores.
Ref: https://llvm.org/bugs/show_bug.cgi?id=26575
Differential Revision: http://reviews.llvm.org/D17374
llvm-svn: 261428
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 261427
|
| |
|
|
|
|
|
|
| |
No functional change intended. Copying small (<= 64 bits) APInts isn't
expensive but bloats code by generating the slow path everywhere. Moving
doesn't care about the size of the value.
llvm-svn: 261426
|
| |
|
|
|
|
|
| |
This has no observable behavior change, it just makes the state
insertion pass look a little more like normal passes.
llvm-svn: 261420
|
| |
|
|
| |
llvm-svn: 261417
|
| |
|
|
| |
llvm-svn: 261411
|
| |
|
|
|
|
| |
registry and test it.
llvm-svn: 261410
|
| |
|
|
| |
llvm-svn: 261409
|
| |
|
|
| |
llvm-svn: 261408
|
| |
|
|
|
|
|
|
| |
This avoids unnecessarily passing them around when calling helper
functions. It may also be slightly faster to call clear() on the
datastructures instead of freshly initializing them for each block.
llvm-svn: 261407
|
| |
|
|
| |
llvm-svn: 261406
|
| |
|
|
|
|
| |
'impossible' condition
llvm-svn: 261405
|
| |
|
|
|
|
|
|
|
| |
tests over to exercise this code.
This uncovered a few missing bits here and there in the analysis, but
nothing interesting.
llvm-svn: 261404
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
it to actually test the new pass manager AA wiring.
This patch was extracted from the (somewhat too large) D12357 and
rebosed on top of the slightly different design of the new pass manager
AA wiring that I just landed. With this we can start testing the AA in
a thorough way with the new pass manager.
Some minor cleanups to the code in the pass was necessitated here, but
otherwise it is a very minimal change.
Differential Revision: http://reviews.llvm.org/D17372
llvm-svn: 261403
|
| |
|
|
|
|
|
| |
It reads odd since most other places name a `SCEV *` as `S`. Pure
renaming change.
llvm-svn: 261393
|
| |
|
|
|
|
| |
`{A, B}` reads cleaner than `std::make_pair(A, B)`.
llvm-svn: 261392
|
| |
|
|
|
|
|
|
|
|
| |
Cleanuppads may be merged together if one is the only predecessor of the
other in which case a simple transform can be performed: replace the
a cleanupret with a branch and remove an unnecessary cleanuppad.
Differential Revision: http://reviews.llvm.org/D17459
llvm-svn: 261390
|
| |
|
|
|
|
|
|
|
|
| |
TLSADDR nodes are lowered into actuall calls inside MC. In order to prevent
shrink-wrapping from pushing prologue/epilogue past them (which result
in TLS variables being accessed before the stack frame is set up), we
put markers, so that the stack gets adjusted properly.
Thanks to Quentin Colombet for guidance/help on how to fix this problem!
llvm-svn: 261387
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Instead of trying to replace SMRD instructions with a VGPR base pointer
with an equivalent MUBUF instruction, we now copy the base pointer to
SGPRs using v_readfirstlane.
This is safe to do, because any load selected as an SMRD instruction
has been proven to have a uniform base pointer, so each thread in the
wave will have the same pointer value in VGPRs.
This will fix some errors on VI from trying to replace SMRD instructions
with addr64-enabled MUBUF instructions that don't exist.
Reviewers: arsenm, cfang, nhaehnle
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D17305
llvm-svn: 261385
|
| |
|
|
|
|
| |
PR26485
llvm-svn: 261384
|
| |
|
|
|
|
| |
These are supposed to be file checksum table offsets, not file ids.
llvm-svn: 261379
|
| |
|
|
|
|
| |
I stumbled upon this while debugging a lowering bug.
llvm-svn: 261371
|
| |
|
|
| |
llvm-svn: 261370
|
| |
|
|
|
|
|
|
| |
calculator by ignoring specific instructions."
It caused PR26509.
llvm-svn: 261368
|
| |
|
|
|
|
| |
Turns out the new nop sequences aren't actually nops on x86_64 (PR26554).
llvm-svn: 261365
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When optimizing for size, sqrt calls can be incorrectly selected as
AVX512 VSQRT instructions. This is because X86InstrAVX512.td has a
`Requires<[OptForSize]>` in its `avx512_sqrt_scalar` multiclass
definition. Even if the target does not support AVX512, the class can
apparently still be chosen, leading to an incorrect selection of
`vsqrtss`.
In PR26625, this lead to an assertion: Reg >= X86::FP0 && Reg <=
X86::FP6 && "Expected FP register!", because the `vsqrtss` instruction
requires an XMM register, which is not available on i686 CPUs.
Reviewers: grosbach, resistor, joker.eph
Subscribers: spatel, emaste, llvm-commits
Differential Revision: http://reviews.llvm.org/D17414
llvm-svn: 261360
|
| |
|
|
|
|
| |
Use auto, bring file up to coding standards etc.
llvm-svn: 261358
|
| |
|
|
| |
llvm-svn: 261354
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
wrapping.
Summary: See bug https://llvm.org/bugs/show_bug.cgi?id=26642
Reviewers: qcolombet, t.p.northover
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D17350
llvm-svn: 261349
|
| |
|
|
|
|
|
|
|
|
| |
Now that we don't always add an element to AllocatedStackSlots if we
don't find a pre-existing unallocated stack slot, bumping
StatepointMaxSlotsRequired to `NumSlots + 1` is not correct. Instead
bump the statistic near the push_back, to
Builder.FuncInfo.StatepointStackSlots.size().
llvm-svn: 261348
|
| |
|
|
|
|
|
|
| |
The check on MFI->getObjectSize() has to be on the FrameIndex, not on
the index of the FrameIndex in AllocatedStackSlots. Weirdly, the tests
I added in rL261336 didn't catch this.
llvm-svn: 261347
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch enables the vectorization of first-order recurrences. A first-order
recurrence is a non-reduction recurrence relation in which the value of the
recurrence in the current loop iteration equals a value defined in the previous
iteration. The load PRE of the GVN pass often creates these recurrences by
hoisting loads from within loops.
In this patch, we add a new recurrence kind for first-order phi nodes and
attempt to vectorize them if possible. Vectorization is performed by shuffling
the values for the current and previous iterations. The vectorization cost
estimate is updated to account for the added shuffle instruction.
Contributed-by: Matthew Simpson and Chad Rosier <mcrosier@codeaurora.org>
Differential Revision: http://reviews.llvm.org/D16197
llvm-svn: 261346
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NFCI. They key motivation here is that I'd like to use
SmallBitVector::all() in a later change. Also, using a bit vector here
seemed better in general.
The only interesting change here is that in the failure case of
allocateStackSlot, we no longer (the equivalent of) push_back(true) to
AllocatedStackSlots. As far as I can tell, this is fine, since we'd
never re-use those slots in the same StatepointLoweringState instance.
Technically there was no need to change the operator[] type accesses to
set() and test(), but I thought it'd be nice to make it obvious that
we're using something other than a std::vector like thing.
llvm-svn: 261337
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
allocateStackSlot did not consider the size of the value to be spilled
before deciding to re-use a spill slot. This was originally okay (since
originally we'd only ever spill pointers), but it became not okay when
we changed our scheme to directly spill vectors of pointers.
While this change fixes the bug pointed out, it has two performance
caveats:
- It matches spill slot and spillee size exactly, while in theory we
can spill, e.g., an 8 byte pointer into a 16 byte slot. This is
slightly complicated to fix since in the stackmaps section, we report
the size of the spill slot as the size of the "indirect value"; and
if they're no longer equivalent, we'll have to keep track of the
(indirect) value size separately from the stack slot size.
- It will "spuriously run out" of reusable slots, since we now have an
second check in the search loop in addition to the availablity
check (e.g. you had two free scalar slots, and you first ask for a
vector slot followed by a scalar slot). I'll fix this in a later
commit.
llvm-svn: 261336
|
| |
|
|
|
|
|
|
|
| |
This removes the unusual loop structure in allocateStackSlot in favor of
something more straightforward. I've also removed the cautionary
comment in the function, which I suspect is historical cruft now, and
confuses more than it enlightens.
llvm-svn: 261335
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If we don't have the first and last access of an interleaved load group,
the first and last wide load in the loop can do an out of bounds
access. Even though we discard results from speculative loads,
this can cause problems, since it can technically generate page faults
(or worse).
We now discard interleaved load groups that don't have the first and
load in the group.
Reviewers: hfinkel, rengolin
Subscribers: rengolin, llvm-commits, mzolotukhin, anemet
Differential Revision: http://reviews.llvm.org/D17332
llvm-svn: 261331
|