| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
matching tables. Shaves about 5100 bytes from the X86 matcher table. NFC
llvm-svn: 262815
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D17868
llvm-svn: 262774
|
|
|
|
|
|
|
|
|
|
| |
Rematerializing and merging into a bigger register class at the same
time, requires the subregister range lanemasks getting remapped to the
new register class.
This fixes http://llvm.org/PR26805
llvm-svn: 262768
|
|
|
|
|
|
|
|
| |
copy coalescing with enabled subregister liveness can reveal undef uses,
previously this was only checked for the SrcReg in updateRegDefsUses()
but we need to check DstReg as well.
llvm-svn: 262767
|
|
|
|
| |
llvm-svn: 262766
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The divrem combine assumed the type of the div/rem is simple, which isn't
necessarily true. This probably worked fine until r250825, since it only
saw legal types, but now breaks when it runs as a pre-type-legalization
combine.
This fixes PR26835.
Differential Revision: http://reviews.llvm.org/D17878
llvm-svn: 262746
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.
This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.
Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make
sure we only emit valid types or the ones that were explicitly marked as custom.
Now, passing check-all and test-suite on x86, ARM and AArch64.
This patch fixes PR17193 (and a long time FIXME in the tests).
llvm-svn: 262738
|
|
|
|
|
|
| |
Part of D15390.
llvm-svn: 262719
|
|
|
|
| |
llvm-svn: 262702
|
|
|
|
|
|
|
|
| |
Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns.
Differential Revision: http://reviews.llvm.org/D17691
llvm-svn: 262599
|
|
|
|
|
|
| |
This reverts commit r262507, which broke some ARM buildbots.
llvm-svn: 262594
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Removing MMOs is not our prefer behavior any more.
Reviewers: mcrosier, reames
Differential Revision: http://reviews.llvm.org/D17668
llvm-svn: 262580
|
|
|
|
|
|
| |
Was discussed as part of http://reviews.llvm.org/D17830
llvm-svn: 262571
|
|
|
|
|
|
|
|
|
|
| |
If we have a loop with a rarely taken path, we will prune that from the blocks which get added as part of the loop chain. The problem is that we weren't then recognizing the loop chain as schedulable when considering the preheader when forming the function chain. We'd then fall to various non-predecessors before finally scheduling the loop chain (as if the CFG was unnatural.) The net result was that there could be lots of garbage between a loop preheader and the loop, even though we could have directly fallen into the loop. It also meant we separated hot code with regions of colder code.
The particular reason for the rejection of the loop chain was that we were scanning predecessor of the header, seeing the backedge, believing that was a globally more important predecessor (true), but forgetting to account for the fact the backedge precessor was already part of the existing loop chain (oops!.
Differential Revision: http://reviews.llvm.org/D17830
llvm-svn: 262547
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Catch objects with a displacement of zero do not initialize a catch
object. The displacement is relative to %rsp at the end of the
function's prologue for x86_64 targets.
If we place an object at the top-of-stack, we will end up wit a
displacement of zero resulting in our catch object remaining
uninitialized.
Address this by creating our catch objects as fixed objects. We will
ensure that the UnwindHelp object is created after the catch objects so
that no catch object will have a displacement of zero.
Differential Revision: http://reviews.llvm.org/D17823
llvm-svn: 262546
|
|
|
|
| |
llvm-svn: 262531
|
|
|
|
| |
llvm-svn: 262522
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.
This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.
This patch fixes PR17193 (and a long time FIXME in the tests).
llvm-svn: 262507
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The placement new calls here were all calling the allocation function
in RecyclingAllocator/Recycler for SDNode, instead of the function for
the specific subclass we were constructing.
Since this particular allocator always overallocates it more or less
worked, but would hide what we're actually doing from any memory
tools. Also, if you tried to change this allocator so something like a
BumpPtrAllocator or MallocAllocator, the compiler would crash horribly
all the time.
Part of llvm.org/PR26808.
llvm-svn: 262500
|
|
|
|
| |
llvm-svn: 262446
|
|
|
|
|
|
|
|
|
|
|
| |
On AMDGPU where operations i64 operations are often bitcasted to v2i32
and back, this pattern shows up regularly where it breaks some
expected combines on i64, such as load width reducing.
This fixes some test failures in a future commit when i64 loads
are changed to promote.
llvm-svn: 262397
|
|
|
|
|
|
|
|
|
| |
This reverts commit r262316.
It seems that my change breaks an out-of-tree chromium buildbot, so
I'm reverting this in order to investigate the situation further.
llvm-svn: 262387
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Calls sometimes need to be convergent. This is already handled at the
LLVM IR level, but it also needs to be handled at the MI level.
Ideally we'd propagate convergence from instructions, down through the
selection DAG, and into MIs. But this is Hard, and would affect
optimizations in the SDNs -- right now only SDNs with two operands have
any flags at all.
Instead, here's a much simpler hack: Add new opcodes for NVPTX for
convergent calls, and generate these when lowering convergent LLVM
calls.
Reviewers: jholewinski
Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits
Differential Revision: http://reviews.llvm.org/D17423
llvm-svn: 262373
|
|
|
|
|
|
|
| |
This reduces the number of bitcast nodes and generally cleans up the
DAG when bitcasting between integers and vectors everywhere.
llvm-svn: 262358
|
|
|
|
| |
llvm-svn: 262344
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch modifies the existing comparison, branch, conditional-move
and select patterns, and adds new ones where needed. Also, the updated
SLT{u,i,iu} set of instructions generate a GPR width result.
The majority of the code changes in the Mips back-end fix the wrong
assumption that the result of SETCC nodes always produce an i32 value.
The changes in the common code path account for the fact that in 64-bit
MIPS targets, i1 is promoted to i32 instead of i64.
Reviewers: dsanders
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D10970
llvm-svn: 262316
|
|
|
|
|
|
|
| |
This fixes regressions exposed in existing AMDGPU tests in a
future commit when all loads are custom lowered.
llvm-svn: 262299
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The CatchObjOffset is relative to the end of the EH registration node
for 32-bit x86 WinEH targets. A special sentinel value, 0, is used to
indicate that no catch object should be initialized.
This means that a catch object allocated immediately before the
registration node would be assigned a CatchObjOffset of 0, leading the
runtime to believe that a catch object should not be initialized.
To handle this, allocate the registration node prior to any other frame
object. This will ensure that catch objects will not be allocated
before the registration node.
This fixes PR26757.
Differential Revision: http://reviews.llvm.org/D17689
llvm-svn: 262294
|
|
|
|
| |
llvm-svn: 262265
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a variable is described by a single DBG_VALUE instruction we can
often use a more efficient inline DW_AT_location instead of using a
location list.
This commit makes the heuristic that decides when to apply this
optimization stricter by also verifying that the DBG_VALUE is live at the
entry of the function (instead of just checking that it is valid until
the end of the function).
<rdar://problem/24611008>
llvm-svn: 262247
|
|
|
|
| |
llvm-svn: 262236
|
|
|
|
|
|
| |
Part of PR26753.
llvm-svn: 262154
|
|
|
|
|
|
|
|
| |
InlineSpiller::rematerializeFor() never uses its parameter as an
iterator, so take it by reference instead. This removes an implicit
conversion from MachineBasicBlock::iterator to MachineInstr*.
llvm-svn: 262152
|
|
|
|
|
|
| |
These parameters aren't expected to be null, so take them by reference.
llvm-svn: 262151
|
|
|
|
|
|
|
|
| |
Change MachineInstr API to prefer MachineInstr& over MachineInstr*
whenever the parameter is expected to be non-null. Slowly inching
toward being able to fix PR26753.
llvm-svn: 262149
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the case where op = add, y = base_ptr, and x = offset, this
transform:
(op y, (op x, c1)) -> (op (op x, y), c1)
breaks the canonical form of add by putting the base pointer in the
second operand and the offset in the first.
This fix is important for the R600 target, because for some address
spaces the base pointer and the offset are stored in separate register
classes. The old pattern caused the ISel code for matching addressing
modes to put the base pointer and offset in the wrong register classes,
which required no-trivial code transformations to fix.
llvm-svn: 262148
|
|
|
|
|
|
| |
Also update HashEndOfMBB to take MachineBasicBlock&.
llvm-svn: 262146
|
|
|
|
|
|
|
|
| |
Take parameters as MachineInstr& instead of MachineInstr* in
AntiDepBreaker API, since these are required to be non-null. No
functionality change intended. Looking toward PR26753.
llvm-svn: 262145
|
|
|
|
|
|
|
| |
This already assumes a valid MI, since it dereferences the MI in an
assertion before checking for null. At an explicit assert.
llvm-svn: 262144
|
|
|
|
|
|
|
|
|
| |
In all but one case, change the DFAPacketizer API to take MachineInstr&
instead of MachineInstr*. In DFAPacketizer::endPacket(), take
MachineBasicBlock::iterator. Besides cleaning up the API, this is in
search of PR26753.
llvm-svn: 262142
|
|
|
|
|
|
|
|
| |
Update APIs in MachineInstrBundle.h to take and return MachineInstr&
instead of MachineInstr* when the instruction cannot be null. Besides
being a nice cleanup, this is tacking toward a fix for PR26753.
llvm-svn: 262141
|
|
|
|
|
|
| |
This is OK for +0 since compares to +/-0 give the same result.
llvm-svn: 262125
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Take MachineInstr by reference instead of by pointer in SlotIndexes and
the SlotIndex wrappers in LiveIntervals. The MachineInstrs here are
never null, so this cleans up the API a bit. It also incidentally
removes a few implicit conversions from MachineInstrBundleIterator to
MachineInstr* (see PR26753).
At a couple of call sites it was convenient to convert to a range-based
for loop over MachineBasicBlock::instr_begin/instr_end, so I added
MachineBasicBlock::instrs.
llvm-svn: 262115
|
|
|
|
| |
llvm-svn: 262096
|
|
|
|
|
|
| |
assertion failure on AArch64.
llvm-svn: 262091
|
|
|
|
| |
llvm-svn: 262061
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
own method. NFC
Summary: This is extracted from D17555
Reviewers: davidxl, reames, sanjoy, MatzeB, pete
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17580
llvm-svn: 262058
|
|
|
|
|
|
| |
Fixes PR26733
llvm-svn: 262057
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MBB slot index intervals are half open, not closed. getMBBEndIndex()
returns the slot index of the start of the next block in layout order.
Placing a register mask there is incorrect if the successor of the
funclet return is not laid out after the return. Clang generates IR for
catch bodies before generating the following normal code, so we never
noticed this issue until the D frontend authors filed a bug about it.
Instead, we can put the clobber mask on the last instruction of the
funclet return block. We still aren't using a register mask operand on
the CATCHRET instruction because it would cause PEI to spill all CSRs,
including XMM regs, in the prologue.
Fixes PR26679.
llvm-svn: 262035
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D17475
llvm-svn: 261966
|