| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
DominanceFrontier. NFC
Differential Revision: http://reviews.llvm.org/D17570
llvm-svn: 261883
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D17537
llvm-svn: 261882
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Both the hardware and LLVM have changed since 2012.
Now, load-based heuristic don't show big differences any more on OoO cores.
There is no notable regressons and improvements on spec2000/2006. (Cortex-A57, Core i5).
Reviewers: spatel, zansari
Differential Revision: http://reviews.llvm.org/D16836
llvm-svn: 261809
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(This is the second attemp to commit this patch, after fixing pr26652 & pr26653).
This patch detects vector reductions before instruction selection. Vector
reductions are vectorized reduction operations, and for such operations we have
freedom to reorganize the elements of the result as long as the reduction of them
stay unchanged. This will enable some reduction pattern recognition during
instruction combine such as SAD/dot-product on X86. A flag is added to
SDNodeFlags to mark those vector reduction nodes to be checked during instruction
combine.
To detect those vector reductions, we search def-use chains starting from the
given instruction, and check if all uses fall into two categories:
1. Reduction with another vector.
2. Reduction on all elements.
in which 2 is detected by recognizing the pattern that the loop vectorizer
generates to reduce all elements in the vector outside of the loop, which
includes several ShuffleVector and one ExtractElement instructions.
Differential revision: http://reviews.llvm.org/D15250
llvm-svn: 261804
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes bugs in copy elimination code in llvm. It slightly changes the
semantics of clearRegisterKills(). This is appropriate because:
- Users in lib/CodeGen/MachineCopyPropagation.cpp and
lib/Target/AArch64RedundantCopyElimination.cpp and
lib/Target/SystemZ/SystemZElimCompare.cpp are incorrect without it
(see included testcase).
- All other users in llvm are unaffected (they pass TRI==nullptr)
- (Kill flags are optional anyway so removing too many shouldn't hurt.)
Differential Revision: http://reviews.llvm.org/D17554
llvm-svn: 261763
|
|
|
|
|
|
|
|
|
|
| |
This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated.
Reviewed By: hfinkel
Differential Revision: http://reviews.llvm.org/D16180
llvm-svn: 261736
|
|
|
|
|
|
|
|
| |
debug info."
This and the corresponding Clang change caused PR26715.
llvm-svn: 261671
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D15976
llvm-svn: 261633
|
|
|
|
|
|
|
| |
We had the right logic for the nested cleanuppad case but omitted it for
catchswitches.
llvm-svn: 261615
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: If a function is hot, put it in text.hot section.
Reviewers: davidxl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17532
llvm-svn: 261607
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change TargetInstrInfo API to take `MachineInstr&` instead of
`MachineInstr*` in the functions related to predicated instructions
(I'll try to come back later and get some of the rest). All of these
functions require non-null parameters already, so references are more
clear. As a bonus, this happens to factor away a host of implicit
iterator => pointer conversions.
No functionality change intended.
llvm-svn: 261605
|
|
|
|
|
|
|
|
| |
This reverts commit r261582, since this bot has been broken for four
hours:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/19399/
llvm-svn: 261604
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Fixing buildbot failure introduced by http://reviews.llvm.org/D17460
Reviewers: davidxl, hans
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17524
llvm-svn: 261588
|
|
|
|
|
|
|
| |
We supported creating mergeable constant pool entries for smaller
constants but not for 32-byte AVX constants.
llvm-svn: 261584
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: If a function is hot, put it in text.hot section.
Reviewers: davidxl
Subscribers: eraman, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D17460
llvm-svn: 261582
|
|
|
|
|
|
|
|
|
|
|
| |
This was causing assertions later from using the wrong pointer
size with LDS operations. getOptimalMemOpType should also have
address space arguments later.
This avoids assertions in existing tests exposed by
a future commit.
llvm-svn: 261580
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DMB instructions can be expensive, so it's best to avoid them if possible. In
atomicrmw operations there will always be an attempted store so a release
barrier is always needed, but in the cmpxchg case we can delay the DMB until we
know we'll definitely try to perform a store (and so need release semantics).
In the strong cmpxchg case this isn't quite free: we must duplicate the LDREX
instructions to skip the barrier on subsequent iterations. The basic outline
becomes:
ldrex rOld, [rAddr]
cmp rOld, rDesired
bne Ldone
dmb
Lloop:
strex rRes, rNew, [rAddr]
cbz rRes Ldone
ldrex rOld, [rAddr]
cmp rOld, rDesired
beq Lloop
Ldone:
So we'll skip this version for strong operations in "minsize" functions.
llvm-svn: 261568
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r261504, since it's not obvious the new name is
better:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160222/334298.html
I'll recommit if we get consensus that it's the right direction.
llvm-svn: 261567
|
|
|
|
|
|
|
|
| |
MIs in ifcnv."
This reverts r261543. Accidental commit (not LGTM'ed).
llvm-svn: 261547
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Also add a comment briefly explaining what ifcnv is.
No functional changes.
Reviewers: resistor
Subscribers: echristo, tra, llvm-commits
Differential Revision: http://reviews.llvm.org/D17430
llvm-svn: 261543
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: rnk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17466
llvm-svn: 261541
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Convergent instrs shouldn't be made control-dependent on other values,
but this is basically the whole point of tail duplication. So just bail
if we see a convergent instruction.
Reviewers: iteratee
Subscribers: jholewinski, jhen, hfinkel, tra, jingyue, llvm-commits
Differential Revision: http://reviews.llvm.org/D17320
llvm-svn: 261540
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r261510, effectively reapplying r261509. The
original commit missed a caller in AArch64ConditionalCompares.
Original commit message:
Pass non-null arguments by reference in MachineTraceMetrics::Trace,
simplifying future work to remove implicit iterator => pointer
conversions.
llvm-svn: 261511
|
|
|
|
|
|
|
| |
This reverts commit r261509. I'm not sure how this compiled locally,
but something was out of whack.
llvm-svn: 261510
|
|
|
|
|
|
|
|
| |
Pass non-null arguments by reference in MachineTraceMetrics::Trace,
simplifying future work to remove implicit iterator => pointer
conversions.
llvm-svn: 261509
|
|
|
|
| |
llvm-svn: 261508
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Delete MachineInstr::getIterator(), since the term "iterator" is
overloaded when talking about MachineInstr.
- Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so
that ilist_node::getIterator() is still available.
- Add it back as MachineInstr::getInstrIterator(). This matches the
naming in MachineBasicBlock.
- Add MachineInstr::getBundleIterator(). This is explicitly called
"bundle" (not matching MachineBasicBlock) to disintinguish it clearly
from ilist_node::getIterator().
- Update all calls. Some of these I switched to `auto` to remove
boiler-plate, since the new name is clear about the type.
There was one call I updated that looked fishy, but it wasn't clear what
the right answer was. This was in X86FrameLowering::inlineStackProbe(),
added in r252578 in lib/Target/X86/X86FrameLowering.cpp. I opted to
leave the behaviour unchanged, but I'll reply to the original commit on
the list in a moment.
llvm-svn: 261504
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.
Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators. (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)
There should be NFC here.
llvm-svn: 261498
|
|
|
|
|
|
|
|
|
|
|
| |
Stop using `getNodePtrUnchecked()` when building IR. Eventually a
dereference will be required to get at the downcast node, since the
iterator will only store an `ilist_node_base` of some sort.
This should have no functionality change for now, but is a path towards
removing some more UB from ilist.
llvm-svn: 261495
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`ilist_iterator<NodeTy>::getNodePtrUnchecked()` is documented as being
for internal use only, but CodeGenPrepare was using it anyway. This
code relies on pulling out the `Value*` pointer even after the lifetime
of the iterator is over. But having this pointer available in
ilist_iterator depends on UB in the first place.
Instead, safely pull out the `Value*` when the iterator is alive and
stop using the internal-only API.
There should be no functionality change here.
llvm-svn: 261493
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
COFF doesn't have sections with mergeable contents. Instead, each
constant pool entry ends up in a COMDAT section. The linker, when
choosing between COMDAT sections, doesn't choose the max alignment of
the two sections. You just get whatever alignment was on the section.
If one constant needed a higher alignment in one object file from
another one, then we will get into trouble if the linker chooses the
lower alignment one.
Instead, lets promote the alignment of the constant pool entry to make
sure we don't use an under aligned constant with an instruction which
assumed otherwise.
This fixes PR26680.
llvm-svn: 261462
|
|
|
|
|
|
|
|
|
| |
TailDuplicate can run on either on SSA code or non-SSA code, as indicated to
it by MRI->isSSA() ("PreRegAlloc" here). TailDuplicate does extra work to
preserve SSA invariants when it duplicates code. This patch makes it skip
some of this extra work in the case where the code is not in SSA form.
llvm-svn: 261450
|
|
|
|
| |
llvm-svn: 261437
|
|
|
|
| |
llvm-svn: 261408
|
|
|
|
|
|
|
|
| |
This avoids unnecessarily passing them around when calling helper
functions. It may also be slightly faster to call clear() on the
datastructures instead of freshly initializing them for each block.
llvm-svn: 261407
|
|
|
|
| |
llvm-svn: 261406
|
|
|
|
|
|
| |
'impossible' condition
llvm-svn: 261405
|
|
|
|
|
|
| |
PR26485
llvm-svn: 261384
|
|
|
|
|
|
| |
Use auto, bring file up to coding standards etc.
llvm-svn: 261358
|
|
|
|
|
|
|
|
|
|
| |
Now that we don't always add an element to AllocatedStackSlots if we
don't find a pre-existing unallocated stack slot, bumping
StatepointMaxSlotsRequired to `NumSlots + 1` is not correct. Instead
bump the statistic near the push_back, to
Builder.FuncInfo.StatepointStackSlots.size().
llvm-svn: 261348
|
|
|
|
|
|
|
|
| |
The check on MFI->getObjectSize() has to be on the FrameIndex, not on
the index of the FrameIndex in AllocatedStackSlots. Weirdly, the tests
I added in rL261336 didn't catch this.
llvm-svn: 261347
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NFCI. They key motivation here is that I'd like to use
SmallBitVector::all() in a later change. Also, using a bit vector here
seemed better in general.
The only interesting change here is that in the failure case of
allocateStackSlot, we no longer (the equivalent of) push_back(true) to
AllocatedStackSlots. As far as I can tell, this is fine, since we'd
never re-use those slots in the same StatepointLoweringState instance.
Technically there was no need to change the operator[] type accesses to
set() and test(), but I thought it'd be nice to make it obvious that
we're using something other than a std::vector like thing.
llvm-svn: 261337
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
allocateStackSlot did not consider the size of the value to be spilled
before deciding to re-use a spill slot. This was originally okay (since
originally we'd only ever spill pointers), but it became not okay when
we changed our scheme to directly spill vectors of pointers.
While this change fixes the bug pointed out, it has two performance
caveats:
- It matches spill slot and spillee size exactly, while in theory we
can spill, e.g., an 8 byte pointer into a 16 byte slot. This is
slightly complicated to fix since in the stackmaps section, we report
the size of the spill slot as the size of the "indirect value"; and
if they're no longer equivalent, we'll have to keep track of the
(indirect) value size separately from the stack slot size.
- It will "spuriously run out" of reusable slots, since we now have an
second check in the search loop in addition to the availablity
check (e.g. you had two free scalar slots, and you first ask for a
vector slot followed by a scalar slot). I'll fix this in a later
commit.
llvm-svn: 261336
|
|
|
|
|
|
|
|
|
| |
This removes the unusual loop structure in allocateStackSlot in favor of
something more straightforward. I've also removed the cautionary
comment in the function, which I suspect is historical cruft now, and
confuses more than it enlightens.
llvm-svn: 261335
|
|
|
|
| |
llvm-svn: 261308
|
|
|
|
|
|
| |
No functional change is intended.
llvm-svn: 261307
|
|
|
|
| |
llvm-svn: 261306
|
|
|
|
|
|
|
|
|
|
| |
Certain optimization passes (like globaldce) can prune function
declaration that SjLjEHPrepare assumed would exit when it'd
runOnFunction.
This fixes PR26669.
llvm-svn: 261303
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Without this, this command
$ llvm-run llc -stop-after machine-cp -o - <( echo '' )
outputs an error, because we close stdout twice -- once when closing the
file opened for "-o", and again when closing outs().
Also clarify in the outs() definition that you can't ever call it if you
want to open your own raw_fd_ostream on stdout.
Reviewers: jroelofs, tstellarAMD
Subscribers: jholewinski, qcolombet, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D17422
llvm-svn: 261286
|
|
|
|
|
|
|
|
|
|
|
|
| |
Today, we do not allow cmpxchg operations with pointer arguments. We require the frontend to insert ptrtoint casts and do the cmpxchg in integers. While correct, this is problematic from a couple of perspectives:
1) It makes the IR harder to analyse (for instance, it make capture tracking overly conservative)
2) It pushes work onto the frontend authors for no real gain
This patch implements the simplest form of IR support. As we did with floating point loads and stores, we teach AtomicExpand to convert back to the old representation. This prevents us needing to change all backends in a single lock step change. Over time, we can migrate each backend to natively selecting the pointer type. In the meantime, we get the advantages of a cleaner IR representation without waiting for the backend changes.
Differential Revision: http://reviews.llvm.org/D17413
llvm-svn: 261281
|