| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TargetInstrInfo:
Change produceSameValue() to take MachineRegisterInfo as an optional argument.
When in SSA form, targets can use it to make more aggressive equality analysis.
Machine LICM:
1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead.
2. Fix a bug which prevent CSE of instructions which are not re-materializable.
3. Use improved form of produceSameValue.
ARM:
1. Teach ARM produceSameValue to look pass some PIC labels.
2. Look for operands from different loads of different constant pool entries
which have same values.
3. Re-implement PIC GA materialization using movw + movt. Combine the pair with
a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible
to re-materialize the instruction, allow machine LICM to hoist the set of
instructions out of the loop and make it possible to CSE them. It's a bit
hacky, but it significantly improve code quality.
4. Some minor bug fixes as well.
With the fixes, using movw + movt to materialize GAs significantly outperform the
load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap
and 176.gcc ~10%.
llvm-svn: 123905
|
| |
|
|
|
|
|
|
| |
Added a check for already live regs before claiming HighRegPressure.
Fixed a few cases of checking the wrong number of successors.
Added some tracing until these heuristics are better understood.
llvm-svn: 123892
|
| |
|
|
|
|
| |
The live range may have been deleted earlier because of rematerialization.
llvm-svn: 123891
|
| |
|
|
|
|
| |
register coalescing.
llvm-svn: 123890
|
| |
|
|
| |
llvm-svn: 123872
|
| |
|
|
| |
llvm-svn: 123866
|
| |
|
|
|
|
|
|
|
| |
with an invalid type then split the result and perform the overflow check
normally.
Fixes the 32-bit parts of rdar://8622122 and rdar://8774702.
llvm-svn: 123864
|
| |
|
|
| |
llvm-svn: 123862
|
| |
|
|
| |
llvm-svn: 123859
|
| |
|
|
|
|
|
|
|
|
| |
interval after an instruction. The leaveIntvAfter() method only adds liveness
from the instruction's boundary index to the inserted copy.
Ideally, SplitKit should be smarter about this, perhaps by combining useIntv()
and leaveIntvAfter() into one method that guarantees continuity.
llvm-svn: 123858
|
| |
|
|
| |
llvm-svn: 123856
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Region splitting includes loop splitting as a subset, and it is more generic.
The splitting heuristics for variables that are live in more than one block are
now:
1. Try to create a region that covers multiple basic blocks.
2. Try to create a new live range for each block with multiple uses.
3. Spill.
Steps 2 and 3 are similar to what the standard spiller is doing.
llvm-svn: 123853
|
| |
|
|
|
|
|
|
|
|
| |
Analyze the live range's behavior entering and leaving basic blocks. Compute an
interference pattern for each allocation candidate, and use SpillPlacement to
find an optimal region where that register can be live.
This code is still not enabled.
llvm-svn: 123774
|
| |
|
|
| |
llvm-svn: 123707
|
| |
|
|
|
|
|
| |
ranges, add legalizer support for nested calls. Necessary for ARM
byval support. Radar 7662569.
llvm-svn: 123704
|
| |
|
|
| |
llvm-svn: 123664
|
| |
|
|
|
|
|
|
|
| |
This shaves off 4 popcounts from the hacked 186.crafty source.
This is enabled even when a native popcount instruction is available. The
combined code is one operation longer but it should be faster nevertheless.
llvm-svn: 123621
|
| |
|
|
|
|
|
| |
multi-instruction sequences like calls. Many thanks to Jakob for
finding a testcase.
llvm-svn: 123559
|
| |
|
|
| |
llvm-svn: 123549
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel
In a silly microbenchmark on a 65 nm core2 this is 1.5x faster than the old
code in 32 bit mode and about 2x faster in 64 bit mode. It's also a lot shorter,
especially when counting 64 bit population on a 32 bit target.
I hope this is fast enough to replace Kernighan-style counting loops even when
the input is rather sparse.
llvm-svn: 123547
|
| |
|
|
| |
llvm-svn: 123491
|
| |
|
|
|
|
| |
comments.
llvm-svn: 123479
|
| |
|
|
|
|
| |
description emission. Currently all the backends use table-based stuff.
llvm-svn: 123476
|
| |
|
|
| |
llvm-svn: 123474
|
| |
|
|
| |
llvm-svn: 123473
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
disabled in this checkin. Sorry for the large diffs due to
refactoring. New functionality is all guarded by EnableSchedCycles.
Scheduling the isel DAG is inherently imprecise, but we give it a best
effort:
- Added MayReduceRegPressure to allow stalled nodes in the queue only
if there is a regpressure need.
- Added BUHasStall to allow checking for either dependence stalls due to
latency or resource stalls due to pipeline hazards.
- Added BUCompareLatency to encapsulate and standardize the heuristics
for minimizing stall cycles (vs. reducing register pressure).
- Modified the bottom-up heuristic (now in BUCompareLatency) to
prioritize nodes by their depth rather than height. As long as it
doesn't stall, height is irrelevant. Depth represents the critical
path to the DAG root.
- Added hybrid_ls_rr_sort::isReady to filter stalled nodes before
adding them to the available queue.
Related Cleanup: most of the register reduction routines do not need
to be templates.
llvm-svn: 123468
|
| |
|
|
|
|
| |
This time let's rephrase to trick gcc-4.3 into not miscompiling.
llvm-svn: 123432
|
| |
|
|
| |
llvm-svn: 123423
|
| |
|
|
|
|
| |
they should go *before* the new instruction not after it.
llvm-svn: 123420
|
| |
|
|
|
|
| |
Fix some callers to better deal with debug values.
llvm-svn: 123419
|
| |
|
|
|
|
|
| |
This approach also works when the terminator doesn't have a slot index. (Which
can happen??)
llvm-svn: 123413
|
| |
|
|
| |
llvm-svn: 123400
|
| |
|
|
| |
llvm-svn: 123399
|
| |
|
|
|
|
| |
happy.
llvm-svn: 123389
|
| |
|
|
|
|
|
| |
It will still return an iterator that points to the first terminator or end(),
but there may be DBG_VALUE instructions following the first terminator.
llvm-svn: 123384
|
| |
|
|
| |
llvm-svn: 123352
|
| |
|
|
| |
llvm-svn: 123351
|
| |
|
|
|
|
| |
further on the associated testcase before aborting.
llvm-svn: 123346
|
| |
|
|
| |
llvm-svn: 123342
|
| |
|
|
|
|
| |
after all.
llvm-svn: 123339
|
| |
|
|
| |
llvm-svn: 123338
|
| |
|
|
|
|
| |
Make sure we don't crash in that case, but simply turn them into %noreg instead.
llvm-svn: 123335
|
| |
|
|
|
|
| |
It was leaving dangling pointers in the slot index maps.
llvm-svn: 123334
|
| |
|
|
| |
llvm-svn: 123333
|
| |
|
|
|
|
| |
The slot indexes must be monotonically increasing through the function.
llvm-svn: 123324
|
| |
|
|
| |
llvm-svn: 123322
|
| |
|
|
| |
llvm-svn: 123290
|
| |
|
|
| |
llvm-svn: 123282
|
| |
|
|
|
|
|
|
| |
For one, MachineBasicBlock::getFirstTerminator() doesn't understand what is
happening, and it also makes sense to have all control flow run through the
DBG_VALUE.
llvm-svn: 123277
|
| |
|
|
|
|
| |
This is not yet completely enabled.
llvm-svn: 123274
|