| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
consistent with the way it's generally done in other places.
llvm-svn: 60439
|
| |
|
|
|
|
|
|
|
|
| |
- Incorporate Tilmann Scheller's ISD::TRUNCATE custom lowering patch
- Update SPU calling convention info, even if it's not used yet (but can be
at some point or another)
- Ensure that any-extended f32 loads are custom lowered, especially when
they're promoted for use in printf.
llvm-svn: 60438
|
| |
|
|
| |
llvm-svn: 60434
|
| |
|
|
|
|
| |
splitting.
llvm-svn: 60433
|
| |
|
|
| |
llvm-svn: 60432
|
| |
|
|
| |
llvm-svn: 60431
|
| |
|
|
| |
llvm-svn: 60429
|
| |
|
|
| |
llvm-svn: 60409
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
straight-forward implementation. This does not require any extra
alias analysis queries beyond what we already do for non-local loads.
Some programs really really like load PRE. For example, SPASS triggers
this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc.
The biggest limitation to the implementation is that it does not split
critical edges. This is a huge killer on many programs and should be
addressed after the initial patch is enabled by default.
The implementation of this should incidentally speed up rejection of
non-local loads because it avoids creating the repl densemap in cases
when it won't be used for fully redundant loads.
This is currently disabled by default.
Before I turn this on, I need to fix a couple of miscompilations in
the testsuite, look at compile time performance numbers, and look at
perf impact. This is pretty close to ready though.
llvm-svn: 60408
|
| |
|
|
| |
llvm-svn: 60407
|
| |
|
|
| |
llvm-svn: 60406
|
| |
|
|
| |
llvm-svn: 60405
|
| |
|
|
| |
llvm-svn: 60404
|
| |
|
|
| |
llvm-svn: 60403
|
| |
|
|
| |
llvm-svn: 60402
|
| |
|
|
| |
llvm-svn: 60401
|
| |
|
|
|
|
|
|
|
| |
constant. If X is a constant, then this is folded elsewhere.
- Added a note to Target/README.txt to indicate that we'd like to implement
this when we're able.
llvm-svn: 60399
|
| |
|
|
| |
llvm-svn: 60398
|
| |
|
|
|
|
|
|
| |
- No need to do a swap on a canonicalized pattern.
No functionality change.
llvm-svn: 60397
|
| |
|
|
| |
llvm-svn: 60395
|
| |
|
|
|
|
|
| |
a new value numbering set after splitting a critical edge. This increases
the number of instances of PRE on 403.gcc from ~60 to ~570.
llvm-svn: 60393
|
| |
|
|
| |
llvm-svn: 60392
|
| |
|
|
| |
llvm-svn: 60391
|
| |
|
|
|
|
|
|
|
|
|
| |
- LowerXADDO lowers [SU]ADDO into an ADD with an implicit EFLAGS define. The
EFLAGS are fed into a SETCC node which has the conditional COND_O or COND_C,
depending on the type of ADDO requested.
- LowerBRCOND now recognizes if it's coming from a SETCC node with COND_O or
COND_C set.
llvm-svn: 60388
|
| |
|
|
| |
llvm-svn: 60385
|
| |
|
|
| |
llvm-svn: 60383
|
| |
|
|
|
|
| |
- Add support for seto, setno, setc, and setnc instructions.
llvm-svn: 60382
|
| |
|
|
|
|
| |
types.
llvm-svn: 60381
|
| |
|
|
|
|
|
|
|
| |
figuring out the base of the IV. This produces better
code in the example. (Addresses use (IV) instead of
(BASE,IV) - a significant improvement on low-register
machines like x86).
llvm-svn: 60374
|
| |
|
|
| |
llvm-svn: 60370
|
| |
|
|
| |
llvm-svn: 60369
|
| |
|
|
|
|
| |
integer is "minint".
llvm-svn: 60366
|
| |
|
|
|
|
|
|
|
| |
- Fix v2[if]64 vector insertion code before IBM files a bug report.
- Ensure that zero (0) offsets relative to $sp don't trip an assert
(add $sp, 0 gets legalized to $sp alone, tripping an assert)
- Shuffle masks passed to SPUISD::SHUFB are now v16i8 or v4i32
llvm-svn: 60358
|
| |
|
|
|
|
|
|
| |
MERGE_VALUES node with only one operand, so get
rid of special code that only existed to handle
that possibility.
llvm-svn: 60349
|
| |
|
|
|
|
|
|
|
|
|
| |
ReplaceNodeResults: rather than returning a node which
must have the same number of results as the original
node (which means mucking around with MERGE_VALUES,
and which is also easy to get wrong since SelectionDAG
folding may mean you don't get the node you expect),
return the results in a vector.
llvm-svn: 60348
|
| |
|
|
|
|
| |
don't have overlapping bits.
llvm-svn: 60344
|
| |
|
|
| |
llvm-svn: 60343
|
| |
|
|
| |
llvm-svn: 60341
|
| |
|
|
|
|
| |
fiddling with constants unless we have to.
llvm-svn: 60340
|
| |
|
|
|
|
| |
instead of throughout it.
llvm-svn: 60339
|
| |
|
|
|
|
|
| |
that it isn't reallocated all the time. This is a tiny speedup for
GVN: 3.90->3.88s
llvm-svn: 60338
|
| |
|
|
| |
llvm-svn: 60337
|
| |
|
|
|
|
|
|
|
| |
instead of std::sort. This shrinks the release-asserts LSR.o file
by 1100 bytes of code on my system.
We should start using array_pod_sort where possible.
llvm-svn: 60335
|
| |
|
|
|
|
| |
This is a lot cheaper and conceptually simpler.
llvm-svn: 60332
|
| |
|
|
|
|
| |
DeadInsts ivar, just use it directly.
llvm-svn: 60330
|
| |
|
|
|
|
|
|
| |
buggy rewrite, this notifies ScalarEvolution of a pending instruction
about to be removed and then erases it, instead of erasing it then
notifying.
llvm-svn: 60329
|
| |
|
|
|
|
| |
xor in testcase (or is a substring).
llvm-svn: 60328
|
| |
|
|
|
|
|
|
|
| |
new instructions it simplifies. Because we're threading jumps on edges
with constants coming in from PHI's, we inherently are exposing a lot more
constants to the new block. Folding them and deleting dead conditions
allows the cost model in jump threading to be more accurate as it iterates.
llvm-svn: 60327
|
| |
|
|
|
|
|
| |
prevents the passmgr from adding yet-another domtree invocation
for Verifier if there is already one live.
llvm-svn: 60326
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instead of using FoldPHIArgBinOpIntoPHI. In addition to being more
obvious, this also fixes a problem where instcombine wouldn't merge two
phis that had different variable indices. This prevented instcombine
from factoring big chunks of code in 403.gcc. For example:
insn_cuid.exit:
- %tmp336 = load i32** @uid_cuid, align 4
- %tmp337 = getelementptr %struct.rtx_def* %insn_addr.0.ph.i, i32 0, i32 3
- %tmp338 = bitcast [1 x %struct.rtunion]* %tmp337 to i32*
- %tmp339 = load i32* %tmp338, align 4
- %tmp340 = getelementptr i32* %tmp336, i32 %tmp339
br label %bb62
bb61:
- %tmp341 = load i32** @uid_cuid, align 4
- %tmp342 = getelementptr %struct.rtx_def* %insn, i32 0, i32 3
- %tmp343 = bitcast [1 x %struct.rtunion]* %tmp342 to i32*
- %tmp344 = load i32* %tmp343, align 4
- %tmp345 = getelementptr i32* %tmp341, i32 %tmp344
br label %bb62
bb62:
- %iftmp.62.0.in = phi i32* [ %tmp345, %bb61 ], [ %tmp340, %insn_cuid.exit ]
+ %insn.pn2 = phi %struct.rtx_def* [ %insn, %bb61 ], [ %insn_addr.0.ph.i, %insn_cuid.exit ]
+ %tmp344.pn.in.in = getelementptr %struct.rtx_def* %insn.pn2, i32 0, i32 3
+ %tmp344.pn.in = bitcast [1 x %struct.rtunion]* %tmp344.pn.in.in to i32*
+ %tmp341.pn = load i32** @uid_cuid
+ %tmp344.pn = load i32* %tmp344.pn.in
+ %iftmp.62.0.in = getelementptr i32* %tmp341.pn, i32 %tmp344.pn
%iftmp.62.0 = load i32* %iftmp.62.0.in
llvm-svn: 60325
|