summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Rename TargetSubtarget to TargetSubtargetInfo for consistency.Evan Cheng2011-07-011-2/+2
| | | | llvm-svn: 134259
* Sink SubtargetFeature and TargetInstrItineraries (renamed ↵Evan Cheng2011-06-291-0/+1
| | | | | | MCInstrItineraries) into MC. llvm-svn: 134049
* - Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo andEvan Cheng2011-06-281-7/+7
| | | | | | | | sink them into MC layer. - Added MCInstrInfo, which captures the tablegen generated static data. Chang TargetInstrInfo so it's based off MCInstrInfo. llvm-svn: 134021
* pre-RA-sched: Cleanup register pressure tracking.Andrew Trick2011-06-271-9/+1
| | | | | | | | Removed the check that peeks past EXTRA_SUBREG, which I don't think makes sense any more. Intead treat it as a normal register def. No significant affect on x86 or ARM benchmarks. llvm-svn: 133917
* The scheduler needs to be aware on the existence of untyped nodes when it ↵Owen Anderson2011-06-241-1/+2
| | | | | | performs type propagation for EXTRACT_SUBREG. llvm-svn: 133838
* Don't allocate empty read-only SmallVectors during SelectionDAG deallocation.Benjamin Kramer2011-06-181-1/+1
| | | | llvm-svn: 133348
* Added -stress-sched flag in the Asserts build.Andrew Trick2011-06-151-1/+1
| | | | | | Added a test case for handling physreg aliases during pre-RA-sched. llvm-svn: 133063
* Be careful about scheduling nodes above previous calls. It increase usages ofEvan Cheng2011-04-261-0/+19
| | | | | | | | | | | | more callee-saved registers and introduce copies. Only allows it if scheduling a node above calls would end up lessen register pressure. Call operands also has added ABI restrictions for register allocation, so be extra careful with hoisting them above calls. rdar://9329627 llvm-svn: 130245
* Fix a ton of comment typos found by codespell. Patch byChris Lattner2011-04-151-1/+1
| | | | | | Luis Felipe Strano Moraes! llvm-svn: 129558
* In the pre-RA scheduler, maintain cmp+br proximity.Andrew Trick2011-04-141-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | This is done by pushing physical register definitions close to their use, which happens to handle flag definitions if they're not glued to the branch. This seems to be generally a good thing though, so I didn't need to add a target hook yet. The primary motivation is to generate code closer to what people expect and rule out missed opportunity from enabling macro-op fusion. As a side benefit, we get several 2-5% gains on x86 benchmarks. There is one regression: SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is an independent scheduler bug that will be tracked separately. See rdar://problem/9283108. Incidentally, pre-RA scheduling is only half the solution. Fixing the later passes is tracked by: <rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump Fixes: <rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion llvm-svn: 129508
* Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor ↵Andrew Trick2011-04-131-46/+14
| | | | | | | | | | | | | | | | | | | | | latency. Additional fixes: Do something reasonable for subtargets with generic itineraries by handle node latency the same as for an empty itinerary. Now nodes default to unit latency unless an itinerary explicitly specifies a zero cycle stage or it is a TokenFactor chain. Original fixes: UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make the ndoe latency adjustments work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129421
* Revert 129383. It causes some targets to hit a scheduler assert.Andrew Trick2011-04-121-7/+46
| | | | llvm-svn: 129385
* PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency.Andrew Trick2011-04-121-46/+7
| | | | | | | | | | | | UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make these heuristic adjustments to node latency work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129383
* Added a check in the preRA scheduler for potential interference on aAndrew Trick2011-04-071-0/+46
| | | | | | | | | induction variable. The preRA scheduler is unaware of induction vars, so we look for potential "virtual register cycles" instead. Fixes <rdar://problem/8946719> Bad scheduling prevents coalescing llvm-svn: 129100
* Improve pre-RA-sched register pressure tracking for duplicate operands.Andrew Trick2011-03-091-1/+5
| | | | | | This helps cases like 2008-07-19-movups-spills.ll, but doesn't have an obvious impact on benchmarks llvm-svn: 127347
* Fix some latent bugs if the nodes are unschedulable. We'd gotten awayEric Christopher2011-03-081-0/+4
| | | | | | | | | | | | with this before since none of the register tracking or nightly tests had unschedulable nodes. This should probably be refixed with a special default Node that just returns some "don't touch me" values. Fixes PR9427 llvm-svn: 127263
* Fix for -sched-high-latency-cycles in sched=list-ilp mode.Andrew Trick2011-03-051-1/+3
| | | | llvm-svn: 127071
* Increased the register pressure limit on x86_64 from 8 to 12Andrew Trick2011-03-051-1/+13
| | | | | | | | | | | | | | | | | | | | | | | regs. This is the only change in this checkin that may affects the default scheduler. With better register tracking and heuristics, it doesn't make sense to artificially lower the register limit so much. Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to give the scheduler a way to account for div and sqrt on targets that don't have an itinerary. It is currently defaults to 10 (the actual number doesn't matter much), but only takes effect on non-default schedulers: list-hybrid and list-ilp. Added several heuristics that can be individually disabled for the non-default sched=list-ilp mode. This helps us determine how much better we can do on a given benchmark than the default scheduler. Certain compute intensive loops run much faster in this mode with the right set of heuristics, and it doesn't seem to have much negative impact elsewhere. Not all of the heuristics are needed, but we still need to experiment to decide which should be disabled by default for sched=list-ilp. llvm-svn: 127067
* Introducing a new method of tracking register pressure. We can'tAndrew Trick2011-02-041-1/+73
| | | | | | | | | | | | | | | precisely track pressure on a selection DAG, but we can at least keep it balanced. This design accounts for various interesting aspects of selection DAGS: register and subregister copies, glued nodes, dead nodes, unused registers, etc. Added SUnit::NumRegDefsLeft and ScheduleDAGSDNodes::RegDefIter. Note: I disabled PrescheduleNodesWithMultipleUses when register pressure is enabled, based on no evidence other than I don't think it makes sense to have both enabled. llvm-svn: 124853
* whitespaceAndrew Trick2011-02-031-18/+18
| | | | llvm-svn: 124827
* Reapply 124301Devang Patel2011-01-271-1/+5
| | | | llvm-svn: 124339
* Revert 124301.Devang Patel2011-01-261-5/+1
| | | | llvm-svn: 124327
* Process valid SDDbgValues even if the node does not have any order assigned.Devang Patel2011-01-261-1/+5
| | | | llvm-svn: 124301
* Refactor.Devang Patel2011-01-261-19/+30
| | | | llvm-svn: 124300
* This assertion is too restrictive, it does not apply for dangling dbg value ↵Devang Patel2011-01-251-8/+0
| | | | | | nodes (nodes where dbg.value intrinsic preceds use of the value). llvm-svn: 124202
* flags -> glue for selectiondagChris Lattner2010-12-231-49/+48
| | | | llvm-svn: 122509
* rename MVT::Flag to MVT::Glue. "Flag" is a terrible name forChris Lattner2010-12-211-7/+7
| | | | | | | something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310
* Two sets of changes. Sorry they are intermingled.Evan Cheng2010-11-031-4/+7
| | | | | | | | | | | | | 1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to "optimize for latency". Call instructions don't have the right latency and this is more likely to use introduce spills. 2. Fix if-converter cost function. For ARM, it should use instruction latencies, not # of micro-ops since multi-latency instructions is completely executed even when the predicate is false. Also, some instruction will be "slower" when they are predicated due to the register def becoming implicit input. rdar://8598427 llvm-svn: 118135
* Avoiding overly aggressive latency scheduling. If the two nodes share anEvan Cheng2010-10-291-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | operand and one of them has a single use that is a live out copy, favor the one that is live out. Otherwise it will be difficult to eliminate the copy if the instruction is a loop induction variable update. e.g. BB: sub r1, r3, #1 str r0, [r2, r3] mov r3, r1 cmp bne BB => BB: str r0, [r2, r3] sub r3, r3, #1 cmp bne BB This fixed the recent 256.bzip2 regression. llvm-svn: 117675
* Re-commit 117518 and 117519 now that ARM MC test failures are out of the way.Evan Cheng2010-10-281-0/+3
| | | | llvm-svn: 117531
* Revert 117518 and 117519 for now. They changed scheduling and cause MC tests ↵Evan Cheng2010-10-281-3/+0
| | | | | | to fail. Ugh. llvm-svn: 117520
* Fix a major bug in operand latency computation. The use index must be adjustedEvan Cheng2010-10-281-0/+3
| | | | | | by the number of defs first for it to match the instruction itinerary. llvm-svn: 117518
* - Add TargetInstrInfo::getOperandLatency() to compute operand latencies. ThisEvan Cheng2010-10-061-19/+1
| | | | | | | | | | | | | allow target to correctly compute latency for cases where static scheduling itineraries isn't sufficient. e.g. variable_ops instructions such as ARM::ldm. This also allows target without scheduling itineraries to compute operand latencies. e.g. X86 can return (approximated) latencies for high latency instructions such as division. - Compute operand latencies for those defined by load multiple instructions, e.g. ldm and those used by store multiple instructions, e.g. stm. llvm-svn: 115755
* Model Cortex-a9 load to SUB, RSB, ADD, ADC, SBC, RSC, CMN, MVN, or CMPEvan Cheng2010-09-291-17/+17
| | | | | | pipeline forwarding path. llvm-svn: 115098
* Teach if-converter to be more careful with predicating instructions that wouldEvan Cheng2010-09-101-9/+7
| | | | | | | | | | | take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. llvm-svn: 113570
* Add missing null check reported by Amaury Pouly.Evan Cheng2010-08-101-2/+3
| | | | llvm-svn: 110649
* Fix a bug in the code which re-inserts DBG_VALUE nodes after scheduling;Dan Gohman2010-07-101-1/+3
| | | | | | | | if a block is split (by a custom inserter), the insert point may be in a different block than it was originally. This fixes 32-bit llvm-gcc bootstrap builds, and I haven't been able to reproduce it otherwise. llvm-svn: 108060
* Reapply bottom-up fast-isel, with several fixes for x86-32:Dan Gohman2010-07-101-8/+5
| | | | | | | | | - Check getBytesToPopOnReturn(). - Eschew ST0 and ST1 for return values. - Fix the PIC base register initialization so that it doesn't ever fail to end up the top of the entry block. llvm-svn: 108039
* --- Reverse-merging r107947 into '.':Bob Wilson2010-07-091-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | U utils/TableGen/FastISelEmitter.cpp --- Reverse-merging r107943 into '.': U test/CodeGen/X86/fast-isel.ll U test/CodeGen/X86/fast-isel-loads.ll U include/llvm/Target/TargetLowering.h U include/llvm/Support/PassNameParser.h U include/llvm/CodeGen/FunctionLoweringInfo.h U include/llvm/CodeGen/CallingConvLower.h U include/llvm/CodeGen/FastISel.h U include/llvm/CodeGen/SelectionDAGISel.h U lib/CodeGen/LLVMTargetMachine.cpp U lib/CodeGen/CallingConvLower.cpp U lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp U lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp U lib/CodeGen/SelectionDAG/FastISel.cpp U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp U lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp U lib/CodeGen/SelectionDAG/InstrEmitter.cpp U lib/CodeGen/SelectionDAG/TargetLowering.cpp U lib/Target/XCore/XCoreISelLowering.cpp U lib/Target/XCore/XCoreISelLowering.h U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86ISelLowering.h llvm-svn: 107987
* Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emittingDan Gohman2010-07-091-8/+5
| | | | | | a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL. llvm-svn: 107943
* grammar tweak in comment.Jim Grosbach2010-06-301-1/+1
| | | | llvm-svn: 107321
* Add a VT argument to getMinimalPhysRegClass and replace the copy related usesRafael Espindola2010-06-291-1/+1
| | | | | | | | | of getPhysicalRegisterRegClass with it. If we want to make a copy (or estimate its cost), it is better to use the smallest class as more efficient operations might be possible. llvm-svn: 107140
* Remove variables which are assigned to but for which the valueDuncan Sands2010-06-251-8/+1
| | | | | | is not used. Spotted by gcc-4.6. llvm-svn: 106854
* It's possible that a flag is added to the SDNode that points back to theBill Wendling2010-06-241-11/+19
| | | | | | | original SDNode. This is badness. Also, this function allows one SDNode to point multiple flags to another SDNode. Badness as well. llvm-svn: 106793
* MorphNodeTo doesn't preserve the memory operands. Because we're morphing a nodeBill Wendling2010-06-231-0/+21
| | | | | | | into the same node, but with different non-memory operands, we need to replace the memory operands after it's finished morphing. llvm-svn: 106643
* Use A.append(...) instead of A.insert(A.end(), ...) when A is aDan Gohman2010-06-211-1/+1
| | | | | | SmallVector, and other SmallVector simplifications. llvm-svn: 106452
* Code refactoring, no functionality changes.Evan Cheng2010-06-101-82/+83
| | | | llvm-svn: 105775
* Fix some latency computation bugs: if the use is not a machine opcode do not ↵Evan Cheng2010-05-281-6/+15
| | | | | | just return zero. llvm-svn: 105061
* Allow targets more controls on what nodes are scheduled by reg pressure, ↵Evan Cheng2010-05-201-0/+20
| | | | | | what for latency in hybrid mode. llvm-svn: 104293
* Add a hybrid bottom up scheduler that reduce register usage while avoidingEvan Cheng2010-05-201-1/+36
| | | | | | | | | pipeline stall. It's useful for targets like ARM cortex-a8. NEON has a lot of long latency instructions so a strict register pressure reduction scheduler does not work well. Early experiments show this speeds up some NEON loops by over 30%. llvm-svn: 104216
OpenPOWER on IntegriCloud