summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* misched preparation: rename core scheduler methods for consistency.Andrew Trick2012-03-071-10/+10
| | | | | | | We had half the API with one convention, half with another. Now was a good time to clean it up. llvm-svn: 152255
* misched preparation: clarify ScheduleDAG and ScheduleDAGInstrs roles.Andrew Trick2012-03-071-7/+13
| | | | | | | | | | | | | | | | | | | ScheduleDAG is responsible for the DAG: SUnits and SDeps. It provides target hooks for latency computation. ScheduleDAGInstrs extends ScheduleDAG and defines the current scheduling region in terms of MachineInstr iterators. It has access to the target's scheduling itinerary data. ScheduleDAGInstrs provides the logic for building the ScheduleDAG for the sequence of MachineInstrs in the current region. Target's can implement highly custom schedulers by extending this class. ScheduleDAGPostRATDList provides the driver and diagnostics for current postRA scheduling. It maintains a current Sequence of scheduled machine instructions and logic for splicing them into the block. During scheduling, it uses the ScheduleHazardRecognizer provided by the target. Specific changes: - Removed driver code from ScheduleDAG. clearDAG is the only interface needed. - Added enterRegion/exitRegion hooks to ScheduleDAGInstrs to delimit the scope of each scheduling region and associated DAG. They should be used to setup and cleanup any region-specific state in addition to the DAG itself. This is necessary because we reuse the same ScheduleDAG object for the entire function. The target may extend these hooks to do things at regions boundaries, like bundle terminators. The hooks are called even if we decide not to schedule the region. So all instructions in a block are "covered" by these calls. - Added ScheduleDAGInstrs::begin()/end() public API. - Moved Sequence into the driver layer, which is specific to the scheduling algorithm. llvm-svn: 152208
* misched preparation: modularize schedule emission.Andrew Trick2012-03-071-3/+43
| | | | | | ScheduleDAG has nothing to do with how the instructions are scheduled. llvm-svn: 152206
* misched preparation: modularize schedule printing.Andrew Trick2012-03-071-0/+9
| | | | | | ScheduleDAG will not refer to the scheduled instruction sequence. llvm-svn: 152205
* misched preparation: modularize schedule verification.Andrew Trick2012-03-071-0/+15
| | | | | | ScheduleDAG will not refer to the scheduled instruction sequence. llvm-svn: 152204
* Cleanup in preparation for misched: Move DAG visualization logic.Andrew Trick2012-03-071-0/+5
| | | | | | Soon, ScheduleDAG will not refer to the BB. llvm-svn: 152177
* Rename TargetSubtarget to TargetSubtargetInfo for consistency.Evan Cheng2011-07-011-2/+2
| | | | llvm-svn: 134259
* Sink SubtargetFeature and TargetInstrItineraries (renamed ↵Evan Cheng2011-06-291-0/+1
| | | | | | MCInstrItineraries) into MC. llvm-svn: 134049
* - Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo andEvan Cheng2011-06-281-7/+7
| | | | | | | | sink them into MC layer. - Added MCInstrInfo, which captures the tablegen generated static data. Chang TargetInstrInfo so it's based off MCInstrInfo. llvm-svn: 134021
* pre-RA-sched: Cleanup register pressure tracking.Andrew Trick2011-06-271-9/+1
| | | | | | | | Removed the check that peeks past EXTRA_SUBREG, which I don't think makes sense any more. Intead treat it as a normal register def. No significant affect on x86 or ARM benchmarks. llvm-svn: 133917
* The scheduler needs to be aware on the existence of untyped nodes when it ↵Owen Anderson2011-06-241-1/+2
| | | | | | performs type propagation for EXTRACT_SUBREG. llvm-svn: 133838
* Don't allocate empty read-only SmallVectors during SelectionDAG deallocation.Benjamin Kramer2011-06-181-1/+1
| | | | llvm-svn: 133348
* Added -stress-sched flag in the Asserts build.Andrew Trick2011-06-151-1/+1
| | | | | | Added a test case for handling physreg aliases during pre-RA-sched. llvm-svn: 133063
* Be careful about scheduling nodes above previous calls. It increase usages ofEvan Cheng2011-04-261-0/+19
| | | | | | | | | | | | more callee-saved registers and introduce copies. Only allows it if scheduling a node above calls would end up lessen register pressure. Call operands also has added ABI restrictions for register allocation, so be extra careful with hoisting them above calls. rdar://9329627 llvm-svn: 130245
* Fix a ton of comment typos found by codespell. Patch byChris Lattner2011-04-151-1/+1
| | | | | | Luis Felipe Strano Moraes! llvm-svn: 129558
* In the pre-RA scheduler, maintain cmp+br proximity.Andrew Trick2011-04-141-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | This is done by pushing physical register definitions close to their use, which happens to handle flag definitions if they're not glued to the branch. This seems to be generally a good thing though, so I didn't need to add a target hook yet. The primary motivation is to generate code closer to what people expect and rule out missed opportunity from enabling macro-op fusion. As a side benefit, we get several 2-5% gains on x86 benchmarks. There is one regression: SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is an independent scheduler bug that will be tracked separately. See rdar://problem/9283108. Incidentally, pre-RA scheduling is only half the solution. Fixing the later passes is tracked by: <rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump Fixes: <rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion llvm-svn: 129508
* Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor ↵Andrew Trick2011-04-131-46/+14
| | | | | | | | | | | | | | | | | | | | | latency. Additional fixes: Do something reasonable for subtargets with generic itineraries by handle node latency the same as for an empty itinerary. Now nodes default to unit latency unless an itinerary explicitly specifies a zero cycle stage or it is a TokenFactor chain. Original fixes: UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make the ndoe latency adjustments work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129421
* Revert 129383. It causes some targets to hit a scheduler assert.Andrew Trick2011-04-121-7/+46
| | | | llvm-svn: 129385
* PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency.Andrew Trick2011-04-121-46/+7
| | | | | | | | | | | | UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make these heuristic adjustments to node latency work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129383
* Added a check in the preRA scheduler for potential interference on aAndrew Trick2011-04-071-0/+46
| | | | | | | | | induction variable. The preRA scheduler is unaware of induction vars, so we look for potential "virtual register cycles" instead. Fixes <rdar://problem/8946719> Bad scheduling prevents coalescing llvm-svn: 129100
* Improve pre-RA-sched register pressure tracking for duplicate operands.Andrew Trick2011-03-091-1/+5
| | | | | | This helps cases like 2008-07-19-movups-spills.ll, but doesn't have an obvious impact on benchmarks llvm-svn: 127347
* Fix some latent bugs if the nodes are unschedulable. We'd gotten awayEric Christopher2011-03-081-0/+4
| | | | | | | | | | | | with this before since none of the register tracking or nightly tests had unschedulable nodes. This should probably be refixed with a special default Node that just returns some "don't touch me" values. Fixes PR9427 llvm-svn: 127263
* Fix for -sched-high-latency-cycles in sched=list-ilp mode.Andrew Trick2011-03-051-1/+3
| | | | llvm-svn: 127071
* Increased the register pressure limit on x86_64 from 8 to 12Andrew Trick2011-03-051-1/+13
| | | | | | | | | | | | | | | | | | | | | | | regs. This is the only change in this checkin that may affects the default scheduler. With better register tracking and heuristics, it doesn't make sense to artificially lower the register limit so much. Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to give the scheduler a way to account for div and sqrt on targets that don't have an itinerary. It is currently defaults to 10 (the actual number doesn't matter much), but only takes effect on non-default schedulers: list-hybrid and list-ilp. Added several heuristics that can be individually disabled for the non-default sched=list-ilp mode. This helps us determine how much better we can do on a given benchmark than the default scheduler. Certain compute intensive loops run much faster in this mode with the right set of heuristics, and it doesn't seem to have much negative impact elsewhere. Not all of the heuristics are needed, but we still need to experiment to decide which should be disabled by default for sched=list-ilp. llvm-svn: 127067
* Introducing a new method of tracking register pressure. We can'tAndrew Trick2011-02-041-1/+73
| | | | | | | | | | | | | | | precisely track pressure on a selection DAG, but we can at least keep it balanced. This design accounts for various interesting aspects of selection DAGS: register and subregister copies, glued nodes, dead nodes, unused registers, etc. Added SUnit::NumRegDefsLeft and ScheduleDAGSDNodes::RegDefIter. Note: I disabled PrescheduleNodesWithMultipleUses when register pressure is enabled, based on no evidence other than I don't think it makes sense to have both enabled. llvm-svn: 124853
* whitespaceAndrew Trick2011-02-031-18/+18
| | | | llvm-svn: 124827
* Reapply 124301Devang Patel2011-01-271-1/+5
| | | | llvm-svn: 124339
* Revert 124301.Devang Patel2011-01-261-5/+1
| | | | llvm-svn: 124327
* Process valid SDDbgValues even if the node does not have any order assigned.Devang Patel2011-01-261-1/+5
| | | | llvm-svn: 124301
* Refactor.Devang Patel2011-01-261-19/+30
| | | | llvm-svn: 124300
* This assertion is too restrictive, it does not apply for dangling dbg value ↵Devang Patel2011-01-251-8/+0
| | | | | | nodes (nodes where dbg.value intrinsic preceds use of the value). llvm-svn: 124202
* flags -> glue for selectiondagChris Lattner2010-12-231-49/+48
| | | | llvm-svn: 122509
* rename MVT::Flag to MVT::Glue. "Flag" is a terrible name forChris Lattner2010-12-211-7/+7
| | | | | | | something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310
* Two sets of changes. Sorry they are intermingled.Evan Cheng2010-11-031-4/+7
| | | | | | | | | | | | | 1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to "optimize for latency". Call instructions don't have the right latency and this is more likely to use introduce spills. 2. Fix if-converter cost function. For ARM, it should use instruction latencies, not # of micro-ops since multi-latency instructions is completely executed even when the predicate is false. Also, some instruction will be "slower" when they are predicated due to the register def becoming implicit input. rdar://8598427 llvm-svn: 118135
* Avoiding overly aggressive latency scheduling. If the two nodes share anEvan Cheng2010-10-291-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | operand and one of them has a single use that is a live out copy, favor the one that is live out. Otherwise it will be difficult to eliminate the copy if the instruction is a loop induction variable update. e.g. BB: sub r1, r3, #1 str r0, [r2, r3] mov r3, r1 cmp bne BB => BB: str r0, [r2, r3] sub r3, r3, #1 cmp bne BB This fixed the recent 256.bzip2 regression. llvm-svn: 117675
* Re-commit 117518 and 117519 now that ARM MC test failures are out of the way.Evan Cheng2010-10-281-0/+3
| | | | llvm-svn: 117531
* Revert 117518 and 117519 for now. They changed scheduling and cause MC tests ↵Evan Cheng2010-10-281-3/+0
| | | | | | to fail. Ugh. llvm-svn: 117520
* Fix a major bug in operand latency computation. The use index must be adjustedEvan Cheng2010-10-281-0/+3
| | | | | | by the number of defs first for it to match the instruction itinerary. llvm-svn: 117518
* - Add TargetInstrInfo::getOperandLatency() to compute operand latencies. ThisEvan Cheng2010-10-061-19/+1
| | | | | | | | | | | | | allow target to correctly compute latency for cases where static scheduling itineraries isn't sufficient. e.g. variable_ops instructions such as ARM::ldm. This also allows target without scheduling itineraries to compute operand latencies. e.g. X86 can return (approximated) latencies for high latency instructions such as division. - Compute operand latencies for those defined by load multiple instructions, e.g. ldm and those used by store multiple instructions, e.g. stm. llvm-svn: 115755
* Model Cortex-a9 load to SUB, RSB, ADD, ADC, SBC, RSC, CMN, MVN, or CMPEvan Cheng2010-09-291-17/+17
| | | | | | pipeline forwarding path. llvm-svn: 115098
* Teach if-converter to be more careful with predicating instructions that wouldEvan Cheng2010-09-101-9/+7
| | | | | | | | | | | take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. llvm-svn: 113570
* Add missing null check reported by Amaury Pouly.Evan Cheng2010-08-101-2/+3
| | | | llvm-svn: 110649
* Fix a bug in the code which re-inserts DBG_VALUE nodes after scheduling;Dan Gohman2010-07-101-1/+3
| | | | | | | | if a block is split (by a custom inserter), the insert point may be in a different block than it was originally. This fixes 32-bit llvm-gcc bootstrap builds, and I haven't been able to reproduce it otherwise. llvm-svn: 108060
* Reapply bottom-up fast-isel, with several fixes for x86-32:Dan Gohman2010-07-101-8/+5
| | | | | | | | | - Check getBytesToPopOnReturn(). - Eschew ST0 and ST1 for return values. - Fix the PIC base register initialization so that it doesn't ever fail to end up the top of the entry block. llvm-svn: 108039
* --- Reverse-merging r107947 into '.':Bob Wilson2010-07-091-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | U utils/TableGen/FastISelEmitter.cpp --- Reverse-merging r107943 into '.': U test/CodeGen/X86/fast-isel.ll U test/CodeGen/X86/fast-isel-loads.ll U include/llvm/Target/TargetLowering.h U include/llvm/Support/PassNameParser.h U include/llvm/CodeGen/FunctionLoweringInfo.h U include/llvm/CodeGen/CallingConvLower.h U include/llvm/CodeGen/FastISel.h U include/llvm/CodeGen/SelectionDAGISel.h U lib/CodeGen/LLVMTargetMachine.cpp U lib/CodeGen/CallingConvLower.cpp U lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp U lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp U lib/CodeGen/SelectionDAG/FastISel.cpp U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp U lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp U lib/CodeGen/SelectionDAG/InstrEmitter.cpp U lib/CodeGen/SelectionDAG/TargetLowering.cpp U lib/Target/XCore/XCoreISelLowering.cpp U lib/Target/XCore/XCoreISelLowering.h U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86ISelLowering.h llvm-svn: 107987
* Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emittingDan Gohman2010-07-091-8/+5
| | | | | | a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL. llvm-svn: 107943
* grammar tweak in comment.Jim Grosbach2010-06-301-1/+1
| | | | llvm-svn: 107321
* Add a VT argument to getMinimalPhysRegClass and replace the copy related usesRafael Espindola2010-06-291-1/+1
| | | | | | | | | of getPhysicalRegisterRegClass with it. If we want to make a copy (or estimate its cost), it is better to use the smallest class as more efficient operations might be possible. llvm-svn: 107140
* Remove variables which are assigned to but for which the valueDuncan Sands2010-06-251-8/+1
| | | | | | is not used. Spotted by gcc-4.6. llvm-svn: 106854
* It's possible that a flag is added to the SDNode that points back to theBill Wendling2010-06-241-11/+19
| | | | | | | original SDNode. This is badness. Also, this function allows one SDNode to point multiple flags to another SDNode. Badness as well. llvm-svn: 106793
OpenPOWER on IntegriCloud