summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Change TargetLowering::getRepRegClassFor to take an MVT, instead ofPatrik Hagglund2012-12-131-1/+1
| | | | | | | | EVT. Accordingly, change RegDefIter to contain MVTs instead of EVTs. llvm-svn: 170140
* Revert EVT->MVT changes, r169836-169851, due to buildbot failures.Patrik Hagglund2012-12-111-1/+1
| | | | llvm-svn: 169854
* Change TargetLowering::getRepRegClassFor to take an MVT, instead ofPatrik Hagglund2012-12-111-1/+1
| | | | | | | | EVT. Accordingly, change RegDefIter to contain MVTs instead of EVTs. llvm-svn: 169838
* Use the new script to sort the includes of every file under lib.Chandler Carruth2012-12-031-10/+10
| | | | | | | | | | | | | | | | | Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
* ScheduleDAG interface. Added OrderKind to distinguish nonregister dependencies.Andrew Trick2012-11-061-5/+6
| | | | | | | This is in preparation for adding "weak" DAG edges, but generally simplifies the design. llvm-svn: 167435
* Add a really faster pre-RA scheduler (-pre-RA-sched=linearize). It doesn't useEvan Cheng2012-10-171-2/+1
| | | | | | | | | | | | | | any scheduling heuristics nor does it build up any scheduling data structure that other heuristics use. It essentially linearize by doing a DFA walk but it does handle glues correctly. IMPORTANT: it probably can't handle all the physical register dependencies so it's not suitable for x86. It also doesn't deal with dbg_value nodes right now so it's definitely is still WIP. rdar://12474515 llvm-svn: 166122
* Release build: guard dump functions withManman Ren2012-09-111-2/+2
| | | | | | | | "#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)" No functional change. Update r163339. llvm-svn: 163653
* Release build: guard dump functions with "ifndef NDEBUG"Manman Ren2012-09-061-0/+4
| | | | | | No functional change. llvm-svn: 163339
* Reapply 155668: Fix the SD scheduler to avoid gluing the same node twice.Andrew Trick2012-04-281-26/+58
| | | | | | | | | | This time, also fix the caller of AddGlue to properly handle incomplete chains. AddGlue had failure modes, but shamefully hid them from its caller. It's luck ran out. Fixes rdar://11314175: BuildSchedUnits assert. llvm-svn: 155749
* Temporarily revert r155668: Fix the SD scheduler to avoid gluing.Andrew Trick2012-04-271-4/+2
| | | | | | This definitely caused regression with ARM -mno-thumb. llvm-svn: 155743
* Fix the SD scheduler to avoid gluing the same node twice.Andrew Trick2012-04-261-3/+5
| | | | | | | | | | | DAGCombine strangeness may result in multiple loads from the same offset. They both may try to glue themselves to another load. We could insist that the redundant loads glue themselves to each other, but the beter fix is to bail out from bad gluing at the time we detect it. Fixes rdar://11314175: BuildSchedUnits assert. llvm-svn: 155668
* Insert the debugging instructions in one fell-swoop so that it doesn't call theBill Wendling2012-03-141-7/+8
| | | | | | | expensive "getFirstTerminator" call. This reduces the time of compilation in PR12258 from >10 minutes to < 10 seconds. llvm-svn: 152704
* misched preparation: rename core scheduler methods for consistency.Andrew Trick2012-03-071-10/+10
| | | | | | | We had half the API with one convention, half with another. Now was a good time to clean it up. llvm-svn: 152255
* misched preparation: clarify ScheduleDAG and ScheduleDAGInstrs roles.Andrew Trick2012-03-071-7/+13
| | | | | | | | | | | | | | | | | | | ScheduleDAG is responsible for the DAG: SUnits and SDeps. It provides target hooks for latency computation. ScheduleDAGInstrs extends ScheduleDAG and defines the current scheduling region in terms of MachineInstr iterators. It has access to the target's scheduling itinerary data. ScheduleDAGInstrs provides the logic for building the ScheduleDAG for the sequence of MachineInstrs in the current region. Target's can implement highly custom schedulers by extending this class. ScheduleDAGPostRATDList provides the driver and diagnostics for current postRA scheduling. It maintains a current Sequence of scheduled machine instructions and logic for splicing them into the block. During scheduling, it uses the ScheduleHazardRecognizer provided by the target. Specific changes: - Removed driver code from ScheduleDAG. clearDAG is the only interface needed. - Added enterRegion/exitRegion hooks to ScheduleDAGInstrs to delimit the scope of each scheduling region and associated DAG. They should be used to setup and cleanup any region-specific state in addition to the DAG itself. This is necessary because we reuse the same ScheduleDAG object for the entire function. The target may extend these hooks to do things at regions boundaries, like bundle terminators. The hooks are called even if we decide not to schedule the region. So all instructions in a block are "covered" by these calls. - Added ScheduleDAGInstrs::begin()/end() public API. - Moved Sequence into the driver layer, which is specific to the scheduling algorithm. llvm-svn: 152208
* misched preparation: modularize schedule emission.Andrew Trick2012-03-071-3/+43
| | | | | | ScheduleDAG has nothing to do with how the instructions are scheduled. llvm-svn: 152206
* misched preparation: modularize schedule printing.Andrew Trick2012-03-071-0/+9
| | | | | | ScheduleDAG will not refer to the scheduled instruction sequence. llvm-svn: 152205
* misched preparation: modularize schedule verification.Andrew Trick2012-03-071-0/+15
| | | | | | ScheduleDAG will not refer to the scheduled instruction sequence. llvm-svn: 152204
* Cleanup in preparation for misched: Move DAG visualization logic.Andrew Trick2012-03-071-0/+5
| | | | | | Soon, ScheduleDAG will not refer to the BB. llvm-svn: 152177
* Rename TargetSubtarget to TargetSubtargetInfo for consistency.Evan Cheng2011-07-011-2/+2
| | | | llvm-svn: 134259
* Sink SubtargetFeature and TargetInstrItineraries (renamed ↵Evan Cheng2011-06-291-0/+1
| | | | | | MCInstrItineraries) into MC. llvm-svn: 134049
* - Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo andEvan Cheng2011-06-281-7/+7
| | | | | | | | sink them into MC layer. - Added MCInstrInfo, which captures the tablegen generated static data. Chang TargetInstrInfo so it's based off MCInstrInfo. llvm-svn: 134021
* pre-RA-sched: Cleanup register pressure tracking.Andrew Trick2011-06-271-9/+1
| | | | | | | | Removed the check that peeks past EXTRA_SUBREG, which I don't think makes sense any more. Intead treat it as a normal register def. No significant affect on x86 or ARM benchmarks. llvm-svn: 133917
* The scheduler needs to be aware on the existence of untyped nodes when it ↵Owen Anderson2011-06-241-1/+2
| | | | | | performs type propagation for EXTRACT_SUBREG. llvm-svn: 133838
* Don't allocate empty read-only SmallVectors during SelectionDAG deallocation.Benjamin Kramer2011-06-181-1/+1
| | | | llvm-svn: 133348
* Added -stress-sched flag in the Asserts build.Andrew Trick2011-06-151-1/+1
| | | | | | Added a test case for handling physreg aliases during pre-RA-sched. llvm-svn: 133063
* Be careful about scheduling nodes above previous calls. It increase usages ofEvan Cheng2011-04-261-0/+19
| | | | | | | | | | | | more callee-saved registers and introduce copies. Only allows it if scheduling a node above calls would end up lessen register pressure. Call operands also has added ABI restrictions for register allocation, so be extra careful with hoisting them above calls. rdar://9329627 llvm-svn: 130245
* Fix a ton of comment typos found by codespell. Patch byChris Lattner2011-04-151-1/+1
| | | | | | Luis Felipe Strano Moraes! llvm-svn: 129558
* In the pre-RA scheduler, maintain cmp+br proximity.Andrew Trick2011-04-141-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | This is done by pushing physical register definitions close to their use, which happens to handle flag definitions if they're not glued to the branch. This seems to be generally a good thing though, so I didn't need to add a target hook yet. The primary motivation is to generate code closer to what people expect and rule out missed opportunity from enabling macro-op fusion. As a side benefit, we get several 2-5% gains on x86 benchmarks. There is one regression: SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is an independent scheduler bug that will be tracked separately. See rdar://problem/9283108. Incidentally, pre-RA scheduling is only half the solution. Fixing the later passes is tracked by: <rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump Fixes: <rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion llvm-svn: 129508
* Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor ↵Andrew Trick2011-04-131-46/+14
| | | | | | | | | | | | | | | | | | | | | latency. Additional fixes: Do something reasonable for subtargets with generic itineraries by handle node latency the same as for an empty itinerary. Now nodes default to unit latency unless an itinerary explicitly specifies a zero cycle stage or it is a TokenFactor chain. Original fixes: UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make the ndoe latency adjustments work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129421
* Revert 129383. It causes some targets to hit a scheduler assert.Andrew Trick2011-04-121-7/+46
| | | | llvm-svn: 129385
* PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency.Andrew Trick2011-04-121-46/+7
| | | | | | | | | | | | UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make these heuristic adjustments to node latency work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129383
* Added a check in the preRA scheduler for potential interference on aAndrew Trick2011-04-071-0/+46
| | | | | | | | | induction variable. The preRA scheduler is unaware of induction vars, so we look for potential "virtual register cycles" instead. Fixes <rdar://problem/8946719> Bad scheduling prevents coalescing llvm-svn: 129100
* Improve pre-RA-sched register pressure tracking for duplicate operands.Andrew Trick2011-03-091-1/+5
| | | | | | This helps cases like 2008-07-19-movups-spills.ll, but doesn't have an obvious impact on benchmarks llvm-svn: 127347
* Fix some latent bugs if the nodes are unschedulable. We'd gotten awayEric Christopher2011-03-081-0/+4
| | | | | | | | | | | | with this before since none of the register tracking or nightly tests had unschedulable nodes. This should probably be refixed with a special default Node that just returns some "don't touch me" values. Fixes PR9427 llvm-svn: 127263
* Fix for -sched-high-latency-cycles in sched=list-ilp mode.Andrew Trick2011-03-051-1/+3
| | | | llvm-svn: 127071
* Increased the register pressure limit on x86_64 from 8 to 12Andrew Trick2011-03-051-1/+13
| | | | | | | | | | | | | | | | | | | | | | | regs. This is the only change in this checkin that may affects the default scheduler. With better register tracking and heuristics, it doesn't make sense to artificially lower the register limit so much. Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to give the scheduler a way to account for div and sqrt on targets that don't have an itinerary. It is currently defaults to 10 (the actual number doesn't matter much), but only takes effect on non-default schedulers: list-hybrid and list-ilp. Added several heuristics that can be individually disabled for the non-default sched=list-ilp mode. This helps us determine how much better we can do on a given benchmark than the default scheduler. Certain compute intensive loops run much faster in this mode with the right set of heuristics, and it doesn't seem to have much negative impact elsewhere. Not all of the heuristics are needed, but we still need to experiment to decide which should be disabled by default for sched=list-ilp. llvm-svn: 127067
* Introducing a new method of tracking register pressure. We can'tAndrew Trick2011-02-041-1/+73
| | | | | | | | | | | | | | | precisely track pressure on a selection DAG, but we can at least keep it balanced. This design accounts for various interesting aspects of selection DAGS: register and subregister copies, glued nodes, dead nodes, unused registers, etc. Added SUnit::NumRegDefsLeft and ScheduleDAGSDNodes::RegDefIter. Note: I disabled PrescheduleNodesWithMultipleUses when register pressure is enabled, based on no evidence other than I don't think it makes sense to have both enabled. llvm-svn: 124853
* whitespaceAndrew Trick2011-02-031-18/+18
| | | | llvm-svn: 124827
* Reapply 124301Devang Patel2011-01-271-1/+5
| | | | llvm-svn: 124339
* Revert 124301.Devang Patel2011-01-261-5/+1
| | | | llvm-svn: 124327
* Process valid SDDbgValues even if the node does not have any order assigned.Devang Patel2011-01-261-1/+5
| | | | llvm-svn: 124301
* Refactor.Devang Patel2011-01-261-19/+30
| | | | llvm-svn: 124300
* This assertion is too restrictive, it does not apply for dangling dbg value ↵Devang Patel2011-01-251-8/+0
| | | | | | nodes (nodes where dbg.value intrinsic preceds use of the value). llvm-svn: 124202
* flags -> glue for selectiondagChris Lattner2010-12-231-49/+48
| | | | llvm-svn: 122509
* rename MVT::Flag to MVT::Glue. "Flag" is a terrible name forChris Lattner2010-12-211-7/+7
| | | | | | | something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310
* Two sets of changes. Sorry they are intermingled.Evan Cheng2010-11-031-4/+7
| | | | | | | | | | | | | 1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to "optimize for latency". Call instructions don't have the right latency and this is more likely to use introduce spills. 2. Fix if-converter cost function. For ARM, it should use instruction latencies, not # of micro-ops since multi-latency instructions is completely executed even when the predicate is false. Also, some instruction will be "slower" when they are predicated due to the register def becoming implicit input. rdar://8598427 llvm-svn: 118135
* Avoiding overly aggressive latency scheduling. If the two nodes share anEvan Cheng2010-10-291-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | operand and one of them has a single use that is a live out copy, favor the one that is live out. Otherwise it will be difficult to eliminate the copy if the instruction is a loop induction variable update. e.g. BB: sub r1, r3, #1 str r0, [r2, r3] mov r3, r1 cmp bne BB => BB: str r0, [r2, r3] sub r3, r3, #1 cmp bne BB This fixed the recent 256.bzip2 regression. llvm-svn: 117675
* Re-commit 117518 and 117519 now that ARM MC test failures are out of the way.Evan Cheng2010-10-281-0/+3
| | | | llvm-svn: 117531
* Revert 117518 and 117519 for now. They changed scheduling and cause MC tests ↵Evan Cheng2010-10-281-3/+0
| | | | | | to fail. Ugh. llvm-svn: 117520
* Fix a major bug in operand latency computation. The use index must be adjustedEvan Cheng2010-10-281-0/+3
| | | | | | by the number of defs first for it to match the instruction itinerary. llvm-svn: 117518
OpenPOWER on IntegriCloud