summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Temporarily workaround JM/lencod miscompile (SIGSEGV).Andrew Trick2011-01-241-0/+2
| | | | | | rdar://problem/8893967 llvm-svn: 124137
* Enable support for precise scheduling of the instruction selectionAndrew Trick2011-01-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | DAG. Disable using "-disable-sched-cycles". For ARM, this enables a framework for modeling the cpu pipeline and counting stalls. It also activates several heuristics to drive scheduling based on the model. Scheduling is inherently imprecise at this stage, and until spilling is improved it may defeat attempts to schedule. However, this framework provides greater control over tuning codegen. Although the flag is not target-specific, it should have very little affect on the default scheduler used by x86. The only two changes that affect x86 are: - scheduling a high-latency operation bumps the current cycle so independent operations can have their latency covered. i.e. two independent 4 cycle operations can produce results in 4 cycles, not 8 cycles. - Two operations with equal register pressure impact and no latency-based stalls on their uses will be prioritized by depth before height (height is irrelevant if no stalls occur in the schedule below this point). llvm-svn: 123971
* Convert -enable-sched-cycles and -enable-sched-hazard to -disableAndrew Trick2011-01-211-29/+31
| | | | | | | | | | | flags. They are still not enable in this revision. Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with the scheduler's model of operand latency in the selection DAG. Generalized unit tests to work with sched-cycles. llvm-svn: 123969
* Selection DAG scheduler register pressure heuristic fixes.Andrew Trick2011-01-201-8/+27
| | | | | | | | Added a check for already live regs before claiming HighRegPressure. Fixed a few cases of checking the wrong number of successors. Added some tracing until these heuristics are better understood. llvm-svn: 123892
* Support for precise scheduling of the instruction selection DAG,Andrew Trick2011-01-141-537/+663
| | | | | | | | | | | | | | | | | | | | | | | | | disabled in this checkin. Sorry for the large diffs due to refactoring. New functionality is all guarded by EnableSchedCycles. Scheduling the isel DAG is inherently imprecise, but we give it a best effort: - Added MayReduceRegPressure to allow stalled nodes in the queue only if there is a regpressure need. - Added BUHasStall to allow checking for either dependence stalls due to latency or resource stalls due to pipeline hazards. - Added BUCompareLatency to encapsulate and standardize the heuristics for minimizing stall cycles (vs. reducing register pressure). - Modified the bottom-up heuristic (now in BUCompareLatency) to prioritize nodes by their depth rather than height. As long as it doesn't stall, height is irrelevant. Depth represents the critical path to the DAG root. - Added hybrid_ls_rr_sort::isReady to filter stalled nodes before adding them to the available queue. Related Cleanup: most of the register reduction routines do not need to be templates. llvm-svn: 123468
* Minor cleanup related to my latest scheduler changes.Andrew Trick2010-12-241-3/+5
| | | | llvm-svn: 122545
* Fix a few cases where the scheduler is not checking for phys reg copies. The ↵Andrew Trick2010-12-241-3/+10
| | | | | | scheduling node may have a NULL DAG node, yuck. llvm-svn: 122544
* Various bits of framework needed for precise machine-level selectionAndrew Trick2010-12-241-60/+367
| | | | | | | | | | | | | | | | | | | | | | | | | | DAG scheduling during isel. Most new functionality is currently guarded by -enable-sched-cycles and -enable-sched-hazard. Added InstrItineraryData::IssueWidth field, currently derived from ARM itineraries, but could be initialized differently on other targets. Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is active, and if so how many cycles of state it holds. Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry into the scheduler's available queue. ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to get information about it's SUnits, provides RecedeCycle for bottom-up scheduling, correctly computes scoreboard depth, tracks IssueCount, and considers potential stall cycles when checking for hazards. ScheduleDAGRRList now models machine cycles and hazards (under flags). It tracks MinAvailableCycle, drives the hazard recognizer and priority queue's ready filter, manages a new PendingQueue, properly accounts for stall cycles, etc. llvm-svn: 122541
* flags -> glue for selectiondagChris Lattner2010-12-231-6/+6
| | | | llvm-svn: 122509
* Reorganize ListScheduleBottomUp in preparation for modeling machine cycles ↵Andrew Trick2010-12-231-130/+153
| | | | | | and instruction issue. llvm-svn: 122491
* Converted LiveRegCycles to LiveRegGens. It's easier to work with and allows ↵Andrew Trick2010-12-231-17/+18
| | | | | | multiple nodes per cycle. llvm-svn: 122474
* In CheckForLiveRegDef use TRI->getOverlaps.Andrew Trick2010-12-231-6/+9
| | | | llvm-svn: 122473
* Fixes PR8823: add-with-overflow-128.llAndrew Trick2010-12-231-12/+33
| | | | | | | | In the bottom-up selection DAG scheduling, handle two-address instructions that read/write unspillable registers. Treat the entire chain of two-address nodes as a single live range. llvm-svn: 122472
* In DelayForLiveRegsBottomUp, handle instructions that read and writeAndrew Trick2010-12-211-15/+4
| | | | | | | the same physical register. Simplifies the fix from the previous checkin r122211. llvm-svn: 122370
* whitespaceAndrew Trick2010-12-211-42/+42
| | | | llvm-svn: 122368
* rename MVT::Flag to MVT::Glue. "Flag" is a terrible name forChris Lattner2010-12-211-5/+5
| | | | | | | something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310
* Fix a bug in the scheduler's handling of "unspillable" vregs.Chris Lattner2010-12-201-1/+14
| | | | | | | | | | | | | | | | | | Imagine we see: EFLAGS = inst1 EFLAGS = inst2 FLAGS gpr = inst3 EFLAGS Previously, we would refuse to schedule inst2 because it clobbers the EFLAGS of the predecessor. However, it also uses the EFLAGS of the predecessor, so it is safe to emit. SDep edges ensure that the right order happens already anyway. This fixes 2 testsuite crashes with the X86 patch I'm going to commit next. llvm-svn: 122211
* the result of CheckForLiveRegDef is dead, remove it.Chris Lattner2010-12-201-12/+8
| | | | llvm-svn: 122209
* Two sets of changes. Sorry they are intermingled.Evan Cheng2010-11-031-0/+8
| | | | | | | | | | | | | 1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to "optimize for latency". Call instructions don't have the right latency and this is more likely to use introduce spills. 2. Fix if-converter cost function. For ARM, it should use instruction latencies, not # of micro-ops since multi-latency instructions is completely executed even when the predicate is false. Also, some instruction will be "slower" when they are predicated due to the register def becoming implicit input. rdar://8598427 llvm-svn: 118135
* Avoiding overly aggressive latency scheduling. If the two nodes share anEvan Cheng2010-10-291-24/+69
| | | | | | | | | | | | | | | | | | | | | | | | | operand and one of them has a single use that is a live out copy, favor the one that is live out. Otherwise it will be difficult to eliminate the copy if the instruction is a loop induction variable update. e.g. BB: sub r1, r3, #1 str r0, [r2, r3] mov r3, r1 cmp bne BB => BB: str r0, [r2, r3] sub r3, r3, #1 cmp bne BB This fixed the recent 256.bzip2 regression. llvm-svn: 117675
* The "excess register pressure" returned by HighRegPressure() is not accurate ↵Evan Cheng2010-07-261-41/+20
| | | | | | enough to factor into scheduling priority. Eliminate it and add early exits to speed up scheduling. llvm-svn: 109449
* Pacify gcc-4.5 which wrongly thinks that RExcess (passed as the Excess ↵Duncan Sands2010-07-261-1/+2
| | | | | | | | parameter) may be used uninitialized in the callers of HighRegPressure. llvm-svn: 109393
* Add comments.Evan Cheng2010-07-251-4/+16
| | | | llvm-svn: 109383
* Fix crashes when scheduling a CopyToReg node -- getMachineOpcode asserts onBob Wilson2010-07-251-2/+2
| | | | | | those. Radar 8231572. llvm-svn: 109367
* Add an ILP scheduler. This is a register pressure aware scheduler that'sEvan Cheng2010-07-241-10/+72
| | | | | | | | | | | | appropriate for targets without detailed instruction iterineries. The scheduler schedules for increased instruction level parallelism in low register pressure situation; it schedules to reduce register pressure when the register pressure becomes high. On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%. llvm-svn: 109300
* - Allow target to specify when is register pressure "too high". In most cases,Evan Cheng2010-07-231-56/+124
| | | | | | | | | | | | | it's too late to start backing off aggressive latency scheduling when most of the registers are in use so the threshold should be a bit tighter. - Correctly handle live out's and extract_subreg etc. - Enable register pressure aware scheduling by default for hybrid scheduler. For ARM, this is almost always a win on # of instructions. It's runtime neutral for most of the tests. But for some kernels with high register pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by 54 and sped up by 20%. llvm-svn: 109279
* Re-apply r109079 with fix.Evan Cheng2010-07-221-28/+26
| | | | llvm-svn: 109083
* Revert r109079, which broke a lot of CodeGen tests.Owen Anderson2010-07-221-25/+27
| | | | llvm-svn: 109082
* Initialize RegLimit only when register pressure is being tracked.Evan Cheng2010-07-221-27/+25
| | | | llvm-svn: 109079
* More register pressure aware scheduling work.Evan Cheng2010-07-211-81/+84
| | | | llvm-svn: 109064
* Teach bottom up pre-ra scheduler to track register pressure. Work in progress.Evan Cheng2010-07-211-15/+229
| | | | llvm-svn: 108991
* Add a VT argument to getMinimalPhysRegClass and replace the copy related usesRafael Espindola2010-06-291-1/+1
| | | | | | | | | of getPhysicalRegisterRegClass with it. If we want to make a copy (or estimate its cost), it is better to use the smallest class as more efficient operations might be possible. llvm-svn: 107140
* Use `llvm::next' instead of `next' to make VC++ 2010 happy.Oscar Fuentes2010-05-301-1/+1
| | | | llvm-svn: 105168
* Fix some latency computation bugs: if the use is not a machine opcode do not ↵Evan Cheng2010-05-281-1/+12
| | | | | | just return zero. llvm-svn: 105061
* Eliminate the use of PriorityQueue and just use a std::vector,Dan Gohman2010-05-261-7/+18
| | | | | | | | implementing pop with a linear search for a "best" element. The priority queue was a neat idea, but in practice the comparison functions depend on dynamic information. llvm-svn: 104718
* Delete an unused function.Dan Gohman2010-05-261-2/+0
| | | | llvm-svn: 104716
* Change push_all to a non-virtual function and implement it in theDan Gohman2010-05-261-5/+0
| | | | | | base class, since all the implementations are the same. llvm-svn: 104659
* Rename -pre-RA-sched=hybrid to -pre-RA-sched=list-hybrid.Evan Cheng2010-05-211-1/+1
| | | | llvm-svn: 104306
* Allow targets more controls on what nodes are scheduled by reg pressure, ↵Evan Cheng2010-05-201-2/+4
| | | | | | what for latency in hybrid mode. llvm-svn: 104293
* Add a hybrid bottom up scheduler that reduce register usage while avoidingEvan Cheng2010-05-201-25/+92
| | | | | | | | | pipeline stall. It's useful for targets like ARM cortex-a8. NEON has a lot of long latency instructions so a strict register pressure reduction scheduler does not work well. Early experiments show this speeds up some NEON loops by over 30%. llvm-svn: 104216
* Three changes:Chris Lattner2010-04-071-5/+7
| | | | | | | | | | | | | | | 1. Introduce some enums and accessors in the InlineAsm class that eliminate a ton of magic numbers when handling inline asm SDNode. 2. Add a new MDNodeSDNode selection dag node type that holds a MDNode (shocking!) 3. Add a new argument to ISD::INLINEASM nodes that hold !srcloc metadata, propagating it to the instruction emitter, which drops it. No functionality change. llvm-svn: 100605
* move target-independent opcodes out of TargetInstrInfoChris Lattner2010-02-091-7/+7
| | | | | | | | | into TargetOpcodes.h. #include the new TargetOpcodes.h into MachineInstr. Add new inline accessors (like isPHI()) to MachineInstr, and start using them throughout the codebase. llvm-svn: 95687
* When the scheduler unfold a load folding instruction it move some of the ↵Evan Cheng2010-02-051-2/+10
| | | | | | | | predecessors to the unfolded load. It decides what gets moved to the load by checking whether the new load is using the predecessor as an operand. The check neglects the cases whether the predecessor is a flagged scheduling unit. rdar://7604000 llvm-svn: 95339
* Remove the '-disable-scheduling' flag and replace it with the 'source' option ofBill Wendling2010-01-231-14/+56
| | | | | | | | | the '-pre-RA-sched' flag. It actually makes more sense to do it this way. Also, keep track of the SDNode ordering by default. Eventually, we would like to make this ordering a way to break a "tie" in the scheduler. However, doing that now breaks the "CodeGen/X86/abi-isel.ll" test for 32-bit Linux. llvm-svn: 94308
* The previous code could potentially cause a cycle. Allow ordering w.r.t. a 0 ↵Bill Wendling2010-01-061-2/+2
| | | | | | order. llvm-svn: 92810
* Only check the ordering if there is an ordering for each nodes.Bill Wendling2010-01-061-2/+2
| | | | llvm-svn: 92807
* Add a semi-primitive form of scheduling via the "SDNode ordering" to theBill Wendling2010-01-051-0/+12
| | | | | | bottom-up scheduler. We prefer the lower order number. llvm-svn: 92806
* Change errs() to dbgs().David Greene2010-01-051-14/+14
| | | | llvm-svn: 92576
* Remove includes of Support/Compiler.h that are no longer needed after theNick Lewycky2009-10-251-1/+0
| | | | | | VISIBILITY_HIDDEN removal. llvm-svn: 85043
* Remove VISIBILITY_HIDDEN from class/struct found inside anonymous namespaces.Nick Lewycky2009-10-251-3/+2
| | | | | | | Chris claims we should never have visibility_hidden inside any .cpp file but that's still not true even after this commit. llvm-svn: 85042
OpenPOWER on IntegriCloud