summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Speculatively disable Dan's commits 143177 and 143179 to see ifDuncan Sands2011-10-281-134/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | it fixes the dragonegg self-host (it looks like gcc is miscompiled). Original commit messages: Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. Delete #if 0 code accidentally left in. llvm-svn: 143188
* Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUWDan Gohman2011-10-281-0/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. llvm-svn: 143177
* Change this overloaded use of Sched::Latency to be an overloadedDan Gohman2011-10-241-4/+4
| | | | | | use of Sched::ILP instead, as Sched::Latency is going away. llvm-svn: 142813
* Remove a now dead function, fixing -Wunused-function warnings fromChandler Carruth2011-10-211-20/+0
| | | | | | Clang. llvm-svn: 142631
* Delete the list-tdrr scheduler. Top-down schedulers are going awayDan Gohman2011-10-201-203/+11
| | | | | | because they don't support physical register dependencies. llvm-svn: 142620
* PreRA scheduler should avoid cloning compares.Andrew Trick2011-09-011-1/+35
| | | | | | | | | Added canClobberReachingPhysRegUse() to handle a particular pattern in which a two-address instruction could be forced to interfere with EFLAGS, causing a compare to be unnecessarilly cloned. Fixes rdar://problem/5875261 llvm-svn: 138924
* - Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo andEvan Cheng2011-06-281-20/+20
| | | | | | | | sink them into MC layer. - Added MCInstrInfo, which captures the tablegen generated static data. Chang TargetInstrInfo so it's based off MCInstrInfo. llvm-svn: 134021
* More refactoring. Move getRegClass from TargetOperandInfo to TargetInstrInfo.Evan Cheng2011-06-271-1/+1
| | | | llvm-svn: 133944
* pre-RA-sched: Cleanup register pressure tracking.Andrew Trick2011-06-271-7/+3
| | | | | | | | Removed the check that peeks past EXTRA_SUBREG, which I don't think makes sense any more. Intead treat it as a normal register def. No significant affect on x86 or ARM benchmarks. llvm-svn: 133917
* Distinguish early clobber output operands from clobbered registers.Jakob Stoklund Olesen2011-06-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | Both become <earlyclobber> defs on the INLINEASM MachineInstr, but we now use two different asm operand kinds. The new Kind_Clobber is treated identically to the old Kind_RegDefEarlyClobber for now, but x87 floating point stack inline assembly does care about the difference. This will pop a register off the stack: asm("fstp %st" : : "t"(x) : "st"); While this will pop the input and push an output: asm("fst %st" : "=&t"(r) : "t"(x)); We need to know if ST0 was a clobber or an output operand, and we can't depend on <dead> flags for that. llvm-svn: 133902
* Fix some trailing issues from my introduction of MVT::untyped and its use ↵Owen Anderson2011-06-211-1/+11
| | | | | | for REGISTER_SEQUENCE. llvm-svn: 133567
* Remove unused but set variables.Benjamin Kramer2011-06-181-2/+0
| | | | llvm-svn: 133347
* Add a new MVT::untyped. This will be used in future work for modelling ISA ↵Owen Anderson2011-06-151-9/+39
| | | | | | features like register pairs and lists with "interesting" constraints (such as ARM NEON contiguous register lists or even-odd paired registers). We need to be able to generate these instructions (often from intrinsics), but don't want to have to assign a legal type to them. Instead, we'll use an "untyped" edge to bypass the type-checking and simply ensure that the register classes match. llvm-svn: 133106
* Added -stress-sched flag in the Asserts build.Andrew Trick2011-06-151-14/+42
| | | | | | Added a test case for handling physreg aliases during pre-RA-sched. llvm-svn: 133063
* Remove a temporary test case probe in CheckForLiveRegDef.Andrew Trick2011-06-081-1/+0
| | | | llvm-svn: 132751
* Fix a merge bug in preRAsched for handling physreg aliases.Andrew Trick2011-06-071-4/+6
| | | | | | | I've been sitting on this long enough trying to find a test case. I think the fix should go in now, but I'll keep working on the test case. llvm-svn: 132701
* Be careful about scheduling nodes above previous calls. It increase usages ofEvan Cheng2011-04-261-1/+42
| | | | | | | | | | | | more callee-saved registers and introduce copies. Only allows it if scheduling a node above calls would end up lessen register pressure. Call operands also has added ABI restrictions for register allocation, so be extra careful with hoisting them above calls. rdar://9329627 llvm-svn: 130245
* Fix typoEvan Cheng2011-04-261-1/+1
| | | | llvm-svn: 130190
* In the pre-RA scheduler, maintain cmp+br proximity.Andrew Trick2011-04-141-13/+53
| | | | | | | | | | | | | | | | | | | | | | | | This is done by pushing physical register definitions close to their use, which happens to handle flag definitions if they're not glued to the branch. This seems to be generally a good thing though, so I didn't need to add a target hook yet. The primary motivation is to generate code closer to what people expect and rule out missed opportunity from enabling macro-op fusion. As a side benefit, we get several 2-5% gains on x86 benchmarks. There is one regression: SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is an independent scheduler bug that will be tracked separately. See rdar://problem/9283108. Incidentally, pre-RA scheduling is only half the solution. Fixing the later passes is tracked by: <rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump Fixes: <rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion llvm-svn: 129508
* Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor ↵Andrew Trick2011-04-131-110/+176
| | | | | | | | | | | | | | | | | | | | | latency. Additional fixes: Do something reasonable for subtargets with generic itineraries by handle node latency the same as for an empty itinerary. Now nodes default to unit latency unless an itinerary explicitly specifies a zero cycle stage or it is a TokenFactor chain. Original fixes: UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make the ndoe latency adjustments work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129421
* Revert 129383. It causes some targets to hit a scheduler assert.Andrew Trick2011-04-121-177/+111
| | | | llvm-svn: 129385
* PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency.Andrew Trick2011-04-121-111/+177
| | | | | | | | | | | | UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make these heuristic adjustments to node latency work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129383
* Added a check in the preRA scheduler for potential interference on aAndrew Trick2011-04-071-4/+55
| | | | | | | | | induction variable. The preRA scheduler is unaware of induction vars, so we look for potential "virtual register cycles" instead. Fixes <rdar://problem/8946719> Bad scheduling prevents coalescing llvm-svn: 129100
* Fix for -pre-RA-sched=source.Andrew Trick2011-03-251-0/+2
| | | | | | | Yet another case of unchecked NULL node (for physreg copy). May fix PR9509. llvm-svn: 128266
* Ensure that def-side physreg copies are scheduled above any other usesAndrew Trick2011-03-231-0/+9
| | | | | | | | | so the scheduler can't create new interferences on the copies themselves. Prior to this fix the scheduler could get stuck in a loop creating copies. Fixes PR9509. llvm-svn: 128164
* whitespaceAndrew Trick2011-03-231-2/+2
| | | | llvm-svn: 128163
* Grammar-o.Eric Christopher2011-03-211-1/+1
| | | | llvm-svn: 128004
* Re-commit 127368 and 127371. They are exonerated.Evan Cheng2011-03-101-5/+11
| | | | llvm-svn: 127380
* Revert 127368 and 127371 for now.Evan Cheng2011-03-091-11/+5
| | | | llvm-svn: 127376
* Change the definition of TargetRegisterInfo::getCrossCopyRegClass to be moreEvan Cheng2011-03-091-5/+11
| | | | | | | | | | | | | flexible. If it returns a register class that's different from the input, then that's the register class used for cross-register class copies. If it returns a register class that's the same as the input, then no cross- register class copies are needed (normal copies would do). If it returns null, then it's not at all possible to copy registers of the specified register class. llvm-svn: 127368
* Fix typo, make helper static.Benjamin Kramer2011-03-091-3/+3
| | | | llvm-svn: 127335
* Fix some latent bugs if the nodes are unschedulable. We'd gotten awayEric Christopher2011-03-081-1/+6
| | | | | | | | | | | | with this before since none of the register tracking or nightly tests had unschedulable nodes. This should probably be refixed with a special default Node that just returns some "don't touch me" values. Fixes PR9427 llvm-svn: 127263
* Further improvements to pre-RA-sched=list-ilp.Andrew Trick2011-03-081-17/+62
| | | | | | | This change uses the MaxReorderWindow for both height and depth, which tends to limit the negative effects of high register pressure. llvm-svn: 127203
* Move getRegPressureLimit() from TargetLoweringInfo to TargetRegisterInfo.Cameron Zwarich2011-03-071-1/+1
| | | | llvm-svn: 127175
* Typo.Eric Christopher2011-03-061-1/+1
| | | | llvm-svn: 127131
* Disable a couple of experimental heuristics to get the best results from the ↵Andrew Trick2011-03-061-2/+2
| | | | | | current implementation of -pre-RA-sched=list-ilp. llvm-svn: 127113
* Be explicit with abs(). Visual Studio workaround.Andrew Trick2011-03-051-4/+6
| | | | llvm-svn: 127075
* Missing comment.Andrew Trick2011-03-051-0/+2
| | | | llvm-svn: 127068
* Increased the register pressure limit on x86_64 from 8 to 12Andrew Trick2011-03-051-22/+143
| | | | | | | | | | | | | | | | | | | | | | | regs. This is the only change in this checkin that may affects the default scheduler. With better register tracking and heuristics, it doesn't make sense to artificially lower the register limit so much. Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to give the scheduler a way to account for div and sqrt on targets that don't have an itinerary. It is currently defaults to 10 (the actual number doesn't matter much), but only takes effect on non-default schedulers: list-hybrid and list-ilp. Added several heuristics that can be individually disabled for the non-default sched=list-ilp mode. This helps us determine how much better we can do on a given benchmark than the default scheduler. Certain compute intensive loops run much faster in this mode with the right set of heuristics, and it doesn't seem to have much negative impact elsewhere. Not all of the heuristics are needed, but we still need to experiment to decide which should be disabled by default for sched=list-ilp. llvm-svn: 127067
* Minor pre-RA-sched fixes and cleanup.Andrew Trick2011-03-041-7/+15
| | | | | | | | Fix the PendingQueue, then disable it because it's not required for the current schedulers' heuristics. Fix the logic for the unused list-ilp scheduler. llvm-svn: 126981
* Introducing a new method of tracking register pressure. We can'tAndrew Trick2011-02-041-111/+62
| | | | | | | | | | | | | | | precisely track pressure on a selection DAG, but we can at least keep it balanced. This design accounts for various interesting aspects of selection DAGS: register and subregister copies, glued nodes, dead nodes, unused registers, etc. Added SUnit::NumRegDefsLeft and ScheduleDAGSDNodes::RegDefIter. Note: I disabled PrescheduleNodesWithMultipleUses when register pressure is enabled, based on no evidence other than I don't think it makes sense to have both enabled. llvm-svn: 124853
* Remove a temporary workaround for a lencod miscompile. Depends on the fix in ↵Andrew Trick2011-01-271-2/+0
| | | | | | r124442. llvm-svn: 124443
* Temporarily workaround JM/lencod miscompile (SIGSEGV).Andrew Trick2011-01-241-0/+2
| | | | | | rdar://problem/8893967 llvm-svn: 124137
* Enable support for precise scheduling of the instruction selectionAndrew Trick2011-01-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | DAG. Disable using "-disable-sched-cycles". For ARM, this enables a framework for modeling the cpu pipeline and counting stalls. It also activates several heuristics to drive scheduling based on the model. Scheduling is inherently imprecise at this stage, and until spilling is improved it may defeat attempts to schedule. However, this framework provides greater control over tuning codegen. Although the flag is not target-specific, it should have very little affect on the default scheduler used by x86. The only two changes that affect x86 are: - scheduling a high-latency operation bumps the current cycle so independent operations can have their latency covered. i.e. two independent 4 cycle operations can produce results in 4 cycles, not 8 cycles. - Two operations with equal register pressure impact and no latency-based stalls on their uses will be prioritized by depth before height (height is irrelevant if no stalls occur in the schedule below this point). llvm-svn: 123971
* Convert -enable-sched-cycles and -enable-sched-hazard to -disableAndrew Trick2011-01-211-29/+31
| | | | | | | | | | | flags. They are still not enable in this revision. Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with the scheduler's model of operand latency in the selection DAG. Generalized unit tests to work with sched-cycles. llvm-svn: 123969
* Selection DAG scheduler register pressure heuristic fixes.Andrew Trick2011-01-201-8/+27
| | | | | | | | Added a check for already live regs before claiming HighRegPressure. Fixed a few cases of checking the wrong number of successors. Added some tracing until these heuristics are better understood. llvm-svn: 123892
* Support for precise scheduling of the instruction selection DAG,Andrew Trick2011-01-141-537/+663
| | | | | | | | | | | | | | | | | | | | | | | | | disabled in this checkin. Sorry for the large diffs due to refactoring. New functionality is all guarded by EnableSchedCycles. Scheduling the isel DAG is inherently imprecise, but we give it a best effort: - Added MayReduceRegPressure to allow stalled nodes in the queue only if there is a regpressure need. - Added BUHasStall to allow checking for either dependence stalls due to latency or resource stalls due to pipeline hazards. - Added BUCompareLatency to encapsulate and standardize the heuristics for minimizing stall cycles (vs. reducing register pressure). - Modified the bottom-up heuristic (now in BUCompareLatency) to prioritize nodes by their depth rather than height. As long as it doesn't stall, height is irrelevant. Depth represents the critical path to the DAG root. - Added hybrid_ls_rr_sort::isReady to filter stalled nodes before adding them to the available queue. Related Cleanup: most of the register reduction routines do not need to be templates. llvm-svn: 123468
* Minor cleanup related to my latest scheduler changes.Andrew Trick2010-12-241-3/+5
| | | | llvm-svn: 122545
* Fix a few cases where the scheduler is not checking for phys reg copies. The ↵Andrew Trick2010-12-241-3/+10
| | | | | | scheduling node may have a NULL DAG node, yuck. llvm-svn: 122544
* Various bits of framework needed for precise machine-level selectionAndrew Trick2010-12-241-60/+367
| | | | | | | | | | | | | | | | | | | | | | | | | | DAG scheduling during isel. Most new functionality is currently guarded by -enable-sched-cycles and -enable-sched-hazard. Added InstrItineraryData::IssueWidth field, currently derived from ARM itineraries, but could be initialized differently on other targets. Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is active, and if so how many cycles of state it holds. Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry into the scheduler's available queue. ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to get information about it's SUnits, provides RecedeCycle for bottom-up scheduling, correctly computes scoreboard depth, tracks IssueCount, and considers potential stall cycles when checking for hazards. ScheduleDAGRRList now models machine cycles and hazards (under flags). It tracks MinAvailableCycle, drives the hazard recognizer and priority queue's ready filter, manages a new PendingQueue, properly accounts for stall cycles, etc. llvm-svn: 122541
OpenPOWER on IntegriCloud