bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Be careful about scheduling nodes above previous calls. It increase usages of	Evan Cheng	2011-04-26	1	-1/+42
\| \| \| \| \| \| \| \| \| \| \| \|	more callee-saved registers and introduce copies. Only allows it if scheduling a node above calls would end up lessen register pressure. Call operands also has added ABI restrictions for register allocation, so be extra careful with hoisting them above calls. rdar://9329627 llvm-svn: 130245
*	Fix typo	Evan Cheng	2011-04-26	1	-1/+1
\| \| \| \|	llvm-svn: 130190
*	In the pre-RA scheduler, maintain cmp+br proximity.	Andrew Trick	2011-04-14	1	-13/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is done by pushing physical register definitions close to their use, which happens to handle flag definitions if they're not glued to the branch. This seems to be generally a good thing though, so I didn't need to add a target hook yet. The primary motivation is to generate code closer to what people expect and rule out missed opportunity from enabling macro-op fusion. As a side benefit, we get several 2-5% gains on x86 benchmarks. There is one regression: SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is an independent scheduler bug that will be tracked separately. See rdar://problem/9283108. Incidentally, pre-RA scheduling is only half the solution. Fixing the later passes is tracked by: <rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump Fixes: <rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion llvm-svn: 129508
*	Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor ↵	Andrew Trick	2011-04-13	1	-110/+176
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	latency. Additional fixes: Do something reasonable for subtargets with generic itineraries by handle node latency the same as for an empty itinerary. Now nodes default to unit latency unless an itinerary explicitly specifies a zero cycle stage or it is a TokenFactor chain. Original fixes: UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make the ndoe latency adjustments work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129421
*	Revert 129383. It causes some targets to hit a scheduler assert.	Andrew Trick	2011-04-12	1	-177/+111
\| \| \| \|	llvm-svn: 129385
*	PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency.	Andrew Trick	2011-04-12	1	-111/+177
\| \| \| \| \| \| \| \| \| \| \| \|	UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make these heuristic adjustments to node latency work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129383
*	Added a check in the preRA scheduler for potential interference on a	Andrew Trick	2011-04-07	1	-4/+55
\| \| \| \| \| \| \| \| \|	induction variable. The preRA scheduler is unaware of induction vars, so we look for potential "virtual register cycles" instead. Fixes <rdar://problem/8946719> Bad scheduling prevents coalescing llvm-svn: 129100
*	Fix for -pre-RA-sched=source.	Andrew Trick	2011-03-25	1	-0/+2
\| \| \| \| \| \| \|	Yet another case of unchecked NULL node (for physreg copy). May fix PR9509. llvm-svn: 128266
*	Ensure that def-side physreg copies are scheduled above any other uses	Andrew Trick	2011-03-23	1	-0/+9
\| \| \| \| \| \| \| \| \|	so the scheduler can't create new interferences on the copies themselves. Prior to this fix the scheduler could get stuck in a loop creating copies. Fixes PR9509. llvm-svn: 128164
*	whitespace	Andrew Trick	2011-03-23	1	-2/+2
\| \| \| \|	llvm-svn: 128163
*	Grammar-o.	Eric Christopher	2011-03-21	1	-1/+1
\| \| \| \|	llvm-svn: 128004
*	Re-commit 127368 and 127371. They are exonerated.	Evan Cheng	2011-03-10	1	-5/+11
\| \| \| \|	llvm-svn: 127380
*	Revert 127368 and 127371 for now.	Evan Cheng	2011-03-09	1	-11/+5
\| \| \| \|	llvm-svn: 127376
*	Change the definition of TargetRegisterInfo::getCrossCopyRegClass to be more	Evan Cheng	2011-03-09	1	-5/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	flexible. If it returns a register class that's different from the input, then that's the register class used for cross-register class copies. If it returns a register class that's the same as the input, then no cross- register class copies are needed (normal copies would do). If it returns null, then it's not at all possible to copy registers of the specified register class. llvm-svn: 127368
*	Fix typo, make helper static.	Benjamin Kramer	2011-03-09	1	-3/+3
\| \| \| \|	llvm-svn: 127335
*	Fix some latent bugs if the nodes are unschedulable. We'd gotten away	Eric Christopher	2011-03-08	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \|	with this before since none of the register tracking or nightly tests had unschedulable nodes. This should probably be refixed with a special default Node that just returns some "don't touch me" values. Fixes PR9427 llvm-svn: 127263
*	Further improvements to pre-RA-sched=list-ilp.	Andrew Trick	2011-03-08	1	-17/+62
\| \| \| \| \| \| \|	This change uses the MaxReorderWindow for both height and depth, which tends to limit the negative effects of high register pressure. llvm-svn: 127203
*	Move getRegPressureLimit() from TargetLoweringInfo to TargetRegisterInfo.	Cameron Zwarich	2011-03-07	1	-1/+1
\| \| \| \|	llvm-svn: 127175
*	Typo.	Eric Christopher	2011-03-06	1	-1/+1
\| \| \| \|	llvm-svn: 127131
*	Disable a couple of experimental heuristics to get the best results from the ↵	Andrew Trick	2011-03-06	1	-2/+2
\| \| \| \| \| \|	current implementation of -pre-RA-sched=list-ilp. llvm-svn: 127113
*	Be explicit with abs(). Visual Studio workaround.	Andrew Trick	2011-03-05	1	-4/+6
\| \| \| \|	llvm-svn: 127075
*	Missing comment.	Andrew Trick	2011-03-05	1	-0/+2
\| \| \| \|	llvm-svn: 127068
*	Increased the register pressure limit on x86_64 from 8 to 12	Andrew Trick	2011-03-05	1	-22/+143
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	regs. This is the only change in this checkin that may affects the default scheduler. With better register tracking and heuristics, it doesn't make sense to artificially lower the register limit so much. Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to give the scheduler a way to account for div and sqrt on targets that don't have an itinerary. It is currently defaults to 10 (the actual number doesn't matter much), but only takes effect on non-default schedulers: list-hybrid and list-ilp. Added several heuristics that can be individually disabled for the non-default sched=list-ilp mode. This helps us determine how much better we can do on a given benchmark than the default scheduler. Certain compute intensive loops run much faster in this mode with the right set of heuristics, and it doesn't seem to have much negative impact elsewhere. Not all of the heuristics are needed, but we still need to experiment to decide which should be disabled by default for sched=list-ilp. llvm-svn: 127067
*	Minor pre-RA-sched fixes and cleanup.	Andrew Trick	2011-03-04	1	-7/+15
\| \| \| \| \| \| \| \|	Fix the PendingQueue, then disable it because it's not required for the current schedulers' heuristics. Fix the logic for the unused list-ilp scheduler. llvm-svn: 126981
*	Introducing a new method of tracking register pressure. We can't	Andrew Trick	2011-02-04	1	-111/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	precisely track pressure on a selection DAG, but we can at least keep it balanced. This design accounts for various interesting aspects of selection DAGS: register and subregister copies, glued nodes, dead nodes, unused registers, etc. Added SUnit::NumRegDefsLeft and ScheduleDAGSDNodes::RegDefIter. Note: I disabled PrescheduleNodesWithMultipleUses when register pressure is enabled, based on no evidence other than I don't think it makes sense to have both enabled. llvm-svn: 124853
*	Remove a temporary workaround for a lencod miscompile. Depends on the fix in ↵	Andrew Trick	2011-01-27	1	-2/+0
\| \| \| \| \| \|	r124442. llvm-svn: 124443
*	Temporarily workaround JM/lencod miscompile (SIGSEGV).	Andrew Trick	2011-01-24	1	-0/+2
\| \| \| \| \| \|	rdar://problem/8893967 llvm-svn: 124137
*	Enable support for precise scheduling of the instruction selection	Andrew Trick	2011-01-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DAG. Disable using "-disable-sched-cycles". For ARM, this enables a framework for modeling the cpu pipeline and counting stalls. It also activates several heuristics to drive scheduling based on the model. Scheduling is inherently imprecise at this stage, and until spilling is improved it may defeat attempts to schedule. However, this framework provides greater control over tuning codegen. Although the flag is not target-specific, it should have very little affect on the default scheduler used by x86. The only two changes that affect x86 are: - scheduling a high-latency operation bumps the current cycle so independent operations can have their latency covered. i.e. two independent 4 cycle operations can produce results in 4 cycles, not 8 cycles. - Two operations with equal register pressure impact and no latency-based stalls on their uses will be prioritized by depth before height (height is irrelevant if no stalls occur in the schedule below this point). llvm-svn: 123971
*	Convert -enable-sched-cycles and -enable-sched-hazard to -disable	Andrew Trick	2011-01-21	1	-29/+31
\| \| \| \| \| \| \| \| \| \| \|	flags. They are still not enable in this revision. Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with the scheduler's model of operand latency in the selection DAG. Generalized unit tests to work with sched-cycles. llvm-svn: 123969
*	Selection DAG scheduler register pressure heuristic fixes.	Andrew Trick	2011-01-20	1	-8/+27
\| \| \| \| \| \| \| \|	Added a check for already live regs before claiming HighRegPressure. Fixed a few cases of checking the wrong number of successors. Added some tracing until these heuristics are better understood. llvm-svn: 123892
*	Support for precise scheduling of the instruction selection DAG,	Andrew Trick	2011-01-14	1	-537/+663
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	disabled in this checkin. Sorry for the large diffs due to refactoring. New functionality is all guarded by EnableSchedCycles. Scheduling the isel DAG is inherently imprecise, but we give it a best effort: - Added MayReduceRegPressure to allow stalled nodes in the queue only if there is a regpressure need. - Added BUHasStall to allow checking for either dependence stalls due to latency or resource stalls due to pipeline hazards. - Added BUCompareLatency to encapsulate and standardize the heuristics for minimizing stall cycles (vs. reducing register pressure). - Modified the bottom-up heuristic (now in BUCompareLatency) to prioritize nodes by their depth rather than height. As long as it doesn't stall, height is irrelevant. Depth represents the critical path to the DAG root. - Added hybrid_ls_rr_sort::isReady to filter stalled nodes before adding them to the available queue. Related Cleanup: most of the register reduction routines do not need to be templates. llvm-svn: 123468
*	Minor cleanup related to my latest scheduler changes.	Andrew Trick	2010-12-24	1	-3/+5
\| \| \| \|	llvm-svn: 122545
*	Fix a few cases where the scheduler is not checking for phys reg copies. The ↵	Andrew Trick	2010-12-24	1	-3/+10
\| \| \| \| \| \|	scheduling node may have a NULL DAG node, yuck. llvm-svn: 122544
*	Various bits of framework needed for precise machine-level selection	Andrew Trick	2010-12-24	1	-60/+367
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DAG scheduling during isel. Most new functionality is currently guarded by -enable-sched-cycles and -enable-sched-hazard. Added InstrItineraryData::IssueWidth field, currently derived from ARM itineraries, but could be initialized differently on other targets. Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is active, and if so how many cycles of state it holds. Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry into the scheduler's available queue. ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to get information about it's SUnits, provides RecedeCycle for bottom-up scheduling, correctly computes scoreboard depth, tracks IssueCount, and considers potential stall cycles when checking for hazards. ScheduleDAGRRList now models machine cycles and hazards (under flags). It tracks MinAvailableCycle, drives the hazard recognizer and priority queue's ready filter, manages a new PendingQueue, properly accounts for stall cycles, etc. llvm-svn: 122541
*	flags -> glue for selectiondag	Chris Lattner	2010-12-23	1	-6/+6
\| \| \| \|	llvm-svn: 122509
*	Reorganize ListScheduleBottomUp in preparation for modeling machine cycles ↵	Andrew Trick	2010-12-23	1	-130/+153
\| \| \| \| \| \|	and instruction issue. llvm-svn: 122491
*	Converted LiveRegCycles to LiveRegGens. It's easier to work with and allows ↵	Andrew Trick	2010-12-23	1	-17/+18
\| \| \| \| \| \|	multiple nodes per cycle. llvm-svn: 122474
*	In CheckForLiveRegDef use TRI->getOverlaps.	Andrew Trick	2010-12-23	1	-6/+9
\| \| \| \|	llvm-svn: 122473
*	Fixes PR8823: add-with-overflow-128.ll	Andrew Trick	2010-12-23	1	-12/+33
\| \| \| \| \| \| \| \|	In the bottom-up selection DAG scheduling, handle two-address instructions that read/write unspillable registers. Treat the entire chain of two-address nodes as a single live range. llvm-svn: 122472
*	In DelayForLiveRegsBottomUp, handle instructions that read and write	Andrew Trick	2010-12-21	1	-15/+4
\| \| \| \| \| \| \|	the same physical register. Simplifies the fix from the previous checkin r122211. llvm-svn: 122370
*	whitespace	Andrew Trick	2010-12-21	1	-42/+42
\| \| \| \|	llvm-svn: 122368
*	rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for	Chris Lattner	2010-12-21	1	-5/+5
\| \| \| \| \| \| \|	something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310
*	Fix a bug in the scheduler's handling of "unspillable" vregs.	Chris Lattner	2010-12-20	1	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Imagine we see: EFLAGS = inst1 EFLAGS = inst2 FLAGS gpr = inst3 EFLAGS Previously, we would refuse to schedule inst2 because it clobbers the EFLAGS of the predecessor. However, it also uses the EFLAGS of the predecessor, so it is safe to emit. SDep edges ensure that the right order happens already anyway. This fixes 2 testsuite crashes with the X86 patch I'm going to commit next. llvm-svn: 122211
*	the result of CheckForLiveRegDef is dead, remove it.	Chris Lattner	2010-12-20	1	-12/+8
\| \| \| \|	llvm-svn: 122209
*	Two sets of changes. Sorry they are intermingled.	Evan Cheng	2010-11-03	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to "optimize for latency". Call instructions don't have the right latency and this is more likely to use introduce spills. 2. Fix if-converter cost function. For ARM, it should use instruction latencies, not # of micro-ops since multi-latency instructions is completely executed even when the predicate is false. Also, some instruction will be "slower" when they are predicated due to the register def becoming implicit input. rdar://8598427 llvm-svn: 118135
*	Avoiding overly aggressive latency scheduling. If the two nodes share an	Evan Cheng	2010-10-29	1	-24/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	operand and one of them has a single use that is a live out copy, favor the one that is live out. Otherwise it will be difficult to eliminate the copy if the instruction is a loop induction variable update. e.g. BB: sub r1, r3, #1 str r0, [r2, r3] mov r3, r1 cmp bne BB => BB: str r0, [r2, r3] sub r3, r3, #1 cmp bne BB This fixed the recent 256.bzip2 regression. llvm-svn: 117675
*	The "excess register pressure" returned by HighRegPressure() is not accurate ↵	Evan Cheng	2010-07-26	1	-41/+20
\| \| \| \| \| \|	enough to factor into scheduling priority. Eliminate it and add early exits to speed up scheduling. llvm-svn: 109449
*	Pacify gcc-4.5 which wrongly thinks that RExcess (passed as the Excess ↵	Duncan Sands	2010-07-26	1	-1/+2
\| \| \| \| \| \| \| \|	parameter) may be used uninitialized in the callers of HighRegPressure. llvm-svn: 109393
*	Add comments.	Evan Cheng	2010-07-25	1	-4/+16
\| \| \| \|	llvm-svn: 109383
*	Fix crashes when scheduling a CopyToReg node -- getMachineOpcode asserts on	Bob Wilson	2010-07-25	1	-2/+2
\| \| \| \| \| \|	those. Radar 8231572. llvm-svn: 109367