summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
diff options
context:
space:
mode:
authorAndrew Trick <atrick@apple.com>2011-04-14 05:15:06 +0000
committerAndrew Trick <atrick@apple.com>2011-04-14 05:15:06 +0000
commitbfbd972b1f7f1584bfa2c22f820fcc45467c5626 (patch)
tree2fc97e9822b86b59d65529cfae68ef5fb59f88b7 /llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
parentd175c98d41e11ad5b6d03e44101f7de217e2b32d (diff)
downloadbcm5719-llvm-bfbd972b1f7f1584bfa2c22f820fcc45467c5626.tar.gz
bcm5719-llvm-bfbd972b1f7f1584bfa2c22f820fcc45467c5626.zip
In the pre-RA scheduler, maintain cmp+br proximity.
This is done by pushing physical register definitions close to their use, which happens to handle flag definitions if they're not glued to the branch. This seems to be generally a good thing though, so I didn't need to add a target hook yet. The primary motivation is to generate code closer to what people expect and rule out missed opportunity from enabling macro-op fusion. As a side benefit, we get several 2-5% gains on x86 benchmarks. There is one regression: SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is an independent scheduler bug that will be tracked separately. See rdar://problem/9283108. Incidentally, pre-RA scheduling is only half the solution. Fixing the later passes is tracked by: <rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump Fixes: <rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion llvm-svn: 129508
Diffstat (limited to 'llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp')
-rw-r--r--llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp8
1 files changed, 8 insertions, 0 deletions
diff --git a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
index 078533be843..b258e6eefe2 100644
--- a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
@@ -87,6 +87,8 @@ SUnit *ScheduleDAGSDNodes::Clone(SUnit *Old) {
SU->isCommutable = Old->isCommutable;
SU->hasPhysRegDefs = Old->hasPhysRegDefs;
SU->hasPhysRegClobbers = Old->hasPhysRegClobbers;
+ SU->isScheduleHigh = Old->isScheduleHigh;
+ SU->isScheduleLow = Old->isScheduleLow;
SU->SchedulingPref = Old->SchedulingPref;
Old->isCloned = true;
return SU;
@@ -335,6 +337,12 @@ void ScheduleDAGSDNodes::BuildSchedUnits() {
if (!HasGlueUse) break;
}
+ // Schedule zero-latency TokenFactor below any nodes that may increase the
+ // schedule height. Otherwise, ancestors of the TokenFactor may appear to
+ // have false stalls.
+ if (NI->getOpcode() == ISD::TokenFactor)
+ NodeSUnit->isScheduleLow = true;
+
// If there are glue operands involved, N is now the bottom-most node
// of the sequence of nodes that are glued together.
// Update the SUnit.
OpenPOWER on IntegriCloud