diff options
| author | Chandler Carruth <chandlerc@gmail.com> | 2014-01-27 11:12:14 +0000 |
|---|---|---|
| committer | Chandler Carruth <chandlerc@gmail.com> | 2014-01-27 11:12:14 +0000 |
| commit | 328998b2f7dc1dd9ceca111244c36fd377cce188 (patch) | |
| tree | 9a14c93a64a0bc09cb374f5f1c6267f32399bf9d | |
| parent | 629199ccb30ae73fb9325516bcbb056b3f40f413 (diff) | |
| download | bcm5719-llvm-328998b2f7dc1dd9ceca111244c36fd377cce188.tar.gz bcm5719-llvm-328998b2f7dc1dd9ceca111244c36fd377cce188.zip | |
[vectorizer] Fix a trivial oversight where we always requested the
number of vector registers rather than toggling between vector and
scalar register number based on VF. I don't have a test case as
I spotted this by inspection and on X86 it only makes a difference if
your target is lacking SSE and thus has *no* vector registers.
If someone wants to add a test case for this for ARM or somewhere else
where this is more significant, that would be awesome.
Also made the variable name a bit more sensible while I'm here.
llvm-svn: 200211
| -rw-r--r-- | llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | 8 |
1 files changed, 4 insertions, 4 deletions
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp index 66134bd95d0..f904765f41e 100644 --- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp +++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp @@ -4962,9 +4962,9 @@ LoopVectorizationCostModel::selectUnrollFactor(bool OptForSize, if (TC > 1 && TC < TinyTripCountUnrollThreshold) return 1; - unsigned TargetVectorRegisters = TTI.getNumberOfRegisters(true); - DEBUG(dbgs() << "LV: The target has " << TargetVectorRegisters << - " vector registers\n"); + unsigned TargetNumRegisters = TTI.getNumberOfRegisters(VF > 1); + DEBUG(dbgs() << "LV: The target has " << TargetNumRegisters << + " registers\n"); LoopVectorizationCostModel::RegisterUsage R = calculateRegisterUsage(); // We divide by these constants so assume that we have at least one @@ -4978,7 +4978,7 @@ LoopVectorizationCostModel::selectUnrollFactor(bool OptForSize, // Next, divide the remaining registers by the number of registers that is // required by the loop, in order to estimate how many parallel instances // fit without causing spills. - unsigned UF = (TargetVectorRegisters - R.LoopInvariantRegs) / R.MaxLocalUsers; + unsigned UF = (TargetNumRegisters - R.LoopInvariantRegs) / R.MaxLocalUsers; // Clamp the unroll factor ranges to reasonable factors. unsigned MaxUnrollSize = TTI.getMaximumUnrollFactor(); |

