summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/MachineInstrBundle.cpp
diff options
context:
space:
mode:
authorJF Bastien <jfb@google.com>2015-08-10 20:59:36 +0000
committerJF Bastien <jfb@google.com>2015-08-10 20:59:36 +0000
commitfa9746dc8d868df616e1068363d224f25c3ff01a (patch)
tree3bdc2dbe682aca2d45deda740ad2aef15b2ad6ae /llvm/lib/CodeGen/MachineInstrBundle.cpp
parent00b6f749354e9dab19fb5eac66cbfb5f6a67b3b6 (diff)
downloadbcm5719-llvm-fa9746dc8d868df616e1068363d224f25c3ff01a.tar.gz
bcm5719-llvm-fa9746dc8d868df616e1068363d224f25c3ff01a.zip
x86: Emit LAHF/SAHF instead of PUSHF/POPF
NaCl's sandbox doesn't allow PUSHF/POPF out of security concerns (priviledged emulators have forgotten to mask system bits in the past, and EFLAGS's DF bit is a constant source of hilarity). Commit r220529 fixed PR20376 by saving cmpxchg's flags result using EFLAGS, this commit now generated LAHF/SAHF instead, for all of x86 (not just NaCl) because it leads to an overall performance gain over PUSHF/POPF. As with the previous patch this code generation is pretty bad because it occurs very later, after register allocation, and in many cases it rematerializes flags which were already available (e.g. already in a register through SETE). Fortunately it's somewhat rare that this code needs to fire. I did [[ https://github.com/jfbastien/benchmark-x86-flags | a bit of benchmarking ]], the results on an Intel Haswell E5-2690 CPU at 2.9GHz are: | Time per call (ms) | Runtime (ms) | Benchmark | | 0.000012514 | 6257 | sete.i386 | | 0.000012810 | 6405 | sete.i386-fast | | 0.000010456 | 5228 | sete.x86-64 | | 0.000010496 | 5248 | sete.x86-64-fast | | 0.000012906 | 6453 | lahf-sahf.i386 | | 0.000013236 | 6618 | lahf-sahf.i386-fast | | 0.000010580 | 5290 | lahf-sahf.x86-64 | | 0.000010304 | 5152 | lahf-sahf.x86-64-fast | | 0.000028056 | 14028 | pushf-popf.i386 | | 0.000027160 | 13580 | pushf-popf.i386-fast | | 0.000023810 | 11905 | pushf-popf.x86-64 | | 0.000026468 | 13234 | pushf-popf.x86-64-fast | Clearly `PUSHF`/`POPF` are suboptimal. It doesn't really seems to be worth teaching LLVM about individual flags, at least not for this purpose. Reviewers: rnk, jvoung, t.p.northover Subscribers: llvm-commits Differential revision: http://reviews.llvm.org/D6629 llvm-svn: 244503
Diffstat (limited to 'llvm/lib/CodeGen/MachineInstrBundle.cpp')
-rw-r--r--llvm/lib/CodeGen/MachineInstrBundle.cpp2
1 files changed, 1 insertions, 1 deletions
diff --git a/llvm/lib/CodeGen/MachineInstrBundle.cpp b/llvm/lib/CodeGen/MachineInstrBundle.cpp
index cd820ee1ac5..f6e45a4b7c7 100644
--- a/llvm/lib/CodeGen/MachineInstrBundle.cpp
+++ b/llvm/lib/CodeGen/MachineInstrBundle.cpp
@@ -310,7 +310,7 @@ MachineOperandIteratorBase::analyzePhysReg(unsigned Reg,
if (!MOReg || !TargetRegisterInfo::isPhysicalRegister(MOReg))
continue;
- bool IsRegOrSuperReg = MOReg == Reg || TRI->isSubRegister(MOReg, Reg);
+ bool IsRegOrSuperReg = MOReg == Reg || TRI->isSuperRegister(MOReg, Reg);
bool IsRegOrOverlapping = MOReg == Reg || TRI->regsOverlap(MOReg, Reg);
if (IsRegOrSuperReg && MO.readsReg()) {
OpenPOWER on IntegriCloud