diff options
| author | Simon Pilgrim <llvm-dev@redking.me.uk> | 2018-01-29 21:24:31 +0000 |
|---|---|---|
| committer | Simon Pilgrim <llvm-dev@redking.me.uk> | 2018-01-29 21:24:31 +0000 |
| commit | 02bdac53e75aa3bff67d3320d29597c1188d641b (patch) | |
| tree | bac3bbc9f526876f5e57e0c513f376ce14d7ad99 /llvm/lib/Target/X86/MCTargetDesc | |
| parent | 08464524c34daa350ba4eaafd6231ddc1c3edee0 (diff) | |
| download | bcm5719-llvm-02bdac53e75aa3bff67d3320d29597c1188d641b.tar.gz bcm5719-llvm-02bdac53e75aa3bff67d3320d29597c1188d641b.zip | |
[X86] Emit 11-byte or 15-byte NOPs on recent AMD targets, else default to 10-byte NOPs (PR22965)
We currently emit up to 15-byte NOPs on all targets (apart from Silvermont), which stalls performance on some targets with decoders that struggle with 2 or 3 more '66' prefixes.
This patch flags recent AMD targets (btver1/znver1) to still emit 15-byte NOPs and bdver* targets to emit 11-byte NOPs. All other targets now emit 10-byte NOPs apart from SilverMont CPUs which still emit 7-byte NOPS.
Differential Revision: https://reviews.llvm.org/D42616
llvm-svn: 323693
Diffstat (limited to 'llvm/lib/Target/X86/MCTargetDesc')
| -rw-r--r-- | llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp | 16 |
1 files changed, 12 insertions, 4 deletions
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp index 3e68120041c..da4ca665d83 100644 --- a/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp +++ b/llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp @@ -344,10 +344,18 @@ bool X86AsmBackend::writeNopData(uint64_t Count, MCObjectWriter *OW) const { return true; } - uint64_t MaxNopLength = STI.getFeatureBits()[X86::ProcIntelSLM] ? 7 : 15; - - // 15 is the longest single nop instruction. Emit as many 15-byte nops as - // needed, then emit a nop of the remaining length. + // 15-bytes is the longest single NOP instruction, but 10-bytes is + // commonly the longest that can be efficiently decoded. + uint64_t MaxNopLength = 10; + if (STI.getFeatureBits()[X86::ProcIntelSLM]) + MaxNopLength = 7; + else if (STI.getFeatureBits()[X86::FeatureFast15ByteNOP]) + MaxNopLength = 15; + else if (STI.getFeatureBits()[X86::FeatureFast11ByteNOP]) + MaxNopLength = 11; + + // Emit as many MaxNopLength NOPs as needed, then emit a NOP of the remaining + // length. do { const uint8_t ThisNopLength = (uint8_t) std::min(Count, MaxNopLength); const uint8_t Prefixes = ThisNopLength <= 10 ? 0 : ThisNopLength - 10; |

