diff options
author | Evan Cheng <evan.cheng@apple.com> | 2009-12-18 07:40:29 +0000 |
---|---|---|
committer | Evan Cheng <evan.cheng@apple.com> | 2009-12-18 07:40:29 +0000 |
commit | 4cf30b72bf0de8b6f138ad617b8e1c26abde3cc0 (patch) | |
tree | 7d17b86f61ce9bec97ab3646f1df79e9a4e2f354 /llvm/lib/Target/X86/X86Subtarget.h | |
parent | a7d0231b66f16e65c17f5a37a7140bca11d45c2d (diff) | |
download | bcm5719-llvm-4cf30b72bf0de8b6f138ad617b8e1c26abde3cc0.tar.gz bcm5719-llvm-4cf30b72bf0de8b6f138ad617b8e1c26abde3cc0.zip |
On recent Intel u-arch's, folding loads into some unary SSE instructions can
be non-optimal. To be precise, we should avoid folding loads if the instructions
only update part of the destination register, and the non-updated part is not
needed. e.g. cvtss2sd, sqrtss. Unfolding the load from these instructions breaks
the partial register dependency and it can improve performance. e.g.
movss (%rdi), %xmm0
cvtss2sd %xmm0, %xmm0
instead of
cvtss2sd (%rdi), %xmm0
An alternative method to break dependency is to clear the register first. e.g.
xorps %xmm0, %xmm0
cvtss2sd (%rdi), %xmm0
llvm-svn: 91672
Diffstat (limited to 'llvm/lib/Target/X86/X86Subtarget.h')
-rw-r--r-- | llvm/lib/Target/X86/X86Subtarget.h | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/llvm/lib/Target/X86/X86Subtarget.h b/llvm/lib/Target/X86/X86Subtarget.h index fb457ddd880..b2b48edf96c 100644 --- a/llvm/lib/Target/X86/X86Subtarget.h +++ b/llvm/lib/Target/X86/X86Subtarget.h @@ -77,6 +77,14 @@ protected: /// IsBTMemSlow - True if BT (bit test) of memory instructions are slow. bool IsBTMemSlow; + + /// BreakSSEDep - True if codegen should unfold load or insert xorps / pxor + /// to break register dependency for a partial register update SSE + /// instruction. This is needed for instructions such as CVTSS2SD which + /// only update the lower part of the register, and the result of the updated + /// part does not depend on the contents of the destination before the + /// instruction, and the non-updated portion of the register is not used. + bool BreakSSEDep; /// DarwinVers - Nonzero if this is a darwin platform: the numeric /// version of the platform, e.g. 8 = 10.4 (Tiger), 9 = 10.5 (Leopard), etc. @@ -142,6 +150,7 @@ public: bool hasFMA3() const { return HasFMA3; } bool hasFMA4() const { return HasFMA4; } bool isBTMemSlow() const { return IsBTMemSlow; } + bool shouldBreakSSEDep() const { return BreakSSEDep; } bool isTargetDarwin() const { return TargetType == isDarwin; } bool isTargetELF() const { return TargetType == isELF; } |