diff options
author | Evan Cheng <evan.cheng@apple.com> | 2012-12-10 23:21:26 +0000 |
---|---|---|
committer | Evan Cheng <evan.cheng@apple.com> | 2012-12-10 23:21:26 +0000 |
commit | 79e2ca90bcfcc3310d5f724409f0bef193726743 (patch) | |
tree | 3ecbf7e33e22074637dbe856ee55298fd4abeedf /llvm/test/CodeGen/ARM/2011-10-26-memset-with-neon.ll | |
parent | edd62b14e5284182231ecb4eb3850205167c4076 (diff) | |
download | bcm5719-llvm-79e2ca90bcfcc3310d5f724409f0bef193726743.tar.gz bcm5719-llvm-79e2ca90bcfcc3310d5f724409f0bef193726743.zip |
Some enhancements for memcpy / memset inline expansion.
1. Teach it to use overlapping unaligned load / store to copy / set the trailing
bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
if it's not possible to materialize an integer immediate with a single
instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
Also increase the threshold to something reasonable (8 for memset, 4 pairs
for memcpy).
This significantly improves Dhrystone, up to 50% on ARM iOS devices.
rdar://12760078
llvm-svn: 169791
Diffstat (limited to 'llvm/test/CodeGen/ARM/2011-10-26-memset-with-neon.ll')
-rw-r--r-- | llvm/test/CodeGen/ARM/2011-10-26-memset-with-neon.ll | 8 |
1 files changed, 0 insertions, 8 deletions
diff --git a/llvm/test/CodeGen/ARM/2011-10-26-memset-with-neon.ll b/llvm/test/CodeGen/ARM/2011-10-26-memset-with-neon.ll index 6e0ef961965..f563eeef018 100644 --- a/llvm/test/CodeGen/ARM/2011-10-26-memset-with-neon.ll +++ b/llvm/test/CodeGen/ARM/2011-10-26-memset-with-neon.ll @@ -1,13 +1,5 @@ ; RUN: llc -march=arm -mcpu=cortex-a8 < %s | FileCheck %s -; Should trigger a NEON store. -; CHECK: vstr -define void @f_0_12(i8* nocapture %c) nounwind optsize { -entry: - call void @llvm.memset.p0i8.i64(i8* %c, i8 0, i64 12, i32 8, i1 false) - ret void -} - ; Trigger multiple NEON stores. ; CHECK: vst1.64 ; CHECK-NEXT: vst1.64 |