diff options
author | Sanjay Patel <spatel@rotateright.com> | 2015-08-25 16:29:21 +0000 |
---|---|---|
committer | Sanjay Patel <spatel@rotateright.com> | 2015-08-25 16:29:21 +0000 |
commit | deb8f826a58260244e8bac596d09ea54485837eb (patch) | |
tree | 8203f1fff3e4e913c8490f9a79ba259da695dca3 /llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | |
parent | 3240cd3421c78b707e80d59ea0bcd5f14c8933fa (diff) | |
download | bcm5719-llvm-deb8f826a58260244e8bac596d09ea54485837eb.tar.gz bcm5719-llvm-deb8f826a58260244e8bac596d09ea54485837eb.zip |
make fast unaligned memory accesses implicit with SSE4.2 or SSE4a
This is a follow-on from the discussion in http://reviews.llvm.org/D12154.
This change allows memset/memcpy to use SSE or AVX memory accesses for any chip that has
generally fast unaligned memory ops.
A motivating use case for this change is a clang invocation that doesn't explicitly set
the CPU, but does target a feature that we know only exists on a CPU that supports fast
unaligned memops. For example:
$ clang -O1 foo.c -mavx
This resolves a difference in lowering noted in PR24449:
https://llvm.org/bugs/show_bug.cgi?id=24449
Before this patch, we used different store types depending on whether the example can be
lowered as a memset or not.
Differential Revision: http://reviews.llvm.org/D12288
llvm-svn: 245950
Diffstat (limited to 'llvm/lib/Transforms/Vectorize/LoopVectorize.cpp')
0 files changed, 0 insertions, 0 deletions