| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
intrinsics when compiled for 32-bit mode.
All the command lines are for 64-bit mode, but sometimes I compile
the tests in 32-bit mode to see what assembly we get and we need
to skip these to do that.
llvm-svn: 365668
|
|
|
|
|
|
|
|
| |
D48464 contains changes that will loosen some of the range checks in SemaChecking to a DefaultError warning that can be disabled.
This patch adds explicit masking to avoid using the upper bits of immediates to gracefully handle the warning being disabled.
llvm-svn: 335308
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and 256 bit vector types. Use them to implement the extract and insert intrinsics.
Previously we were just using extended vector operations in the header file.
This unfortunately allowed non-constant indices to be used with the intrinsics. This is incompatible with gcc, icc, and MSVC. It also introduces a different performance characteristic because non-constant index gets lowered to a vector store and an element sized load.
By adding the builtins we can check for the index to be a constant and ensure its in range of the vector element count.
User code still has the option to use extended vector operations themselves if they need non-constant indexing.
llvm-svn: 334057
|
|
|
|
|
|
|
| |
Following r333110:
"Move all Intel defined intrinsic includes into immintrin.h"
llvm-svn: 333160
|
|
|
|
|
|
|
|
| |
I believe all the pieces are now in place in the backend to make this work correctly. We can either mask the input to 32 bits for pmuludg or shl/ashr for pmuldq and use a regular mul instruction. The backend should combine this to PMULUDQ/PMULDQ and then SimplifyDemandedBits will remove the and/shifts.
Differential Revision: https://reviews.llvm.org/D45421
llvm-svn: 329605
|
|
|
|
|
|
|
|
|
|
| |
MOVNTDQA non-temporal aligned vector loads can be correctly represented using generic builtin loads, allowing us to remove the existing x86 intrinsics.
LLVM companion patch: D31767.
Differential Revision: https://reviews.llvm.org/D31766
llvm-svn: 300326
|
|
|
|
|
|
|
|
|
|
|
|
| |
The X86 clang/test/CodeGen/*builtins.c tests define the mm_malloc.h include
guard as a hack for avoiding its inclusion (mm_malloc.h requires a hosted
environment since it expects stdlib.h to be available - which is not the case
in these internal clang codegen tests).
This patch removes this hack and instead passes -ffreestanding to clang cc1.
Differential Revision: https://reviews.llvm.org/D24825
llvm-svn: 282581
|
|
|
|
|
|
| |
possible problems in headers.
llvm-svn: 277696
|
|
|
|
| |
llvm-svn: 274965
|
|
|
|
|
|
|
| |
Sibling patch to r272806:
http://reviews.llvm.org/rL272806
llvm-svn: 272807
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
generic IR (clang)
The VPMOVSX and (V)PMOVZX sign/zero extension intrinsics can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics.
This patch removes the clang builtins and their use in the sse2/avx headers - a companion patch will remove/auto-upgrade the llvm intrinsics.
Note: We already did this for SSE41 PMOVSX sometime ago.
Differential Revision: http://reviews.llvm.org/D20684
llvm-svn: 271106
|
|
|
|
| |
llvm-svn: 269926
|
|
|
|
|
|
| |
equivalent tests
llvm-svn: 262418
|
|
|
|
|
|
|
|
|
|
| |
As discussed on the ml, backend tests need to be put in llvm/test/CodeGen/X86 as fast-isel tests using IR that is as close to what is generated here as possible.
The llvm tests will (re)added in a future commit
I will update PR24580 on this new plan
llvm-svn: 254594
|
|
|
|
|
|
| |
about optimization options.
llvm-svn: 250271
|
|
|
|
|
|
|
|
|
|
| |
Per Intel intrinsics guide:
- _mm256_stream_load_si256 takes `__m256i const *'
- _mm_stream_load_si128 takes `__m128i *', for no good reason.
Let's accept const* for both.
llvm-svn: 249213
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
recently when we started using direct conversion to model sign
extension. The __v16qi type we use for SSE v16i8 vectors is defined in
terms of 'char' which may or may not be signed! This causes us to
generate pmovsx and pmovzx depending on the setting of -funsigned-char.
This patch just forms an explicitly signed type and uses that to
formulate the sign extension. While this gets the correct behavior
(which we now verify with the enhanced test) this is just the tip of the
ice berg. Now that I know what to look for, I have found errors of this
sort *throughout* our vector code. Fortunately, this is the only
specific place where I know of users actively having their code
miscompiled by Clang due to this, so I'm keeping the fix for those users
minimal and targeted.
I'll be sending a proper email for discussion of how to fix these
systematically, what the implications are, and just how widely broken
this is... From what I can tell, we have never shipped a correct set of
builtin headers for x86 when users rely on -funsigned-char. Oops.
llvm-svn: 248980
|
|
|
|
|
|
|
|
|
|
| |
128-bit vector integer sign extensions correctly lower to the pmovsx instructions even for debug builds.
This patch removes the builtins and reimplements the _mm_cvtepi*_epi* intrinsics __using builtin_shufflevector (to extract the bottom most subvector) and __builtin_convertvector (to actually perform the sign extension).
Differential Revision: http://reviews.llvm.org/D12835
llvm-svn: 248092
|
|
Transferred SSE41 instructions from sse-builtins.c
llvm-svn: 246947
|