summaryrefslogtreecommitdiffstats
path: root/clang/test/CodeGen/sse2-builtins.c
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Lower _mm[256|512]_[mask[z]]_avg_epu[8|16] intrinsics to native llvm IRYael Tsafrir2017-09-121-2/+14
| | | | | | Differential Revision: https://reviews.llvm.org/D37562 llvm-svn: 313011
* [X86][SSE] Add _mm_set_pd1 (PR32827)Simon Pilgrim2017-04-281-0/+7
| | | | | | Matches _mm_set_ps1 implementation llvm-svn: 301637
* [x86] these aren't the undefs you're looking for (PR32176)Sanjay Patel2017-03-121-2/+2
| | | | | | | | | | | | | x86 has undef SSE/AVX intrinsics that should represent a bogus register operand. This is not the same as LLVM's undef value which can take on multiple bit patterns. There are better solutions / follow-ups to this discussed here: https://bugs.llvm.org/show_bug.cgi?id=32176 ...but this should prevent miscompiles with a one-line code change. Differential Revision: https://reviews.llvm.org/D30834 llvm-svn: 297588
* [X86] Remove the mm_malloc.h include guard hack from the X86 builtins testsElad Cohen2016-09-281-4/+2
| | | | | | | | | | | | The X86 clang/test/CodeGen/*builtins.c tests define the mm_malloc.h include guard as a hack for avoiding its inclusion (mm_malloc.h requires a hosted environment since it expects stdlib.h to be available - which is not the case in these internal clang codegen tests). This patch removes this hack and instead passes -ffreestanding to clang cc1. Differential Revision: https://reviews.llvm.org/D24825 llvm-svn: 282581
* [X86] Use v2i64 vectors to implement _mm_and/andn/or/xor_pd.Craig Topper2016-08-311-5/+5
| | | | | | These will be reused when removing some builtins from avx512vldqintrin.h and this will make the tests for that change show a better number of vector elements. llvm-svn: 280196
* After PR28761 use -Wall with -Werror in builtins tests to identifyEric Christopher2016-08-041-2/+2
| | | | | | possible problems in headers. llvm-svn: 277696
* [X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using ↵Simon Pilgrim2016-07-201-6/+4
| | | | | | | | | | | | | | | | generic IR D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead. It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match). This patch changes both scalar and packed versions back to using x86-specific builtins. It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding. Differential Revision: https://reviews.llvm.org/D22105 llvm-svn: 276102
* [X86][SSE2] Updated tests to match ↵Simon Pilgrim2016-06-291-9/+8
| | | | | | llvm\test\CodeGen\X86\sse2-intrinsics-fast-isel-x86_64.ll llvm-svn: 274126
* [X86] add _mm_loadu_si64Asaf Badouh2016-06-261-0/+9
| | | | | | Differential Revision: http://reviews.llvm.org/D21504 llvm-svn: 273812
* [X86] Fix pslldq/psrldq intrinsics to not fail compilation with immediates ↵Craig Topper2016-06-251-0/+12
| | | | | | larger than 16. This was accidentally broken in r272246. llvm-svn: 273775
* [x86] translate SSE packed FP comparison builtins to IRSanjay Patel2016-06-151-12/+48
| | | | | | | | | | | | | As noted in the code comment, a potential follow-on would be to remove the builtins themselves. Other than ord/unord, this already works as expected. Eg: typedef float v4sf __attribute__((__vector_size__(16))); v4sf fcmpgt(v4sf a, v4sf b) { return a > b; } Differential Revision: http://reviews.llvm.org/D21268 llvm-svn: 272840
* [x86] generate IR for SSE integer min/max builtinsSanjay Patel2016-06-151-4/+8
| | | | | | | Sibling patch to r272806: http://reviews.llvm.org/rL272806 llvm-svn: 272807
* [X86][SSE] Replace (V)CVTTPS2DQ and VCVTTPD2DQ truncating (round to zero) ↵Simon Pilgrim2016-06-011-1/+1
| | | | | | | | | | | | f32/f64 to i32 with generic IR (clang) The 'cvtt' truncation (round to zero) conversions can be safely represented as generic __builtin_convertvector (fptosi) calls instead of x86 intrinsics. We already do this (implicitly) for the scalar equivalents. Note: I looked at updating _mm_cvttpd_epi32 as well but this still requires a lot more backend work to correctly lower (both for debug and optimized builds). Differential Revision: http://reviews.llvm.org/D20859 llvm-svn: 271436
* [X86] Ensure load/store tests unaligned pointers really are align 1Simon Pilgrim2016-05-301-5/+5
| | | | llvm-svn: 271227
* [X86][SSE] _mm_store1_ps/_mm_store1_pd should require an aligned pointerSimon Pilgrim2016-05-301-3/+9
| | | | | | | | | | | | | | | | According to the gcc headers, intel intrinsics docs and msdn codegen the _mm_store1_pd (and its _mm_store_pd1 equivalent) should use an aligned pointer - the clang headers are the only implementation I can find that assume non-aligned stores (by storing with _mm_storeu_pd). Additionally, according to the intel intrinsics docs and msdn codegen the _mm_store1_ps (_mm_store_ps1) requires a similarly aligned pointer. This patch raises the alignment requirements to match the other implementations by calling _mm_store_ps/_mm_store_pd instead. I've also added the missing _mm_store_pd1 intrinsic (which maps to _mm_store1_pd like _mm_store_ps1 does to _mm_store1_ps). As a followup I'll update the llvm fast-isel tests to match this codegen. Differential Revision: http://reviews.llvm.org/D20617 llvm-svn: 271218
* [X86] Replace unaligned store builtins in SSE/AVX intrinsic files with code ↵Craig Topper2016-05-301-2/+4
| | | | | | | | that will compile to a native unaligned store. Remove the builtins since they are no longer used. Intrinsics will be removed from llvm in a future commit. llvm-svn: 271214
* [X86] Update test cases to make sure storeu builtins use the storeu ↵Craig Topper2016-05-251-2/+2
| | | | | | | | instrinsics. We were previously matching on other stores in the IR from this being an -O0 test. We should probably look into making the storeu builtins just emit a normal store with an alignment of 1. llvm-svn: 270664
* [X86][SSE] Replace lossless i32/f32 to f64 conversion intrinsics with generic IRSimon Pilgrim2016-05-231-2/+4
| | | | | | | | | | Both the (V)CVTDQ2PD(Y) (i32 to f64) and (V)CVTPS2PD(Y) (f32 to f64) conversion instructions are lossless and can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics without affecting final codegen. This patch removes the clang builtins and their use in the sse2/avx headers - a future patch will deal with removing the llvm intrinsics, but that will require a bit more work. Differential Revision: http://reviews.llvm.org/D20528 llvm-svn: 270499
* [X86][SSE2] Fixed shuffle of results in _mm_cmpnge_sd/_mm_cmpngt_sd testsSimon Pilgrim2016-05-191-0/+8
| | | | llvm-svn: 270079
* [X86][SSE2] Added _mm_move_* testsSimon Pilgrim2016-05-191-0/+15
| | | | llvm-svn: 270043
* [X86][SSE2] Added _mm_cast* and _mm_set* testsSimon Pilgrim2016-05-191-0/+236
| | | | llvm-svn: 270042
* [X86][SSE2] Sync with llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.llSimon Pilgrim2016-05-191-53/+107
| | | | llvm-svn: 270034
* Revert r269967 (SSE2 builtin checks) due to failed buildbotsSimon Pilgrim2016-05-181-98/+52
| | | | llvm-svn: 269970
* [X86][SSE2] Sync with llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.llSimon Pilgrim2016-05-181-52/+98
| | | | llvm-svn: 269967
* [X86][SSE] Tidied up MMX/SSE/SSE2 builtin tests to the correct test fileSimon Pilgrim2016-05-171-0/+54
| | | | llvm-svn: 269852
* [X86] Stripped backend codegen testsSimon Pilgrim2015-12-031-926/+375
| | | | | | | | | | As discussed on the ml, backend tests need to be put in llvm/test/CodeGen/X86 as fast-isel tests using IR that is as close to what is generated here as possible. The llvm tests will (re)added in a future commit I will update PR24580 on this new plan llvm-svn: 254594
* [X86][SSE2] Added SSE2 IR + assembly codegen builtin testsSimon Pilgrim2015-11-291-0/+1656
Improved tests as discussed in PR24580 llvm-svn: 254262
OpenPOWER on IntegriCloud