summaryrefslogtreecommitdiffstats
path: root/clang/test/CodeGen/sse-builtins.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix reliance on lax vector conversions in tests for x86 intrinsics.Richard Smith2019-09-171-1/+1
| | | | llvm-svn: 372062
* Allow prefetching from non-zero address spacesJF Bastien2019-07-251-1/+1
| | | | | | | | | | | | | | | Summary: This is useful for targets which have prefetch instructions for non-default address spaces. <rdar://problem/42662136> Subscribers: nemanjai, javed.absar, hiraditya, kbarton, jkorous, dexonsmith, cfe-commits, llvm-commits, RKSimon, hfinkel, t.p.northover, craig.topper, anemet Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65254 llvm-svn: 367032
* [X86] Change the IR sequence for _mm_storeh_pi and _mm_storel_pi to perform ↵Craig Topper2019-07-101-6/+4
| | | | | | | | the store as a <2 x float> instead of i64. This is similar to what we do for loadl_pi and loadh_pi. llvm-svn: 365669
* [X86] Add guards to some of the x86 intrinsic tests to skip 64-bit mode only ↵Craig Topper2019-07-101-0/+6
| | | | | | | | | | intrinsics when compiled for 32-bit mode. All the command lines are for 64-bit mode, but sometimes I compile the tests in 32-bit mode to see what assembly we get and we need to skip these to do that. llvm-svn: 365668
* [X86] Use __m128_u for _mm_loadu_ps after r353555Reid Kleckner2019-02-121-0/+1
| | | | | | | | Add secondary triple to existing SSE test for it. I audited other uses of __attribute__((__packed__)) in the intrinsic headers, and this seemed to be the only missing one. llvm-svn: 353878
* [X86] Add __builtin_ia32_selectss_128 and __builtin_ia32_selectsd_128 that ↵Craig Topper2018-07-101-1/+2
| | | | | | | | | | | | is suitable for use in scalar mask intrinsics. This will convert the i8 mask argument to <8 x i1> and extract an i1 and then emit a select instruction. This replaces the '(__U & 1)" and ternary operator used in some of intrinsics. The old sequence was lowered to a scalar and and compare. The new sequence uses an i1 vector that will interoperate better with other mask intrinsics. This removes the need to handle div_ss/sd specially in CGBuiltin.cpp. A follow up patch will add the GCCBuiltin name back in llvm and remove the custom handling. I made some adjustments to legacy move_ss/sd intrinsics which we reused here to do a simpler extract and insert instead of 2 extracts and two inserts or a shuffle. llvm-svn: 336622
* [X86] Lowering sqrt intrinsics to native IRTomasz Krupa2018-06-151-2/+4
| | | | | | | | | | | | Reviewers: craig.topper, spatel, RKSimon, igorb, uriel.k Reviewed By: craig.topper Subscribers: tkrupa, cfe-commits Differential Revision: https://reviews.llvm.org/D41168 llvm-svn: 334850
* [X86] Simplify the implementation of _mm_sqrt_ss, _mm_rcp_ss, and _mm_rsqrt_ss.Craig Topper2018-05-301-24/+0
| | | | | | | | We don't need the insertion back into the original vector at the end. The builtin already understands that. This is different than _mm_sqrt_sd which takes two arguments and we do need to insert. llvm-svn: 333572
* [X86] NFC Include immintrin.h in CodeGen testsGabor Buella2018-05-241-1/+1
| | | | | | | Following r333110: "Move all Intel defined intrinsic includes into immintrin.h" llvm-svn: 333160
* [x86] these aren't the undefs you're looking for (PR32176)Sanjay Patel2017-03-121-1/+1
| | | | | | | | | | | | | x86 has undef SSE/AVX intrinsics that should represent a bogus register operand. This is not the same as LLVM's undef value which can take on multiple bit patterns. There are better solutions / follow-ups to this discussed here: https://bugs.llvm.org/show_bug.cgi?id=32176 ...but this should prevent miscompiles with a one-line code change. Differential Revision: https://reviews.llvm.org/D30834 llvm-svn: 297588
* [X86] Remove the mm_malloc.h include guard hack from the X86 builtins testsElad Cohen2016-09-281-3/+1
| | | | | | | | | | | | The X86 clang/test/CodeGen/*builtins.c tests define the mm_malloc.h include guard as a hack for avoiding its inclusion (mm_malloc.h requires a hosted environment since it expects stdlib.h to be available - which is not the case in these internal clang codegen tests). This patch removes this hack and instead passes -ffreestanding to clang cc1. Differential Revision: https://reviews.llvm.org/D24825 llvm-svn: 282581
* After PR28761 use -Wall with -Werror in builtins tests to identifyEric Christopher2016-08-041-1/+1
| | | | | | possible problems in headers. llvm-svn: 277696
* [X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using ↵Simon Pilgrim2016-07-201-6/+3
| | | | | | | | | | | | | | | | generic IR D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead. It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match). This patch changes both scalar and packed versions back to using x86-specific builtins. It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding. Differential Revision: https://reviews.llvm.org/D22105 llvm-svn: 276102
* [x86] translate SSE packed FP comparison builtins to IRSanjay Patel2016-06-151-12/+48
| | | | | | | | | | | | | As noted in the code comment, a potential follow-on would be to remove the builtins themselves. Other than ord/unord, this already works as expected. Eg: typedef float v4sf __attribute__((__vector_size__(16))); v4sf fcmpgt(v4sf a, v4sf b) { return a > b; } Differential Revision: http://reviews.llvm.org/D21268 llvm-svn: 272840
* [X86] Ensure load/store tests unaligned pointers really are align 1Simon Pilgrim2016-05-301-2/+2
| | | | llvm-svn: 271227
* [X86][SSE] Added missing tests (merge failure)Simon Pilgrim2016-05-301-4/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D20617 llvm-svn: 271219
* [X86] Replace unaligned store builtins in SSE/AVX intrinsic files with code ↵Craig Topper2016-05-301-3/+6
| | | | | | | | that will compile to a native unaligned store. Remove the builtins since they are no longer used. Intrinsics will be removed from llvm in a future commit. llvm-svn: 271214
* [X86][SSE] Updated _mm_store_ps1 test to match _mm_store1_psSimon Pilgrim2016-05-251-1/+1
| | | | llvm-svn: 270679
* [X86] Update test cases to make sure storeu builtins use the storeu ↵Craig Topper2016-05-251-2/+2
| | | | | | | | instrinsics. We were previously matching on other stores in the IR from this being an -O0 test. We should probably look into making the storeu builtins just emit a normal store with an alignment of 1. llvm-svn: 270664
* [X86][SSE] Sync with llvm/test/CodeGen/X86/sse-intrinsics-fast-isel.llSimon Pilgrim2016-05-191-150/+681
| | | | | | sse-builtins.c now just covers SSE1 intrinsics llvm-svn: 270083
* [X86][SSE] Tidied up MMX/SSE/SSE2 builtin tests to the correct test fileSimon Pilgrim2016-05-171-301/+42
| | | | llvm-svn: 269852
* [X86] Fix a few intrinsic tests to use the return type that matches the ↵Craig Topper2016-05-171-4/+4
| | | | | | intrinsic they're testing. llvm-svn: 269735
* [X86] Add 'pause' builtin that's already in llvm and use it instead of ↵Craig Topper2015-11-111-0/+6
| | | | | | inline assembly to implement _mm_pause. llvm-svn: 252712
* [X86] Use __builtin_ia32_paddq and __builtin_ia32_psubq to implement a ↵Craig Topper2015-11-111-0/+18
| | | | | | couple intrinsics that were supposed to operate on MMX registers. Otherwise we end up operating on GPRs. Throw in a test for _mm_mul_su32 while I was there. llvm-svn: 252711
* [X86][SSSE3] Added SSSE3 IR + assembly codegen builtin testsSimon Pilgrim2015-09-061-10/+0
| | | | | | Transferred SSSE3 instructions from sse-builtins.c llvm-svn: 246948
* [X86]][SSE3] Added SSE41 IR + assembly codegen builtin testsSimon Pilgrim2015-09-061-162/+0
| | | | | | Transferred SSE41 instructions from sse-builtins.c llvm-svn: 246947
* [X86][SSE] Add _mm_undefined_* intrinsicsSimon Pilgrim2015-08-261-0/+18
| | | | | | | | | | | | | | | | Added missing SSE/AVX 'undefined' intrinsics (PR24040): _mm_undefined_pd, _mm_undefined_ps + _mm_undefined_si128 _mm256_undefined_pd, _mm256_undefined_ps + _mm256_undefined_si256 _mm512_undefined, _mm512_undefined_ps, _mm512_undefined_pd + _mm512_undefined_epi32 Added builtin intrinsicss: __builtin_ia32_undef128, __builtin_ia32_undef256 + __builtin_ia32_undef512 Differential Revision: http://reviews.llvm.org/D12052 llvm-svn: 246083
* Added missing tests for SSE41 pmovsx/pmovzx extension intrinsicsSimon Pilgrim2015-08-231-0/+72
| | | | llvm-svn: 245815
* Update Clang tests to handle explicitly typed load changes in LLVM.David Blaikie2015-02-271-10/+10
| | | | llvm-svn: 230795
* Make tests independent of llvm variable naming.Manuel Klimek2015-02-171-1/+1
| | | | llvm-svn: 229484
* [X86] Convert palignr builtin handling to use shuffle form of right shift ↵Craig Topper2015-02-171-1/+1
| | | | | | instead of intrinsics. This should allow the instrinsics to removed from the backend. llvm-svn: 229474
* [X86] Merge the 2 separate builtin handlers for PALIGNR into a single one ↵Craig Topper2015-02-171-0/+10
| | | | | | that handles both. llvm-svn: 229469
* [X86] Fix test cases that I foolishly copied and modified from another file ↵Craig Topper2015-02-131-4/+4
| | | | | | that had optimizations on. This caused the check patterns to not quite match. llvm-svn: 229073
* [X86] Add _mm_bslli_si128 and _mm_bsrli_si128 as aliases of _mm_slli_si128 ↵Craig Topper2015-02-131-0/+24
| | | | | | and _mm_srli_si128. This matches Intel documentation and gcc. llvm-svn: 229066
* [x86] Add the (v)cmpps/pd/ss/sd builtins to match gcc. Use them in the sse ↵Craig Topper2014-12-271-0/+288
| | | | | | | | | | intrinsic files. This still lower to the same intrinsics as before. This is preparation for bounds checking the immediate on the avx version of the builtin so we don't pass illegal immediates into the backend. Since SSE uses a smaller size immediate its not possible to bounds check when using a shared builtin. Rather than creating a clang specific builtin for the different immediate, I decided (after consulting with Chandler) that it was better to match gcc. llvm-svn: 224879
* Fix line numbers for code inlined from __nodebug__ functions.Evgeniy Stepanov2014-06-091-2/+2
| | | | | | | | | | | | | | Instructions from __nodebug__ functions don't have file:line information even when inlined into no-nodebug functions. As a result, intrinsics (SSE and other) from <*intrin.h> clang headers _never_ have file:line information. With this change, an instruction without !dbg metadata gets one from the call instruction when inlined. Fixes PR19001. llvm-svn: 210459
* Patched clang to emit x86 blends as shufflevectors.Filipe Cabecinhas2014-05-131-0/+18
| | | | | | | | | | | | | | | | | Summary: Most of the clang header patch by Simon Pilgrim @ SCEE. Also fixed (or added) clang tests for these intrinsics. LLVM tests to make sure we get the blend instruction out of these shufflevectors are at http://reviews.llvm.org/D3600 Reviewers: eli.friedman, craig.topper, rafael Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D3601 llvm-svn: 208664
* Intrinsics: fix extract & insert when index is out of bound.Manman Ren2013-10-231-0/+24
| | | | | | | | | Now, all extract & insert intrinsics should have the correct and operation to ignore higher bits. rdar://15250497 llvm-svn: 193267
* _mm_extract_epi16: use "& 7" when index is out of bound.Manman Ren2013-10-221-0/+7
| | | | | | | This is in line with implementation of _mm_extract_pi16. rdar://15250497 llvm-svn: 193187
* Add _mm_stream_si64 intrinsic.Eli Friedman2013-09-231-1/+19
| | | | | | | | | While I'm here, also fix the alignment computation for the whole family of intrinsics. PR17298. llvm-svn: 191243
* CHECK-LABEL-ify some code gen tests to improve diagnostic experience when ↵Stephen Lin2013-08-151-7/+7
| | | | | | tests fail. llvm-svn: 188447
* X86 SSE Intrinsics: update header for sqrt_ss, rsqrt_ss and rcp_ss.Manman Ren2012-10-261-0/+31
| | | | | | | There intrinsics pass through the upper FP values from the input. rdar://12558838 llvm-svn: 166743
* Get rid of storelv4si builtin as it can be expressed directly. This is generalChad Rosier2012-05-011-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | goodness because it provides opportunites to cleanup things. For example, uint64_t t1(__m128i vA) { uint64_t Alo; _mm_storel_epi64((__m128i*)&Alo, vA); return Alo; } was generating movq %xmm0, -8(%rbp) movq -8(%rbp), %rax and now generates movd %xmm0, %rax rdar://11282581 llvm-svn: 155924
* Correctly check argument types for some vector macros in smmintrin.h. Put ↵Craig Topper2012-03-301-0/+44
| | | | | | parentheses around uses of vector macro arguments. llvm-svn: 153732
* Add _mm_minpos_epu16 to smmintrin.h. Fixes PR12399.Craig Topper2012-03-301-0/+5
| | | | llvm-svn: 153726
* test/CodeGen/sse-builtins.c: Make this host-independent to unbreak ↵NAKAMURA Takumi2011-09-161-1/+1
| | | | | | | | posix-unlike hosts. Without -ffreestanding, clang tries to seek /usr/include/stdlib.h in host filesystem, even on Windows hosts. llvm-svn: 139899
* Tweak *mmintrin.h so that they don't make any bad assumptions about ↵Eli Friedman2011-09-151-0/+104
alignment (which probably has little effect in practice, but better to get it right). Make the load in _mm_loadh_pi and _mm_loadl_pi a single LLVM IR instruction to make optimizing easier for CodeGen. rdar://10054986 llvm-svn: 139874
OpenPOWER on IntegriCloud