summaryrefslogtreecommitdiffstats
path: root/clang/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [AArch64] Corrected FP16 Intrinsic range checks in Clang + added Sema testsLuke Geeson2018-06-121-24/+24
| | | | | | | | | | | | | | | | | | Summary: This fixes the ranges for the vcvth family of FP16 intrinsics in the clang front end. Previously it was accepting incorrect ranges -Changed builtin range checking in SemaChecking -added tests SemaCheck changes - included in their own file since no similar one exists -modified existing tests to reflect new ranges Reviewers: SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Subscribers: kristof.beyls, cfe-commits Differential Revision: https://reviews.llvm.org/D47592 llvm-svn: 334489
* [Driver] Add aliases for -Qn/-QyMikhail Maltsev2018-06-111-0/+4
| | | | | | | | | This patch adds aliases for -Qn (-fno-ident) and -Qy (-fident) which look less cryptic than -Qn/-Qy. The aliases are compatible with GCC. Differential Revision: https://reviews.llvm.org/D48021 llvm-svn: 334414
* [X86] Remove masking from dbpsadbw builtins, use select builtin instead.Craig Topper2018-06-112-9/+15
| | | | llvm-svn: 334385
* [X86] Use target independent masked expandload and compressstore intrinsics ↵Craig Topper2018-06-104-52/+64
| | | | | | | | | | | | | | | | to implement expandload/compressstore builtins. Summary: We've had these target independent intrinsics for at least a year and a half. Looks like they do exactly what we need here and the backend already supports them. Reviewers: RKSimon, delena, spatel, GBuella Reviewed By: RKSimon Subscribers: cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D47693 llvm-svn: 334366
* [NEON] Support VST1xN intrinsics in AArch32 mode (Clang part)Ivan A. Kosarev2018-06-102-2088/+2312
| | | | | | | | | We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47446 llvm-svn: 334362
* [X86] Remove masking from the 512-bit packed floating point add/sub/mul/div ↵Craig Topper2018-06-101-24/+40
| | | | | | builtins. Use select in IR instead. llvm-svn: 334359
* [X86] Add builtins for vpermq/vpermpd instructions to enable target feature ↵Craig Topper2018-06-083-14/+14
| | | | | | checking. llvm-svn: 334311
* [X86] Add builtins for pshufd, pshuflw, and pshufhw to enable target feature ↵Craig Topper2018-06-086-27/+27
| | | | | | and immediate range checking. llvm-svn: 334265
* [X86] Add subvector insert and extract builtins to enable target feature ↵Craig Topper2018-06-087-62/+62
| | | | | | | | checking and immediate range checking. Test changes are due to differences in how we generate undef elements now. We also changed the types used for extractf128_si256/insertf128_si256 to match the signature of the builtin that previously existed which this patch resurrects. This also matches gcc. llvm-svn: 334261
* [X86] Add builtins for vpermilps/pd instructions to enable target feature ↵Craig Topper2018-06-083-19/+19
| | | | | | checking. llvm-svn: 334256
* [X86] Add builtins for blend with immediate control to enforce target ↵Craig Topper2018-06-082-2/+2
| | | | | | feature requirements and check immediate range. llvm-svn: 334249
* [X86] Add builtins for shuff32x4/shuff64x2/shufi32x4/shuff64x2 to enable ↵Craig Topper2018-06-072-6/+6
| | | | | | target feature checking and immediate range checking. llvm-svn: 334244
* [Frontend] Disallow non-MSVC exception models for windows-msvc targetsShoaib Meenai2018-06-071-9/+6
| | | | | | | | | | | | | | The windows-msvc target is used for MSVC ABI compatibility, including the exceptions model. It doesn't make sense to pair a windows-msvc target with a non-MSVC exception model. This would previously cause an assertion failure; explicitly error out for it in the frontend instead. This also allows us to reduce the matrix of target/exception models a bit (see the modified tests), and we can possibly simplify some of the personality code in a follow-up. Differential Revision: https://reviews.llvm.org/D47853 llvm-svn: 334243
* [MS] Re-add support for the ARM interlocked bittest intrinscsReid Kleckner2018-06-071-11/+22
| | | | | | | | | | | | | | | Adds support for these intrinsics, which are ARM and ARM64 only: _interlockedbittestandreset_acq _interlockedbittestandreset_rel _interlockedbittestandreset_nf _interlockedbittestandset_acq _interlockedbittestandset_rel _interlockedbittestandset_nf Refactor the bittest intrinsic handling to decompose each intrinsic into its action, its width, and its atomicity. llvm-svn: 334239
* [X86] Add back builtins for _mm_slli_si128/_mm_srli_si128 and similar ↵Craig Topper2018-06-073-12/+12
| | | | | | | | | | intrinsics. We still lower them to native shuffle IR, but we do it in CGBuiltin.cpp now. This allows us to check the target feature and ensure the immediate fits in 8 bits. This also improves our -O0 codegen slightly because we're able to see the zeroinitializer in the shuffle. It looks like it got lost behind a store+load previously. llvm-svn: 334208
* [CodeGen] Improve diagnostics related to target attributesGabor Buella2018-06-072-16/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When requirement imposed by __target__ attributes on functions are not satisfied, prefer printing those requirements, which are explicitly mentioned in the attributes. This makes such messages more useful, e.g. printing avx512f instead of avx2 in the following scenario: ``` $ cat foo.c static inline void __attribute__((__always_inline__, __target__("avx512f"))) x(void) { } int main(void) { x(); } $ clang foo.c foo.c:7:2: error: always_inline function 'x' requires target feature 'avx2', but would be inlined into function 'main' that is compiled without support for 'avx2' x(); ^ 1 error generated. ``` bugzilla: https://bugs.llvm.org/show_bug.cgi?id=37338 Reviewers: craig.topper, echristo, dblaikie Reviewed By: craig.topper, echristo Differential Revision: https://reviews.llvm.org/D46541 llvm-svn: 334174
* [X86] Add back _mask, _maskz, and _mask3 builtins for some 512-bit ↵Craig Topper2018-06-071-24/+24
| | | | | | | | | | | | | | | | | | | | | | | fmadd/fmsub/fmaddsub/fmsubadd builtins. Summary: We recently switch to using a selects in the intrinsics header files for FMA instructions. But the 512-bit versions support flavors with rounding mode which must be an Integer Constant Expression. This has forced those intrinsics to be implemented as macros. As it stands now the mask and mask3 intrinsics evaluate one of their macro arguments twice. If that argument itself is another intrinsic macro, we can end up over expanding macros. Or if its something we can CSE later it would show up multiple times when it shouldn't. I tried adding __extension__ around the macro and making it an expression statement and declaring a local variable. But whatever name you choose for the local variable can never be used as the name of an input to the macro in user code. If that happens you would end up with the same name on the LHS and RHS of an assignment after expansion. We might be safe if we use __ in front of the variable names because those names are reserved and user code shouldn't use that, but I wasn't sure I wanted to make that claim. The other option which I've chosen here, is to add back _mask, _maskz, and _mask3 flavors of the builtin which we will expand in CGBuiltin.cpp to replicate the argument as needed and insert any fneg needed on the third operand to make a subtract. The _maskz isn't truly necessary if we have an unmasked version or if we use the masked version with a -1 mask and wrap a select around it. But I've chosen to make things more uniform. I separated out the scalar builtin handling to avoid too many things going on in EmitX86FMAExpr. It was different enough due to the extract and insert that the minor duplication of the CreateCall was probably worth it. Reviewers: tkrupa, RKSimon, spatel, GBuella Reviewed By: tkrupa Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D47724 llvm-svn: 334159
* [PATCH 2/2] [test] Add support for Samsung Exynos M4 (NFC)Evandro Menezes2018-06-061-0/+1
| | | | | | Add test cases for Exynos M4. llvm-svn: 334116
* [MS][ARM64]: Promote _setjmp to_setjmpex as there is no _setjmp in the ARM64 ↵Reid Kleckner2018-06-061-0/+12
| | | | | | | | | | | | libvcruntime.lib Factor out the common setjmp call emission code. Based on a patch by Chris January Differential Revision: https://reviews.llvm.org/D47784 llvm-svn: 334112
* Implement bittest intrinsics generically for non-x86 platformsReid Kleckner2018-06-061-16/+105
| | | | | | | | | | I tested these locally on an x86 machine by disabling the inline asm codepath and confirming that it does the same bitflips as we do with the inline asm. Addresses code review feedback. llvm-svn: 334059
* [X86] Add builtins for vector element insert and extract for different 128 ↵Craig Topper2018-06-065-35/+25
| | | | | | | | | | | | | | and 256 bit vector types. Use them to implement the extract and insert intrinsics. Previously we were just using extended vector operations in the header file. This unfortunately allowed non-constant indices to be used with the intrinsics. This is incompatible with gcc, icc, and MSVC. It also introduces a different performance characteristic because non-constant index gets lowered to a vector store and an element sized load. By adding the builtins we can check for the index to be a constant and ensure its in range of the vector element count. User code still has the option to use extended vector operations themselves if they need non-constant indexing. llvm-svn: 334057
* Reimplement the bittest intrinsic family as builtins with inline asmReid Kleckner2018-06-053-22/+35
| | | | | | | | | | | We need to implement _interlockedbittestandset as a builtin for windows.h, so we might as well do the whole family. It reduces code duplication anyway. Fixes PR33188, a long standing bug in our bittest implementation encountered by Chakra. llvm-svn: 333978
* [ThinLTO] Add testing of new summary index format to a couple CFI testsTeresa Johnson2018-06-042-0/+10
| | | | | | | | | | | | | | | | Summary: Adds testing of combined index summary entries in disassembly format to CFI tests that were already testing the bitcode format. Depends on D46699. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, cfe-commits Differential Revision: https://reviews.llvm.org/D46700 llvm-svn: 333966
* Revert r333791 "Cap "voluntary" vector alignment at 16 for all Darwin ↵Reid Kleckner2018-06-042-62/+26
| | | | | | | | | | | | | | platforms." Adding __attribute__((aligned(32))) to __m256 breaks the implementation of _mm256_loadu_ps on Windows. On Windows, alignment attributes have higher precedence than packing attributes. We also might want to carefully consider the consequences of changing our vector typedefs, since many users copy them and invent their own new, non-Intel specific vector type names. llvm-svn: 333958
* [X86] Avoid passing _mm_undefined* to builtin_shufflevector if we are able ↵Craig Topper2018-06-044-33/+33
| | | | | | | | to pass the first input a second time. This is more consistent with other usages of builtin_shufflevector. Later optimization passes or codegen will detect the duplicate vector and replace it with undef. Using _mm_undefined just puts a zeroinitializer that still needs to be optimized out later. llvm-svn: 333944
* [X86] Replace __builtin_ia32_vbroadcastf128_pd256 and ↵Craig Topper2018-06-031-2/+0
| | | | | | __builtin_ia32_vbroadcastf128_ps256 with an unaligned load intrinsics and a __builtin_shufflevector call. llvm-svn: 333853
* [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)Ivan A. Kosarev2018-06-022-1260/+1411
| | | | | | | | | We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333829
* Cap "voluntary" vector alignment at 16 for all Darwin platforms.John McCall2018-06-012-26/+62
| | | | | | | | | | | | | | | | | | | | | This fixes two major problems: - We were not capping vector alignment as desired on 32-bit ARM. - We were using different alignments based on the AVX settings on Intel, so we did not have a consistent ABI. This is an ABI break, but we think we can get away with it because vectors tend to be used mostly in inline code (which is why not having a consistent ABI has not proven disastrous on Intel). Intel's AVX types are specified as having 32-byte / 64-byte alignment, so align them explicitly instead of relying on the base ABI rule. Note that this sort of attribute is stripped from template arguments in template substitution, so there's a possibility that code templated over vectors will produce inadequately-aligned objects. The right long-term solution for this is for alignment attributes to be interpreted as true qualifiers and thus preserved in the canonical type. llvm-svn: 333791
* [WebAssembly] Update to the new names for the memory builtin functions.Dan Gohman2018-06-011-5/+17
| | | | | | | | | The WebAssembly committee has decided on the names `memory.size` and `memory.grow` for the memory intrinsics, so update the clang builtin functions to follow those names, keeping both sets of old names in place for compatibility. llvm-svn: 333712
* IRGen: Write .dwo files when -split-dwarf-file is used together with ↵Peter Collingbourne2018-05-311-0/+21
| | | | | | | | -fthinlto-index. Differential Revision: https://reviews.llvm.org/D47597 llvm-svn: 333677
* [X86] Make 512-bit unmasked load/store builtins more like their 128/256-bit ↵Craig Topper2018-05-311-2/+2
| | | | | | | | equivalents. Previously we were just passing -1 mask to the masked builtin. This changes it to the more generic way that the 128/256 bit use. llvm-svn: 333626
* [X86] Remove __extension__ from macro intrinsics when its not needed.Craig Topper2018-05-311-12/+12
| | | | | | | | | | I think this is a holdover from when we used to declare variables inside the macros. And then its been copy and pasted forward for years every time a new macro intrinsic gets added. Interestingly this caused some tests for IRGen to be slightly more optimized. We now return a zeroinitializer directly instead of going through a store+load. It also removed a bogus error message on another test. llvm-svn: 333613
* Add fopen to the list of builtins that we check and whitelist.Eric Christopher2018-05-301-1/+9
| | | | llvm-svn: 333594
* [X86] Simplify the implementation of _mm_sqrt_ss, _mm_rcp_ss, and _mm_rsqrt_ss.Craig Topper2018-05-301-24/+0
| | | | | | | | We don't need the insertion back into the original vector at the end. The builtin already understands that. This is different than _mm_sqrt_sd which takes two arguments and we do need to insert. llvm-svn: 333572
* [X86] Reduce the number of setzero intrinsics to just the set defined by the ↵Craig Topper2018-05-301-2/+2
| | | | | | | | | | Intel Intrinsics Guide. We had quite a few for different element sizes of integers sometimes with strange target features attached to them. We only need a single version for each of _m128i, _m256i, and _m512i with the target feature that first introduced those types. llvm-svn: 333568
* [X86] Lowering FMA intrinsics to native IR (Clang part)Gabor Buella2018-05-304-250/+1326
| | | | | | | | | | | | | | | | This patch replaces all packed (and scalar without rounding mode) fused intrinsics with fmadd/fmaddsub variations. Then fmadd/fmaddsub are lowered to native IR. Patch by tkrupa Reviewers: craig.topper, sroland, spatel, RKSimon Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D47444 llvm-svn: 333555
* Support __iso_volatile_load8 etc on aarch64-win32.Simon Tatham2018-05-301-0/+13
| | | | | | | | | | | | | | | | | These intrinsics are used by MSVC's header files on AArch64 Windows as well as AArch32, so we should support them for both targets. I've factored them out of CodeGenFunction::EmitARMBuiltinExpr into separate functions that EmitAArch64BuiltinExpr can call as well. Reviewers: javed.absar, mstorsjo Reviewed By: mstorsjo Subscribers: kristof.beyls, cfe-commits Differential Revision: https://reviews.llvm.org/D47476 llvm-svn: 333513
* [Sparc] Add floating-point register namesDaniel Cederman2018-05-302-1/+64
| | | | | | | | | | | | Reviewers: jyknight Reviewed By: jyknight Subscribers: eraman, fedor.sergeev, jrtc27, cfe-commits Differential Revision: https://reviews.llvm.org/D47137 llvm-svn: 333510
* [X86] Remove masking from the AVX512VNNI builtins. Use a select in IR instead.Craig Topper2018-05-302-36/+60
| | | | llvm-svn: 333509
* [X86] Fix the names of a bunch of icelake intrinsics.Craig Topper2018-05-303-216/+216
| | | | | | | | Mostly this fixes the names of all the 128-bit intrinsics to start with _mm_ instead of _mm128_ as is the convention and what the Intel docs say. This also fixes the name of the bitshuffle intrinsics to say epi64 for 128 and 256 bit versions. llvm-svn: 333497
* [X86] Merge the 3 different flavors of masked vpermi2var/vpermt2var builtins ↵Craig Topper2018-05-296-70/+136
| | | | | | to a single version without masking. Use select builtins with appropriate operand instead. llvm-svn: 333387
* Revert r333347 "[X86] Rewrite the max and min reduction intrinsics to make ↵Craig Topper2018-05-261-2598/+2285
| | | | | | | | better use of other functions and to reduce width to 256 and 128 bits were possible." This wasn't supposed to be commited yet. llvm-svn: 333349
* [X86] Remove mask from avx512ifma builtins. Use a select instruction instead.Craig Topper2018-05-262-18/+30
| | | | | | This reduces from 12 builtins to 6 since we no longer need a mask and maskz version. llvm-svn: 333348
* [X86] Rewrite the max and min reduction intrinsics to make better use of ↵Craig Topper2018-05-261-2285/+2598
| | | | | | | | | | | | | | | | | | | other functions and to reduce width to 256 and 128 bits were possible. Summary: We only need to use 512 bit vectors all the way through v8i64 reductions since those max instructions are new to avx512f and only available in 512 bits until SKX. For v16i32 and floating point we have legacy 128/256 bit instructions we can use. I've tried to use other intrinsics to reduce the verbosity of the code and avoid having to mention all the shuffles. I've also removed all the -1 shuffle indices so the output sequence is fully specified and not left to backend optimization. Reviewers: RKSimon, spatel, GBuella Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D47401 llvm-svn: 333347
* Revert "[DebugInfo] Don't bother with MD5 checksums of preprocessed files."Paul Robinson2018-05-251-13/+0
| | | | | | | This reverts commit d734f2aa3f76fbf355ecd2bbe081d0c1f49867ab. Also known as r333311. A very small but nonzero number of bots fail. llvm-svn: 333319
* [DebugInfo] Don't bother with MD5 checksums of preprocessed files.Paul Robinson2018-05-251-0/+13
| | | | | | | | | | The checksum will not reflect the real source, so there's no clear reason to include them in the debug info. Also this was causing a crash on the DWARF side. Differential Revision: https://reviews.llvm.org/D47260 llvm-svn: 333311
* [x86] invpcid intrinsicGabor Buella2018-05-251-0/+12
| | | | | | | | | | | | An intrinsic for an old instruction, as described in the Intel SDM. Reviewers: craig.topper, rnk Reviewed By: craig.topper, rnk Differential Revision: https://reviews.llvm.org/D47142 llvm-svn: 333256
* [X86] NFC Include immintrin.h in CodeGen testsGabor Buella2018-05-2426-26/+26
| | | | | | | Following r333110: "Move all Intel defined intrinsic includes into immintrin.h" llvm-svn: 333160
* Add Builtins.def support for fread and fwrite to ensure that -fno-builtin-Eric Christopher2018-05-241-1/+14
| | | | | | works with them and test accordingly. llvm-svn: 333156
* Migrate libcalls-fno-builtin.c test from checking optimized assemblyEric Christopher2018-05-241-50/+60
| | | | | | | | | | | to checking for attributes on the call site - and fix up builtin functions that we were testing for but not ensuring wouldn't be optimized by the backend. Leave one set of asm tests to make sure that we're also communicating builtin-ness to TLI. llvm-svn: 333154
OpenPOWER on IntegriCloud