bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[CUDA] Added a wrapper header for inclusion of stock CUDA headers.	Artem Belevich	2015-11-17	2	-0/+180
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Header files that come with CUDA are assuming split host/device compilation and are not usable by clang out of the box. With a bit of preprocessor magic it's possible to twist them into something clang can use. This wrapper always includes CUDA headers exactly the same way during host and device compilation passes and produces identical preprocessed content during host and device side compilation for sm_35 GPUs. Device compilation passes for older GPUs will see a smaller subset of device functions supported by particular GPU. The wrapper assumes specific contents of CUDA header files and works only with CUDA 7.0 and 7.5. Differential Revision: http://reviews.llvm.org/D13171 llvm-svn: 253388
*	bmiintrin.h: Allow using the tzcnt intrinsics for non-BMI targets	Hans Wennborg	2015-11-17	1	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \|	The tzcnt intrinsics are used non non-BMI targets by code (e.g. ffmpeg) that uses it as a potentially faster BSF. The TZCNT instruction is special in that it's encoded in a backward-compatible way and behaves as BSF on non-BMI targets. Differential Revision: http://reviews.llvm.org/D14748 llvm-svn: 253358
*	[ARM,AArch64] Fix __rev16l and __rev16ll intrinsics	Oliver Stannard	2015-11-16	1	-6/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These two intrinsics are defined in arm_acle.h. __rev16l needs to rotate by 16 bits, bit it was actually rotating by 2 bits. For AArch64, where long is 64 bits, this would still be wrong. __rev16ll was incorrect, it reversed the bytes in each 32-bit word, rather than each 16-bit halfword. The correct implementation is to apply __rev16 to the top and bottom words of the 64-bit value. For AArch32 targets, these get compiled down to the hardware rev16 instruction at -O1 and above. For AArch64 targets, the 64-bit ones get compiled to two 32-bit rev16 instructions, because there is not currently a pattern for the 64-bit rev16 instruction. Differential Revision: http://reviews.llvm.org/D14609 llvm-svn: 253211
*	[X86] Add 'pause' builtin that's already in llvm and use it instead of ↵	Craig Topper	2015-11-11	1	-1/+1
\| \| \| \| \| \|	inline assembly to implement _mm_pause. llvm-svn: 252712
*	[X86] Use __builtin_ia32_paddq and __builtin_ia32_psubq to implement a ↵	Craig Topper	2015-11-11	1	-2/+2
\| \| \| \| \| \|	couple intrinsics that were supposed to operate on MMX registers. Otherwise we end up operating on GPRs. Throw in a test for _mm_mul_su32 while I was there. llvm-svn: 252711
*	[X86] Header formatting fixes. NFC	Craig Topper	2015-11-11	2	-2/+2
\| \| \| \|	llvm-svn: 252710
*	[X86] Add missing typecasts in intrinsic macros. This should make them more ↵	Craig Topper	2015-11-11	7	-55/+85
\| \| \| \| \| \|	robust against inputs that aren't already the right type. llvm-svn: 252700
*	[X86] Change pointer type in AVX2 gather builtins to be the scalar type ↵	Craig Topper	2015-11-11	1	-160/+118
\| \| \| \| \| \|	instead of the vector type. This matches gcc and removes extras casts. llvm-svn: 252697
*	[X86] Use setzero instead of set1(0) in a few places in intrinsic headers.	Craig Topper	2015-11-10	2	-6/+6
\| \| \| \|	llvm-svn: 252587
*	[X86] Remove temporary variables from macros in x86 intrinsic headers. ↵	Craig Topper	2015-11-10	7	-198/+138
\| \| \| \| \| \|	Prevents duplicate names appearing from multiple macro expansions. NFC llvm-svn: 252586
*	[X86] Fix bad intrinsic header comment. NFC.	Craig Topper	2015-11-10	1	-1/+1
\| \| \| \|	llvm-svn: 252585
*	Fix a couple intrinsic header comments. NFC	Craig Topper	2015-11-03	2	-2/+2
\| \| \| \|	llvm-svn: 251900
*	Handle target builtin options that are all required rather than	Eric Christopher	2015-10-27	1	-20/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	only one of a group of possibilities. This changes the syntax in the builtin files to represent: , as the and operator \| as the or operator The former syntax matches how the backend tablegen files represent multiple subtarget features being required. Updated the builtin and intrinsic headers accordingly for the new syntax. llvm-svn: 251388
*	[x86] Fix maskload/store intrinsic definitions in avxintrin.h	Andrea Di Biagio	2015-10-20	1	-16/+16
\| \| \| \| \| \| \| \| \| \| \| \| \|	According to the Intel documentation, the mask operand of a maskload and maskstore intrinsics is always a vector of packed integer/long integer values. This patch introduces the following two changes: 1. It fixes the avx maskload/store intrinsic definitions in avxintrin.h. 2. It changes BuiltinsX86.def to match the correct gcc definitions for avx maskload/store (see D13861 for more details). Differential Revision: http://reviews.llvm.org/D13861 llvm-svn: 250816
*	[X86] Add fxsr feature name for fxsave/fxrestore builtins.	Craig Topper	2015-10-16	1	-1/+1
\| \| \| \|	llvm-svn: 250498
*	Headers: Switch some headers to LF line endings for consistency.	Peter Collingbourne	2015-10-15	5	-343/+343
\| \| \| \|	llvm-svn: 250388
*	Intrin.h: implement __emul and __emulu	Hans Wennborg	2015-10-14	1	-0/+11
\| \| \| \|	llvm-svn: 250301
*	Add subtarget feature support for 3dnowa to the 3dnowa intrinsics.	Eric Christopher	2015-10-13	1	-0/+4
\| \| \| \|	llvm-svn: 250202
*	[X86] Add XSAVE intrinsic family	Amjad Aboud	2015-10-13	7	-6/+224
\| \| \| \| \| \| \| \| \| \| \| \|	Add intrinsics for the XSAVE instructions (XSAVE/XSAVE64/XRSTOR/XRSTOR64) XSAVEOPT instructions (XSAVEOPT/XSAVEOPT64) XSAVEC instructions (XSAVEC/XSAVEC64) XSAVES instructions (XSAVES/XSAVES64/XRSTORS/XRSTORS64) Differential Revision: http://reviews.llvm.org/D13014 llvm-svn: 250158
*	[Headers][X86] Fix stream_load (movntdqa) to accept const*.	Ahmed Bougacha	2015-10-02	2	-4/+4
\| \| \| \| \| \| \| \| \| \|	Per Intel intrinsics guide: - _mm256_stream_load_si256 takes `__m256i const ' - _mm_stream_load_si128 takes `__m128i ', for no good reason. Let's accept const* for both. llvm-svn: 249213
*	Fix the SSE4 byte sign extension in a cleaner way, and more thoroughly	Chandler Carruth	2015-10-01	4	-21/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	test that our intrinsics behave the same under -fsigned-char and -funsigned-char. This further testing uncovered that AVX-2 has a broken cmpgt for 8-bit elements, and has for a long time. This is fixed in the same way as SSE4 handles the case. The other ISA extensions currently work correctly because they use specific instruction intrinsics. As soon as they are rewritten in terms of generic IR, they will need to add these special casts. I've added the necessary testing to catch this however, so we shouldn't have to chase it down again. I considered changing the core typedef to be signed, but that seems like a bad idea. Notably, it would be an ABI break if anyone is reaching into the innards of the intrinsic headers and passing __v16qi on an API boundary. I can't be completely confident that this wouldn't happen due to a macro expanding in a lambda, etc., so it seems much better to leave it alone. It also matches GCC's behavior exactly. A fun side note is that for both GCC and Clang, -funsigned-char really does change the semantics of __v16qi. To observe this, consider: % cat x.cc #include <smmintrin.h> #include <iostream> int main() { __v16qi a = { 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; __v16qi b = _mm_set1_epi8(-1); std::cout << (int)(a / b)[0] << ", " << (int)(a / b)[1] << '\n'; } % clang++ -o x x.cc && ./x -1, 1 % clang++ -funsigned-char -o x x.cc && ./x 0, 1 However, while this may be surprising, both Clang and GCC agree. Differential Revision: http://reviews.llvm.org/D13324 llvm-svn: 249097
*	Patch over a really horrible bug in our vector builtins that showed up	Chandler Carruth	2015-10-01	1	-3/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	recently when we started using direct conversion to model sign extension. The __v16qi type we use for SSE v16i8 vectors is defined in terms of 'char' which may or may not be signed! This causes us to generate pmovsx and pmovzx depending on the setting of -funsigned-char. This patch just forms an explicitly signed type and uses that to formulate the sign extension. While this gets the correct behavior (which we now verify with the enhanced test) this is just the tip of the ice berg. Now that I know what to look for, I have found errors of this sort throughout our vector code. Fortunately, this is the only specific place where I know of users actively having their code miscompiled by Clang due to this, so I'm keeping the fix for those users minimal and targeted. I'll be sending a proper email for discussion of how to fix these systematically, what the implications are, and just how widely broken this is... From what I can tell, we have never shipped a correct set of builtin headers for x86 when users rely on -funsigned-char. Oops. llvm-svn: 248980
*	Forgot to remove a FIXME that has been fixed. NFC.	Nemanja Ivanovic	2015-09-29	1	-3/+0
\| \| \| \|	llvm-svn: 248815
*	Addition of interfaces the FE to conform to Table A-2 of ELF V2 ABI V1.1	Nemanja Ivanovic	2015-09-29	1	-170/+564
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch corresponds to review: http://reviews.llvm.org/D13190 Implemented the following interfaces to conform to ELF V2 ABI version 1.1. vector signed __int128 vec_adde (vector signed __int128, vector signed __int128, vector signed __int128); vector unsigned __int128 vec_adde (vector unsigned __int128, vector unsigned __int128, vector unsigned __int128); vector signed __int128 vec_addec (vector signed __int128, vector signed __int128, vector signed __int128); vector unsigned __int128 vec_addec (vector unsigned __int128, vector unsigned __int128, vector unsigned __int128); vector signed int vec_addc(vector signed int __a, vector signed int __b); vector bool char vec_cmpge (vector signed char __a, vector signed char __b); vector bool char vec_cmpge (vector unsigned char __a, vector unsigned char __b); vector bool short vec_cmpge (vector signed short __a, vector signed short __b); vector bool short vec_cmpge (vector unsigned short __a, vector unsigned short __b); vector bool int vec_cmpge (vector signed int __a, vector signed int __b); vector bool int vec_cmpge (vector unsigned int __a, vector unsigned int __b); vector bool char vec_cmple (vector signed char __a, vector signed char __b); vector bool char vec_cmple (vector unsigned char __a, vector unsigned char __b); vector bool short vec_cmple (vector signed short __a, vector signed short __b); vector bool short vec_cmple (vector unsigned short __a, vector unsigned short __b); vector bool int vec_cmple (vector signed int __a, vector signed int __b); vector bool int vec_cmple (vector unsigned int __a, vector unsigned int __b); vector double vec_double (vector signed long long __a); vector double vec_double (vector unsigned long long __a); vector bool char vec_eqv(vector bool char __a, vector bool char __b); vector bool short vec_eqv(vector bool short __a, vector bool short __b); vector bool int vec_eqv(vector bool int __a, vector bool int __b); vector bool long long vec_eqv(vector bool long long __a, vector bool long long __b); vector signed short vec_madd(vector signed short __a, vector signed short __b, vector signed short __c); vector signed short vec_madd(vector signed short __a, vector unsigned short __b, vector unsigned short __c); vector signed short vec_madd(vector unsigned short __a, vector signed short __b, vector signed short __c); vector unsigned short vec_madd(vector unsigned short __a, vector unsigned short __b, vector unsigned short __c); vector bool long long vec_mergeh(vector bool long long __a, vector bool long long __b); vector bool long long vec_mergel(vector bool long long __a, vector bool long long __b); vector bool char vec_nand(vector bool char __a, vector bool char __b); vector bool short vec_nand(vector bool short __a, vector bool short __b); vector bool int vec_nand(vector bool int __a, vector bool int __b); vector bool long long vec_nand(vector bool long long __a, vector bool long long __b); vector bool char vec_orc(vector bool char __a, vector bool char __b); vector bool short vec_orc(vector bool short __a, vector bool short __b); vector bool int vec_orc(vector bool int __a, vector bool int __b); vector bool long long vec_orc(vector bool long long __a, vector bool long long __b); vector signed long long vec_sub(vector signed long long __a, vector signed long long __b); vector signed long long vec_sub(vector bool long long __a, vector signed long long __b); vector signed long long vec_sub(vector signed long long __a, vector bool long long __b); vector unsigned long long vec_sub(vector unsigned long long __a, vector unsigned long long __b); vector unsigned long long vec_sub(vector bool long long __a, vector unsigned long long __b); vector unsigned long long vec_sub(vector unsigned long long __V2 ABI V1.1 http://ror float vec_sub(vector float __a, vector float __b); unsigned char vec_extract(vector bool char __a, int __b); signed short vec_extract(vector signed short __a, int __b); unsigned short vec_extract(vector bool short __a, int __b); signed int vec_extract(vector signed int __a, int __b); unsigned int vec_extract(vector bool int __a, int __b); signed long long vec_extract(vector signed long long __a, int __b); unsigned long long vec_extract(vector unsigned long long __a, int __b); unsigned long long vec_extract(vector bool long long __a, int __b); double vec_extract(vector double __a, int __b); vector bool char vec_insert(unsigned char __a, vector bool char __b, int __c); vector signed short vec_insert(signed short __a, vector signed short __b, int __c); vector bool short vec_insert(unsigned short __a, vector bool short __b, int __c); vector signed int vec_insert(signed int __a, vector signed int __b, int __c); vector bool int vec_insert(unsigned int __a, vector bool int __b, int __c); vector signed long long vec_insert(signed long long __a, vector signed long long __b, int __c); vector unsigned long long vec_insert(unsigned long long __a, vector unsigned long long __b, int __c); vector bool long long vec_insert(unsigned long long __a, vector bool long long __b, int __c); vector double vec_insert(double __a, vector double __b, int __c); vector signed long long vec_splats(signed long long __a); vector unsigned long long vec_splats(unsigned long long __a); vector signed __int128 vec_splats(signed __int128 __a); vector unsigned __int128 vec_splats(unsigned __int128 __a); vector double vec_splats(double __a); int vec_all_eq(vector double __a, vector double __b); int vec_all_ge(vector double __a, vector double __b); int vec_all_gt(vector double __a, vector double __b); int vec_all_le(vector double __a, vector double __b); int vec_all_lt(vector double __a, vector double __b); int vec_all_nan(vector double __a); int vec_all_ne(vector double __a, vector double __b); int vec_all_nge(vector double __a, vector double __b); int vec_all_ngt(vector double __a, vector double __b); int vec_any_eq(vector double __a, vector double __b); int vec_any_ge(vector double __a, vector double __b); int vec_any_gt(vector double __a, vector double __b); int vec_any_le(vector double __a, vector double __b); int vec_any_lt(vector double __a, vector double __b); int vec_any_ne(vector double __a, vector double __b); vector unsigned char vec_sbox_be (vector unsigned char); vector unsigned char vec_cipher_be (vector unsigned char, vector unsigned char); vector unsigned char vec_cipherlast_be (vector unsigned char, vector unsigned char); vector unsigned char vec_ncipher_be (vector unsigned char, vector unsigned char); vector unsigned char vec_ncipherlast_be (vector unsigned char, vector unsigned char); vector unsigned int vec_shasigma_be (vector unsigned int, const int, const int); vector unsigned long long vec_shasigma_be (vector unsigned long long, const int, const int); vector unsigned short vec_pmsum_be (vector unsigned char, vector unsigned char); vector unsigned int vec_pmsum_be (vector unsigned short, vector unsigned short); vector unsigned long long vec_pmsum_be (vector unsigned int, vector unsigned int); vector unsigned __int128 vec_pmsum_be (vector unsigned long long, vector unsigned long long); vector unsigned char vec_gb (vector unsigned char); vector unsigned long long vec_bperm (vector unsigned __int128 __a, vector unsigned char __b); Removed the folowing interfaces either because their signatures have changed in version 1.1 of the ABI or because they were implemented for ELF V2 ABI but have actually been deprecated in version 1.1. vector signed char vec_eqv(vector bool char __a, vector signed char __b); vector signed char vec_eqv(vector signed char __a, vector bool char __b); vector unsigned char vec_eqv(vector bool char __a, vector unsigned char __b); vector unsigned char vec_eqv(vector unsigned char __a, vector bool char __b); vector signed short vec_eqv(vector bool short __a, vector signed short __b); vector signed short vec_eqv(vector signed short __a, vector bool short __b); vector unsigned short vec_eqv(vector bool short __a, vector unsigned short __b); vector unsigned short vec_eqv(vector unsigned short __a, vector bool short __b); vector signed int vec_eqv(vector bool int __a, vector signed int __b); vector signed int vec_eqv(vector signed int __a, vector bool int __b); vector unsigned int vec_eqv(vector bool int __a, vector unsigned int __b); vector unsigned int vec_eqv(vector unsigned int __a, vector bool int __b); vector signed long long vec_eqv(vector bool long long __a, vector signed long long __b); vector signed long long vec_eqv(vector signed long long __a, vector bool long long __b); vector unsigned long long vec_eqv(vector bool long long __a, vector unsigned long long __b); vector unsigned long long vec_eqv(vector unsigned long long __a, vector bool long long __b); vector float vec_eqv(vector bool int __a, vector float __b); vector float vec_eqv(vector float __a, vector bool int __b); vector double vec_eqv(vector bool long long __a, vector double __b); vector double vec_eqv(vector double __a, vector bool long long __b); vector unsigned short vec_nand(vector bool short __a, vector unsigned short __b); llvm-svn: 248813
*	ms Intrin.h: Fix __movsw's and __stosw's inline asm.	Nico Weber	2015-09-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	Before, clang's internal assembler would reject the inline asm in clang's Intrin.h. To make sure this doesn't happen for other Intrin.h functions using __asm__ blocks, add 32-bit and 64-bit codegen tests for Intrin.h. Sadly, these tests discovered that __readcr3 and __writecr3 have bad implementations in 64-bit builds. This will have to be fixed in a follow-up. llvm-svn: 248234
*	[X86] Make f16c intrinsics accessible through emmintrin.h, per Intel docs	Michael Kuperstein	2015-09-21	2	-2/+4
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D13015 llvm-svn: 248156
*	[X86] Fix some non-reserved parameter names in intrinsic headers	Michael Kuperstein	2015-09-21	2	-26/+26
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D13009 llvm-svn: 248150
*	[X86][SSE] Replace 128-bit SSE41 PMOVSX intrinsics with native IR	Simon Pilgrim	2015-09-19	1	-6/+6
\| \| \| \| \| \| \| \| \| \|	128-bit vector integer sign extensions correctly lower to the pmovsx instructions even for debug builds. This patch removes the builtins and reimplements the _mm_cvtepi_epi intrinsics __using builtin_shufflevector (to extract the bottom most subvector) and __builtin_convertvector (to actually perform the sign extension). Differential Revision: http://reviews.llvm.org/D12835 llvm-svn: 248092
*	re-apply r.247881	Asaf Badouh	2015-09-17	1	-0/+536
\| \| \| \| \| \|	fixed the tests. llvm-svn: 247892
*	revert r.247881 due to tests failures	Asaf Badouh	2015-09-17	1	-536/+0
\| \| \| \|	llvm-svn: 247883
*	[X86][AVX512DQ] add new intrinsics	Asaf Badouh	2015-09-17	1	-0/+536
\| \| \| \| \| \| \| \| \| \| \| \|	convert i64 to FP and vice versa reduceps & reducepd rangeps & rangepd all in their 512bit versions Differential Revision: http://reviews.llvm.org/D11716 llvm-svn: 247881
*	Clean up trailing whitespace in the builtin headers	Sean Silva	2015-09-12	13	-103/+103
\| \| \| \|	llvm-svn: 247498
*	[X86][SSE] Add _mm_undefined_* intrinsics	Simon Pilgrim	2015-08-26	4	-0/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added missing SSE/AVX 'undefined' intrinsics (PR24040): _mm_undefined_pd, _mm_undefined_ps + _mm_undefined_si128 _mm256_undefined_pd, _mm256_undefined_ps + _mm256_undefined_si256 _mm512_undefined, _mm512_undefined_ps, _mm512_undefined_pd + _mm512_undefined_epi32 Added builtin intrinsicss: __builtin_ia32_undef128, __builtin_ia32_undef256 + __builtin_ia32_undef512 Differential Revision: http://reviews.llvm.org/D12052 llvm-svn: 246083
*	[X86] Remove unnecessary MMX declarations from Intrin.h	Simon Pilgrim	2015-08-25	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \|	As discussed in PR23648 - the intrinsics _m_from_int, _m_to_int and _m_prefetch are defined in mmintrin.h and prfchwintrin.h so we don't need to in Intrin.h Added tests for _m_from_int and _m_to_int D11338 already added a test for _m_prefetch Differential Revision: http://reviews.llvm.org/D12272 llvm-svn: 245975
*	Revert r245923 since it breaks mingw.	Michael Kuperstein	2015-08-25	2	-54/+20
\| \| \| \|	llvm-svn: 245929
*	[X86] Expose the various _rot intrinsics on non-MS platforms	Michael Kuperstein	2015-08-25	2	-20/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	_rotl, _rotwl and _lrotl (and their right-shift counterparts) are official x86 intrinsics, and should be supported regardless of environment. This is in contrast to _rotl8, _rotl16, and _rotl64 which are MS-specific. Note that the MS documentation for _lrotl is different from the Intel documentation. Intel explicitly documents it as a 64-bit rotate, while for MS, since sizeof(unsigned long) for MSVC is always 4, a 32-bit rotate is implied. Differential Revision: http://reviews.llvm.org/D12271 llvm-svn: 245923
*	[Headers][X86] Use __builtin_shufflevector in AVX2 broadcasts.	Ahmed Bougacha	2015-08-20	1	-11/+11
\| \| \| \| \| \| \| \| \| \|	This lets us optimize them better. We agreed to remove the intrinsics, instead of combining them later, as, at -O0, we generate the expected instructions. Plus, it's a nice cleanup. Differential Revision: http://reviews.llvm.org/D10556 llvm-svn: 245605
*	[X86] Add support for _MM_ALIGN16	Michael Kuperstein	2015-08-06	1	-0/+5
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D11753 llvm-svn: 244201
*	[X86][AVX512VLBW] add pack, cvt, mulhi and madd intrinsics	Asaf Badouh	2015-08-03	1	-0/+429
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D11642 llvm-svn: 243867
*	[X86][AVX512VLDQ] add reduce/range/cvt intrinsics	Asaf Badouh	2015-08-02	1	-0/+600
\| \| \| \| \| \| \| \|	add 128 & 256 width intrinsic versions of reduce/range and cvt i64 to FP and vice versa Differential Revision: http://reviews.llvm.org/D11598 llvm-svn: 243848
*	[SystemZ] Add support for vecintrin.h vector built-in functions	Ulrich Weigand	2015-07-30	4	-0/+8956
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for the System Z vector built-in functions. The API-defined header file has the name vecintrin.h. The user-level functions are defined in the same style as the clang version of altivec.h, making heavy use of the __overloadable__ and __always_inline__ attributes. Where possible the functions expand to generic operations rather than specific built-in functions, in the hope that that form can be optimised better. Where a built-in routine is specified to require an immediate integer argument, the __enable_if__ attribute is used to verify the argument is in fact constant and in the appropriate range. Based on a patch by Richard Sandiford. llvm-svn: 243643
*	[X86][AVX512BW] Remove whitespaces	Asaf Badouh	2015-07-30	1	-68/+53
\| \| \| \|	llvm-svn: 243623
*	[X86][AVX512BW] add convert i16 to i8 and unpack intrinsics	Asaf Badouh	2015-07-29	1	-0/+163
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D11564 llvm-svn: 243514
*	[X86][AVX512BW] Replace attributes with __DEFAULT_FN_ATTRS	Asaf Badouh	2015-07-29	1	-6/+6
\| \| \| \|	llvm-svn: 243512
*	[X86][AVX512VL] add AVX512VL intrinsics 4 out of 4	Asaf Badouh	2015-07-28	1	-0/+456
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D11526 llvm-svn: 243409
*	[X86][AVX512VL] add AVX512VL intrinsics 3 out of 4	Asaf Badouh	2015-07-28	1	-0/+655
\| \| \| \| \| \|	http://reviews.llvm.org/D11526 llvm-svn: 243406
*	[X86][AVX512VL] add AVX512VL intrinsics 2 out of 4	Asaf Badouh	2015-07-28	1	-0/+699
\| \| \| \| \| \|	http://reviews.llvm.org/D11526 llvm-svn: 243402
*	[X86][AVX512VL] add AVX512VL intrinsics 1 out of 4	Asaf Badouh	2015-07-28	1	-0/+816
\| \| \| \| \| \|	http://reviews.llvm.org/D11526 llvm-svn: 243394
*	[X86] Add missing _m_prefetch intrinsic	Simon Pilgrim	2015-07-27	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \|	The 3DNOW/PRFCHW cpu targets define both the PREFETCHW (set cache line modified) and PREFETCH (set cache line exclusive) instructions but only the _m_prefetchw (PREFETCHW) intrinsic is included in the header. This patch adds the missing _m_prefetch intrinsic. I'm basing this off AMD documentation - the intel docs on the support for PREFETCHW isn't clear whether Silvermont/Broadwell properly support PREFETCH but given that the intrinsic implementation is a default __builtin_prefetch call, it is safe whatever. Fix for PR23648 Differential Revision: http://reviews.llvm.org/D11338 llvm-svn: 243305
*	[X86][AVX512F] Add FP scalar intrinsics	Asaf Badouh	2015-07-23	1	-0/+357
\| \| \| \| \| \| \| \|	intrinsics for: add/sub/mul/div/min/max in their FP scalar versions Differential Revision: http://reviews.llvm.org/D11418 llvm-svn: 243009