bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[X86] Change the IR sequence for _mm_storeh_pi and _mm_storel_pi to perform ↵	Craig Topper	2019-07-10	1	-2/+0
\| \| \| \| \| \| \| \|	the store as a <2 x float> instead of i64. This is similar to what we do for loadl_pi and loadh_pi. llvm-svn: 365669
*	[X86][test] Add test cases using immediates to builtins-x86.c	Russell Gallop	2019-06-07	1	-0/+24
\| \| \| \| \| \| \| \| \|	These builtins should work with immediate or variable shift operand for gcc compatibility. Differential Revision: https://reviews.llvm.org/D62850 llvm-svn: 362786
*	[OpenCL] Use long instead of long long in x86 builtins	Andrew Savonichev	2019-06-03	1	-11/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: According to C99 standard long long is at least 64 bits in size. However, OpenCL C defines long long as 128 bit signed integer. This prevents one to use x86 builtins when compiling OpenCL C code for x86 targets. The patch changes long long to long for OpenCL only. Patch by: Alexander Batashev <alexander.batashev@intel.com> Reviewers: craig.topper, Ka-Ka, eandrews, erichkeane, Anastasia Reviewed By: Ka-Ka, erichkeane, Anastasia Subscribers: a.elovikov, yaxunl, Anastasia, cfe-commits, ivankara, etyurin, asavonic Tags: #clang Differential Revision: https://reviews.llvm.org/D62580 llvm-svn: 362391
*	Recommit r351160 "[X86] Make _xgetbv/_xsetbv on non-windows platforms"	Craig Topper	2019-01-16	1	-0/+2
\| \| \| \| \| \|	V8 has been fixed now. llvm-svn: 351391
*	Revert "[X86] Make _xgetbv/_xsetbv on non-windows platforms"	Benjamin Kramer	2019-01-15	1	-2/+0
\| \| \| \| \| \|	This reverts commit r351160. Breaks building v8. llvm-svn: 351210
*	[X86] Make _xgetbv/_xsetbv on non-windows platforms	Craig Topper	2019-01-15	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch attempts to redo what was tried in r278783, but was reverted. These intrinsics should be available on non-windows platforms with "xsave" feature check. But on Windows platforms they shouldn't have feature check since that's how MSVC behaves. To accomplish this I've added a MS builtin with no feature check. And a normal gcc builtin with a feature check. When _MSC_VER is not defined _xgetbv/_xsetbv will be macros pointing to the gcc builtin name. I've moved the forward declarations from intrin.h to immintrin.h to match the MSDN documentation and used that as the header file for the MS builtin. I'm not super happy with this implementation, and I'm open to suggestions for better ways to do it. Reviewers: rnk, RKSimon, spatel Reviewed By: rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D56686 llvm-svn: 351160
*	[X86] Replace __builtin_ia32_vbroadcastf128_pd256 and ↵	Craig Topper	2018-06-03	1	-2/+0
\| \| \| \| \| \|	__builtin_ia32_vbroadcastf128_ps256 with an unaligned load intrinsics and a __builtin_shufflevector call. llvm-svn: 333853
*	[X86] Remove a builtin that should have been removed in r332882.	Craig Topper	2018-05-21	1	-1/+0
\| \| \| \|	llvm-svn: 332909
*	[X86] Use __builtin_convertvector to implement some of the packed integer to ↵	Craig Topper	2018-05-21	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \|	packed float conversion intrinsics. I believe this is safe assuming default default FP environment. The conversion might be inexact, but it can never overflow the FP type so this shouldn't be undefined behavior for the uitofp/sitofp instructions. We already do something similar for scalar conversions. Differential Revision: https://reviews.llvm.org/D46863 llvm-svn: 332882
*	This patch aims to match the changes introduced	Alexander Ivchenko	2018-05-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in gcc by https://gcc.gnu.org/ml/gcc-cvs/2018-04/msg00534.html. The -mibt feature flag is being removed, and the -fcf-protection option now also defines a CET macro and causes errors when used on non-X86 targets, while X86 targets no longer check for -mibt and -mshstk to determine if -fcf-protection is supported. -mshstk is now used only to determine availability of shadow stack intrinsics. Comes with an LLVM patch (D46882). Patch by mike.dvoretsky Differential Revision: https://reviews.llvm.org/D46881 llvm-svn: 332704
*	[X86] Introduce cldemote intrinsic	Gabor Buella	2018-04-13	1	-2/+3
\| \| \| \| \| \| \| \| \| \|	Reviewers: craig.topper, zvi Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45257 llvm-svn: 329993
*	[x86] wbnoinvd intrinsic	Gabor Buella	2018-04-11	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The WBNOINVD instruction writes back all modified cache lines in the processor’s internal cache to main memory but does not invalidate (flush) the internal caches. Reviewers: craig.topper, zvi, ashlykov Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D43817 llvm-svn: 329848
*	Control-Flow Enforcement Technology - Shadow Stack and Indirect Branch ↵	Oren Ben Simhon	2017-11-26	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tracking support (Clang side) Shadow stack solution introduces a new stack for return addresses only. The stack has a Shadow Stack Pointer (SSP) that points to the last address to which we expect to return. If we return to a different address an exception is triggered. This patch includes shadow stack intrinsics as well as the corresponding CET header. It includes CET clang flags for shadow stack and Indirect Branch Tracking. For more information, please see the following: https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf Differential Revision: https://reviews.llvm.org/D40224 Change-Id: I79ad0925a028bbc94c8ecad75f6daa2f214171f1 llvm-svn: 318995
*	[X86] Lower _mm[256\|512]_[mask[z]]_avg_epu[8\|16] intrinsics to native llvm IR	Yael Tsafrir	2017-09-12	1	-4/+0
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D37562 llvm-svn: 313011
*	[X86] Clzero flag addition and inclusion under znver1	Craig Topper	2017-02-09	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	1. Adds the command line flag for clzero. 2. Includes the clzero flag under znver1. 3. Defines the macro for clzero. 4. Adds a new file which has the intrinsic definition for clzero instruction. Patch by Ganesh Gopalasubramanian with some additional tests from me. Differential revision: https://reviews.llvm.org/D29386 llvm-svn: 294559
*	Add some MS aliases for existing intrinsics	Albert Gutowski	2016-09-14	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	Reviewers: thakis, compnerd, majnemer, rsmith, rnk Subscribers: alexshap, cfe-commits Differential Revision: https://reviews.llvm.org/D24330 llvm-svn: 281540
*	Reverse commit 281375 (breaks building Chromium)	Albert Gutowski	2016-09-13	1	-9/+0
\| \| \| \|	llvm-svn: 281399
*	Add some MS aliases for existing intrinsics	Albert Gutowski	2016-09-13	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	Reviewers: thakis, compnerd, majnemer, rsmith, rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24330 llvm-svn: 281375
*	Revert "[X86] Add xgetbv/x[X86] Add xgetbv xsetbv intrinsics to non-windows ↵	Reid Kleckner	2016-08-16	1	-2/+0
\| \| \| \| \| \| \| \|	platforms" This reverts commit r278783. It breaks usage of _xgetbv on Windows. llvm-svn: 278814
*	[X86] Add xgetbv/x[X86] Add xgetbv xsetbv intrinsics to non-windows platforms	Marina Yatsina	2016-08-16	1	-0/+2
\| \| \| \| \| \| \| \|	commit on behalf of guyblank Differential Revision: https://reviews.llvm.org/D21959 llvm-svn: 278783
*	[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using ↵	Simon Pilgrim	2016-07-20	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	generic IR D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead. It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match). This patch changes both scalar and packed versions back to using x86-specific builtins. It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding. Differential Revision: https://reviews.llvm.org/D22105 llvm-svn: 276102
*	[X86] Remove dead builtins that don't exist in the backend intrinsic file ↵	Craig Topper	2016-07-08	1	-1/+0
\| \| \| \| \| \|	and don't have custom handling in CGBuiltins.cpp either. llvm-svn: 274825
*	[Clang][X86] Convert non-temporal store builtins to generic ↵	Simon Pilgrim	2016-06-13	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	__builtin_nontemporal_store in headers We can now use __builtin_nontemporal_store instead of target specific builtins for naturally aligned nontemporal stores which avoids the need for handling in CGBuiltin.cpp The scalar integer nontemporal (unaligned) store builtins will have to wait as __builtin_nontemporal_store currently assumes natural alignment and doesn't accept the 'packed struct' trick that we use for normal unaligned load/stores. The nontemporal loads require further backend support before we can safely convert them to __builtin_nontemporal_load Differential Revision: http://reviews.llvm.org/D21272 llvm-svn: 272540
*	[X86][SSE] Replace (V)CVTTPS2DQ and VCVTTPD2DQ truncating (round to zero) ↵	Simon Pilgrim	2016-06-01	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \|	f32/f64 to i32 with generic IR (clang) The 'cvtt' truncation (round to zero) conversions can be safely represented as generic __builtin_convertvector (fptosi) calls instead of x86 intrinsics. We already do this (implicitly) for the scalar equivalents. Note: I looked at updating _mm_cvttpd_epi32 as well but this still requires a lot more backend work to correctly lower (both for debug and optimized builds). Differential Revision: http://reviews.llvm.org/D20859 llvm-svn: 271436
*	[X86] Replace unaligned store builtins in SSE/AVX intrinsic files with code ↵	Craig Topper	2016-05-30	1	-6/+0
\| \| \| \| \| \| \| \|	that will compile to a native unaligned store. Remove the builtins since they are no longer used. Intrinsics will be removed from llvm in a future commit. llvm-svn: 271214
*	[X86][SSE] Replace VPMOVSX and (V)PMOVZX integer extension intrinsics with ↵	Simon Pilgrim	2016-05-28	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	generic IR (clang) The VPMOVSX and (V)PMOVZX sign/zero extension intrinsics can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics. This patch removes the clang builtins and their use in the sse2/avx headers - a companion patch will remove/auto-upgrade the llvm intrinsics. Note: We already did this for SSE41 PMOVSX sometime ago. Differential Revision: http://reviews.llvm.org/D20684 llvm-svn: 271106
*	[X86][SSE] Replace lossless i32/f32 to f64 conversion intrinsics with generic IR	Simon Pilgrim	2016-05-23	1	-4/+0
\| \| \| \| \| \| \| \| \| \|	Both the (V)CVTDQ2PD(Y) (i32 to f64) and (V)CVTPS2PD(Y) (f32 to f64) conversion instructions are lossless and can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics without affecting final codegen. This patch removes the clang builtins and their use in the sse2/avx headers - a future patch will deal with removing the llvm intrinsics, but that will require a bit more work. Differential Revision: http://reviews.llvm.org/D20528 llvm-svn: 270499
*	Add new intrinsic support for MONITORX and MWAITX instructions	Ashutosh Nema	2016-05-18	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT pair while adding a timer function, such that another termination of the MWAITX instruction occurs when the timer expires. The presence of the MONITORX and MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29. The MONITORX and MWAITX instructions are intercepted by the same bits that intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be monitored. MWAITX instruction causes the processor to stop instruction execution and enter an implementation-dependent optimized state until occurrence of a class of events. Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is "0F 01 FB". These opcode information is used in adding tests for the disassembler. These instructions are enabled for AMD's bdver4 architecture. Patch by Ganesh Gopalasubramanian! Reviewers: echristo, craig.topper Subscribers: RKSimon, joker.eph, llvm-commits, cfe-commits Differential Revision: http://reviews.llvm.org/D19796 llvm-svn: 269907
*	[x86] Fix maskload/store intrinsic definitions in avxintrin.h	Andrea Di Biagio	2015-10-20	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	According to the Intel documentation, the mask operand of a maskload and maskstore intrinsics is always a vector of packed integer/long integer values. This patch introduces the following two changes: 1. It fixes the avx maskload/store intrinsic definitions in avxintrin.h. 2. It changes BuiltinsX86.def to match the correct gcc definitions for avx maskload/store (see D13861 for more details). Differential Revision: http://reviews.llvm.org/D13861 llvm-svn: 250816
*	[X86] Add fxsr feature name for fxsave/fxrestore builtins.	Craig Topper	2015-10-16	1	-2/+2
\| \| \| \|	llvm-svn: 250498
*	Add the minimum target features that these tests depend upon.	Eric Christopher	2015-10-15	1	-2/+2
\| \| \| \|	llvm-svn: 250448
*	[X86] Add XSAVE intrinsic family	Amjad Aboud	2015-10-13	1	-1/+15
\| \| \| \| \| \| \| \| \| \| \| \|	Add intrinsics for the XSAVE instructions (XSAVE/XSAVE64/XRSTOR/XRSTOR64) XSAVEOPT instructions (XSAVEOPT/XSAVEOPT64) XSAVEC instructions (XSAVEC/XSAVEC64) XSAVES instructions (XSAVES/XSAVES64/XRSTORS/XRSTORS64) Differential Revision: http://reviews.llvm.org/D13014 llvm-svn: 250158
*	[X86][SSE] Replace 128-bit SSE41 PMOVSX intrinsics with native IR	Simon Pilgrim	2015-09-19	1	-6/+0
\| \| \| \| \| \| \| \| \| \|	128-bit vector integer sign extensions correctly lower to the pmovsx instructions even for debug builds. This patch removes the builtins and reimplements the _mm_cvtepi_epi intrinsics __using builtin_shufflevector (to extract the bottom most subvector) and __builtin_convertvector (to actually perform the sign extension). Differential Revision: http://reviews.llvm.org/D12835 llvm-svn: 248092
*	[X86] Add __builtin_ia32_undef* intrinsics to test	Simon Pilgrim	2015-08-27	1	-0/+3
\| \| \| \| \| \|	Minor tweak to rL246083 llvm-svn: 246200
*	[X86] Add FXSR intrinsics	Michael Kuperstein	2015-06-30	1	-0/+4
\| \| \| \| \| \| \| \| \|	Add intrinsics for the FXSR instructions (FXSAVE/FXSAVE64/FXRSTOR/FXRSTOR64) These were previously declared in Intrin.h for MSVC compatibility, but now that we have them implemented, these declarations can be removed. llvm-svn: 241053
*	[X86, AVX] replace vextractf128 intrinsics with generic shuffles	Sanjay Patel	2015-03-12	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \|	This is very much like D8088 (checked in at r231792). Now that we've replaced the vinsertf128 intrinsics, do the same for their extract twins. Differential Revision: http://reviews.llvm.org/D8275 llvm-svn: 232052
*	[X86, AVX] Replace vinsertf128 intrinsics with generic shuffles.	Sanjay Patel	2015-03-10	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We want to replace as much custom x86 shuffling via intrinsics as possible because pushing the code down the generic shuffle optimization path allows for better codegen and less complexity in LLVM. This is the sibling patch for the LLVM half of this change: http://reviews.llvm.org/D8086 Differential Revision: http://reviews.llvm.org/D8088 llvm-svn: 231792
*	[X86] Remove pblendw and pblendd builtins that aren't being used by the ↵	Craig Topper	2015-02-27	1	-1/+0
\| \| \| \| \| \|	intrinsic headers. llvm-svn: 230738
*	[X86] Remove the blendps/blendpd builtins. They aren't used by the intrinsic ↵	Craig Topper	2015-02-26	1	-4/+0
\| \| \| \| \| \|	headers. We use appropriate shuffle vector instead. llvm-svn: 230616
*	[X86] Add range checking to the immediate arguments of many of the SSE/AVX ↵	Craig Topper	2015-01-31	1	-8/+8
\| \| \| \| \| \|	builtins. llvm-svn: 227674
*	[x86] Clean up the x86 builtin specs to reflect r217310 in LLVM which	Chandler Carruth	2014-09-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	made the 8-bit masks actually 8-bit arguments to these intrinsics. These builtins are a mess. Many were missing the I qualifier which I added where obviously correct. Most aren't tested, but I've updated the relevant tests. I've tried to catch all the things that should become 'c' in this round. It's also frustrating because the set of these is really ad-hoc and doesn't really map that cleanly to the set supported by either GCC or LLVM. Oh well... llvm-svn: 217311
*	[x86] Add Clang support for intrinsic __rdpmc.	Andrea Di Biagio	2014-06-30	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This patch adds intrinsic __rdpmc to header file 'ia32intrin.h'. Intrinsic __rdmpc can be used to read performance monitoring counters. It is implemented as a direct call to __builtin_ia32_rdpmc. It takes as input a value representing the index of the performance counter to read. The value of the performance counter is then returned as a unsigned 64-bit quantity. llvm-svn: 212053
*	Implement AVX1 vbroadcast intrinsics with vector initializers	Adam Nemet	2014-05-29	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These intrinsics are special because they directly take a memory operand (AVX2 adds the register counterparts). Typically, other non-memop intrinsics take registers and then it's left to isel to fold memory operands. In order to LICM intrinsics directly reading memory, we require that no stores are in the loop (LICM) or that the folded load accesses constant memory (MachineLICM). When neither is the case we fail to hoist a loop-invariant broadcast. We can work around this limitation if we expose the load as a regular load and then just implement the broadcast using the vector initializer syntax. This exposes the load to LICM and other optimizations. At the IR level this is translated into a series of insertelements. The sequence is already recognized as a broadcast so there is no impact on the quality of codegen. _mm256_broadcast_pd and _mm256_broadcast_ps are not updated by this patch because right now we lack the DAG-combiner smartness to recover the broadcast instructions. This will be tackled in a follow-on. There will be completing changes on the LLVM side to remove the LLVM intrinsics and to auto-upgrade bitcode files. Fixes <rdar://problem/16494520> llvm-svn: 209846
*	[X86] Add Clang support for intrinsics __rdtsc and __rdtscp.	Andrea Di Biagio	2014-04-24	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch: 1. Adds a definition for two new GCCBuiltins in BuiltinsX86.def: __builtin_ia32_rdtsc; __builtin_ia32_rdtscp; 2. Replaces the already existing definition of intrinsic __rdtsc in ia32intrin.h with a simple call to the new GCC builtin __builtin_ia32_rdtsc. 3. Adds a definition for the new intrinsic __rdtscp in ia32intrin.h llvm-svn: 207132
*	Add _mm_stream_si64 intrinsic.	Eli Friedman	2013-09-23	1	-0/+4
\| \| \| \| \| \| \| \| \|	While I'm here, also fix the alignment computation for the whole family of intrinsics. PR17298. llvm-svn: 191243
*	Add C intrinsics for Intel SHA Extensions	Ben Langmuir	2013-09-19	1	-0/+8
\| \| \| \| \| \| \| \| \|	Intrinsics added shaintrin.h, which is included from x86intrin.h if __SHA__ is enabled. SHA implies SSE2, which is needed for the __m128i type. Also add the -msha/-mno-sha option. llvm-svn: 190999
*	Get rid of storelv4si builtin as it can be expressed directly. This is general	Chad Rosier	2012-05-01	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	goodness because it provides opportunites to cleanup things. For example, uint64_t t1(__m128i vA) { uint64_t Alo; _mm_storel_epi64((__m128i*)&Alo, vA); return Alo; } was generating movq %xmm0, -8(%rbp) movq -8(%rbp), %rax and now generates movd %xmm0, %rax rdar://11282581 llvm-svn: 155924
*	Convert vperm2f128 and vperm2i128 intrinsics back to using llvm intrinsics. ↵	Craig Topper	2012-04-17	1	-0/+3
\| \| \| \| \| \|	Unfortunately, these instructions have behavior that can't be modeled with shuffle vector. llvm-svn: 154906
*	Remove vperm2f* and vperm2i builtins. Same effect can be achieved with ↵	Craig Topper	2012-02-08	1	-3/+0
\| \| \| \| \| \|	builtin_shufflevector. llvm-svn: 150064
*	Remove vpermilp* builtins. Same effect can be achieved with ↵	Craig Topper	2012-02-08	1	-4/+0
\| \| \| \| \| \|	builtin_shufflevector. llvm-svn: 150056