bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86] Disable SGX for Skylake Server	Gabor Buella	2018-04-10	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	Reviewers: craig.topper, zvi, echristo Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45058 llvm-svn: 329701
*	[CUDA] Revert defining __CUDA_ARCH__ for amdgcn targets	Yaxun Liu	2018-04-09	3	-10/+55
\| \| \| \| \| \| \| \| \| \|	amdgcn targets only support HIP, which does not define __CUDA_ARCH__. this is a partial unroll of r329232 / D45277. Differential Revision: https://reviews.llvm.org/D45387 llvm-svn: 329584
*	Fix typos in clang	Alexander Kornienko	2018-04-06	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399
*	[PATCH] [RISCV] Extend getTargetDefines for RISCVTargetInfo	Shiva Chen	2018-04-05	2	-1/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch extend getTargetDefines and implement handleTargetFeatures and hasFeature. and define corresponding marco for those features. Reviewers: asb, apazos, eli.friedman Differential Revision: https://reviews.llvm.org/D44727 Patch by Kito Cheng. llvm-svn: 329278
*	[CUDA] Add amdgpu sub archs	Yaxun Liu	2018-04-04	3	-42/+10
\| \| \| \| \| \| \| \| \|	Patch by Greg Rodgers. Revised and lit tests added by Yaxun Liu. Differential Revision: https://reviews.llvm.org/D45277 llvm-svn: 329232
*	[Hexagon] Remove -mhvx-double and the corresponding subtarget feature	Krzysztof Parzyszek	2018-04-03	1	-1/+0
\| \| \| \| \| \| \|	Specifying the HVX vector length should be done via the -mhvx-length option. llvm-svn: 329077
*	CodeGenCXX: support PreserveMostCC in MS ABI	Saleem Abdulrasool	2018-04-02	1	-0/+1
\| \| \| \| \| \| \| \|	Microsoft has reserved 'U' for the PreserveMostCC which is used in the swift runtime. Add support for this. This allows the swift runtime to be built for Windows again. llvm-svn: 329025
*	[AArch64]: Add support for parsing rN registers.	Manoj Gupta	2018-03-29	1	-1/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Allow rN registers to be simply parsed as correspoing xN registers. The "register ... asm("rN")" is an command to the compiler's register allocator, not an operand to any individual assembly instruction. GCC documents this syntax as "...the name of the register that should be used." This is needed to support the changes in Linux kernel (see https://lkml.org/lkml/2018/3/1/268 ) Note: This will add support only for the limited use case of register ... asm("rN"). Any other uses that make rN leak into assembly are not supported. Reviewers: kristof.beyls, rengolin, peter.smith, t.p.northover Reviewed By: peter.smith Subscribers: javed.absar, eraman, cfe-commits, srhines Differential Revision: https://reviews.llvm.org/D44815 llvm-svn: 328829
*	[ObjC++] Make parameter passing and function return compatible with ObjC	Akira Hatanaka	2018-03-28	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ObjC and ObjC++ pass non-trivial structs in a way that is incompatible with each other. For example: typedef struct { id f0; __weak id f1; } S; // this code is compiled in c++. extern "C" { void foo(S s); } void caller() { // the caller passes the parameter indirectly and destructs it. foo(S()); } // this function is compiled in c. // 'a' is passed directly and is destructed in the callee. void foo(S a) { } This patch fixes the incompatibility by passing and returning structs with __strong or weak fields using the C ABI in C++ mode. __strong and __weak fields in a struct do not cause the struct to be destructed in the caller and __strong fields do not cause the struct to be passed indirectly. Also, this patch fixes the microsoft ABI bug mentioned here: https://reviews.llvm.org/D41039?id=128767#inline-364710 rdar://problem/38887866 Differential Revision: https://reviews.llvm.org/D44908 llvm-svn: 328731
*	AMDGPU: Update datalayout for stack alignment	Matt Arsenault	2018-03-27	1	-2/+2
\| \| \| \|	llvm-svn: 328657
*	[AMDGPU] Fix codegen for inline assembly	Yaxun Liu	2018-03-23	1	-0/+13
\| \| \| \| \| \| \| \|	Need to override convertConstraint to recognise amdgpu specific register names. Differential Revision: https://reviews.llvm.org/D44533 llvm-svn: 328359
*	Basic: support PreserveMost and PreserveAll on Windows ARM	Saleem Abdulrasool	2018-03-20	1	-0/+2
\| \| \| \| \| \| \|	Do not ignore these calling conventions on Windows ARM. They are used by the swift runtime for certain calls. llvm-svn: 328007
*	[ARM] Pass half or i16 types for NEON intrinsics	Sjoerd Meijer	2018-03-19	3	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For generating NEON intrinsics, this determines the NEON data type, and whether it should be a half type or an i16 type. I.e., we always pass a half type for AArch64, this hasn't changed, but now also for ARM but only when FullFP16 is enabled, and i16 otherwise. This is intended to be non-functional change, but together with the backend work in D44538 which adds support for f16 vectors, this enables adding the AArch32 FP16 (vector) intrinsics. Differential Revision: https://reviews.llvm.org/D44561 llvm-svn: 327836
*	[ARM] ACLE FP16 feature test macros	Sjoerd Meijer	2018-03-13	2	-0/+13
\| \| \| \| \| \| \| \| \|	This is a partial recommit of r327189 that was reverted due to test issues. I.e., this recommits minimal functional change, the FP16 feature test macros, and adds tests that were missing in the original commit. llvm-svn: 327455
*	This reverts "r327189 - [ARM] Add ARMv8.2-A FP16 vector intrinsic"	Sjoerd Meijer	2018-03-13	2	-10/+0
\| \| \| \| \| \| \|	This is causing problems in testing, and PR36683 was raised. Reverting it until we have sorted out how to pass f16 vectors. llvm-svn: 327437
*	[ARM] Add ARMv8.2-A FP16 vector intrinsic	Abderrazek Zaafrani	2018-03-09	2	-0/+10
\| \| \| \| \| \| \| \|	Add the fp16 neon vector intrinsic for ARM as described in the ARM ACLE document. Reviews in https://reviews.llvm.org/D43650 llvm-svn: 327189
*	Correct the alignment for the PS4 target	Matthew Voss	2018-03-07	1	-0/+1
\| \| \| \| \| \|	https://reviews.llvm.org/D44218 llvm-svn: 326942
*	[AMDGPU] Clean up old address space mapping and fix constant address space value	Yaxun Liu	2018-03-05	2	-93/+47
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D43911 llvm-svn: 326725
*	[WebAssembly] Add exception handling option	Heejin Ahn	2018-03-02	2	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add exception handling option to clang. Reviewers: dschuff Subscribers: jfb, sbc100, jgravelle-google, sunfish, cfe-commits Differential Revision: https://reviews.llvm.org/D43681 llvm-svn: 326517
*	AMDGPU: Define FP_FAST_FMA{F} macros for amdgcn	Konstantin Zhuravlyov	2018-02-27	2	-148/+194
\| \| \| \| \| \| \| \| \| \|	- Expand GK_*s (i.e. GFX6 -> GFX600, GFX601, etc.) - This allows us to choose features correctly in some cases (for example, fast fmaf is available on gfx600, but not gfx601) - Move HasFMAF, HasFP64, HasLDEXPF to GPUInfo tables - Add HasFastFMA, HasFastFMAF to GPUInfo tables - Add missing tests llvm-svn: 326254
*	[RISCV] Enable __int128_t and __uint128_t through clang flag	Mandeep Singh Grang	2018-02-25	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the flag -fforce-enable-int128 is passed, it will enable support for __int128_t and __uint128_t types. This flag can then be used to build compiler-rt for RISCV32. Reviewers: asb, kito-cheng, apazos, efriedma Reviewed By: asb, efriedma Subscribers: shiva0217, efriedma, jfb, dschuff, sdardis, sbc100, jgravelle-google, aheejin, rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, cfe-commits Differential Revision: https://reviews.llvm.org/D43105 llvm-svn: 326045
*	bpf: Hook target feature "alu32" with LLVM	Yonghong Song	2018-02-23	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	LLVM has supported a new target feature "alu32" which could be enabled or disabled by "-mattr=[+\|-]alu32" when using llc. This patch link Clang with it, so it could be also done by passing related options to Clang, for example: -Xclang -target-feature -Xclang +alu32 Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325996
*	[X86] Disable CLWB in Cannon Lake	Craig Topper	2018-02-21	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \|	Cannon Lake does not support CLWB, therefore it does not include all features listed under SKX. Patch by Gabor Buella Differential Revision: https://reviews.llvm.org/D43459 llvm-svn: 325655
*	[mips] Spectre variant two mitigation for MIPSR2	Simon Dardis	2018-02-21	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch provides mitigation for CVE-2017-5715, Spectre variant two, which affects the P5600 and P6600. It provides the option -mindirect-jump=hazard, which instructs the LLVM backend to replace indirect branches with their hazard barrier variants. This option is accepted when targeting MIPS revision two or later. The migitation strategy suggested by MIPS for these processors is to use two hazard barrier instructions. 'jalr.hb' and 'jr.hb' are hazard barrier variants of the 'jalr' and 'jr' instructions respectively. These instructions impede the execution of instruction stream until architecturally defined hazards (changes to the instruction stream, privileged registers which may affect execution) are cleared. These instructions in MIPS' designs are not speculated past. These instructions are used with the option -mindirect-jump=hazard when branching indirectly and for indirect function calls. These instructions are defined by the MIPS32R2 ISA, so this mitigation method is not compatible with processors which implement an earlier revision of the MIPS ISA. Implementation note: I've opted to provide this as an -mindirect-jump={hazard,...} style option in case alternative mitigation methods are required for other implementations of the MIPS ISA in future, e.g. retpoline style solutions. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D43487 llvm-svn: 325651
*	[AVR] Set the program address space in the data layout	Dylan McKay	2018-02-19	1	-1/+1
\| \| \| \| \| \|	This is accompanied by r325481 in LLVM. llvm-svn: 325483
*	[X86] Add 'sahf' CPU feature to frontend	Dimitry Andric	2018-02-17	2	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Make clang accept `-msahf` (and `-mno-sahf`) flags to activate the `+sahf` feature for the backend, for bug 36028 (Incorrect use of pushf/popf enables/disables interrupts on amd64 kernels). This was originally submitted in bug 36037 by Jonathan Looney <jonlooney@gmail.com>. As described there, GCC also uses `-msahf` for this feature, and the backend already recognizes the `+sahf` feature. All that is needed is to teach clang to pass this on to the backend. The mapping of feature support onto CPUs may not be complete; rather, it was chosen to match LLVM's idea of which CPUs support this feature (see lib/Target/X86/X86.td). I also updated the affected test case (CodeGen/attr-target-x86.c) to match the emitted output. Reviewers: craig.topper, coby, efriedma, rsmith Reviewed By: craig.topper Subscribers: emaste, cfe-commits Differential Revision: https://reviews.llvm.org/D43394 llvm-svn: 325446
*	Reapply r325193	Konstantin Zhuravlyov	2018-02-15	2	-91/+101
\| \| \| \|	llvm-svn: 325203
*	Revert r325193 as it breaks buildbots	Konstantin Zhuravlyov	2018-02-15	2	-101/+91
\| \| \| \|	llvm-svn: 325200
*	Add missing definition for class static after r325193.	Richard Smith	2018-02-15	1	-1/+1
\| \| \| \|	llvm-svn: 325195
*	AMDGPU: Cleanup most of the macros	Konstantin Zhuravlyov	2018-02-15	2	-91/+101
\| \| \| \| \| \| \| \| \| \| \|	- Insert __AMD__ macro - Insert __AMDGPU__ macro - Insert __devicename__ macro - Add missing tests for arch macros Differential Revision: https://reviews.llvm.org/D36802 llvm-svn: 325193
*	[AMDGPU] Change constant addr space to 4	Yaxun Liu	2018-02-13	1	-9/+9
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D43171 llvm-svn: 325031
*	AMDGPU: Update for datalayout change	Matt Arsenault	2018-02-09	1	-3/+3
\| \| \| \|	llvm-svn: 324748
*	AMDGPU/GCN: Bring processors in sync with AMDGPUUsage	Konstantin Zhuravlyov	2018-02-09	1	-28/+42
\| \| \| \| \| \| \| \| \| \| \|	- Remove gfx800 - Remove gfx804 - Remove gfx901 - Remove gfx903 Differential Revision: https://reviews.llvm.org/D40045 llvm-svn: 324714
*	Fix UBSan issue with PPC::isValidCPUName	Erich Keane	2018-02-09	1	-2/+1
\| \| \| \| \| \| \| \|	Apparently storing the pointer to a StringLiteral as a StringRef caused this section of code to issue a ubsan warning. This will hopefully fix that. llvm-svn: 324687
*	Add size to constexpr Arrays	Erich Keane	2018-02-08	1	-2/+2
\| \| \| \| \| \| \| \|	What seems to be a bug in older versions of MSVC, constexpr member arrays with a redefinition (to force emission) require their initial definition to have the size between the brackets. llvm-svn: 324682
*	Add Rest of Targets Support to ValidCPUList (enabling march notes)	Erich Keane	2018-02-08	21	-269/+307
\| \| \| \| \| \| \| \| \| \| \|	A followup to: https://reviews.llvm.org/D42978 Most of the rest of the Targets were pretty rote, so this patch knocks them all out at once. Differential Revision: https://reviews.llvm.org/D43057 llvm-svn: 324676
*	Add NVPTX Support to ValidCPUList (enabling march notes)	Erich Keane	2018-02-08	2	-0/+8
\| \| \| \| \| \| \| \| \| \|	A followup to: https://reviews.llvm.org/D42978 This patch adds NVPTX support for enabling the march notes. Differential Revision: https://reviews.llvm.org/D43045 llvm-svn: 324675
*	Add X86 Support to ValidCPUList (enabling march notes)	Erich Keane	2018-02-08	2	-2/+15
\| \| \| \| \| \| \| \| \| \|	A followup to: https://reviews.llvm.org/D42978 This patch adds X86 and X86_64 support for enabling the march notes. Differential Revision: https://reviews.llvm.org/D43041 llvm-svn: 324674
*	Make march/target-cpu print a note with the list of valid values for ARM	Erich Keane	2018-02-08	4	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \|	When rejecting a march= or target-cpu command line parameter, the message is quite lacking. This patch adds a note that prints all possible values for the current target, if the target supports it. This adds support for the ARM/AArch64 targets (more to come!). Differential Revision: https://reviews.llvm.org/D42978 llvm-svn: 324673
*	[NFCi] Replace a couple of usages of const StringRef& with StringRef	Erich Keane	2018-02-07	3	-4/+4
\| \| \| \| \| \| \|	No sense passing these by reference when a copy is about as free, and saves on potential indirection later. llvm-svn: 324540
*	[Myriad] Define __ma2x5x and __ma2x8x	Walter Lee	2018-02-06	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add architecture defines for ma2x5x and ma2x8x. Reviewers: jyknight Subscribers: fedor.sergeev, MartinO Differential Revision: https://reviews.llvm.org/D42882 llvm-svn: 324420
*	[AMDGPU] Switch to the new addr space mapping by default	Yaxun Liu	2018-02-02	2	-5/+2
\| \| \| \| \| \| \| \|	This requires corresponding llvm change. Differential Revision: https://reviews.llvm.org/D40956 llvm-svn: 324102
*	[CUDA] Added partial support for CUDA-9.1	Artem Belevich	2018-01-30	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Clang can use CUDA-9.1 now, though new APIs (are not implemented yet. The major change is that headers in CUDA-9.1 went through substantial changes that started in CUDA-9.0 which required substantial changes in the cuda compatibility headers provided by clang. There are two major issues: * CUDA SDK no longer provides declarations for libdevice functions. * A lot of device-side functions have become nvcc's builtins and CUDA headers no longer contain their implementations. This patch changes the way CUDA headers are handled if we compile with CUDA 9.x. Both 9.0 and 9.1 are affected. * Clang provides its own declarations of libdevice functions. * For CUDA-9.x clang now provides implementation of device-side 'standard library' functions using libdevice. This patch should not affect compilation with CUDA-8. There may be some observable differences for CUDA-9.0, though they are not expected to affect functionality. Tested: CUDA test-suite tests for all supported combinations of: CUDA: 7.0,7.5,8.0,9.0,9.1 GPU: sm_20, sm_35, sm_60, sm_70 Differential Revision: https://reviews.llvm.org/D42513 llvm-svn: 323713
*	[X86] Add 'rdrnd' feature to silvermont to match recent gcc bug fix.	Craig Topper	2018-01-26	1	-1/+1
\| \| \| \| \| \|	gcc recently fixed this bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83546 llvm-svn: 323552
*	[X86] Define __IBT__ when -mibt is specified.	Craig Topper	2018-01-26	1	-0/+2
\| \| \| \|	llvm-svn: 323543
*	Adjust MaxAtomicInlineWidth for i386/i486 targets.	Wei Mi	2018-01-23	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is to fix the bug reported in https://bugs.llvm.org/show_bug.cgi?id=34347#c6. Currently, all MaxAtomicInlineWidth of x86-32 targets are set to 64. However, i386 doesn't support any cmpxchg related instructions. i486 only supports cmpxchg. So in this patch MaxAtomicInlineWidth is reset as follows: For i386, the MaxAtomicInlineWidth should be 0 because no cmpxchg is supported. For i486, the MaxAtomicInlineWidth should be 32 because it supports cmpxchg. For others 32 bits x86 cpu, the MaxAtomicInlineWidth should be 64 because of cmpxchg8b. Differential Revision: https://reviews.llvm.org/D42154 llvm-svn: 323281
*	[WebAssembly] Factor out settings common to wasm32 and wasm64. NFC.	Dan Gohman	2018-01-23	1	-2/+1
\| \| \| \| \| \| \|	MaxAtomicPromoteWidth and MaxAtomicInlineWidth are 64 on both wasm32 and wasm64, so they can be set in shared code. llvm-svn: 323253
*	Introduce the "retpoline" x86 mitigation technique for variant #2 of the ↵	Chandler Carruth	2018-01-22	2	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre.. Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took in the speculative execution, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain. The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers. However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address. On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address. This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886 We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` in addition to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction. There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection. The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness. For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile all libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller. When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typical workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel. When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%. However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we strongly recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline. We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors. This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design. Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41723 llvm-svn: 323155
*	[X86] Add rdpid command line option and intrinsics.	Craig Topper	2018-01-20	2	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds -mrdpid/-mno-rdpid and the rdpid intrinsic. The corresponding LLVM commit has already been made. Reviewers: RKSimon, spatel, zvi, AndreiGrischenko Reviewed By: RKSimon Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D42272 llvm-svn: 323047
*	[X86] Put the code that defines __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 for the ↵	Craig Topper	2018-01-20	1	-2/+2
\| \| \| \| \| \|	preprocessor with the other __GCC_HAVE_SYNC_COMPARE_AND_SWAP_* defines. NFC llvm-svn: 323046