summaryrefslogtreecommitdiffstats
path: root/clang/lib/Basic
Commit message (Collapse)AuthorAgeFilesLines
...
* [NFCi] Replace a couple of usages of const StringRef& with StringRefErich Keane2018-02-073-4/+4
| | | | | | | No sense passing these by reference when a copy is about as free, and saves on potential indirection later. llvm-svn: 324540
* [Myriad] Define __ma2x5x and __ma2x8xWalter Lee2018-02-061-0/+7
| | | | | | | | | | | | Summary: Add architecture defines for ma2x5x and ma2x8x. Reviewers: jyknight Subscribers: fedor.sergeev, MartinO Differential Revision: https://reviews.llvm.org/D42882 llvm-svn: 324420
* [RISCV] Create a LinuxTargetInfo when targeting LinuxAlex Bradbury2018-02-031-0/+6
| | | | | | | | | | | | | | Previously, RISCV32TargetInfo or RISCV64TargetInfo were created unconditionally. Use LinuxTargetInfo<RISCV??TargetInfo> to ensure that the proper OS-specific defines are present. This patch only adds logic to instantiate LinuxTargetInfo and leaves a TODO, as I'm reluctant to add logic for other targets (e.g. FreeBSD, RTEMS) until I've produced and tested at least one binary for that OS+target combo. Thanks to @mgrang to reporting the issue. llvm-svn: 324170
* [AMDGPU] Switch to the new addr space mapping by defaultYaxun Liu2018-02-022-5/+2
| | | | | | | | This requires corresponding llvm change. Differential Revision: https://reviews.llvm.org/D40956 llvm-svn: 324102
* [CUDA] Added partial support for CUDA-9.1Artem Belevich2018-01-302-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clang can use CUDA-9.1 now, though new APIs (are not implemented yet. The major change is that headers in CUDA-9.1 went through substantial changes that started in CUDA-9.0 which required substantial changes in the cuda compatibility headers provided by clang. There are two major issues: * CUDA SDK no longer provides declarations for libdevice functions. * A lot of device-side functions have become nvcc's builtins and CUDA headers no longer contain their implementations. This patch changes the way CUDA headers are handled if we compile with CUDA 9.x. Both 9.0 and 9.1 are affected. * Clang provides its own declarations of libdevice functions. * For CUDA-9.x clang now provides implementation of device-side 'standard library' functions using libdevice. This patch should not affect compilation with CUDA-8. There may be some observable differences for CUDA-9.0, though they are not expected to affect functionality. Tested: CUDA test-suite tests for all supported combinations of: CUDA: 7.0,7.5,8.0,9.0,9.1 GPU: sm_20, sm_35, sm_60, sm_70 Differential Revision: https://reviews.llvm.org/D42513 llvm-svn: 323713
* [X86] Add 'rdrnd' feature to silvermont to match recent gcc bug fix.Craig Topper2018-01-261-1/+1
| | | | | | gcc recently fixed this bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83546 llvm-svn: 323552
* [X86] Define __IBT__ when -mibt is specified.Craig Topper2018-01-261-0/+2
| | | | llvm-svn: 323543
* Adjust MaxAtomicInlineWidth for i386/i486 targets.Wei Mi2018-01-231-3/+6
| | | | | | | | | | | | | | This is to fix the bug reported in https://bugs.llvm.org/show_bug.cgi?id=34347#c6. Currently, all MaxAtomicInlineWidth of x86-32 targets are set to 64. However, i386 doesn't support any cmpxchg related instructions. i486 only supports cmpxchg. So in this patch MaxAtomicInlineWidth is reset as follows: For i386, the MaxAtomicInlineWidth should be 0 because no cmpxchg is supported. For i486, the MaxAtomicInlineWidth should be 32 because it supports cmpxchg. For others 32 bits x86 cpu, the MaxAtomicInlineWidth should be 64 because of cmpxchg8b. Differential Revision: https://reviews.llvm.org/D42154 llvm-svn: 323281
* [WebAssembly] Factor out settings common to wasm32 and wasm64. NFC.Dan Gohman2018-01-231-2/+1
| | | | | | | MaxAtomicPromoteWidth and MaxAtomicInlineWidth are 64 on both wasm32 and wasm64, so they can be set in shared code. llvm-svn: 323253
* Introduce the "retpoline" x86 mitigation technique for variant #2 of the ↵Chandler Carruth2018-01-222-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre.. Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took *in the speculative execution*, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain. The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers. However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address. On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address. This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886 We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction. There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection. The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness. For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile *all* libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller. When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typical workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel. When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%. However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we *strongly* recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline. We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors. This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design. Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41723 llvm-svn: 323155
* [X86] Add rdpid command line option and intrinsics.Craig Topper2018-01-202-0/+8
| | | | | | | | | | | | | | Summary: This patch adds -mrdpid/-mno-rdpid and the rdpid intrinsic. The corresponding LLVM commit has already been made. Reviewers: RKSimon, spatel, zvi, AndreiGrischenko Reviewed By: RKSimon Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D42272 llvm-svn: 323047
* [X86] Put the code that defines __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 for the ↵Craig Topper2018-01-201-2/+2
| | | | | | preprocessor with the other __GCC_HAVE_SYNC_COMPARE_AND_SWAP_* defines. NFC llvm-svn: 323046
* [AArch64] Add ARMv8.2-A FP16 scalar intrinsicsAbderrazek Zaafrani2018-01-191-0/+2
| | | | | | https://reviews.llvm.org/D41792 llvm-svn: 323006
* [WebAssembly] Add target flags for sign-ext opcodes.Dan Gohman2018-01-192-1/+13
| | | | | | | Add -msign-ext and -mno-sign-ext to control the new sign-ext target feature. llvm-svn: 322967
* Make DiagnosticsEngine() take DiagOpts as DiagnosticsEngine.Nico Weber2018-01-171-5/+5
| | | | | | | No behavior change, but makes it a bit clearer that DiagnosticsEngine adds a ref to DiagOpts. llvm-svn: 322611
* [SystemZ] Support vector registers with inline asmUlrich Weigand2018-01-162-3/+21
| | | | | | | Allow using vector register names and the "v" constraint in inline asm to ensure compatibility with GCC. llvm-svn: 322562
* [OPENMP] Initial codegen for `target teams distribute parallel forAlexey Bataev2018-01-151-1/+1
| | | | | | | | | simd`. Added host codegen + codegen for devices with default codegen for `#pragma omp target teams distribute parallel for simd` directive. llvm-svn: 322515
* [OPENMP] Add codegen for `depend` clauses on `target` directive.Alexey Bataev2018-01-151-0/+4
| | | | | | | Added basic support for codegen of `depend` clauses on `target` directive. llvm-svn: 322501
* [RISCV] Add the RISCV target and compiler driverAlex Bradbury2018-01-114-0/+162
| | | | | | | | | As RV64 codegen has not yet been upstreamed into LLVM, we focus on RV32 driver support (RV64 to follow). Differential Revision: https://reviews.llvm.org/D39963 llvm-svn: 322276
* [X86] Make -mavx512f imply -mfma and -mf16c in the frontend like it does in ↵Craig Topper2018-01-111-1/+5
| | | | | | | | | | the backend. Similarly, make -mno-fma and -mno-f16c imply -mno-avx512f. Withou this "-mno-sse -mavx512f" ends up with avx512f being enabled in the frontend but disabled in the backend. llvm-svn: 322245
* Added Control Flow Protection FlagOren Ben Simhon2018-01-092-0/+26
| | | | | | | | | | Cf-protection is a target independent flag that instructs the back-end to instrument control flow mechanisms like: Branch, Return, etc. For example in X86 this flag will be used to instrument Indirect Branch Tracking instructions. Differential Revision: https://reviews.llvm.org/D40478 Change-Id: I5126e766c0e6b84118cae0ee8a20fe78cc373dea llvm-svn: 322063
* Implement Attribute Target MultiVersioningErich Keane2018-01-082-0/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | GCC's attribute 'target', in addition to being an optimization hint, also allows function multiversioning. We currently have the former implemented, this is the latter's implementation. This works by enabling functions with the same name/signature to coexist, so that they can all be emitted. Multiversion state is stored in the FunctionDecl itself, and SemaDecl manages the definitions. Note that it ends up having to permit redefinition of functions so that they can all be emitted. Additionally, all versions of the function must be emitted, so this also manages that. Note that this includes some additional rules that GCC does not, since defining something as a MultiVersion function after a usage has been made illegal. The only 'history rewriting' that happens is if a function is emitted before it has been converted to a multiversion'ed function, at which point its name needs to be changed. Function templates and virtual functions are NOT yet supported (not supported in GCC either). Additionally, constructors/destructors are disallowed, but the former is planned. llvm-svn: 322028
* Fix TLS support check for Darwin 32-bit simulator targets.Volodymyr Sapsai2018-01-051-9/+15
| | | | | | | | | | | | | | | | | Also instead of checking architecture explicitly, use recently added "simulator" environment in the triple. rdar://problem/35083787 Reviewers: arphaman, bob.wilson Reviewed By: arphaman Subscribers: gparker42, cfe-commits Differential Revision: https://reviews.llvm.org/D41750 llvm-svn: 321890
* Reapply r321781: [Modules] Allow modules specified by -fmodule-map-file to ↵Bruno Cardoso Lopes2018-01-051-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | shadow implicitly found ones When modules come from module map files explicitly specified by -fmodule-map-file= arguments, allow those to override/shadow modules with the same name that are found implicitly by header search. If such a module is looked up by name (e.g. @import), we will always find the one from -fmodule-map-file. If we try to use a shadowed module by including one of its headers report an error. This enables developers to force use of a specific copy of their module to be used if there are multiple copies that would otherwise be visible, for example if they develop modules that are installed in the default search paths. Patch originally by Ben Langmuir, http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20151116/143425.html Based on cfe-dev discussion: http://lists.llvm.org/pipermail/cfe-dev/2015-November/046164.html Differential Revision: https://reviews.llvm.org/D31269 rdar://problem/23612102 llvm-svn: 321855
* Revert "[Modules] Allow modules specified by -fmodule-map-file to shadow ↵Bruno Cardoso Lopes2018-01-041-6/+1
| | | | | | | | | | | implicitly found ones" This reverts r321781 until I fix the leaks pointed out by bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/12146 http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/3741 llvm-svn: 321786
* [Modules] Allow modules specified by -fmodule-map-file to shadow implicitly ↵Bruno Cardoso Lopes2018-01-041-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | found ones When modules come from module map files explicitly specified by -fmodule-map-file= arguments, allow those to override/shadow modules with the same name that are found implicitly by header search. If such a module is looked up by name (e.g. @import), we will always find the one from -fmodule-map-file. If we try to use a shadowed module by including one of its headers report an error. This enables developers to force use of a specific copy of their module to be used if there are multiple copies that would otherwise be visible, for example if they develop modules that are installed in the default search paths. Patch originally by Ben Langmuir, http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20151116/143425.html Based on cfe-dev discussion: http://lists.llvm.org/pipermail/cfe-dev/2015-November/046164.html Differential Revision: https://reviews.llvm.org/D31269 rdar://problem/23612102 llvm-svn: 321781
* [OpenMP] Initial implementation of code generation for pragma 'target teams ↵Carlo Bertolli2018-01-031-1/+5
| | | | | | | | | | distribute parallel for' on host https://reviews.llvm.org/D41709 This patch includes code generation and testing for offloading when target device is host. llvm-svn: 321759
* Revert r321504 "[X86] Don't accidentally enable PKU on cannon lake and ↵Craig Topper2017-12-291-5/+2
| | | | | | | | | | | | | | icelake or CLWB on cannonlake." I based that commit on what was in Intel's public documentation here https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf Which specifically said CLWB wasn't until Icelake. But I've since cross checked with SDE and it thinks these features exist on CNL and ICL. So now I don't know what to believe. I've added test coverage of the current behavior as part of the revert so at least now have proof of what we're doing. llvm-svn: 321547
* Avoid int to string conversion in Twine or raw_ostream contexts.Benjamin Kramer2017-12-282-5/+5
| | | | | | Some output changes from uppercase hex to lowercase hex, no other functionality change intended. llvm-svn: 321526
* [X86] Don't accidentally enable PKU on cannon lake and icelake or CLWB on ↵Craig Topper2017-12-271-2/+4
| | | | | | | | cannonlake. We have cannonlake and icelake inheriting from skylake server in a switch using fallthroughs. But they aren't perfect supersets of skylake server. llvm-svn: 321504
* [X86] Enable avx512vpopcntdq and clwb for icelake.Craig Topper2017-12-271-1/+2
| | | | | | Per table 1-1 of the October 2017 edition of Intel® Architecture Instruction Set Extensions and Future Features Programming Reference llvm-svn: 321502
* [x86][icelake][vbmi2]Coby Tayree2017-12-272-7/+17
| | | | | | | | | | | | | | | added vbmi2 feature recognition added intrinsics support for vbmi2 instructions _mm[128,256,512]_mask[z]_compress_epi[16,32] _mm[128,256,512]_mask_compressstoreu_epi[16,32] _mm[128,256,512]_mask[z]_expand_epi[16,32] _mm[128,256,512]_mask[z]_expandloadu_epi[16,32] _mm[128,256,512]_mask[z]_sh[l,r]di_epi[16,32,64] _mm[128,256,512]_mask_sh[l,r]dv_epi[16,32,64] matching a similar work on the backend (D40206) Differential Revision: https://reviews.llvm.org/D41557 llvm-svn: 321487
* [x86][icelake][vnni]Coby Tayree2017-12-272-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | added vnni feature recognition added intrinsics support for VNNI instructions _mm256_mask_dpbusd_epi32 _mm256_maskz_dpbusd_epi32 _mm256_dpbusd_epi32 _mm256_mask_dpbusds_epi32 _mm256_maskz_dpbusds_epi32 _mm256_dpbusds_epi32 _mm256_mask_dpwssd_epi32 _mm256_maskz_dpwssd_epi32 _mm256_dpwssd_epi32 _mm256_mask_dpwssds_epi32 _mm256_maskz_dpwssds_epi32 _mm256_dpwssds_epi32 _mm128_mask_dpbusd_epi32 _mm128_maskz_dpbusd_epi32 _mm128_dpbusd_epi32 _mm128_mask_dpbusds_epi32 _mm128_maskz_dpbusds_epi32 _mm128_dpbusds_epi32 _mm128_mask_dpwssd_epi32 _mm128_maskz_dpwssd_epi32 _mm128_dpwssd_epi32 _mm128_mask_dpwssds_epi32 _mm128_maskz_dpwssds_epi32 _mm128_dpwssds_epi32 _mm512_mask_dpbusd_epi32 _mm512_maskz_dpbusd_epi32 _mm512_dpbusd_epi32 _mm512_mask_dpbusds_epi32 _mm512_maskz_dpbusds_epi32 _mm512_dpbusds_epi32 _mm512_mask_dpwssd_epi32 _mm512_maskz_dpwssd_epi32 _mm512_dpwssd_epi32 _mm512_mask_dpwssds_epi32 _mm512_maskz_dpwssds_epi32 _mm512_dpwssds_epi32 matching a similar work on the backend (D40208) Differential Revision: https://reviews.llvm.org/D41558 llvm-svn: 321484
* [x86][icelake][bitalg]Coby Tayree2017-12-272-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | added bitalg feature recognition added intrinsics support for bitalg instructions _mm512_popcnt_epi16 _mm512_mask_popcnt_epi16 _mm512_maskz_popcnt_epi16 _mm512_popcnt_epi8 _mm512_mask_popcnt_epi8 _mm512_maskz_popcnt_epi8 _mm512_mask_bitshuffle_epi64_mask _mm512_bitshuffle_epi64_mask _mm256_popcnt_epi16 _mm256_mask_popcnt_epi16 _mm256_maskz_popcnt_epi16 _mm128_popcnt_epi16 _mm128_mask_popcnt_epi16 _mm128_maskz_popcnt_epi16 _mm256_popcnt_epi8 _mm256_mask_popcnt_epi8 _mm256_maskz_popcnt_epi8 _mm128_popcnt_epi8 _mm128_mask_popcnt_epi8 _mm128_maskz_popcnt_epi8 _mm256_mask_bitshuffle_epi32_mask _mm256_bitshuffle_epi32_mask _mm128_mask_bitshuffle_epi16_mask _mm128_bitshuffle_epi16_mask matching a similar work on the backend (D40222) Differential Revision: https://reviews.llvm.org/D41564 llvm-svn: 321483
* [x86][icelake][vpclmulqdq]Coby Tayree2017-12-272-1/+17
| | | | | | | | | | | added vpclmulqdq feature recognition added intrinsics support for vpclmulqdq instructions _mm256_clmulepi64_epi128 _mm512_clmulepi64_epi128 matching a similar work on the backend (D40101) Differential Revision: https://reviews.llvm.org/D41573 llvm-svn: 321480
* [x86][icelake][gfni]Coby Tayree2017-12-272-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | added gfni feature recognition added intrinsics support for gfni instructions _mm_gf2p8affineinv_epi64_epi8 _mm_mask_gf2p8affineinv_epi64_epi8 _mm_maskz_gf2p8affineinv_epi64_epi8 _mm256_gf2p8affineinv_epi64_epi8 _mm256_mask_gf2p8affineinv_epi64_epi8 _mm256_maskz_gf2p8affineinv_epi64_epi8 _mm512_gf2p8affineinv_epi64_epi8 _mm512_mask_gf2p8affineinv_epi64_epi8 _mm512_maskz_gf2p8affineinv_epi64_epi8 _mm_gf2p8affine_epi64_epi8 _mm_mask_gf2p8affine_epi64_epi8 _mm_maskz_gf2p8affine_epi64_epi8 _mm256_gf2p8affine_epi64_epi8 _mm256_mask_gf2p8affine_epi64_epi8 _mm256_maskz_gf2p8affine_epi64_epi8 _mm512_gf2p8affine_epi64_epi8 _mm512_mask_gf2p8affine_epi64_epi8 _mm512_maskz_gf2p8affine_epi64_epi8 _mm_gf2p8mul_epi8 _mm_mask_gf2p8mul_epi8 _mm_maskz_gf2p8mul_epi8 _mm256_gf2p8mul_epi8 _mm256_mask_gf2p8mul_epi8 _mm256_maskz_gf2p8mul_epi8 _mm512_gf2p8mul_epi8 _mm512_mask_gf2p8mul_epi8 _mm512_maskz_gf2p8mul_epi8 matching a similar work on the backend (D40373) Differential Revision: https://reviews.llvm.org/D41582 llvm-svn: 321477
* [x86][icelake][vaes]Coby Tayree2017-12-272-1/+17
| | | | | | | | | | | | | | | added vaes feature recognition added intrinsics support for vaes instructions, matching a similar work on the backend (D40078) _mm256_aesenc_epi128 _mm512_aesenc_epi128 _mm256_aesenclast_epi128 _mm512_aesenclast_epi128 _mm256_aesdec_epi128 _mm512_aesdec_epi128 _mm256_aesdeclast_epi128 _mm512_aesdeclast_epi128 llvm-svn: 321474
* [X86] Add 'prfchw' to the correct CPUs to match the backend.Craig Topper2017-12-221-0/+3
| | | | llvm-svn: 321341
* Correct hasFeature/isValidFeatureName's handling of shstk/adx/mwaitxErich Keane2017-12-211-2/+7
| | | | | | | | | | | | https://bugs.llvm.org/show_bug.cgi?id=35721 reports that x86intrin.h is issuing a few warnings. This is because attribute target is using isValidFeatureName for its source. It was also discovered that two of these were missing from hasFeature. Additionally, shstk is and ibu are reordered alphabetically, as came up during code review. llvm-svn: 321324
* [AARch64] Add ARMv8.2-A FP16 vector intrinsicsAbderrazek Zaafrani2017-12-211-0/+3
| | | | | | | | Putting back the code that was reverted few weeks ago. Differential Revision: https://reviews.llvm.org/D34161 llvm-svn: 321294
* Make DiagnosticIDs::getAllDiagnostics use std::vector. NFC.Gabor Horvath2017-12-202-2/+2
| | | | | | | | | | | | The size of the result vector is currently around 4600 with Flavor::WarningOrError, which makes std::vector a better candidate than llvm::SmallVector. Patch by: Andras Leitereg! Differential Revision: https://reviews.llvm.org/D39372 llvm-svn: 321190
* Remove llvm::MemoryBuffer const_castsPavel Labath2017-12-201-3/+4
| | | | | | | | | | | | | | | Summary: llvm has grown a WritableMemoryBuffer class, which is convertible (inherits from) a MemoryBuffer. We can use it to avoid conts_casting the buffer contents when we want to write to it. Reviewers: dblaikie, rsmith Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D41387 llvm-svn: 321167
* [c++20] P0515R3: Parsing support and basic AST construction for operator <=>.Richard Smith2017-12-141-0/+1
| | | | | | | | | | | | | | | Adding the new enumerator forced a bunch more changes into this patch than I would have liked. The -Wtautological-compare warning was extended to properly check the new comparison operator, clang-format needed updating because it uses precedence levels as weights for determining where to break lines (and several operators increased their precedence levels with this change), thread-safety analysis needed changes to build its own IL properly for the new operator. All "real" semantic checking for this operator has been deferred to a future patch. For now, we use the relational comparison rules and arbitrarily give the builtin form of the operator a return type of 'void'. llvm-svn: 320707
* [OPENMP] Initial codegen for `target teams distribute simd` directive.Alexey Bataev2017-12-131-1/+1
| | | | | | | Host + generic device codegen for `target teams distribute simd` directive. llvm-svn: 320608
* [Hexagon] Add front-end support for Hexagon V65Krzysztof Parzyszek2017-12-131-0/+4
| | | | llvm-svn: 320579
* [mips] Minor update to the comment (NFC)Aleksandar Beserminji2017-12-111-1/+1
| | | | llvm-svn: 320354
* [mips] Removal of microMIPS64R6Aleksandar Beserminji2017-12-111-0/+7
| | | | | | | | | | | microMIPS64R6 is removed from backend, and therefore frontend will show an error when target is microMIPS64R6. This is Clang part of patch. Differential Revision: https://reviews.llvm.org/D35624 llvm-svn: 320351
* [CodeGen][X86] Fix handling of __fp16 vectors.Akira Hatanaka2017-12-093-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit fixes a bug in IRGen where it generates completely broken code for __fp16 vectors on X86. For example when the following code is compiled: half4 hv0, hv1, hv2; // these are vectors of __fp16. void foo221() { hv0 = hv1 + hv2; } clang generates the following IR, in which two i16 vectors are added: @hv1 = common global <4 x i16> zeroinitializer, align 8 @hv2 = common global <4 x i16> zeroinitializer, align 8 @hv0 = common global <4 x i16> zeroinitializer, align 8 define void @foo221() { %0 = load <4 x i16>, <4 x i16>* @hv1, align 8 %1 = load <4 x i16>, <4 x i16>* @hv2, align 8 %add = add <4 x i16> %0, %1 store <4 x i16> %add, <4 x i16>* @hv0, align 8 ret void } To fix the bug, this commit uses the code committed in r314056, which modified clang to promote and truncate __fp16 vectors to and from float vectors in the AST. It also fixes another IRGen bug where a short value is assigned to an __fp16 variable without any integer-to-floating-point conversion, as shown in the following example: __fp16 a; short b; void foo1() { a = b; } @b = common global i16 0, align 2 @a = common global i16 0, align 2 define void @foo1() #0 { %0 = load i16, i16* @b, align 2 store i16 %0, i16* @a, align 2 ret void } rdar://problem/20625184 Differential Revision: https://reviews.llvm.org/D40112 llvm-svn: 320215
* [OPENMP] Initial codegen for `target teams distribute` directive.Alexey Bataev2017-12-081-1/+1
| | | | | | Host + default devices codegen for `target teams distribute` directive. llvm-svn: 320149
* [OpenCL] Fix layering violation by getOpenCLTypeAddrSpaceSven van Haastregt2017-12-062-31/+13
| | | | | | | | | | | | Commit 7ac28eb0a5 / r310911 ("[OpenCL] Allow targets to select address space per type", 2017-08-15) made Basic depend on AST, introducing a circular dependency. Break this dependency by adding the OpenCLTypeKind enum in Basic and map from AST types to this enum in ASTContext. Differential Revision: https://reviews.llvm.org/D40838 llvm-svn: 319883
OpenPOWER on IntegriCloud