| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
enabling '3dnow'
All SSE capable CPUs have MMX. 3dnow implicitly enables MMX.
We have code that detects if sse is enabled and implicitly enables
MMX unless -mno-mmx is passed. So in most cases we were already
enabling MMX if march passed a CPU that supported SSE.
The exception to this is if you pass -march for a cpu supports SSE
and also pass -mno-sse. We should still enable MMX since its part
of the CPU capability.
|
|
|
|
|
|
|
|
|
|
|
|
| |
For more details about these instructions, please refer to the latest
ISE document:
https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference.
Patch by Tianqing Wang (tianqing)
Differential Revision: https://reviews.llvm.org/D62282
llvm-svn: 362685
|
|
|
|
|
|
|
|
|
|
| |
Support intel AVX512 VP2INTERSECT instructions in clang
Patch by Xiang Zhang (xiangzhangllvm)
Differential Revision: https://reviews.llvm.org/D62367
llvm-svn: 362196
|
|
|
|
|
|
|
|
| |
Previously we were doing this so that the 256 bit selectw builtin could be used in the implementation of the 512->256 bit conversion intrinsic.
After this commit we now use a masked convert builtin that will emit the intrinsic call and the 256-bit select from custom code in CGBuiltin. Then the header only needs to call that one intrinsic.
llvm-svn: 360924
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lake
Summary:
1. Enable infrastructure of AVX512_BF16, which is supported for BFLOAT16 in Cooper Lake;
2. Enable intrinsics for VCVTNE2PS2BF16, VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural Network Instructions supporting BFLOAT16 inputs and conversion instructions from IEEE single precision.
For more details about BF16 intrinsic, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference
Patch by LiuTianle
Reviewers: craig.topper, smaslov, LuoYuanke, wxiao3, annita.zhang, spatel, RKSimon
Reviewed By: craig.topper
Subscribers: mgorny, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60552
llvm-svn: 360018
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This define should correspond to CMPXCHG16B being available which requires 64-bit mode.
I checked and gcc also seems to only define this in 64-bit mode.
Reviewers: RKSimon, spatel, efriedma, jyknight, jfb
Reviewed By: jfb
Subscribers: jfb, cfe-commits, llvm-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D59287
llvm-svn: 356118
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in gcc by https://gcc.gnu.org/ml/gcc-cvs/2018-04/msg00534.html.
The -mibt feature flag is being removed, and the -fcf-protection
option now also defines a CET macro and causes errors when used
on non-X86 targets, while X86 targets no longer check for -mibt
and -mshstk to determine if -fcf-protection is supported. -mshstk
is now used only to determine availability of shadow stack intrinsics.
Comes with an LLVM patch (D46882).
Patch by mike.dvoretsky
Differential Revision: https://reviews.llvm.org/D46881
llvm-svn: 332704
|
|
|
|
| |
llvm-svn: 323543
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch adds -mrdpid/-mno-rdpid and the rdpid intrinsic. The corresponding LLVM commit has already been made.
Reviewers: RKSimon, spatel, zvi, AndreiGrischenko
Reviewed By: RKSimon
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D42272
llvm-svn: 323047
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
added vbmi2 feature recognition
added intrinsics support for vbmi2 instructions
_mm[128,256,512]_mask[z]_compress_epi[16,32]
_mm[128,256,512]_mask_compressstoreu_epi[16,32]
_mm[128,256,512]_mask[z]_expand_epi[16,32]
_mm[128,256,512]_mask[z]_expandloadu_epi[16,32]
_mm[128,256,512]_mask[z]_sh[l,r]di_epi[16,32,64]
_mm[128,256,512]_mask_sh[l,r]dv_epi[16,32,64]
matching a similar work on the backend (D40206)
Differential Revision: https://reviews.llvm.org/D41557
llvm-svn: 321487
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
added bitalg feature recognition
added intrinsics support for bitalg instructions
_mm512_popcnt_epi16
_mm512_mask_popcnt_epi16
_mm512_maskz_popcnt_epi16
_mm512_popcnt_epi8
_mm512_mask_popcnt_epi8
_mm512_maskz_popcnt_epi8
_mm512_mask_bitshuffle_epi64_mask
_mm512_bitshuffle_epi64_mask
_mm256_popcnt_epi16
_mm256_mask_popcnt_epi16
_mm256_maskz_popcnt_epi16
_mm128_popcnt_epi16
_mm128_mask_popcnt_epi16
_mm128_maskz_popcnt_epi16
_mm256_popcnt_epi8
_mm256_mask_popcnt_epi8
_mm256_maskz_popcnt_epi8
_mm128_popcnt_epi8
_mm128_mask_popcnt_epi8
_mm128_maskz_popcnt_epi8
_mm256_mask_bitshuffle_epi32_mask
_mm256_bitshuffle_epi32_mask
_mm128_mask_bitshuffle_epi16_mask
_mm128_bitshuffle_epi16_mask
matching a similar work on the backend (D40222)
Differential Revision: https://reviews.llvm.org/D41564
llvm-svn: 321483
|
|
|
|
|
|
|
|
|
|
|
| |
added vpclmulqdq feature recognition
added intrinsics support for vpclmulqdq instructions
_mm256_clmulepi64_epi128
_mm512_clmulepi64_epi128
matching a similar work on the backend (D40101)
Differential Revision: https://reviews.llvm.org/D41573
llvm-svn: 321480
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
added gfni feature recognition
added intrinsics support for gfni instructions
_mm_gf2p8affineinv_epi64_epi8
_mm_mask_gf2p8affineinv_epi64_epi8
_mm_maskz_gf2p8affineinv_epi64_epi8
_mm256_gf2p8affineinv_epi64_epi8
_mm256_mask_gf2p8affineinv_epi64_epi8
_mm256_maskz_gf2p8affineinv_epi64_epi8
_mm512_gf2p8affineinv_epi64_epi8
_mm512_mask_gf2p8affineinv_epi64_epi8
_mm512_maskz_gf2p8affineinv_epi64_epi8
_mm_gf2p8affine_epi64_epi8
_mm_mask_gf2p8affine_epi64_epi8
_mm_maskz_gf2p8affine_epi64_epi8
_mm256_gf2p8affine_epi64_epi8
_mm256_mask_gf2p8affine_epi64_epi8
_mm256_maskz_gf2p8affine_epi64_epi8
_mm512_gf2p8affine_epi64_epi8
_mm512_mask_gf2p8affine_epi64_epi8
_mm512_maskz_gf2p8affine_epi64_epi8
_mm_gf2p8mul_epi8
_mm_mask_gf2p8mul_epi8
_mm_maskz_gf2p8mul_epi8
_mm256_gf2p8mul_epi8
_mm256_mask_gf2p8mul_epi8
_mm256_maskz_gf2p8mul_epi8
_mm512_gf2p8mul_epi8
_mm512_mask_gf2p8mul_epi8
_mm512_maskz_gf2p8mul_epi8
matching a similar work on the backend (D40373)
Differential Revision: https://reviews.llvm.org/D41582
llvm-svn: 321477
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
added vaes feature recognition
added intrinsics support for vaes instructions, matching a similar work on the backend (D40078)
_mm256_aesenc_epi128
_mm512_aesenc_epi128
_mm256_aesenclast_epi128
_mm512_aesenclast_epi128
_mm256_aesdec_epi128
_mm512_aesdec_epi128
_mm256_aesdeclast_epi128
_mm512_aesdeclast_epi128
llvm-svn: 321474
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tracking support (Clang side)
Shadow stack solution introduces a new stack for return addresses only.
The stack has a Shadow Stack Pointer (SSP) that points to the last address to which we expect to return.
If we return to a different address an exception is triggered.
This patch includes shadow stack intrinsics as well as the corresponding CET header.
It includes CET clang flags for shadow stack and Indirect Branch Tracking.
For more information, please see the following:
https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf
Differential Revision: https://reviews.llvm.org/D40224
Change-Id: I79ad0925a028bbc94c8ecad75f6daa2f214171f1
llvm-svn: 318995
|
|
|
|
|
|
|
|
| |
Missed in rL302418
Differential Revision: https://reviews.llvm.org/D32770
llvm-svn: 302445
|
|
|
|
|
|
| |
__CLFLUSHOPT__ define to match gcc.
llvm-svn: 294411
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
set is also enabled.
Summary: This is needed to make the v64i8 and v32i16 types legal for the 512-bit VBMI instructions. Fixes PR30912.
Reviewers: delena, zvi
Subscribers: RKSimon, cfe-commits
Differential Revision: https://reviews.llvm.org/D26306
llvm-svn: 286340
|
|
|
|
| |
llvm-svn: 265187
|
|
|
|
|
|
| |
Enabling avx512vbmi or avx512ifma should enable avx512f. Add command line switches and header defines for avx512ifma and avx512vbmi.
llvm-svn: 262201
|
|
|
|
|
|
| |
defines for the same. And add the flags to correct CPU names.
llvm-svn: 250368
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
they enable/disable.
This fixes two things:
a) sse4 isn't actually a target feature, don't treat it as one.
b) we weren't correctly disabling sse4.1 when we'd pass -mno-sse4
after enabling it, thus passing preprocessor directives and
(soon) passing the function attribute as well when we shouldn't.
llvm-svn: 233223
|
|
|
|
|
|
| |
Added -madx option
llvm-svn: 218116
|
|
|
|
|
|
|
|
|
| |
a) add SKX support to Clang driver;
b) add tests for SKX target and AVX512BW, AVX512DQ, AVX512VL features into clang driver tests
Patch by Zinovy Nis <zinovy.y.nis@intel.com>
llvm-svn: 214306
|
|
|
|
| |
llvm-svn: 201475
|
|
|
|
|
|
|
|
|
|
| |
clang front end. This change will allow the __PRFCHW__ macro to be set on these
processors and hence include prfchwintrin.h in x86intrin.h header. Support for
the intrinsic itself seems to have already been added in r178041.
Differential Revision: http://llvm-reviews.chandlerc.com/D1934
llvm-svn: 192829
|
|
|
|
|
|
| |
it is enabled. Also enable it on the same architectures that GCC does.
llvm-svn: 192045
|
|
|
|
|
|
|
|
|
| |
x86 TBM instruction set. Also adding a __TBM__ macro if the TBM feature is
enabled. Otherwise there should be no functionality change to existing features.
Phabricator code review is located here: http://llvm-reviews.chandlerc.com/D1693
llvm-svn: 191326
|
|
|
|
|
|
|
|
|
| |
Intrinsics added shaintrin.h, which is included from x86intrin.h if __SHA__ is
enabled. SHA implies SSE2, which is needed for the __m128i type.
Also add the -msha/-mno-sha option.
llvm-svn: 190999
|
|
|
|
| |
llvm-svn: 190977
|
|
|
|
| |
llvm-svn: 190776
|
|
|
|
| |
llvm-svn: 190496
|
|
|
|
|
|
|
| |
Enabling sse4.2 will implicitly enable popcnt unless popcnt is explicitly disabled.
Disabling sse4.2 will not disable popcnt if popcnt is explicitly enabled.
llvm-svn: 190387
|
|
|
|
| |
llvm-svn: 188984
|
|
|
|
|
|
| |
Thanks for Craig Topper for noticing it.
llvm-svn: 188902
|
|
|
|
|
|
|
|
|
| |
This moves the logic for handling -mfoo -mno-foo from the driver to -cc1. It
also changes -cc1 to apply the options in order, fixing pr16943.
The handling of -mno-mmx -msse is now an explicit special case.
llvm-svn: 188817
|
|
|
|
| |
llvm-svn: 185645
|
|
|
|
| |
llvm-svn: 148582
|
|
|
|
| |
llvm-svn: 148141
|
|
|
|
| |
llvm-svn: 148138
|
|
|
|
|
|
|
| |
clang ' or ' clang -cc1 ' or ' clang-cc ' in test lines (by substituting them to
garbage).
llvm-svn: 91460
|
|
|
|
| |
llvm-svn: 86432
|
|
|
|
|
|
| |
- 'for i in $(find . -type f); do sed -e 's#\(RUN:.*[^ ]\) *&& *$#\1#g' $i | FileUpdate $i; done', for the curious.
llvm-svn: 86430
|
|
|
|
|
|
|
| |
- x86 target feature handling should not be feature complete, even if
the code quality is lacking.
llvm-svn: 71123
|
|
- Apologies for the extremely gross code duplication, I want to get
this working and then decide how to get this information out of the
back end.
- This replaces -m[no-]sse4[12] by -m[no-]sse4, it appears gcc
doesn't distinguish them?
- -msse, etc. now properly disable/enable related features.
- Don't always define __SSE3__...
- The main missing functionality bit here is that we don't initialize
the features based on the CPU for all -march options.
llvm-svn: 71117
|