summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86Subtarget.h
Commit message (Collapse)AuthorAgeFilesLines
...
* Disable the vzeroupper insertion pass on PS4.Yunzhong Gao2016-02-121-0/+5
| | | | | | Differential Revision: http://reviews.llvm.org/D16837 llvm-svn: 260764
* Added Skylake client to X86 targets and featuresElena Demikhovsky2016-01-241-2/+31
| | | | | | | | | | | | | Changes in X86.td: I set features of Intel processors in incremental form: IVB = SNB + X HSW = IVB + X .. I added Skylake client processor and defined it's features FeatureADX was missing on KNL Added some new features to appropriate processors SMAP, IFMA, PREFETCHWT1, VMFUNC and others Differential Revision: http://reviews.llvm.org/D16357 llvm-svn: 258659
* [AVX512] adding AVXVBMI feature flagMichael Zuckerman2016-01-171-0/+4
| | | | | | | | | | The feature flag is for VPERMB,VPERMI2B,VPERMT2B and VPMULTISHIFTQB instructions. More about the instruction can be found in: hattps://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf Differential Revision: http://reviews.llvm.org/D16190 llvm-svn: 258012
* [x86] adding PKU feature flagAsaf Badouh2015-12-151-0/+4
| | | | | | | | | the feature flag is essential for RDPKRU and WRPKRU instruction more about the instruction can be found in the SDM rev 56, vol 2 from http://www.intel.com/sdm Differential Revision: http://reviews.llvm.org/D15491 llvm-svn: 255644
* X86: Don't emit SAHF/LAHF for 64-bit targets unless explicitly supportedHans Wennborg2015-12-041-0/+4
| | | | | | | | | | | | | | | These instructions are not supported by all CPUs in 64-bit mode. Emitting them causes Chromium to crash on start-up for users with such chips. (GCC puts these instructions behind -msahf on 64-bit for the same reason.) This patch adds FeatureLAHFSAHF, enables it by default for 32-bit targets and modern CPUs, and changes X86InstrInfo::copyPhysReg back to the lowering from before r244503 when the instructions are not available. Differential Revision: http://reviews.llvm.org/D15240 llvm-svn: 254793
* [x86] add a convenience method to check for FMA capability; NFCISanjay Patel2015-12-011-0/+1
| | | | llvm-svn: 254425
* [X86][FMA4] Prefer FMA4 to FMASimon Pilgrim2015-11-301-3/+4
| | | | | | | | | | | | | | | | We currently output FMA instructions on targets which support both FMA4 + FMA (i.e. later Bulldozer CPUS bdver2/bdver3/bdver4). This patch flips this so FMA4 is preferred; this is for several reasons: 1 - FMA4 is non-destructive reducing the need for mov instructions. 2 - Its more straighforward to commute and fold inputs (although the recent work on FMA has reduced this difference). 3 - All supported targets have FMA4 performance equal or better to FMA - Piledriver (bdver2) in particular has half the throughput when executing FMA instructions. Its looks like no future AMD processor lines will support FMA4 after the Bulldozer series so we're not causing problems for later CPUs. Differential Revision: http://reviews.llvm.org/D14997 llvm-svn: 254339
* Add MMX to the 3dnow enum and propagate changes around. This makesEric Christopher2015-11-141-6/+3
| | | | | | it somewhat more consistent with how the feature is used. llvm-svn: 253122
* [X86] Make elfiamcu an OS, not an environment.Michael Kuperstein2015-10-271-1/+1
| | | | | | | | | | GNU tools require elfiamcu to take up the entire OS field, so, e.g. i?86-*-linux-elfiamcu is not considered a legal triple. Make us compatible. Differential Revision: http://reviews.llvm.org/D14081 llvm-svn: 251390
* [X86] Add support for elfiamcu tripleMichael Kuperstein2015-10-251-0/+1
| | | | | | | | This adds support for the i?86-*-elfiamcu triple, which indicates the IAMCU psABI is used. Differential Revision: http://reviews.llvm.org/D13977 llvm-svn: 251222
* [X86] Add fxsr feature flag for fxsave/fxrestore instructions.Craig Topper2015-10-161-0/+4
| | | | llvm-svn: 250497
* [X86] Add XSAVE intrinsic familyAmjad Aboud2015-10-121-0/+13
| | | | | | | | | | | | Add intrinsics for the XSAVE instructions (XSAVE/XSAVE64/XRSTOR/XRSTOR64) XSAVEOPT instructions (XSAVEOPT/XSAVEOPT64) XSAVEC instructions (XSAVEC/XSAVEC64) XSAVES instructions (XSAVES/XSAVES64/XRSTORS/XRSTORS64) Differential Revision: http://reviews.llvm.org/D13012 llvm-svn: 250029
* Add Triple::isAndroid().Evgeniy Stepanov2015-10-081-3/+1
| | | | | | | This is a simple refactoring that replaces Triple.getEnvironment() checks for Android with Triple.isAndroid(). llvm-svn: 249750
* Move the MMX subtarget feature out of the SSE set of features and intoEric Christopher2015-10-081-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | its own variable. This is needed so that we can explicitly turn off MMX without turning off SSE and also so that we can diagnose feature set incompatibilities that involve MMX without SSE. Rationale: // sse3 __m128d test_mm_addsub_pd(__m128d A, __m128d B) { return _mm_addsub_pd(A, B); } // mmx void shift(__m64 a, __m64 b, int c) { _mm_slli_pi16(a, c); _mm_slli_pi32(a, c); _mm_slli_si64(a, c); _mm_srli_pi16(a, c); _mm_srli_pi32(a, c); _mm_srli_si64(a, c); _mm_srai_pi16(a, c); _mm_srai_pi32(a, c); } clang -msse3 -mno-mmx file.c -c For this code we should be able to explicitly turn off MMX without affecting the compilation of the SSE3 function and then diagnose and error on compiling the MMX function. This matches the existing gcc behavior and follows the spirit of the SSE/MMX separation in llvm where we can (and do) turn off MMX code generation except in the presence of intrinsics. Updated a couple of tests, but primarily tested with a couple of tests for turning on only mmx and only sse. This is paired with a patch to clang to take advantage of this behavior. llvm-svn: 249731
* Android support for SafeStack.Evgeniy Stepanov2015-09-231-0/+3
| | | | | | | | | | | | | | | | | Add two new ways of accessing the unsafe stack pointer: * At a fixed offset from the thread TLS base. This is very similar to StackProtector cookies, but we plan to extend it to other backends (ARM in particular) soon. Bionic-side implementation here: https://android-review.googlesource.com/170988. * Via a function call, as a fallback for platforms that provide neither a fixed TLS slot, nor a reasonable TLS implementation (i.e. not emutls). This is a re-commit of a change in r248357 that was reverted in r248358. llvm-svn: 248405
* Revert "Android support for SafeStack."Evgeniy Stepanov2015-09-231-3/+0
| | | | | | | test/Transforms/SafeStack/abi.ll breaks when target is not supported; needs refactoring. llvm-svn: 248358
* Android support for SafeStack.Evgeniy Stepanov2015-09-231-0/+3
| | | | | | | | | | | | | | Add two new ways of accessing the unsafe stack pointer: * At a fixed offset from the thread TLS base. This is very similar to StackProtector cookies, but we plan to extend it to other backends (ARM in particular) soon. Bionic-side implementation here: https://android-review.googlesource.com/170988. * Via a function call, as a fallback for platforms that provide neither a fixed TLS slot, nor a reasonable TLS implementation (i.e. not emutls). llvm-svn: 248357
* rename "slow-unaligned-mem-under-32" to slow-unaligned-mem-16" (NFCI)Sanjay Patel2015-09-011-3/+3
| | | | | | | | | | | | | | | This is a follow-on suggested by: http://reviews.llvm.org/D12154 ( http://reviews.llvm.org/rL245729 ) http://reviews.llvm.org/D10662 ( http://reviews.llvm.org/rL245075 ) This makes the attribute name match most of the existing lowering logic and regression test expectations. But the current use of this attribute is inconsistent; see the FIXME comment for "allowsMisalignedMemoryAccesses()". That change will result in functional changes and should be coming soon. llvm-svn: 246585
* [x86] invert logic for attribute 'FeatureFastUAMem'Sanjay Patel2015-08-211-4/+4
| | | | | | | | | | | | | | | | This is a 'no functional change intended' patch. It removes one FIXME, but adds several more. Motivation: the FeatureFastUAMem attribute may be too general. It is used to determine if any sized misaligned memory access under 32-bytes is 'fast'. From the added FIXME comments, however, you can see that we're not consistent about this. Changing the name of the attribute makes it clearer to see the logic holes. Changing this to a 'slow' attribute also means we don't have to add an explicit 'fast' attribute to new chips; fast unaligned accesses have been standard for several generations of CPUs now. Differential Revision: http://reviews.llvm.org/D12154 llvm-svn: 245729
* Add a target environment for CoreCLR.Pat Gavlin2015-08-141-0/+4
| | | | | | | | | | Although targeting CoreCLR is similar to targeting MSVC, there are certain important differences that the backend must be aware of (e.g. differences in stack probes, EH, and library calls). Differential Revision: http://reviews.llvm.org/D11012 llvm-svn: 245115
* [Win64] Only treat some functions as having the Win64 conventionReid Kleckner2015-07-081-2/+20
| | | | | | | | | | | | | | | | All the usual X86 target-specific conventions are collapsed to the normal Win64 convention, but the custom conventions like GHC and webkit should not be. Previously we would assume that the caller allocated 32 bytes of shadow space for us, which is not how webkit_jscc or other custom conventions are supposed to work. Based on a patch by peavo@outlook.com. Fixes PR24051. llvm-svn: 241725
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-1/+1
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-1/+1
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* Replace string GNU Triples with llvm::Triple in MCSubtargetInfo and ↵Daniel Sanders2015-06-101-3/+2
| | | | | | | | | | | | | | | | | | create*MCSubtargetInfo(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rafael Reviewed By: rafael Subscribers: rafael, ted, jfb, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10311 llvm-svn: 239467
* make reciprocal estimate code generation more flexible by adding ↵Sanjay Patel2015-06-041-12/+0
| | | | | | | | | | | | | | | | | | | | | | command-line options (3rd try) The first try (r238051) to land this was reverted due to ExecutionEngine build failure; that was hopefully addressed by r238788. The second try (r238842) to land this was reverted due to BUILD_SHARED_LIBS failure; that was hopefully addressed by r238953. This patch adds a TargetRecip class for processing many recip codegen possibilities. The class is intended to handle both command-line options to llc as well as options passed in from a front-end such as clang with the -mrecip option. The x86 backend is updated to use the new functionality. Only -mcpu=btver2 with -ffast-math should see a functional change from this patch. All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math. Differential Revision: http://reviews.llvm.org/D8982 llvm-svn: 239001
* X86: Added MPX feature and bound registers.Elena Demikhovsky2015-06-031-0/+4
| | | | | | | | | Intel® Memory Protection Extensions (Intel® MPX) is a new feature in Skylake. It is a part of KNL and SKX sets. It is also a part of Skylake client. I added definition of %bnd0 - %bnd3 registers, each register is a pair of 64-bit integers. llvm-svn: 238916
* Revert "make reciprocal estimate code generation more flexible by adding ↵Rafael Espindola2015-06-031-0/+12
| | | | | | | | | | command-line options (2nd try)" This reverts commit r238842. It broke -DBUILD_SHARED_LIBS=ON build. llvm-svn: 238900
* make reciprocal estimate code generation more flexible by adding ↵Sanjay Patel2015-06-021-12/+0
| | | | | | | | | | | | | | | | | | | command-line options (2nd try) The first try (r238051) to land this was reverted due to bot failures that were hopefully addressed by r238788. This patch adds a TargetRecip class for processing many recip codegen possibilities. The class is intended to handle both command-line options to llc as well as options passed in from a front-end such as clang with the -mrecip option. The x86 backend is updated to use the new functionality. Only -mcpu=btver2 with -ffast-math should see a functional change from this patch. All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math. Differential Revision: http://reviews.llvm.org/D8982 llvm-svn: 238842
* Revert "make reciprocal estimate code generation more flexible by adding ↵Rafael Espindola2015-05-231-0/+12
| | | | | | | | | | | | command-line options" This reverts commit r238051. It broke some bots: http://lab.llvm.org:8011/builders/llvm-ppc64-linux1/builds/18190 llvm-svn: 238075
* make reciprocal estimate code generation more flexible by adding ↵Sanjay Patel2015-05-221-12/+0
| | | | | | | | | | | | | | | | command-line options This patch adds a class for processing many recip codegen possibilities. The TargetRecip class is intended to handle both command-line options to llc as well as options passed in from a front-end such as clang with the -mrecip option. The x86 backend is updated to use the new functionality. Only -mcpu=btver2 with -ffast-math should see a functional change from this patch. All other CPUs continue to *not* use reciprocal estimates by default with -ffast-math. Differential Revision: http://reviews.llvm.org/D8982 llvm-svn: 238051
* Migrate existing backends that care about software floating pointEric Christopher2015-05-121-0/+4
| | | | | | | | | | | | | | | | | | | | to use the information in the module rather than TargetOptions. We've had and clang has used the use-soft-float attribute for some time now so have the backends set a subtarget feature based on a particular function now that subtargets are created based on functions and function attributes. For the one middle end soft float check go ahead and create an overloadable TargetLowering::useSoftFloat function that just checks the TargetSubtargetInfo in all cases. Also remove the command line option that hard codes whether or not soft-float is set by using the attribute for all of the target specific test cases - for the generic just go ahead and add the attribute in the one case that showed up. llvm-svn: 237079
* [X86] Remove two feature flags that covered sets of instructions that have ↵Craig Topper2015-02-051-8/+0
| | | | | | no patterns or intrinsics. Since we don't check feature flags in the assembler parser for any instruction sets, these flags don't provide any value. This frees up 2 of the fully utilized feature flags. llvm-svn: 228282
* remove variable names from comments; NFCSanjay Patel2015-02-031-63/+57
| | | | | | | I didn't bother to fix the self-referential definitions and grammar because my eyes started to bleed. llvm-svn: 228004
* Fix program crashes due to alignment exceptions generated for SSE memop ↵Sanjay Patel2015-02-031-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | instructions (PR22371). r224330 introduced a bug by misinterpreting the "FeatureVectorUAMem" bit. The commit log says that change did not affect anything, but that's not correct. That change allowed SSE instructions to have unaligned mem operands folded into math ops, and that's not allowed in the default specification for any SSE variant. The bug is exposed when compiling for an AVX-capable CPU that had this feature flag but without enabling AVX codegen. Another mistake in r224330 was not adding the feature flag to all AVX CPUs; the AMD chips were excluded. This is part of the fix for PR22371 ( http://llvm.org/bugs/show_bug.cgi?id=22371 ). This feature bit is SSE-specific, so I've renamed it to "FeatureSSEUnalignedMem". Changed the existing test case for the feature bit to reflect the new name and renamed the test file itself to better reflect the feature. Added runs to fold-vex.ll to check for the failing codegen. Note that the feature bit is not set by default on any CPU because it may require a configuration register setting to enable the enhanced unaligned behavior. llvm-svn: 227983
* Use a different encoding for debugtrap on PS4.Alex Rosenberg2015-01-261-0/+1
| | | | llvm-svn: 227116
* Move DataLayout back to the TargetMachine from TargetSubtargetInfoEric Christopher2015-01-261-3/+0
| | | | | | | | | | | | | | | | | | | derived classes. Since global data alignment, layout, and mangling is often based on the DataLayout, move it to the TargetMachine. This ensures that global data is going to be layed out and mangled consistently if the subtarget changes on a per function basis. Prior to this all targets(*) have had subtarget dependent code moved out and onto the TargetMachine. *One target hasn't been migrated as part of this change: R600. The R600 port has, as a subtarget feature, the size of pointers and this affects global data layout. I've currently hacked in a FIXME to enable progress, but the port needs to be updated to either pass the 64-bitness to the TargetMachine, or fix the DataLayout to avoid subtarget dependent features. llvm-svn: 227113
* Add segmented stack support for DragonFlyBSD.Rafael Espindola2014-12-291-0/+1
| | | | | | Patch by Michael Neumann. llvm-svn: 224936
* Rename the x86 isTargetMacho to isTargetMachO for uniformity.Eric Christopher2014-12-051-1/+1
| | | | llvm-svn: 223421
* [X86] Clean up whitespace as well as minor coding styleMichael Liao2014-12-041-2/+2
| | | | llvm-svn: 223339
* Tidied up target triple OS detection. NFCSimon Pilgrim2014-11-221-8/+4
| | | | | | Use Triple::isOS*() helper functions where possible. llvm-svn: 222622
* Add a feature flag for slow 32-byte unaligned memory accesses [x86].Sanjay Patel2014-11-211-0/+4
| | | | | | | | | | | | | | This patch adds a feature flag to avoid unaligned 32-byte load/store AVX codegen for Sandy Bridge and Ivy Bridge. There is no functionality change intended for those chips. Previously, the absence of AVX2 was being used as a proxy to detect this feature. But that hindered codegen for AVX-enabled AMD chips such as btver2 that do not have the 32-byte unaligned access slowdown. Performance measurements are included in PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ). Differential Revision: http://reviews.llvm.org/D6355 llvm-svn: 222544
* [X86] For Silvermont CPU use 16-bit division instead of 64-bit for small ↵Alexey Volkov2014-11-211-4/+9
| | | | | | | | positive numbers Differential Revision: http://reviews.llvm.org/D5938 llvm-svn: 222521
* X86: use the correct alloca symbol for Windows ItaniumSaleem Abdulrasool2014-11-201-0/+4
| | | | | | | Windows itanium targets the MSVCRT, and the stack probe symbol is provided by MSVCRT. This corrects the emission of stack probes on i686-windows-itanium. llvm-svn: 222439
* Use rcpss/rcpps (X86) to speed up reciprocal calcs (PR21385).Sanjay Patel2014-11-111-0/+6
| | | | | | | | | | | | | | | | | | | | | | This is a first step for generating SSE rcp instructions for reciprocal calcs when fast-math allows it. This is very similar to the rsqrt optimization enabled in D5658 ( http://reviews.llvm.org/rL220570 ). For now, be conservative and only enable this for AMD btver2 where performance improves significantly both in terms of latency and throughput. We may never enable this codegen for Intel Core* chips because the divider circuits are just too fast. On SandyBridge, divss can be as fast as 10 cycles versus the 21 cycle critical path for the rcp + mul + sub + mul + add estimate. Follow-on patches may allow configuration of the number of Newton-Raphson refinement steps, add AVX512 support, and enable the optimization for more chips. More background here: http://llvm.org/bugs/show_bug.cgi?id=21385 Differential Revision: http://reviews.llvm.org/D6175 llvm-svn: 221706
* Use rsqrt (X86) to speed up reciprocal square root calcsSanjay Patel2014-10-241-0/+6
| | | | | | | | | | | | | | | | | | | | | This is a first step for generating SSE rsqrt instructions for reciprocal square root calcs when fast-math is allowed. For now, be conservative and only enable this for AMD btver2 where performance improves significantly - for example, 29% on llvm/projects/test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c (if we convert the data type to single-precision float). This patch adds a two constant version of the Newton-Raphson refinement algorithm to DAGCombiner that can be selected by any target via a parameter returned by getRsqrtEstimate().. See PR20900 for more details: http://llvm.org/bugs/show_bug.cgi?id=20900 Differential Revision: http://reviews.llvm.org/D5658 llvm-svn: 220570
* constify TargetMachine parameter for X86TargetLowering.Eric Christopher2014-10-011-1/+1
| | | | llvm-svn: 218804
* Remove resetSubtargetFeatures as it is unused.Eric Christopher2014-09-031-3/+1
| | | | llvm-svn: 217071
* Reinstate "Nuke the old JIT."Eric Christopher2014-09-021-3/+0
| | | | | | | | Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reinstates commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 216982
* [x86] Enable Broadwell target.Robert Khasanov2014-08-211-0/+4
| | | | | | | | Added FeatureSMAP. Broadwell ISA includes Haswell ISA + ADX + RDSEED + SMAP llvm-svn: 216161
* Canonicalize header guards into a common format.Benjamin Kramer2014-08-131-2/+2
| | | | | | | | | | Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558
OpenPOWER on IntegriCloud