summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86Subtarget.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Add MMX to the 3dnow enum and propagate changes around. This makesEric Christopher2015-11-141-1/+0
| | | | | | it somewhat more consistent with how the feature is used. llvm-svn: 253122
* [X86] Add fxsr feature flag for fxsave/fxrestore instructions.Craig Topper2015-10-161-0/+1
| | | | llvm-svn: 250497
* [X86] Add XSAVE intrinsic familyAmjad Aboud2015-10-121-0/+4
| | | | | | | | | | | | Add intrinsics for the XSAVE instructions (XSAVE/XSAVE64/XRSTOR/XRSTOR64) XSAVEOPT instructions (XSAVEOPT/XSAVEOPT64) XSAVEC instructions (XSAVEC/XSAVEC64) XSAVES instructions (XSAVES/XSAVES64/XRSTORS/XRSTORS64) Differential Revision: http://reviews.llvm.org/D13012 llvm-svn: 250029
* Move the MMX subtarget feature out of the SSE set of features and intoEric Christopher2015-10-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | its own variable. This is needed so that we can explicitly turn off MMX without turning off SSE and also so that we can diagnose feature set incompatibilities that involve MMX without SSE. Rationale: // sse3 __m128d test_mm_addsub_pd(__m128d A, __m128d B) { return _mm_addsub_pd(A, B); } // mmx void shift(__m64 a, __m64 b, int c) { _mm_slli_pi16(a, c); _mm_slli_pi32(a, c); _mm_slli_si64(a, c); _mm_srli_pi16(a, c); _mm_srli_pi32(a, c); _mm_srli_si64(a, c); _mm_srai_pi16(a, c); _mm_srai_pi32(a, c); } clang -msse3 -mno-mmx file.c -c For this code we should be able to explicitly turn off MMX without affecting the compilation of the SSE3 function and then diagnose and error on compiling the MMX function. This matches the existing gcc behavior and follows the spirit of the SSE/MMX separation in llvm where we can (and do) turn off MMX code generation except in the presence of intrinsics. Updated a couple of tests, but primarily tested with a couple of tests for turning on only mmx and only sse. This is paired with a patch to clang to take advantage of this behavior. llvm-svn: 249731
* Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and ↵Daniel Sanders2015-09-151-1/+1
| | | | | | | | related. NFC. Eric has replied and has demanded the patch be reverted. llvm-svn: 247702
* Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* ↵Daniel Sanders2015-09-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Thanks go to Pavel Labath for fixing LLDB for me. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247692
* Revert r247684 - Replace Triple with a new TargetTuple ...Daniel Sanders2015-09-151-1/+1
| | | | | | LLDB needs to be updated in the same commit. llvm-svn: 247686
* Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.Daniel Sanders2015-09-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247683
* rename "slow-unaligned-mem-under-32" to slow-unaligned-mem-16" (NFCI)Sanjay Patel2015-09-011-2/+2
| | | | | | | | | | | | | | | This is a follow-on suggested by: http://reviews.llvm.org/D12154 ( http://reviews.llvm.org/rL245729 ) http://reviews.llvm.org/D10662 ( http://reviews.llvm.org/rL245075 ) This makes the attribute name match most of the existing lowering logic and regression test expectations. But the current use of this attribute is inconsistent; see the FIXME comment for "allowsMisalignedMemoryAccesses()". That change will result in functional changes and should be coming soon. llvm-svn: 246585
* make fast unaligned memory accesses implicit with SSE4.2 or SSE4aSanjay Patel2015-08-251-0/+7
| | | | | | | | | | | | | | | | | | | | | | This is a follow-on from the discussion in http://reviews.llvm.org/D12154. This change allows memset/memcpy to use SSE or AVX memory accesses for any chip that has generally fast unaligned memory ops. A motivating use case for this change is a clang invocation that doesn't explicitly set the CPU, but does target a feature that we know only exists on a CPU that supports fast unaligned memops. For example: $ clang -O1 foo.c -mavx This resolves a difference in lowering noted in PR24449: https://llvm.org/bugs/show_bug.cgi?id=24449 Before this patch, we used different store types depending on whether the example can be lowered as a memset or not. Differential Revision: http://reviews.llvm.org/D12288 llvm-svn: 245950
* [x86] invert logic for attribute 'FeatureFastUAMem'Sanjay Patel2015-08-211-1/+1
| | | | | | | | | | | | | | | | This is a 'no functional change intended' patch. It removes one FIXME, but adds several more. Motivation: the FeatureFastUAMem attribute may be too general. It is used to determine if any sized misaligned memory access under 32-bytes is 'fast'. From the added FIXME comments, however, you can see that we're not consistent about this. Changing the name of the attribute makes it clearer to see the logic holes. Changing this to a 'slow' attribute also means we don't have to add an explicit 'fast' attribute to new chips; fast unaligned accesses have been standard for several generations of CPUs now. Differential Revision: http://reviews.llvm.org/D12154 llvm-svn: 245729
* don't repeaat function names in comments; NFCSanjay Patel2015-08-141-11/+8
| | | | llvm-svn: 245058
* MC: Remove MCSubtargetInfo::InitCPUSched()Duncan P. N. Exon Smith2015-07-101-4/+1
| | | | | | | | | | | | Remove all calls to `MCSubtargetInfo::InitCPUSched()` and merge its body into the only relevant caller, `MCSubtargetInfo::InitMCProcessorInfo()`. We were only calling the former after explicitly calling the latter with the same CPU; it's confusing to have both methods exposed. Besides a minor (surely unmeasurable) speedup in ARM and X86 from avoiding running the logic twice, no functionality change. llvm-svn: 241956
* Remove getDataLayout() from TargetSelectionDAGInfo (had no users)Mehdi Amini2015-07-091-3/+2
| | | | | | | | | | | | | | | | | | Summary: Remove empty subclass in the process. This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren, ted Differential Revision: http://reviews.llvm.org/D11045 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241780
* IR: Do not consider available_externally linkage to be linker-weak.Peter Collingbourne2015-07-051-6/+5
| | | | | | | | | | | | | | | From the linker's perspective, an available_externally global is equivalent to an external declaration (per isDeclarationForLinker()), so it is incorrect to consider it to be a weak definition. Also clean up some logic in the dead argument elimination pass and clarify its comments to better explain how its behavior depends on linkage, introduce GlobalValue::isStrongDefinitionForLinker() and start using it throughout the optimizers and backend. Differential Revision: http://reviews.llvm.org/D10941 llvm-svn: 241413
* Re-land "[X86] Cache variables that only depend on the subtarget"Reid Kleckner2015-06-171-2/+1
| | | | | | Re-instates r239949 without accidentally flipping the sense of UseLEA. llvm-svn: 239950
* Revert "[X86] Cache variables that only depend on the subtarget"Reid Kleckner2015-06-171-1/+2
| | | | | | This reverts commit r239948, tests seem to be failing. llvm-svn: 239949
* [X86] Cache variables that only depend on the subtargetReid Kleckner2015-06-171-2/+1
| | | | | | | | | | | | | | There is a one-to-one relationship between X86Subtarget and X86FrameLowering, but every frame lowering method would previously pull the subtarget off the MachineFunction and query some subtarget properties. Over time, these locals began to grow in complexity and it became important to keep their names and meaning in sync across all of the frame lowering methods, leading to duplication. We can eliminate that duplication by computing them once in the constructor. llvm-svn: 239948
* Replace string GNU Triples with llvm::Triple in MCSubtargetInfo and ↵Daniel Sanders2015-06-101-1/+1
| | | | | | | | | | | | | | | | | | create*MCSubtargetInfo(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rafael Reviewed By: rafael Subscribers: rafael, ted, jfb, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10311 llvm-svn: 239467
* make reciprocal estimate code generation more flexible by adding ↵Sanjay Patel2015-06-041-2/+0
| | | | | | | | | | | | | | | | | | | | | | command-line options (3rd try) The first try (r238051) to land this was reverted due to ExecutionEngine build failure; that was hopefully addressed by r238788. The second try (r238842) to land this was reverted due to BUILD_SHARED_LIBS failure; that was hopefully addressed by r238953. This patch adds a TargetRecip class for processing many recip codegen possibilities. The class is intended to handle both command-line options to llc as well as options passed in from a front-end such as clang with the -mrecip option. The x86 backend is updated to use the new functionality. Only -mcpu=btver2 with -ffast-math should see a functional change from this patch. All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math. Differential Revision: http://reviews.llvm.org/D8982 llvm-svn: 239001
* X86: Added MPX feature and bound registers.Elena Demikhovsky2015-06-031-0/+1
| | | | | | | | | Intel® Memory Protection Extensions (Intel® MPX) is a new feature in Skylake. It is a part of KNL and SKX sets. It is also a part of Skylake client. I added definition of %bnd0 - %bnd3 registers, each register is a pair of 64-bit integers. llvm-svn: 238916
* Revert "make reciprocal estimate code generation more flexible by adding ↵Rafael Espindola2015-06-031-0/+2
| | | | | | | | | | command-line options (2nd try)" This reverts commit r238842. It broke -DBUILD_SHARED_LIBS=ON build. llvm-svn: 238900
* make reciprocal estimate code generation more flexible by adding ↵Sanjay Patel2015-06-021-2/+0
| | | | | | | | | | | | | | | | | | | command-line options (2nd try) The first try (r238051) to land this was reverted due to bot failures that were hopefully addressed by r238788. This patch adds a TargetRecip class for processing many recip codegen possibilities. The class is intended to handle both command-line options to llc as well as options passed in from a front-end such as clang with the -mrecip option. The x86 backend is updated to use the new functionality. Only -mcpu=btver2 with -ffast-math should see a functional change from this patch. All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math. Differential Revision: http://reviews.llvm.org/D8982 llvm-svn: 238842
* Revert "make reciprocal estimate code generation more flexible by adding ↵Rafael Espindola2015-05-231-0/+2
| | | | | | | | | | | | command-line options" This reverts commit r238051. It broke some bots: http://lab.llvm.org:8011/builders/llvm-ppc64-linux1/builds/18190 llvm-svn: 238075
* make reciprocal estimate code generation more flexible by adding ↵Sanjay Patel2015-05-221-2/+0
| | | | | | | | | | | | | | | | command-line options This patch adds a class for processing many recip codegen possibilities. The TargetRecip class is intended to handle both command-line options to llc as well as options passed in from a front-end such as clang with the -mrecip option. The x86 backend is updated to use the new functionality. Only -mcpu=btver2 with -ffast-math should see a functional change from this patch. All other CPUs continue to *not* use reciprocal estimates by default with -ffast-math. Differential Revision: http://reviews.llvm.org/D8982 llvm-svn: 238051
* Migrate existing backends that care about software floating pointEric Christopher2015-05-121-0/+1
| | | | | | | | | | | | | | | | | | | | to use the information in the module rather than TargetOptions. We've had and clang has used the use-soft-float attribute for some time now so have the backends set a subtarget feature based on a particular function now that subtargets are created based on functions and function attributes. For the one middle end soft float check go ahead and create an overloadable TargetLowering::useSoftFloat function that just checks the TargetSubtargetInfo in all cases. Also remove the command line option that hard codes whether or not soft-float is set by using the attribute for all of the target specific test cases - for the generic just go ahead and add the attribute in the one case that showed up. llvm-svn: 237079
* [X86] Remove two feature flags that covered sets of instructions that have ↵Craig Topper2015-02-051-2/+0
| | | | | | no patterns or intrinsics. Since we don't check feature flags in the assembler parser for any instruction sets, these flags don't provide any value. This frees up 2 of the fully utilized feature flags. llvm-svn: 228282
* Fix program crashes due to alignment exceptions generated for SSE memop ↵Sanjay Patel2015-02-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | instructions (PR22371). r224330 introduced a bug by misinterpreting the "FeatureVectorUAMem" bit. The commit log says that change did not affect anything, but that's not correct. That change allowed SSE instructions to have unaligned mem operands folded into math ops, and that's not allowed in the default specification for any SSE variant. The bug is exposed when compiling for an AVX-capable CPU that had this feature flag but without enabling AVX codegen. Another mistake in r224330 was not adding the feature flag to all AVX CPUs; the AMD chips were excluded. This is part of the fix for PR22371 ( http://llvm.org/bugs/show_bug.cgi?id=22371 ). This feature bit is SSE-specific, so I've renamed it to "FeatureSSEUnalignedMem". Changed the existing test case for the feature bit to reflect the new name and renamed the test file itself to better reflect the feature. Added runs to fold-vex.ll to check for the failing codegen. Note that the feature bit is not set by default on any CPU because it may require a configuration register setting to enable the enhanced unaligned behavior. llvm-svn: 227983
* Reuse a bunch of cached subtargets and remove getSubtarget callsEric Christopher2015-02-021-1/+1
| | | | | | without a Function argument. llvm-svn: 227814
* Move DataLayout back to the TargetMachine from TargetSubtargetInfoEric Christopher2015-01-261-44/+4
| | | | | | | | | | | | | | | | | | | derived classes. Since global data alignment, layout, and mangling is often based on the DataLayout, move it to the TargetMachine. This ensures that global data is going to be layed out and mangled consistently if the subtarget changes on a per function basis. Prior to this all targets(*) have had subtarget dependent code moved out and onto the TargetMachine. *One target hasn't been migrated as part of this change: R600. The R600 port has, as a subtarget feature, the size of pointers and this affects global data layout. I've currently hacked in a FIXME to enable progress, but the port needs to be updated to either pass the 64-bitness to the TargetMachine, or fix the DataLayout to avoid subtarget dependent features. llvm-svn: 227113
* Add a feature flag for slow 32-byte unaligned memory accesses [x86].Sanjay Patel2014-11-211-0/+1
| | | | | | | | | | | | | | This patch adds a feature flag to avoid unaligned 32-byte load/store AVX codegen for Sandy Bridge and Ivy Bridge. There is no functionality change intended for those chips. Previously, the absence of AVX2 was being used as a proxy to detect this feature. But that hindered codegen for AVX-enabled AMD chips such as btver2 that do not have the 32-byte unaligned access slowdown. Performance measurements are included in PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ). Differential Revision: http://reviews.llvm.org/D6355 llvm-svn: 222544
* [X86] For Silvermont CPU use 16-bit division instead of 64-bit for small ↵Alexey Volkov2014-11-211-1/+2
| | | | | | | | positive numbers Differential Revision: http://reviews.llvm.org/D5938 llvm-svn: 222521
* Initialize new subtarget feature variable for generating reciprocal estimate ↵Sanjay Patel2014-11-111-0/+1
| | | | | | | | instructions. This was missed in r221706. llvm-svn: 221731
* Remove redundant calls to isMaterializable.Rafael Espindola2014-11-011-6/+1
| | | | | | | | | | This removes calls to isMaterializable in the following cases: * It was redundant with a call to isDeclaration now that isDeclaration returns the correct answer for materializable functions. * It was followed by a call to Materialize. Just call Materialize and check EC. llvm-svn: 221050
* Use rsqrt (X86) to speed up reciprocal square root calcsSanjay Patel2014-10-241-0/+1
| | | | | | | | | | | | | | | | | | | | | This is a first step for generating SSE rsqrt instructions for reciprocal square root calcs when fast-math is allowed. For now, be conservative and only enable this for AMD btver2 where performance improves significantly - for example, 29% on llvm/projects/test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c (if we convert the data type to single-precision float). This patch adds a two constant version of the Newton-Raphson refinement algorithm to DAGCombiner that can be selected by any target via a parameter returned by getRsqrtEstimate().. See PR20900 for more details: http://llvm.org/bugs/show_bug.cgi?id=20900 Differential Revision: http://reviews.llvm.org/D5658 llvm-svn: 220570
* constify TargetMachine parameter for X86TargetLowering.Eric Christopher2014-10-011-1/+1
| | | | llvm-svn: 218804
* Remove resetSubtargetFeatures as it is unused.Eric Christopher2014-09-031-18/+2
| | | | llvm-svn: 217071
* Reinstate "Nuke the old JIT."Eric Christopher2014-09-021-2/+1
| | | | | | | | Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reinstates commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 216982
* [x86] Enable Broadwell target.Robert Khasanov2014-08-211-0/+1
| | | | | | | | Added FeatureSMAP. Broadwell ISA includes Haswell ISA + ADX + RDSEED + SMAP llvm-svn: 216161
* Fix whitespace error from r215279, NFCDuncan P. N. Exon Smith2014-08-141-1/+1
| | | | llvm-svn: 215667
* Initialize X86 DataLayout based on the Triple only.Eric Christopher2014-08-091-12/+14
| | | | llvm-svn: 215279
* Move some X86 subtarget configuration onto the subtarget that's beingEric Christopher2014-08-091-1/+22
| | | | | | created. llvm-svn: 215271
* Temporarily Revert "Nuke the old JIT." as it's not quite ready toEric Christopher2014-08-071-1/+2
| | | | | | | | | | | be deleted. This will be reapplied as soon as possible and before the 3.6 branch date at any rate. Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reverts commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 215154
* Nuke the old JIT.Rafael Espindola2014-08-071-2/+1
| | | | | | | | | I am sure we will be finding bits and pieces of dead code for years to come, but this is a good start. Thanks to Lang Hames for making MCJIT a good replacement! llvm-svn: 215111
* Add support for the X86 secure guard extensions instructions in assembler (SGX).Kevin Enderby2014-07-311-0/+1
| | | | | | | | | | | | | This allows assembling the two new instructions, encls and enclu for the SKX processor model. Note the diffs are a bigger than what might think, but to fit the new MRM_CF and MRM_D7 in things in the right places things had to be renumbered and shuffled down causing a bit more diffs. rdar://16228228 llvm-svn: 214460
* [SKX] Enabling SKX target and AVX512BW, AVX512DQ, AVX512VL features.Robert Khasanov2014-07-211-0/+3
| | | | | | | | | | | | Enabling HasAVX512{DQ,BW,VL} predicates. Adding VK2, VK4, VK32, VK64 masked register classes. Adding new types (v64i8, v32i16) to VR512. Extending calling conventions for new types (v64i8, v32i16) Patch by Zinovy Nis <zinovy.y.nis@intel.com> Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com> llvm-svn: 213545
* Move Post RA Scheduling flag bit into SchedMachineModelSanjay Patel2014-07-151-15/+2
| | | | | | | | | | | | | | | | | | | | | Refactoring; no functional changes intended Removed PostRAScheduler bits from subtargets (X86, ARM). Added PostRAScheduler bit to MCSchedModel class. This bit is set by a CPU's scheduling model (if it exists). Removed enablePostRAScheduler() function from TargetSubtargetInfo and subclasses. Fixed the existing enablePostMachineScheduler() method to use the MCSchedModel (was just returning false!). Added methods to TargetSubtargetInfo to allow overrides for AntiDepBreakMode, CriticalPathRCs, and OptLevel for PostRAScheduling. Added enablePostRAScheduler() function to PostRAScheduler class which queries the subtarget for the above values. Preserved existing scheduler behavior for ARM, MIPS, PPC, and X86: a. ARM overrides the CPU's postRA settings by enabling postRA for any non-Thumb or Thumb2 subtarget. b. MIPS overrides the CPU's postRA settings by enabling postRA for everything. c. PPC overrides the CPU's postRA settings by enabling postRA for everything. d. X86 is the only target that actually has postRA specified via sched model info. Differential Revision: http://reviews.llvm.org/D4217 llvm-svn: 213101
* Move to a private function to initialize the subtarget dependenciesEric Christopher2014-06-111-12/+12
| | | | | | so that we can use initializer lists for the X86Subtarget. llvm-svn: 210614
* Use unique_ptr for X86Subtarget pointer members.Eric Christopher2014-06-101-13/+6
| | | | llvm-svn: 210606
* Remove the use of TargetMachine from X86InstrInfo.Eric Christopher2014-06-101-1/+1
| | | | llvm-svn: 210596
OpenPOWER on IntegriCloud