summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86Subtarget.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Add a feature flag for slow 32-byte unaligned memory accesses [x86].Sanjay Patel2014-11-211-0/+1
| | | | | | | | | | | | | | This patch adds a feature flag to avoid unaligned 32-byte load/store AVX codegen for Sandy Bridge and Ivy Bridge. There is no functionality change intended for those chips. Previously, the absence of AVX2 was being used as a proxy to detect this feature. But that hindered codegen for AVX-enabled AMD chips such as btver2 that do not have the 32-byte unaligned access slowdown. Performance measurements are included in PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ). Differential Revision: http://reviews.llvm.org/D6355 llvm-svn: 222544
* [X86] For Silvermont CPU use 16-bit division instead of 64-bit for small ↵Alexey Volkov2014-11-211-1/+2
| | | | | | | | positive numbers Differential Revision: http://reviews.llvm.org/D5938 llvm-svn: 222521
* Initialize new subtarget feature variable for generating reciprocal estimate ↵Sanjay Patel2014-11-111-0/+1
| | | | | | | | instructions. This was missed in r221706. llvm-svn: 221731
* Remove redundant calls to isMaterializable.Rafael Espindola2014-11-011-6/+1
| | | | | | | | | | This removes calls to isMaterializable in the following cases: * It was redundant with a call to isDeclaration now that isDeclaration returns the correct answer for materializable functions. * It was followed by a call to Materialize. Just call Materialize and check EC. llvm-svn: 221050
* Use rsqrt (X86) to speed up reciprocal square root calcsSanjay Patel2014-10-241-0/+1
| | | | | | | | | | | | | | | | | | | | | This is a first step for generating SSE rsqrt instructions for reciprocal square root calcs when fast-math is allowed. For now, be conservative and only enable this for AMD btver2 where performance improves significantly - for example, 29% on llvm/projects/test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c (if we convert the data type to single-precision float). This patch adds a two constant version of the Newton-Raphson refinement algorithm to DAGCombiner that can be selected by any target via a parameter returned by getRsqrtEstimate().. See PR20900 for more details: http://llvm.org/bugs/show_bug.cgi?id=20900 Differential Revision: http://reviews.llvm.org/D5658 llvm-svn: 220570
* constify TargetMachine parameter for X86TargetLowering.Eric Christopher2014-10-011-1/+1
| | | | llvm-svn: 218804
* Remove resetSubtargetFeatures as it is unused.Eric Christopher2014-09-031-18/+2
| | | | llvm-svn: 217071
* Reinstate "Nuke the old JIT."Eric Christopher2014-09-021-2/+1
| | | | | | | | Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reinstates commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 216982
* [x86] Enable Broadwell target.Robert Khasanov2014-08-211-0/+1
| | | | | | | | Added FeatureSMAP. Broadwell ISA includes Haswell ISA + ADX + RDSEED + SMAP llvm-svn: 216161
* Fix whitespace error from r215279, NFCDuncan P. N. Exon Smith2014-08-141-1/+1
| | | | llvm-svn: 215667
* Initialize X86 DataLayout based on the Triple only.Eric Christopher2014-08-091-12/+14
| | | | llvm-svn: 215279
* Move some X86 subtarget configuration onto the subtarget that's beingEric Christopher2014-08-091-1/+22
| | | | | | created. llvm-svn: 215271
* Temporarily Revert "Nuke the old JIT." as it's not quite ready toEric Christopher2014-08-071-1/+2
| | | | | | | | | | | be deleted. This will be reapplied as soon as possible and before the 3.6 branch date at any rate. Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reverts commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 215154
* Nuke the old JIT.Rafael Espindola2014-08-071-2/+1
| | | | | | | | | I am sure we will be finding bits and pieces of dead code for years to come, but this is a good start. Thanks to Lang Hames for making MCJIT a good replacement! llvm-svn: 215111
* Add support for the X86 secure guard extensions instructions in assembler (SGX).Kevin Enderby2014-07-311-0/+1
| | | | | | | | | | | | | This allows assembling the two new instructions, encls and enclu for the SKX processor model. Note the diffs are a bigger than what might think, but to fit the new MRM_CF and MRM_D7 in things in the right places things had to be renumbered and shuffled down causing a bit more diffs. rdar://16228228 llvm-svn: 214460
* [SKX] Enabling SKX target and AVX512BW, AVX512DQ, AVX512VL features.Robert Khasanov2014-07-211-0/+3
| | | | | | | | | | | | Enabling HasAVX512{DQ,BW,VL} predicates. Adding VK2, VK4, VK32, VK64 masked register classes. Adding new types (v64i8, v32i16) to VR512. Extending calling conventions for new types (v64i8, v32i16) Patch by Zinovy Nis <zinovy.y.nis@intel.com> Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com> llvm-svn: 213545
* Move Post RA Scheduling flag bit into SchedMachineModelSanjay Patel2014-07-151-15/+2
| | | | | | | | | | | | | | | | | | | | | Refactoring; no functional changes intended Removed PostRAScheduler bits from subtargets (X86, ARM). Added PostRAScheduler bit to MCSchedModel class. This bit is set by a CPU's scheduling model (if it exists). Removed enablePostRAScheduler() function from TargetSubtargetInfo and subclasses. Fixed the existing enablePostMachineScheduler() method to use the MCSchedModel (was just returning false!). Added methods to TargetSubtargetInfo to allow overrides for AntiDepBreakMode, CriticalPathRCs, and OptLevel for PostRAScheduling. Added enablePostRAScheduler() function to PostRAScheduler class which queries the subtarget for the above values. Preserved existing scheduler behavior for ARM, MIPS, PPC, and X86: a. ARM overrides the CPU's postRA settings by enabling postRA for any non-Thumb or Thumb2 subtarget. b. MIPS overrides the CPU's postRA settings by enabling postRA for everything. c. PPC overrides the CPU's postRA settings by enabling postRA for everything. d. X86 is the only target that actually has postRA specified via sched model info. Differential Revision: http://reviews.llvm.org/D4217 llvm-svn: 213101
* Move to a private function to initialize the subtarget dependenciesEric Christopher2014-06-111-12/+12
| | | | | | so that we can use initializer lists for the X86Subtarget. llvm-svn: 210614
* Use unique_ptr for X86Subtarget pointer members.Eric Christopher2014-06-101-13/+6
| | | | llvm-svn: 210606
* Remove the use of TargetMachine from X86InstrInfo.Eric Christopher2014-06-101-1/+1
| | | | llvm-svn: 210596
* Delete X86JITInfo in the subtarget destructor.Eric Christopher2014-06-101-0/+1
| | | | llvm-svn: 210516
* Move all of the x86 subtarget initialized variables down into the x86 subtargetEric Christopher2014-06-091-2/+56
| | | | | | from the x86 target machine. Should be no functional change. llvm-svn: 210479
* [X86] Use ADD/SUB instead of INC/DEC for SilvermontAlexey Volkov2014-06-091-0/+1
| | | | | | | | | | | | According to Intel Software Optimization Manual on Silvermont INC or DEC instructions require an additional uop to merge the flags. As a result, a branch instruction depending on an INC or a DEC instruction incurs a 1 cycle penalty. Differential Revision: http://reviews.llvm.org/D3990 llvm-svn: 210466
* Fix compilation issues.Eric Christopher2014-05-211-2/+3
| | | | llvm-svn: 209342
* Make early if conversion dependent upon the subtarget and addEric Christopher2014-05-211-0/+12
| | | | | | | a subtarget hook to enable. Unconditionally add to the pass pipeline for targets that might want to use it. No functional change. llvm-svn: 209340
* [X86] Tune LEA usage for SilvermontAlexey Volkov2014-05-201-0/+1
| | | | | | | | | | | | | According to Intel Software Optimization Manual on Silvermont in some cases LEA is better to be replaced with ADD instructions: "The rule of thumb for ADDs and LEAs is that it is justified to use LEA with a valid index and/or displacement for non-destructive destination purposes (especially useful for stack offset cases), or to use a SCALE. Otherwise, ADD(s) are preferable." Differential Revision: http://reviews.llvm.org/D3826 llvm-svn: 209198
* Reformat a couple of functions for clarity.Eric Christopher2014-05-071-22/+19
| | | | llvm-svn: 208248
* [C++] Use 'nullptr'. Target edition.Craig Topper2014-04-251-1/+1
| | | | llvm-svn: 207197
* [Modules] Fix potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-221-3/+4
| | | | | | | definition below all of the header #include lines, lib/Target/... edition. llvm-svn: 206842
* [cleanup] Lift using directives, DEBUG_TYPE definitions, and even someChandler Carruth2014-04-221-4/+4
| | | | | | | | | | | | system headers above the includes of generated '.inc' files that actually contain code. In a few targets this was already done pretty consistently, but it wasn't done *really* consistently anywhere. It is strictly cleaner IMO and necessary in a bunch of places where the DEBUG_TYPE is referenced from the generated code. Consistency with the necessary places trumps. Hopefully the build bots are OK with the movement of intrin.h... llvm-svn: 206838
* X86: Remove TargetMachine CPU auto-detection.Jim Grosbach2014-04-121-281/+15
| | | | | | | | This logic is properly in the realm of whatever is creating the TargetMachine. This makes plain 'llc foo.ll' consistent across heterogenous machines. llvm-svn: 206094
* X86: Disable IsLegalToCallImmediateAddr for Win32David Majnemer2014-03-281-1/+4
| | | | | | | | | | WinCOFF cannot form PC relative relocations to support absolute MCValues. We should reenable this once WinCOFF supports emission of IMAGE_REL_I386_REL32 relocations. This fixes PR19272. llvm-svn: 205058
* [x86] Support i386-*-*-code16 triple for emitting 16-bit codeDavid Woodhouse2014-01-201-2/+4
| | | | llvm-svn: 199648
* Decouple dllexport/dllimport from linkageNico Rieck2014-01-141-1/+1
| | | | | | | | | | | | | | | | | | | Representing dllexport/dllimport as distinct linkage types prevents using these attributes on templates and inline functions. Instead of introducing further mixed linkage types to include linkonce and weak ODR, the old import/export linkage types are replaced with a new separate visibility-like specifier: define available_externally dllimport void @f() {} @Var = dllexport global i32 1, align 4 Linkage for dllexported globals and functions is now equal to their linkage without dllexport. Imported globals and functions must be either declarations with external linkage, or definitions with AvailableExternallyLinkage. llvm-svn: 199218
* Revert "Decouple dllexport/dllimport from linkage"Nico Rieck2014-01-141-1/+1
| | | | | | | | Revert this for now until I fix an issue in Clang with it. This reverts commit r199204. llvm-svn: 199207
* Decouple dllexport/dllimport from linkageNico Rieck2014-01-141-1/+1
| | | | | | | | | | | | | | | | | | | Representing dllexport/dllimport as distinct linkage types prevents using these attributes on templates and inline functions. Instead of introducing further mixed linkage types to include linkonce and weak ODR, the old import/export linkage types are replaced with a new separate visibility-like specifier: define available_externally dllimport void @f() {} @Var = dllexport global i32 1, align 4 Linkage for dllexported globals and functions is now equal to their linkage without dllexport. Imported globals and functions must be either declarations with external linkage, or definitions with AvailableExternallyLinkage. llvm-svn: 199204
* [x86] Kill gratuitous X86_{32,64}TargetMachine subclasses, use X86TargetMachineDavid Woodhouse2014-01-081-3/+3
| | | | llvm-svn: 198720
* [x86] Add basic support for .code16Craig Topper2014-01-061-1/+9
| | | | | | | | | | | This is not really expected to work right yet. Mostly because we will still emit the OpSize (0x66) prefix in all the wrong places, along with a number of other corner cases. Those will all be fixed in the subsequent commits. Patch from David Woodhouse. llvm-svn: 198584
* X86: enable AVX2 under Haswell native compilationTim Northover2013-11-251-1/+6
| | | | | | Patch by Adam Strzelecki llvm-svn: 195632
* SHLD/SHRD are VectorPath (microcode) instructions known to have poor latency ↵Ekaterina Romanova2013-11-211-0/+10
| | | | | | | | | | on certain architectures. While generating SHLD/SHRD instructions is acceptable when optimizing for size, optimizing for speed on these platforms should be implemented using alternative sequences of instructions composed of add, adc, shr, shl, or and lea which are directPath instructions. These alternative instructions not only have a lower latency but they also increase the decode bandwidth by allowing simultaneous decoding of a third directPath instruction. AMD's processors family K7, K8, K10, K12, K15 and K16 are known to have SHLD/SHRD instructions with very poor latency. Optimization guides for these processors recommend using an alternative sequence of instructions. For these AMD's processors, I disabled folding (or (x << c) | (y >> (64 - c))) when we are not optimizing for size. It might be beneficial to disable this folding for some of the Intel's processors. However, since I couldn't find specific recommendations regarding using SHLD/SHRD instructions on Intel's processors, I haven't disabled this peephole for Intel. llvm-svn: 195383
* Adding a feature flag to the llvm backend for x86 TBM instruction set.Yunzhong Gao2013-09-241-0/+5
| | | | | | | | | | Adding TBM feature to bdver2 processor; piledriver supports this instruction set according to the following document: http://developer.amd.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf Phabricator code review is located here: http://llvm-reviews.chandlerc.com/D1692 llvm-svn: 191324
* Prevent extra calls to ToggleFeature for Feature64Bit and FeatureCMOV if ↵Craig Topper2013-09-181-2/+2
| | | | | | they've already been enabled. The extra call ends up clearing the bit in FeatureBits since its a 'toggle'. Can't prove that anything was broken because of this since I don't think the FeatureBits for these are used. llvm-svn: 190920
* Fix X86 subtarget to not overwrite the autodetected features by calling ↵Craig Topper2013-09-181-1/+1
| | | | | | InitMCProcessorInfo right after detecting them. Instead add a new function that only updates the scheduling model and call that. llvm-svn: 190919
* Adds support for Atom Silvermont (SLM) - -march=slmPreston Gurd2013-09-131-2/+6
| | | | | | | | | | | Implements Instruction scheduler latencies for Silvermont, using latencies from the Intel Silvermont Optimization Guide. Auto detects SLM. Turns on post RA scheduler when generating code for SLM. llvm-svn: 190717
* Move operator to end of previous line to match coding standards.Craig Topper2013-09-131-2/+2
| | | | llvm-svn: 190659
* Partial support for Intel SHA Extensions (sha1rnds4)Ben Langmuir2013-09-121-0/+5
| | | | | | | | | Add basic assembly/disassembly support for the first Intel SHA instruction 'sha1rnds4'. Also includes feature flag, and test cases. Support for the remaining instructions will follow in a separate patch. llvm-svn: 190611
* Rename mattr names for AVX-512 to from avx-512 -> avx512f, avx-512-pfi -> ↵Craig Topper2013-08-211-1/+1
| | | | | | av512pf, avx-512-cdi -> avx512cd, avx-512-eri->avx512er. This matches better with official docs and what gcc patches appearto be using. I didn't touch the has* functions or the feature flag names to avoid change the td and lowering file while commits are still happening. llvm-svn: 188859
* Fix formatting. No functional change.Craig Topper2013-08-201-1/+1
| | | | llvm-svn: 188746
* Add AVX-512 and related features to the CPUID detection code.Craig Topper2013-08-201-3/+19
| | | | llvm-svn: 188745
* Added encoding prefixes for KNL instructions (EVEX).Elena Demikhovsky2013-07-281-0/+3
| | | | | | | Added 512-bit operands printing. Added instruction formats for KNL instructions. llvm-svn: 187324
OpenPOWER on IntegriCloud