summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86Subtarget.h
Commit message (Collapse)AuthorAgeFilesLines
...
* added basic support for Intel ADX instructionsKay Tiong Khoo2013-02-141-0/+4
| | | | | | -feature flag, instructions definitions, test cases llvm-svn: 175196
* Teach SDISel to combine fsin / fcos into a fsincos node if the followingEvan Cheng2013-01-291-0/+4
| | | | | | | | | | | | | | | | | | conditions are met: 1. They share the same operand and are in the same BB. 2. Both outputs are used. 3. The target has a native instruction that maps to ISD::FSINCOS node or the target provides a sincos library call. Implemented the generic optimization in sdisel and enabled it for Mac OSX. Also added an additional optimization for x86_64 Mac OSX by using an alternative entry point __sincos_stret which returns the two results in xmm0 / xmm1. rdar://13087969 PR13204 llvm-svn: 173755
* In this patch, we teach X86_64TargetMachine that it has a ILP32Eli Bendersky2013-01-251-1/+14
| | | | | | | | | | | | | | | | | | | | | (defined by the x32 ABI) mode, in which case its pointers are 32-bits in size. This knowledge is also added to X86RegisterInfo that now returns the appropriate registers in getPointerRegClass. There are many outcomes to this change. In order to keep the patches separate and manageable, we start by focusing on some simple testable cases. The patch adds a test with passing a pointer to a function - focusing on the difference between the two data models for x86-64. Another test is added for handling of 'sret' arguments (and functionality is added in X86ISelLowering to make it work). A note on naming: the "x32 ABI" document refers to the AMD64 architecture (in LLVM it's distinguished by being is64Bits() in the x86 subtarget) with two variations: the LP64 (default) data model, and the ILP32 data model. This patch adds predicates to the subtarget which are consistent with this naming scheme. llvm-svn: 173503
* Pad Short Functions for Intel AtomPreston Gurd2013-01-081-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. This patch has been updated to address Nadav's review comments - Optimize only at >= O1 and don't do optimization if -Os is set - Stores MachineBasicBlock* instead of BBNum - Uses DenseMap instead of std::map - Fixes placement of braces Patch by Andy Zhang. llvm-svn: 171879
* Revert revision 171524. Original message:Nadav Rotem2013-01-051-5/+0
| | | | | | | | | | | | | | | | | | | | URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev Log: The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171603
* The current Intel Atom microarchitecture has a feature whereby when a functionPreston Gurd2013-01-041-0/+5
| | | | | | | | | | | | | | | | | returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171524
* Move all of the header files which are involved in modelling the LLVM IRChandler Carruth2013-01-021-1/+1
| | | | | | | | | | | | | | | | | | | | | into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366
* Make NaCl naming consistent. The triple OSType is called NaCl and is representedEli Bendersky2012-12-041-1/+1
| | | | | | | | | textually as NativeClient. Also added a link to the native client project for readers unfamiliar with it. A Clang patch will follow shortly. llvm-svn: 169291
* Sort includes for all of the .h files under the 'lib' tree. These wereChandler Carruth2012-12-041-1/+1
| | | | | | | | | | missed in the first pass because the script didn't yet handle include guards. Note that the script is now able to handle all of these headers without manual edits. =] llvm-svn: 169224
* I changed hasAVX() to hasFp256() and hasAVX2() to hasInt256() in ↵Elena Demikhovsky2012-11-291-0/+2
| | | | | | | | X86IselLowering.cpp. The logic was not changed, only names. llvm-svn: 168875
* Add support of RTM from TSX extensionMichael Liao2012-11-081-0/+4
| | | | | | | | - Add RTM code generation support throught 3 X86 intrinsics: xbegin()/xend() to start/end a transaction region, and xabort() to abort a tranaction region llvm-svn: 167573
* misched: remove the unused getSpecialAddressLatency hook.Andrew Trick2012-10-081-6/+0
| | | | llvm-svn: 165418
* Support for generating ELF objects on Windows.Andrew Kaylor2012-10-021-5/+8
| | | | | | This adds 'elf' as a recognized target triple environment value and overrides the default generated object format on Windows platforms if that value is present. This patch also enables MCJIT tests on Windows using the new environment value. llvm-svn: 165030
* Remove hasNoAVX method. Can just invert hasAVX instead.Craig Topper2012-09-261-1/+0
| | | | llvm-svn: 164664
* Generic Bypass Slow DivPreston Gurd2012-09-041-0/+5
| | | | | | | | | | | | | | | | | | | | | | | - CodeGenPrepare pass for identifying div/rem ops - Backend specifies the type mapping using addBypassSlowDivType - Enabled only for Intel Atom with O2 32-bit -> 8-bit - Replace IDIV with instructions which test its value and use DIVB if the value is positive and less than 256. - In the case when the quotient and remainder of a divide are used a DIV and a REM instruction will be present in the IR. In the non-Atom case they are both lowered to IDIVs and CSE removes the redundant IDIV instruction, using the quotient and remainder from the first IDIV. However, due to this optimization CSE is not able to eliminate redundant IDIV instructions because they are located in different basic blocks. This is overcome by calculating both the quotient (DIV) and remainder (REM) in each basic block that is inserted by the optimization and reusing the result values when a subsequent DIV or REM instruction uses the same operands. - Test cases check for the presents of the optimization when calculating either the quotient, remainder, or both. Patch by Tyler Nowicki! llvm-svn: 163150
* Introduce 'UseSSEx' to force SSE legacy encodingMichael Liao2012-08-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is enabled. As the penalty of inter-mixing SSE and AVX instructions, we need prevent SSE legacy insn from being generated except explicitly specified through some intrinsics. For patterns supported by both SSE and AVX, so far, we force AVX insn will be tried first relying on AddedComplexity or position in td file. It's error-prone and introduces bugs accidentally. 'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited by AVX, we need this predicate to force VEX encoding or SSE legacy encoding only. For insns not inherited by AVX, we still use the previous predicates, i.e. 'HasSSEx'. So far, these insns fall into the following categories: * SSE insns with MMX operands * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH, CRC, and etc.) * SSE4A insns. * MMX insns. * x87 insns added by SSE. 2 test cases are modified: - test/CodeGen/X86/fast-isel-x86-64.ll AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be selected by fast-isel due to complicated pattern and fast-isel fallback to materialize it from constant pool. - test/CodeGen/X86/widen_load-1.ll AVX code generation is different from SSE one after fixing SSE/AVX inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of 'vmovaps'. llvm-svn: 162919
* Custom lower FMA intrinsics to target specific nodes and remove the patterns.Craig Topper2012-08-241-1/+1
| | | | llvm-svn: 162534
* Favor FMA3 over FMA4 if both are enabled.Craig Topper2012-08-231-1/+2
| | | | llvm-svn: 162454
* Whitespace.Chad Rosier2012-08-011-2/+2
| | | | llvm-svn: 161122
* Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang.Craig Topper2012-06-031-3/+3
| | | | llvm-svn: 157903
* X86: Rename the CLMUL target feature to PCLMUL.Benjamin Kramer2012-05-311-3/+3
| | | | | | | It was renamed in gcc/gas a while ago and causes all kinds of confusion because it was named differently in llvm and clang. llvm-svn: 157745
* This patch fixes a problem which arose when using the Post-RA schedulerPreston Gurd2012-04-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | on X86 Atom. Some of our tests failed because the tail merging part of the BranchFolding pass was creating new basic blocks which did not contain live-in information. When the anti-dependency code in the Post-RA scheduler ran, it would sometimes rename the register containing the function return value because the fact that the return value was live-in to the subsequent block had been lost. To fix this, it is necessary to run the RegisterScavenging code in the BranchFolding pass. This patch makes sure that the register scavenging code is invoked in the X86 subtarget only when post-RA scheduling is being done. Post RA scheduling in the X86 subtarget is only done for Atom. This patch adds a new function to the TargetRegisterClass to control whether or not live-ins should be preserved during branch folding. This is necessary in order for the anti-dependency optimizations done during the PostRASchedulerList pass to work properly when doing Post-RA scheduling for the X86 in general and for the Intel Atom in particular. The patch adds and invokes the new function trackLivenessAfterRegAlloc() instead of using the existing requiresRegisterScavenging(). It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of requiresRegisterScavenging(). It changes the all the targets that implemented requiresRegisterScavenging() to also implement trackLivenessAfterRegAlloc(). It adds an assertion in the Post RA scheduler to make sure that post RA liveness information is available when it is needed. It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order to avoid running into the added assertion. Finally, this patch restores the use of anti-dependency checking (which was turned off temporarily for the 3.1 release) for Intel Atom in the Post RA scheduler. Patch by Andy Zhang! Thanks to Jakob and Anton for their reviews. llvm-svn: 155395
* Reorder includes in Target backends to following coding standards. Remove ↵Craig Topper2012-03-171-1/+1
| | | | | | some superfluous forward declarations. llvm-svn: 152997
* some comment fix for X86 and ARMJia Liu2012-02-191-1/+1
| | | | llvm-svn: 150902
* Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, ↵Jia Liu2012-02-181-1/+1
| | | | | | MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878
* Use LEA to adjust stack ptr for Atom. Patch by Andy Zhang.Evan Cheng2012-02-071-0/+5
| | | | llvm-svn: 150008
* Begin fleshing out more convenience predicates in llvm::Triple andChandler Carruth2012-02-051-17/+7
| | | | | | | | | | convert at least one client over to use them. Subsequent patches both to LLVM and Clang will try to convert more people over to a common set of predicates. This round of predicates is focused on OS-categorization predicates. llvm-svn: 149815
* Instruction scheduling itinerary for Intel Atom.Andrew Trick2012-02-011-0/+24
| | | | | | | | | | | | | | Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT. Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches. Adds a test to verify that the scheduler is working. Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP. Patch by Preston Gurd! llvm-svn: 149558
* Remove hasXMM/hasXMMInt functions. Move callers to hasSSE1/hasSSE2. This is ↵Craig Topper2012-01-101-4/+2
| | | | | | the final piece to remove the AVX hack that disabled SSE. llvm-svn: 147843
* Remove hasSSE*orAVX functions and change all callers to use just hasSSE*. ↵Craig Topper2012-01-101-4/+0
| | | | | | AVX is now an SSE level and no longer disables SSE checks. llvm-svn: 147842
* Instruction selection priority fixes to remove the XMM/XMMInt/orAVX ↵Craig Topper2012-01-101-6/+6
| | | | | | predicates. Another commit will remove orAVX functions from X86SubTarget. llvm-svn: 147841
* Remove AVX hack in X86Subtarget. AVX/AVX2 are now treated as an SSE level. ↵Craig Topper2012-01-091-21/+15
| | | | | | Predicate functions have been altered to maintain previous names and behavior. llvm-svn: 147770
* Remove hasSSE1orAVX(). It's the same as hasXMM().Evan Cheng2011-12-091-1/+0
| | | | llvm-svn: 146246
* Many of the SSE patterns should not be selected when AVX is available. This ↵Evan Cheng2011-12-081-0/+1
| | | | | | | | | | | | | | | | | | led to the following code in X86Subtarget.cpp if (HasAVX) X86SSELevel = NoMMXSSE; This is so patterns that are predicated on hasSSE3, etc. would not be selected when avx is available. Instead, the AVX variant is selected. However, this breaks instructions which do not have AVX variants. The right way to fix this is for the SSE but not-AVX patterns to predicate on something like hasSSE3() && !hasAVX(). Then we can take out the hack in X86Subtarget.cpp. Patterns which do not have AVX variants do not need to change. However, we need to audit all the patterns before we make the change. This patch is workaround that fixes one specific case, the prefetch instructions. rdar://10538297 llvm-svn: 146163
* Add XOP feature flag.Jan Sjödin2011-12-021-0/+4
| | | | llvm-svn: 145682
* Add methods for querying minimum SSE version along with AVX. Simplifies all ↵Craig Topper2011-11-221-0/+4
| | | | | | the places that had to check a version of SSE and AVX. llvm-svn: 145053
* Add intrinsics and feature flag for read/write FS/GS base instructions. Also ↵Craig Topper2011-10-301-0/+8
| | | | | | add AVX2 feature flag. llvm-svn: 143319
* Remove NaClModeDavid Meyer2011-10-181-3/+0
| | | | llvm-svn: 142338
* Add X86 BZHI instruction as well as BMI2 feature detection.Craig Topper2011-10-161-0/+4
| | | | llvm-svn: 142122
* Add X86 TZCNT instruction and patterns to select it. Also added core-avx2 ↵Craig Topper2011-10-141-0/+4
| | | | | | processor which is gcc's name for Haswell. llvm-svn: 141939
* Revert r141854 because it was causing failures:Bill Wendling2011-10-131-4/+0
| | | | | | | | | | | | | | | http://lab.llvm.org:8011/builders/llvm-x86_64-linux/builds/101 --- Reverse-merging r141854 into '.': U test/MC/Disassembler/X86/x86-32.txt U test/MC/Disassembler/X86/simple-tests.txt D test/CodeGen/X86/bmi.ll U lib/Target/X86/X86InstrInfo.td U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86.td U lib/Target/X86/X86Subtarget.h llvm-svn: 141857
* Add X86 TZCNT instruction and patterns to select it. Also added core-avx2 ↵Craig Topper2011-10-131-0/+4
| | | | | | processor which is gcc's name for Haswell. llvm-svn: 141854
* Add X86 LZCNT instruction. Including instruction selection support.Craig Topper2011-10-111-0/+4
| | | | llvm-svn: 141651
* Add Ivy Bridge 16-bit floating point conversion instructions for the X86 ↵Craig Topper2011-10-091-2/+6
| | | | | | disassembler. llvm-svn: 141505
* Add support for MOVBE and RDRAND instructions for the assembler and ↵Craig Topper2011-10-031-0/+8
| | | | | | disassembler. Includes feature flag checking, but no instrinsic support. Fixes PR10832, PR11026 and PR11027. llvm-svn: 141007
* Add a new MC bit for NaCl (Native Client) mode. NaCl requires that certainNick Lewycky2011-09-051-0/+8
| | | | | | | instructions are more aligned than the CPU requires, and adds some additional directives, to follow in future patches. Patch by David Meyer! llvm-svn: 139125
* Add support for generating CMPXCHG16B on x86-64 for the cmpxchg IR instruction.Eli Friedman2011-08-261-0/+5
| | | | llvm-svn: 138660
* X86Subtarget.h: Assume "x86_64-cygwin", though it has not been released yet, ↵NAKAMURA Takumi2011-07-201-1/+2
| | | | | | to appease test/CodeGen/X86 on cygwin. llvm-svn: 135564
* Restore old behavior. Always auto-detect features unless cpu or features are ↵Evan Cheng2011-07-081-1/+1
| | | | | | specified. llvm-svn: 134757
* Add Mode64Bit feature and sink it down to MC layer.Evan Cheng2011-07-071-7/+6
| | | | llvm-svn: 134641
OpenPOWER on IntegriCloud