summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86.h
Commit message (Collapse)AuthorAgeFilesLines
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-1/+1
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-1/+1
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* Reapply r238011 with a fix for the trap instruction.Quentin Colombet2015-05-221-0/+5
| | | | | | | | | | | | | | | | | | | The problem was that I slipped a change required for shrink-wrapping, namely I used getFirstTerminator instead of the getLastNonDebugInstr that was here before the refactoring, whereas the surrounding code is not yet patched for that. Original message: [X86] Refactor the prologue emission to prepare for shrink-wrapping. - Add a late pass to expand pseudo instructions (tail call and EH returns). Instead of doing it in the prologue emission. - Factor some static methods in X86FrameLowering to ease code sharing. NFC. Related to <rdar://problem/20821487> llvm-svn: 238035
* Revert "[X86] Fix a variable name for r237977 so that it works with every ↵Tamas Berghammer2015-05-221-5/+0
| | | | | | | | | | | compilers." Revert "[X86] Refactor the prologue emission to prepare for shrink-wrapping." This reverts commit 6b3b93fc8b68a2c806aa992ee4bd3d7f61898d4b. This reverts commit ab0b15dff8539826283a59c2dd700a18a9680e0f. llvm-svn: 238011
* [X86] Refactor the prologue emission to prepare for shrink-wrapping.Quentin Colombet2015-05-221-0/+5
| | | | | | | | | | | | - Add a late pass to expand pseudo instructions (tail call and EH returns). Instead of doing it in the prologue emission. - Factor some static methods in X86FrameLowering to ease code sharing. NFC. Related to <rdar://problem/20821487> llvm-svn: 237977
* Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"Reid Kleckner2015-05-051-0/+6
| | | | | | | | | | | | This reverts commit r236360. This change exposed a bug in WinEHPrepare by opting win32 code into EH preparation. We already knew that WinEHPrepare has bugs, and is the status quo for x64, so I don't think that's a reason to hold off on this change. I disabled exceptions in the sanitizer tests in r236505 and an earlier revision. llvm-svn: 236508
* Revert "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"Reid Kleckner2015-05-011-6/+0
| | | | | | This reverts commit r236359. Things are still broken despite testing. :( llvm-svn: 236360
* Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"Reid Kleckner2015-05-011-0/+6
| | | | | | This reverts commit r236340. llvm-svn: 236359
* Revert "[WinEH] Add an EH registration and state insertion pass for 32-bit x86"Reid Kleckner2015-05-011-6/+0
| | | | | | This reverts commit r236339, it breaks the win32 clang-cl self-host. llvm-svn: 236340
* [WinEH] Add an EH registration and state insertion pass for 32-bit x86Reid Kleckner2015-05-011-0/+6
| | | | | | | | | | | | | | | | | This pass is responsible for constructing the EH registration object that gets linked into fs:00, which is all it does in this change. In the future, it will also insert stores to update the EH state number. I considered keeping this functionality in WinEHPrepare, but it's pretty separable and X86 specific. It has conceptually very little to do with the task of WinEHPrepare, which is currently outlining. WinEHPrepare is also in theory useful on ARM, but this logic is pretty x86 specific. Reviewers: andrew.w.kaylor, majnemer Differential Revision: http://reviews.llvm.org/D9422 llvm-svn: 236339
* [X86] Convert esp-relative movs of function arguments to pushes, step 2Michael Kuperstein2015-02-011-0/+5
| | | | | | | | | | | | | | This moves the transformation introduced in r223757 into a separate MI pass. This allows it to cover many more cases (not only cases where there must be a reserved call frame), and perform rudimentary call folding. It still doesn't have a heuristic, so it is enabled only for optsize/minsize, with stack alignment <= 8, where it ought to be a fairly clear win. (Re-commit of r227728) Differential Revision: http://reviews.llvm.org/D6789 llvm-svn: 227752
* Revert r227728 due to bad line endings.Michael Kuperstein2015-02-011-5/+0
| | | | llvm-svn: 227746
* [X86] Convert esp-relative movs of function arguments to pushes, step 2Michael Kuperstein2015-02-011-0/+5
| | | | | | | | | | | | This moves the transformation introduced in r223757 into a separate MI pass. This allows it to cover many more cases (not only cases where there must be a reserved call frame), and perform rudimentary call folding. It still doesn't have a heuristic, so it is enabled only for optsize/minsize, with stack alignment <= 8, where it ought to be a fairly clear win. Differential Revision: http://reviews.llvm.org/D6789 llvm-svn: 227728
* [PM] Remove a bunch of stale TTI creation method declarations. I nukedChandler Carruth2015-02-011-3/+0
| | | | | | | their definitions, but forgot to clean up all the declarations which are in different files. llvm-svn: 227698
* [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPassRobin Morisset2014-09-171-4/+0
| | | | | | | | | | | | This required a new hook called hasLoadLinkedStoreConditional to know whether to expand atomics to LL/SC (ARM, AArch64, in a future patch Power) or to CmpXchg (X86). Apart from that, the new code in AtomicExpandPass is mostly moved from X86AtomicExpandPass. The main result of this patch is to get rid of that pass, which had lots of code duplicated with AtomicExpandPass. llvm-svn: 217928
* Reinstate "Nuke the old JIT."Eric Christopher2014-09-021-6/+0
| | | | | | | | Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reinstates commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 216982
* Canonicalize header guards into a common format.Benjamin Kramer2014-08-131-2/+2
| | | | | | | | | | Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558
* Temporarily Revert "Nuke the old JIT." as it's not quite ready toEric Christopher2014-08-071-0/+6
| | | | | | | | | | | be deleted. This will be reapplied as soon as possible and before the 3.6 branch date at any rate. Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reverts commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 215154
* Nuke the old JIT.Rafael Espindola2014-08-071-6/+0
| | | | | | | | | I am sure we will be finding bits and pieces of dead code for years to come, but this is a good start. Thanks to Lang Hames for making MCJIT a good replacement! llvm-svn: 215111
* X86: expand atomics in IR instead of as MachineInstrs.Tim Northover2014-07-011-0/+4
| | | | | | | | | | | | The logic for expanding atomics that aren't natively supported in terms of cmpxchg loops is much simpler to express at the IR level. It also allows the normal optimisations and CodeGen improvements to help out with atomics, instead of using a limited set of possible instructions.. rdar://problem/13496295 llvm-svn: 212119
* Rename createGlobalBaseRegPass -> createX86GlobalBaseRegPass to makeEric Christopher2014-05-221-2/+2
| | | | | | it obvious that it's a target specific pass. llvm-svn: 209380
* Prune includes in X86 target.Craig Topper2014-03-191-4/+2
| | | | llvm-svn: 204216
* This patch adds the X86FixupLEAs pass, which will reduce instructionPreston Gurd2013-04-251-0/+5
| | | | | | | | latency for certain models of the Intel Atom family, by converting instructions into their equivalent LEA instructions, when it is both useful and possible to do so. llvm-svn: 180573
* Pad Short Functions for Intel AtomPreston Gurd2013-01-081-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. This patch has been updated to address Nadav's review comments - Optimize only at >= O1 and don't do optimization if -Os is set - Stores MachineBasicBlock* instead of BBNum - Uses DenseMap instead of std::map - Fixes placement of braces Patch by Andy Zhang. llvm-svn: 171879
* Switch TargetTransformInfo from an immutable analysis pass that requiresChandler Carruth2013-01-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a TargetMachine to construct (and thus isn't always available), to an analysis group that supports layered implementations much like AliasAnalysis does. This is a pretty massive change, with a few parts that I was unable to easily separate (sorry), so I'll walk through it. The first step of this conversion was to make TargetTransformInfo an analysis group, and to sink the nonce implementations in ScalarTargetTransformInfo and VectorTargetTranformInfo into a NoTargetTransformInfo pass. This allows other passes to add a hard requirement on TTI, and assume they will always get at least on implementation. The TargetTransformInfo analysis group leverages the delegation chaining trick that AliasAnalysis uses, where the base class for the analysis group delegates to the previous analysis *pass*, allowing all but tho NoFoo analysis passes to only implement the parts of the interfaces they support. It also introduces a new trick where each pass in the group retains a pointer to the top-most pass that has been initialized. This allows passes to implement one API in terms of another API and benefit when some other pass above them in the stack has more precise results for the second API. The second step of this conversion is to create a pass that implements the TargetTransformInfo analysis using the target-independent abstractions in the code generator. This replaces the ScalarTargetTransformImpl and VectorTargetTransformImpl classes in lib/Target with a single pass in lib/CodeGen called BasicTargetTransformInfo. This class actually provides most of the TTI functionality, basing it upon the TargetLowering abstraction and other information in the target independent code generator. The third step of the conversion adds support to all TargetMachines to register custom analysis passes. This allows building those passes with access to TargetLowering or other target-specific classes, and it also allows each target to customize the set of analysis passes desired in the pass manager. The baseline LLVMTargetMachine implements this interface to add the BasicTTI pass to the pass manager, and all of the tools that want to support target-aware TTI passes call this routine on whatever target machine they end up with to add the appropriate passes. The fourth step of the conversion created target-specific TTI analysis passes for the X86 and ARM backends. These passes contain the custom logic that was previously in their extensions of the ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces. I separated them into their own file, as now all of the interface bits are private and they just expose a function to create the pass itself. Then I extended these target machines to set up a custom set of analysis passes, first adding BasicTTI as a fallback, and then adding their customized TTI implementations. The fourth step required logic that was shared between the target independent layer and the specific targets to move to a different interface, as they no longer derive from each other. As a consequence, a helper functions were added to TargetLowering representing the common logic needed both in the target implementation and the codegen implementation of the TTI pass. While technically this is the only change that could have been committed separately, it would have been a nightmare to extract. The final step of the conversion was just to delete all the old boilerplate. This got rid of the ScalarTargetTransformInfo and VectorTargetTransformInfo classes, all of the support in all of the targets for producing instances of them, and all of the support in the tools for manually constructing a pass based around them. Now that TTI is a relatively normal analysis group, two things become straightforward. First, we can sink it into lib/Analysis which is a more natural layer for it to live. Second, clients of this interface can depend on it *always* being available which will simplify their code and behavior. These (and other) simplifications will follow in subsequent commits, this one is clearly big enough. Finally, I'm very aware that much of the comments and documentation needs to be updated. As soon as I had this working, and plausibly well commented, I wanted to get it committed and in front of the build bots. I'll be doing a few passes over documentation later if it sticks. Commits to update DragonEgg and Clang will be made presently. llvm-svn: 171681
* Revert revision 171524. Original message:Nadav Rotem2013-01-051-5/+0
| | | | | | | | | | | | | | | | | | | | URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev Log: The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171603
* The current Intel Atom microarchitecture has a feature whereby when a functionPreston Gurd2013-01-041-0/+5
| | | | | | | | | | | | | | | | | returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171524
* Remove the X86 Maximal Stack Alignment Check pass as it is no longer necessary.Chad Rosier2012-11-261-6/+0
| | | | | | | | | | | | | | | This pass was conservative in that it always reserved the FP to enable dynamic stack realignment, which allowed the RA to use aligned spills for vector registers. This happens even when spills were not necessary. The RA has since been improved to use unaligned spills when necessary. The new behavior is to realign the stack if the frame pointer was already reserved for some other reason, but don't reserve the frame pointer just because a function contains vector virtual registers. Part of rdar://12719844 llvm-svn: 168627
* Whitespace.Chad Rosier2012-08-011-1/+1
| | | | llvm-svn: 161122
* Implement the local-dynamic TLS model for x86 (PR3985)Hans Wennborg2012-06-011-0/+5
| | | | | | | | | This implements codegen support for accesses to thread-local variables using the local-dynamic model, and adds a clean-up pass so that the base address for the TLS block can be re-used between local-dynamic access on an execution path. llvm-svn: 157818
* Reorder includes in Target backends to following coding standards. Remove ↵Craig Topper2012-03-171-2/+0
| | | | | | some superfluous forward declarations. llvm-svn: 152997
* Remove X86-dependent stuff from SSEDomainFix.Jakob Stoklund Olesen2011-09-271-4/+0
| | | | | | | | | This also enables domain swizzling for AVX code which required a few trivial test changes. The pass will be moved to lib/CodeGen shortly. llvm-svn: 140659
* Introduce a pass to insert vzeroupper instructions to avoid AVX toBruno Cardoso Lopes2011-08-231-0/+5
| | | | | | | | | | SSE transition penalty. The pass is enabled through the "x86-use-vzeroupper" llc command line option. This is only the first step (very naive and conservative one) to sketch out the idea, but proper DFA is coming next to allow smarter decisions. Comments and ideas now and in further commits will be very appreciated. llvm-svn: 138317
* Code clean up.Evan Cheng2011-07-251-8/+0
| | | | llvm-svn: 135954
* More refactoring.Evan Cheng2011-07-251-10/+0
| | | | llvm-svn: 135939
* Refactor X86 target to separate MC code from Target code.Evan Cheng2011-07-251-4/+1
| | | | llvm-svn: 135930
* - Eliminate MCCodeEmitter's dependency on TargetMachine. It now uses MCInstrInfoEvan Cheng2011-07-111-5/+6
| | | | | | | | | | | | and MCSubtargetInfo. - Added methods to update subtarget features (used when targets automatically detect subtarget features or switch modes). - Teach X86Subtarget to update MCSubtargetInfo features bits since the MCSubtargetInfo layer can be shared with other modules. - These fixes .code 16 / .code 32 support since mode switch is updated in MCSubtargetInfo so MC code emitter can do the right thing. llvm-svn: 134884
* Rename files for consistency.Evan Cheng2011-07-061-1/+1
| | | | llvm-svn: 134546
* Merge XXXGenRegisterNames.inc into XXXGenRegisterInfo.incEvan Cheng2011-06-281-6/+1
| | | | llvm-svn: 134024
* Rename TargetDesc to MCTargetDescEvan Cheng2011-06-241-3/+1
| | | | llvm-svn: 133846
* Starting to refactor Target to separate out code that's needed to fully describeEvan Cheng2011-06-241-4/+1
| | | | | | | | | | | | target machine from those that are only needed by codegen. The goal is to sink the essential target description into MC layer so we can start building MC based tools without needing to link in the entire codegen. First step is to refactor TargetRegisterInfo. This patch added a base class MCRegisterInfo which TargetRegisterInfo is derived from. Changed TableGen to separate register description from the rest of the stuff. llvm-svn: 133782
* Add header...Daniel Dunbar2010-12-201-0/+1
| | | | llvm-svn: 122247
* X86/MC/Mach-O: Split out createX86MachObjectWriter().Daniel Dunbar2010-12-201-0/+9
| | | | llvm-svn: 122246
* Remove the X86::FP_REG_KILL pseudo-instruction and the X86FloatingPointRegKillJakob Stoklund Olesen2010-07-161-5/+0
| | | | | | | | | pass that inserted it. It is no longer necessary to limit the live ranges of FP registers to a single basic block. llvm-svn: 108536
* Reapply bottom-up fast-isel, with several fixes for x86-32:Dan Gohman2010-07-101-0/+4
| | | | | | | | | - Check getBytesToPopOnReturn(). - Eschew ST0 and ST1 for return values. - Fix the PIC base register initialization so that it doesn't ever fail to end up the top of the entry block. llvm-svn: 108039
* Fix PR6696 and PR6663Jim Grosbach2010-04-061-0/+6
| | | | | | | | | | | | | | | | | When a frame pointer is not otherwise required, and dynamic stack alignment is necessary solely due to the spilling of a register with larger alignment requirements than the default stack alignment, the frame pointer can be both used as a general purpose register and a frame pointer. That goes poorly, for obvious reasons. This patch brings back a bit of old logic for identifying the use of such registers and conservatively reserves the frame pointer during register allocation in such cases. For now, implement for X86 only since it's 32-bit linux which is hitting this, and we want a targeted fix for 2.7. As a follow-on, this will be expanded to handle other targets, as theoretically the problem could arise elsewhere as well. llvm-svn: 100559
* Add a late SSEDomainFix pass that twiddles SSE instructions to avoid domain ↵Jakob Stoklund Olesen2010-03-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | crossings. On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a register in a different domain than where it was defined. Some instructions have equvivalents for different domains, like por/orps/orpd. The SSEDomainFix pass tries to minimize the number of domain crossings by changing between equvivalent opcodes where possible. This is a work in progress, in particular the pass doesn't do anything yet. SSE instructions are tagged with their execution domain in TableGen using the last two bits of TSFlags. Note that not all instructions are tagged correctly. Life just isn't that simple. The SSE execution domain issue is very similar to the ARM NEON/VFP pipeline issue handled by NEONMoveFixPass. This pass may become target independent to handle both. llvm-svn: 99524
* Revert "Add a late SSEDomainFix pass that twiddles SSE instructions to avoid ↵Jakob Stoklund Olesen2010-03-231-4/+0
| | | | | | | | domain crossings." This reverts commit 99345. It was breaking buildbots. llvm-svn: 99352
* Add a late SSEDomainFix pass that twiddles SSE instructions to avoid domain ↵Jakob Stoklund Olesen2010-03-231-0/+4
| | | | | | | | | crossings. This is work in progress. So far, SSE execution domain tables are added to X86InstrInfo, and a skeleton pass is enabled with -sse-domain-fix. llvm-svn: 99345
* MC: Provide the target triple to AsmBackend constructors.Daniel Dunbar2010-03-111-3/+2
| | | | llvm-svn: 98220
OpenPOWER on IntegriCloud