summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86InstrInfo.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Add more instructions to the memory folding tables using the ↵Craig Topper2018-06-151-1/+220
| | | | | | | | | | autogenerated table as a guide. I think this covers most of the unmasked vector instructions. We're still missing a lot of the masked instructions. There are some test changes here because of the new folding support. I don't think these particular cases should be folded because it creates an undef register dependency. I think the changes introduced in r334175 are not handling stack folding. They're only blocking the peephole pass. llvm-svn: 334800
* [X86] Add 'Z' to the internal names of various EVEX instructions for overall ↵Craig Topper2018-06-151-32/+32
| | | | | | consistency. llvm-svn: 334785
* [X86] Fix stale comment in folding tables.Craig Topper2018-06-141-3/+3
| | | | llvm-svn: 334758
* [X86] Add more vector instructions to the memory folding table using the ↵Craig Topper2018-06-141-1/+215
| | | | | | | | autogenerated table as a guide. The test cahnge is because we now fold stack reload into RNDSCALE and RNDSCALE can be turned into ROUND by EVEX->VEX. llvm-svn: 334728
* [X86] Disable load unfolding for a bunch of instruction where unfolding ↵Craig Topper2018-06-141-16/+16
| | | | | | | | would increase the size of the load. Found by an audit of the manual table vs the autogenerated table. llvm-svn: 334726
* [X86] Move RCPSSr_Int, RSQRTSSr_Int, SQRTSDr_Int, SQRTSSr_Int to the correct ↵Craig Topper2018-06-131-4/+4
| | | | | | | | load folding table. They were in the operand 1 folding table, but their foldable operand is operand 2. llvm-svn: 334648
* [X86] Remove TB_ALIGN_16 from VEXTRACTF128/VEXTRACTI128 in the memory ↵Craig Topper2018-06-121-2/+2
| | | | | | folding table. llvm-svn: 334511
* [X86] Block UndefRegUpdateTomasz Krupa2018-06-071-10/+22
| | | | | | | | | | | | Summary: Prevent folding of operations with memory loads when one of the sources has undefined register update. Reviewers: craig.topper Subscribers: llvm-commits, mike.dvoretsky, ashlykov Differential Revision: https://reviews.llvm.org/D47621 llvm-svn: 334175
* Change TII isCopyInstr way of returning arguments(NFC)Petar Jovanovic2018-06-061-4/+5
| | | | | | | | | | | Make TII isCopyInstr() return MachineOperands through pointer to pointer instead via reference. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D47364 llvm-svn: 334105
* [MachineOutliner] NFC - Move intermediate data structures to MachineOutliner.hJessica Paquette2018-06-041-24/+22
| | | | | | | | | | | | | | | | | | | | | This is setting up to fix bug 37573 cleanly. This moves data structures that are technically both used in some way by the target and the general-purpose outlining algorithm into MachineOutliner.h. In particular, the `Candidate` class is of importance. Before, the outliner passed the locations of `Candidates` to the target, which would then make some decisions about the prospective outlined function. This change allows us to just pass `Candidates` along to the target. This will allow the target to discard `Candidates` that would be considered unsafe before cost calculation. Thus, we will be able to remove the unsafe candidates described in the bug without resorting to torching the entire prospective function. Also, as a side-effect, it makes the outliner a bit cleaner. https://bugs.llvm.org/show_bug.cgi?id=37573 llvm-svn: 333952
* [X86] Fix typo in comment. NFCCraig Topper2018-05-281-2/+2
| | | | llvm-svn: 333382
* [X86][MIPS][ARM] New machine instruction property 'isMoveReg'Petar Jovanovic2018-05-231-0/+10
| | | | | | | | | | | | | This property is needed in order to follow values movement between registers. This property is used in TII to implement method that returns true if simple copy like instruction is recognized, along with source and destination machine operands. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D45204 llvm-svn: 333093
* Fix unused lambda capture.Eli Friedman2018-05-181-1/+1
| | | | llvm-svn: 332686
* [MachineOutliner] Count savings from outlining in bytes.Eli Friedman2018-05-181-4/+15
| | | | | | | | | | Counting the number of instructions is both unintuitive and inaccurate. On AArch64, this only affects the generated remarks and certain rare pseudo-instructions, but it will have a bigger impact on other targets. Differential Revision: https://reviews.llvm.org/D46921 llvm-svn: 332685
* Rename DEBUG macro to LLVM_DEBUG.Nicola Zaghen2018-05-141-2/+2
| | | | | | | | | | | | | | | | The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
* [DebugInfo] Examine all uses of isDebugValue() for debug instructions.Shiva Chen2018-05-091-8/+8
| | | | | | | | | | | | | | | | | | Because we create a new kind of debug instruction, DBG_LABEL, we need to check all passes which use isDebugValue() to check MachineInstr is debug instruction or not. When expelling debug instructions, we should expel both DBG_VALUE and DBG_LABEL. So, I create a new function, isDebugInstr(), in MachineInstr to check whether the MachineInstr is debug instruction or not. This patch has no new test case. I have run regression test and there is no difference in regression test. Differential Revision: https://reviews.llvm.org/D45342 Patch by Hsiangkai Wang. llvm-svn: 331844
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-011-2/+2
| | | | | | | | | | | | | | | | We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
* [X86] Correct spill slot size.Andrea Di Biagio2018-05-011-2/+2
| | | | | | | | | | | | | | | This patch fixes a bug introduced by revision 330778 (originally reviewed at: https://reviews.llvm.org/D44782), where function isFrameLoadOpcode returned the wrong number of bytes read for opcodes VMOVSSrm and VMOVSDrm. This corrects that mistake, and extends the regression test to catch cases where the dead stores should be removed. Patch by Jeremy Morse. Differential Revision: https://reviews.llvm.org/D46256 llvm-svn: 331252
* [X86] Rename BNDMOV instructions and hide redundant instruction encoding ↵Craig Topper2018-04-281-2/+2
| | | | | | | | | | | | from the assembler. Favor the 0x1a encoding for register/register move to match gas. The instructions used RM and MR in their name along with rr/rm/mr at the end. To make more consistent with other instructions remove the RM/MR and use rr/rm/mr/rr_REV. Hide the _REV encoding from the assembler but leave it for the disassembler. llvm-svn: 331101
* [X86] Make the STTNI flag intrinsics use the flags from pcmpestrm/pcmpistrm ↵Craig Topper2018-04-271-5/+5
| | | | | | | | | | | | | | | | | | | | | | | if the mask instrinsics are also used in the same basic block. Summary: Previously the flag intrinsics always used the index instructions even if a mask instruction also exists. To fix fix this I've created a single ISD node type that returns index, mask, and flags. The SelectionDAG CSE process will merge all flavors of intrinsics with the same inputs to a s ingle node. Then during isel we just have to look at which results are used to know what instruction to generate. If both mask and index are used we'll need to emit two instructions. But for all other cases we can emit a single instruction. Since I had to do manual isel anyway, I've removed the pseudo instructions and custom inserter code that was working around tablegen limitations with multiple implicit defs. I've also renamed the recently added sse42.ll test case to sttni.ll since it focuses on that subset of the sse4.2 instructions. Reviewers: chandlerc, RKSimon, spatel Reviewed By: chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46202 llvm-svn: 331091
* [x86] Allow folding unaligned memory operands into pcmp[ei]str*Chandler Carruth2018-04-261-4/+4
| | | | | | | | | | | instructions. These have special permission according to the x86 manual to read unaligned memory, and this folding is done by ICC and GCC as well. This corrects one of the issues identified in PR37246. llvm-svn: 330896
* [X86] Account for partial stack slot spills (PR30821)Warren Ristow2018-04-241-78/+121
| | | | | | | | | | | | | | | | | | | | | | | Previously, _any_ store or load instruction was considered to be operating on a spill if it had a frameindex as an operand, and thus was fair game for optimisations such as "StackSlotColoring". This usually works, except on architectures where spills can be partially restored, for example on X86 where a spilt vector can have a single component loaded (zeroing the rest of the target register). This can be mis-interpreted and the zero extension unsoundly eliminated, see pr30821. To avoid this, this commit optionally provides the caller to isLoadFromStackSlot and isStoreToStackSlot with the number of bytes spilt/loaded by the given instruction. Optimisations can then determine that a full spill followed by a partial load (or vice versa), for example, cannot necessarily be commuted. Patch by Jeremy Morse! Differential Revision: https://reviews.llvm.org/D44782 llvm-svn: 330778
* [x86] Model the direction flag (DF) separately from the rest of EFLAGS.Chandler Carruth2018-04-101-2/+3
| | | | | | | | | | | | | | | | | | | | | This cleans up a number of operations that only claimed te use EFLAGS due to using DF. But no instructions which we think of us setting EFLAGS actually modify DF (other than things like popf) and so this needlessly creates uses of EFLAGS that aren't really there. In fact, DF is so restrictive it is pretty easy to model. Only STD, CLD, and the whole-flags writes (WRFLAGS and POPF) need to model this. I've also somewhat cleaned up some of the flag management instruction definitions to be in the correct .td file. Adding this extra register also uncovered a failure to use the correct datatype to hold X86 registers, and I've corrected that as necessary here. Differential Revision: https://reviews.llvm.org/D45154 llvm-svn: 329673
* [x86] Introduce a pass to begin more systematically fixing PR36028 and ↵Chandler Carruth2018-04-101-96/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | similar issues. The key idea is to lower COPY nodes populating EFLAGS by scanning the uses of EFLAGS and introducing dedicated code to preserve the necessary state in a GPR. In the vast majority of cases, these uses are cmovCC and jCC instructions. For such cases, we can very easily save and restore the necessary information by simply inserting a setCC into a GPR where the original flags are live, and then testing that GPR directly to feed the cmov or conditional branch. However, things are a bit more tricky if arithmetic is using the flags. This patch handles the vast majority of cases that seem to come up in practice: adc, adcx, adox, rcl, and rcr; all without taking advantage of partially preserved EFLAGS as LLVM doesn't currently model that at all. There are a large number of operations that techinaclly observe EFLAGS currently but shouldn't in this case -- they typically are using DF. Currently, they will not be handled by this approach. However, I have never seen this issue come up in practice. It is already pretty rare to have these patterns come up in practical code with LLVM. I had to resort to writing MIR tests to cover most of the logic in this pass already. I suspect even with its current amount of coverage of arithmetic users of EFLAGS it will be a significant improvement over the current use of pushf/popf. It will also produce substantially faster code in most of the common patterns. This patch also removes all of the old lowering for EFLAGS copies, and the hack that forced us to use a frame pointer when EFLAGS copies were found anywhere in a function so that the dynamic stack adjustment wasn't a problem. None of this is needed as we now lower all of these copies directly in MI and without require stack adjustments. Lots of thanks to Reid who came up with several aspects of this approach, and Craig who helped me work out a couple of things tripping me up while working on this. Differential Revision: https://reviews.llvm.org/D45146 llvm-svn: 329657
* [MachineOutliner] Test for X86FI->getUsesRedZone() as well as ↵Jessica Paquette2018-04-031-1/+5
| | | | | | | | | | | | | | | | Attribute::NoRedZone This commit is similar to r329120, but uses the existing getUsesRedZone() function in X86MachineFunctionInfo. This teaches the outliner to look at whether or not a function *truly* uses a redzone instead of just the noredzone attribute on a function. Thus, after this commit, it's possible to outline from x86 without using -mno-red-zone and still get outlining results. This also adds a new test for the new redzone behaviour. llvm-svn: 329134
* [x86] Expose more of the condition conversion routines in the public APIChandler Carruth2018-04-011-7/+7
| | | | | | | | for X86's instruction information. I've now got a second patch under review that needs these same APIs. This bit is nicely orthogonal and obvious, so landing it. NFC. llvm-svn: 328944
* [X86] Rename VROUNDYPS* and VROUNDYPD* instructions to VROUNDPSY* and ↵Craig Topper2018-03-221-2/+2
| | | | | | | | | | VROUNDPDY*. Fix itinerary mistake on all memory forms of VROUNDPD This makes the Y position consistent with other instructions. This should have been NFC, but while refactoring the multiclass I noticed that VROUNDPD memory forms were using the register itinerary. llvm-svn: 328254
* [X86] Rename MOVSX32_NOREXrr8 to MOVSX32rr8_NOREX so that the scheduler ↵Craig Topper2018-03-201-2/+2
| | | | | | | | model regular expressions will pick it up with the regular version. Do the same for MOVSX32_NOREXrm8, MOVZX32_NOREXrr8, and MOVZX32_NOREXrm8 llvm-svn: 327948
* [MachineOutliner] Make KILLs invisibleJessica Paquette2018-03-161-0/+5
| | | | | | | | | At the point the outliner runs, KILLs don't impact anything, but they're still considered unique instructions. This commit makes them invisible like DebugValues so that they can still be outlined without impacting outlining decisions. llvm-svn: 327760
* [X86] Stop passing two arguments by reference. NFCCraig Topper2018-03-011-1/+1
| | | | | | I think these used to be out parameters, but they haven't been for a while. llvm-svn: 326417
* Revert r326225 "[X86] Move the load folding tables to a separate .inc file"Craig Topper2018-02-271-10/+3623
| | | | | | The bots don't seem to like the .inc file. I must be missing some cmake incantation. llvm-svn: 326228
* [X86] Move the load folding tables to a separate .inc fileCraig Topper2018-02-271-3623/+10
| | | | | | | | These tables add 3000 lines to X86InstrInfo.cpp. And if we ever manage to auto generate them they'll be a separate file anyway. Differential Revision: https://reviews.llvm.org/D43806 llvm-svn: 326225
* [X86] Make XOP VPCOM instructions commutable to fold loads during isel.Craig Topper2018-02-201-12/+19
| | | | llvm-svn: 325547
* [X86] Make a helper function for commuting AVX512 VPCMP immediates since we ↵Craig Topper2018-02-201-12/+19
| | | | | | do it in two places. llvm-svn: 325546
* [x86] Fix nasty bug in the x86 backend that is essentially impossible toChandler Carruth2018-02-071-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | hit from IR but creates a minefield for MI passes. The x86 backend has fairly powerful logic to try and fold loads that feed register operands to instructions into a memory operand on the instruction. This is almost always a good thing, but there are specific relocated loads that are only allowed to appear in specific instructions. Notably, R_X86_64_GOTTPOFF is only allowed in `movq` and `addq`. This patch blocks folding of memory operands using this relocation unless the target is in fact `addq`. The particular relocation indicates why we simply don't hit this under normal circumstances. This relocation is only used for TLS, and it gets used in very specific ways in conjunction with %fs-relative addressing. The result is that loads using this relocation are essentially never eligible for folding into an instruction's memory operands. Unless, of course, you have an MI pass that inserts usage of such a load. I have exactly such an MI pass and was greeted by truly mysterious miscompiles where the linker replaced my instruction with a completely garbage byte sequence. Go team. This is the only such relocation I'm aware of in x86, but there may be others that need to be similarly restricted. Fixes PR36165. Differential Revision: https://reviews.llvm.org/D42732 llvm-svn: 324546
* [X86] When doing callee save/restore for k-registers make sure we don't use ↵Craig Topper2018-02-071-2/+6
| | | | | | | | | | | | | | KMOVQ on non-BWI targets If we are saving/restoring k-registers, the default behavior of getMinimalRegisterClass will find the VK64 class with a spill size of 64 bits. This will cause the KMOVQ opcode to be used for save/restore. If we don't have have BWI instructions we need to constrain the class returned to give us VK16 with a 16-bit spill size. We can do this by passing the either v16i1 or v64i1 into getMinimalRegisterClass. Also add asserts to make sure BWI is enabled anytime we use KMOVD/KMOVQ. These are what caught this bug. Fixes PR36256 Differential Revision: https://reviews.llvm.org/D42989 llvm-svn: 324533
* [X86] Teach DAG unfoldMemoryOperand to reconvert CMPs to testsNirav Dave2018-02-051-0/+24
| | | | | | | | | | | | | | | | Summary: Copy MI-level cmp->test conversion to SelectionDAG-level memory unfold. This fixes a regression from upcoming D41293 change. Reviewers: craig.topper, RKSimon Reviewed By: craig.topper Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D42808 llvm-svn: 324261
* [X86] Avoid using high register trick for test instructionAmaury Sechet2018-01-311-3/+0
| | | | | | | | | | | | | Summary: It seems it's main effect is to create addition copies when values are inr register that do not support this trick, which increase register pressure and makes the code bigger. Reviewers: craig.topper, niravd, spatel, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42646 llvm-svn: 323888
* Revert "[X86] Avoid using high register trick for test instruction"Eric Liu2018-01-301-0/+3
| | | | | | This reverts commit r323690. This causes crash in llc. See the original commit thread for details. llvm-svn: 323761
* [X86] Avoid using high register trick for test instructionAmaury Sechet2018-01-291-3/+0
| | | | | | | | | | | | | | | Summary: It seems it's main effect is to create addition copies when values are inr register that do not support this trick, which increase register pressure and makes the code bigger. The main noteworthy regression I was able to observe was pattern of the type (setcc (trunc (and X, C)), 0) where C is such as it would benefit from the hi register trick. To prevent this, a new pattern is added to materialize such pattern using a 32 bits test. This has the added benefit of working with any constant that is materializable as a 32bits immediate, not just the ones that can leverage the high register trick, as demonstrated by the test case in test-shrink.ll using the constant 2049 . Reviewers: craig.topper, niravd, spatel, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42646 llvm-svn: 323690
* [X86] Name the MMX phaddd instruction with 3 Ds instead of just 2. NFCCraig Topper2018-01-251-1/+1
| | | | llvm-svn: 323403
* [X86] Remove 64/128/256 from MMX/SSE/AVX instruction names for overall ↵Craig Topper2018-01-251-33/+33
| | | | | | | | | | consistency. NFC MMX instrutions all start with MMX_ so the 64 isn't needed for disambigutation. SSE/AVX1 instructions are assumed 128-bit so we don't need to say 128. AVX2 instructions should use a Y to indicate 256-bits. llvm-svn: 323402
* [X86] Adjust names of PINSRW/PEXTRW intructions between MMX/SSE/AVX/AVX512 ↵Craig Topper2018-01-241-3/+3
| | | | | | for consistency and to maybe enable more regular expression compaction in the scheduler models. NFCI llvm-svn: 323352
* [X86] Rename 256-bit VFRCZ instructions to have the Y before the rr/rm to ↵Craig Topper2018-01-241-2/+2
| | | | | | match other instructions. NFC llvm-svn: 323304
* [X86] Move 'Int_' to the end of the name of the VCOMISS/VUCOMISS and ↵Craig Topper2018-01-231-8/+8
| | | | | | | | instructions to get them picked up by the scheduler model regexs. All other intrinsic instructions put the _Int on the end. This make these instructions consistent and gets the prefix instregexs in the scheduler models to pick them up. llvm-svn: 323261
* [X86] Add missing MOVSX/MOVZX instructions to load folding tables.Craig Topper2018-01-231-0/+3
| | | | | | | | I'm not sure there's any way to generate these folding cases especially the movzx ones since even the register form is never emitted by codegen. I'm just adding them to remove the difference with the autogenerated version of the folding table. llvm-svn: 323200
* Fix bug in commit 323096 exposed by test in ↵Marina Yatsina2018-01-221-1/+1
| | | | | | | test-suite-verify-machineinstrs-x86_64h-O3 Change-Id: I0a4b10d0d6c8de606d989c567ec07944ae283a87 llvm-svn: 323126
* Break false dependencies for POPCNT, LZCNT, TZCNTMarina Yatsina2018-01-221-5/+38
| | | | | | | | | | | | | | | | | | | | | | Add POPCNT, LZCNT, TZCNT to the list of instructions that have false dependency. Add a test to make sure BreakFalseDeps breaks the dependencies for these instructions. Update affected tests. This fixes bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 This is the final of multiple patches that fix this bugzilla. Most of the patches are intended at refactoring the existent code. Reviews of the refactoring done to enable this change: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40331 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 Differential Revision: https://reviews.llvm.org/D40334 Change-Id: If95cbf1a3f5c7dccff8f1b22ecb397542147303d llvm-svn: 323096
* Separate ExecutionDepsFix into 4 parts:Marina Yatsina2018-01-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. ReachingDefsAnalysis - Allows to identify for each instruction what is the “closest” reaching def of a certain register. Used by BreakFalseDeps (for clearance calculation) and ExecutionDomainFix (for arbitrating conflicting domains). 2. ExecutionDomainFix - Changes the variant of the instructions in order to minimize domain crossings. 3. BreakFalseDeps - Breaks false dependencies. 4. LoopTraversal - Creatws a traversal order of the basic blocks that is optimal for loops (introduced in revision L293571). Both ExecutionDomainFix and ReachingDefsAnalysis use this to determine the order they will traverse the basic blocks. This also included the following changes to ExcecutionDepsFix original logic: 1. BreakFalseDeps and ReachingDefsAnalysis logic no longer restricted by a register class. 2. ReachingDefsAnalysis tracks liveness of reg units instead of reg indices into a given reg class. Additional changes in affected files: 1. X86 and ARM targets now inherit from ExecutionDomainFix instead of ExecutionDepsFix. BreakFalseDeps also was added to the passes they activate. 2. Comments and references to ExecutionDepsFix replaced with ExecutionDomainFix and BreakFalseDeps, as appropriate. Additional refactoring changes will follow. This commit is (almost) NFC. The only functional change is that now BreakFalseDeps will break dependency for all register classes. Since no additional instructions were added to the list of instructions that have false dependencies, there is no actual change yet. In a future commit several instructions (and tests) will be added. This is the first of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40331 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40330 Change-Id: Icaeb75e014eff96a8f721377783f9a3e6c679275 llvm-svn: 323087
* Avoid Wparentheses warning.Simon Pilgrim2018-01-151-2/+2
| | | | llvm-svn: 322526
OpenPOWER on IntegriCloud