summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86InstrInfo.cpp
Commit message (Collapse)AuthorAgeFilesLines
* fix trivial typos in comments; NFCHiroshi Inoue2017-07-031-1/+1
| | | | llvm-svn: 307004
* [AVX-512] Mark masked VPCMP instructions as commutable.Craig Topper2017-06-131-14/+26
| | | | llvm-svn: 305276
* [X86] Add masked integer compare instructions to load folding tables.Craig Topper2017-06-131-0/+58
| | | | llvm-svn: 305274
* [AVX-512] Add VPCONFLICT and VPLZCNT to load folding tables.Craig Topper2017-06-121-0/+36
| | | | llvm-svn: 305180
* [x86] Revert the X86FoldTablesEmitter due to more miscompiles.Chandler Carruth2017-06-061-2/+3398
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In testing, we've found yet another miscompile caused by the new tables. And this one is even less clear how to fix (we could teach it to fold a 16-bit load instead of the 32-bit load it wants, or block folding entirely). Also, the approach to excluding instructions seems increasingly to not scale well. I have left a more detailed analysis on the review log for the original patch (https://reviews.llvm.org/D32684) along with suggested path forward. I will land an additional test case that I wrote which covers the code that was miscompiling (folding into the output of `pextrw`) in a subsequent commit to keep this a pure revert. For each commit reverted here, I've restricted the revert to the non-test code touching the x86 fold table emission until the last commit where I did revert the test updates. This means the *new* test cases added for `insertps` and `xchg` remain untouched (and continue to pass). Reverted commits: r304540: [X86] Don't fold into memory operands into insertps in the ... r304347: [TableGen] Adapt more places to getValueAsString now ... r304163: [X86] Don't fold away the memory operand of an xchg. r304123: Don't capture a temporary std::string in a StringRef. r304122: Resubmit "[X86] Adding new LLVM TableGen backend that ..." Original commit was in r304088, and after a string of fixes was reverted previously in r304121 to fix build bots, and then re-landed in r304122. llvm-svn: 304762
* Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.Galina Kistanova2017-05-311-0/+1
| | | | llvm-svn: 304355
* Resubmit "[X86] Adding new LLVM TableGen backend that generates the X86 ↵Zachary Turner2017-05-291-3398/+2
| | | | | | | | | | | | | backend memory folding tables." This was reverted due to buildbot breakages and I was not familiar with this code to investigate it. But while trying to get a useful backtrace for the author, it turns out the fix was very obvious. Resubmitting this patch as is, and will submit the fix in a followup so that the fix is not hidden in the larger CL. llvm-svn: 304122
* Revert "[X86] Adding new LLVM TableGen backend that generates the X86 ↵Zachary Turner2017-05-291-2/+3398
| | | | | | | | | | | | | backend memory folding tables." This reverts commit 28cb1003507f287726f43c771024a1dc102c45fe as well as all subsequent followups. llvm-tblgen currently segfaults with this change, and it seems it has been broken on the bots all day with no fixes in preparation. See, for example: http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/ llvm-svn: 304121
* [X86] Adding new LLVM TableGen backend that generates the X86 backend memory ↵Ayman Musa2017-05-281-3398/+2
| | | | | | | | | | | folding tables. X86 backend holds huge tables in order to map between the register and memory forms of each instruction. This TableGen Backend automatically generated all these tables with the appropriate flags for each entry. Differential Revision: https://reviews.llvm.org/D32684 llvm-svn: 304088
* LivePhysRegs: Rework constructor + documentation; NFCMatthias Braun2017-05-261-4/+4
| | | | | | | - Take reference instead of pointer to a TRI that cannot be nullptr. - Improve documentation comments. llvm-svn: 304038
* [X86] Adding vpopcntd and vpopcntq instructionsOren Ben Simhon2017-05-251-0/+6
| | | | | | | | | AVX512_VPOPCNTDQ is a new feature set that was published by Intel. The patch represents the LLVM side of the addition of two new intrinsic based instructions (vpopcntd and vpopcntq). Differential Revision: https://reviews.llvm.org/D33169 llvm-svn: 303858
* Strip trailing whitespace. NFCI.Simon Pilgrim2017-05-241-2/+2
| | | | llvm-svn: 303736
* [Target/X86] Remove unneeded return. NFCI.Davide Italiano2017-05-181-3/+1
| | | | llvm-svn: 303323
* [x86, SSE] AVX1 PR28129 (256-bit all-ones rematerialization)Simon Pilgrim2017-05-131-2/+12
| | | | | | | | | | | | | | | | | | | | Further perf tests on Jaguar indicate that: vxorps %ymm0, %ymm0, %ymm0 vcmpps $15, %ymm0, %ymm0, %ymm0 is consistently faster (by about 9%) than: vpcmpeqd %xmm0, %xmm0, %xmm0 vinsertf128 $1, %xmm0, %ymm0, %ymm0 Testing equivalent code on a SandyBridge (E5-2640) puts it slightly (~3%) faster as well. Committed on behalf of @dtemirbulatov Differential Revision: https://reviews.llvm.org/D32416 llvm-svn: 302989
* [X86] Move getX86ConditionCode() from X86FastISel.cpp to X86InstrInfo.cpp. NFCIgor Breger2017-05-111-0/+38
| | | | | | | | | | | | | | | | Summary: Move getX86ConditionCode() from X86FastISel.cpp to X86InstrInfo.cpp so it can be used by GloabalIsel instruction selector. This is a pre-commit for a patch I'm working on to support G_ICMP. NFC. Reviewers: zvi, guyblank, delena Reviewed By: guyblank, delena Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33038 llvm-svn: 302767
* [X86][LWP] Add stack folding mappings and tests for LWPINS/LWPVAL instructionsSimon Pilgrim2017-05-031-0/+6
| | | | llvm-svn: 302049
* Move value type list from TargetRegisterClass to TargetRegisterInfoKrzysztof Parzyszek2017-04-241-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D31937 llvm-svn: 301234
* Revert r301231: Accidentally committed stale filesKrzysztof Parzyszek2017-04-241-2/+2
| | | | | | I forgot to commit local changes before commit. llvm-svn: 301232
* Move value type list from TargetRegisterClass to TargetRegisterInfoKrzysztof Parzyszek2017-04-241-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D31937 llvm-svn: 301231
* Move size and alignment information of regclass to TargetRegisterInfoKrzysztof Parzyszek2017-04-241-17/+32
| | | | | | | | | | | | | | | 1. RegisterClass::getSize() is split into two functions: - TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const; - TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const; 2. RegisterClass::getAlignment() is replaced by: - TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const; This will allow making those values depend on subtarget features in the future. Differential Revision: https://reviews.llvm.org/D31783 llvm-svn: 301221
* [X86][MPX] Add load & store instructions of bnd values to ↵Ayman Musa2017-04-231-22/+30
| | | | | | | | | | getLoadStoreRegOpcode function. This is needed for a follow up patch that generates the memory folding tables. Differential Revision: https://reviews.llvm.org/D32232 llvm-svn: 301109
* Re-commit r301040 "X86: Don't emit zero-byte functions on Windows"Hans Wennborg2017-04-211-1/+1
| | | | | | | | | In addition to the original commit, tighten the condition for when to pad empty functions to COFF Windows. This avoids running into problems when targeting e.g. Win32 AMDGPU, which caused test failures when this was committed initially. llvm-svn: 301047
* Revert r301040 "X86: Don't emit zero-byte functions on Windows"Hans Wennborg2017-04-211-1/+1
| | | | | | This broke almost all bots. Reverting while fixing. llvm-svn: 301041
* X86: Don't emit zero-byte functions on WindowsHans Wennborg2017-04-211-1/+1
| | | | | | | | | | | | | | | | | | Empty functions can lead to duplicate entries in the Guard CF Function Table of a binary due to multiple functions sharing the same RVA, causing the kernel to refuse to load that binary. We had a terrific bug due to this in Chromium. It turns out we were already doing this for Mach-O in certain situations. This patch expands the code for that in AsmPrinter::EmitFunctionBody() and renames TargetInstrInfo::getNoopForMachoTarget() to simply getNoop() since it seems it was used for not just Mach-O anyway. Differential Revision: https://reviews.llvm.org/D32330 llvm-svn: 301040
* Use methods to access data stored with frame instructionsSerge Pavlov2017-04-131-11/+6
| | | | | | | | | | | | | Instructions CALLSEQ_START..CALLSEQ_END and their target dependent counterparts keep data like frame size, stack adjustment etc. These data are accessed by getOperand using hard coded indices. It is error prone way. This change implements the access by special methods, which improve readability and allow changing data representation without massive changes of index values. Differential Revision: https://reviews.llvm.org/D31953 llvm-svn: 300196
* Fix the bootstrap failure caused by r299986.Easwaran Raman2017-04-121-0/+4
| | | | llvm-svn: 300069
* [x86] Relax the check in areLoadsFromSameBasePtrEaswaran Raman2017-04-111-19/+16
| | | | | | | | | Check if the scale operand is identical (doesn't have to be 1) and do not check the chaain operand. Differential revision: https://reviews.llvm.org/D31833 llvm-svn: 299986
* [AVX-512] Fix accidental uses of AH/BH/CH/DH after copies to/from mask registersCraig Topper2017-03-281-22/+0
| | | | | | | | | | | | | | | | We've had several bugs(PR32256, PR32241) recently that resulted from usages of AH/BH/CH/DH either before or after a copy to/from a mask register. This ultimately occurs because we create COPY_TO_REGCLASS with VK1 and GR8. Then in CopyToFromAsymmetricReg in X86InstrInfo we find a 32-bit super register for the GR8 to emit the KMOV with. But as these tests are demonstrating, its possible for the GR8 register to be a high register and we end up doing an accidental extra or insert from bits 15:8. I think the best way forward is to stop making copies directly between mask registers and GR8/GR16. Instead I think we should restrict to only copies between mask registers and GR32/GR64 and use EXTRACT_SUBREG/INSERT_SUBREG to handle the conversion from GR32 to GR16/8 or vice versa. Unfortunately, this complicates fastisel a bit more now to create the subreg extracts where we used to create GR8 copies. We can probably make a helper function to bring down the repitition. This does result in KMOVD being used for copies when BWI is available because we don't know the original mask register size. This caused a lot of deltas on tests because we have to split the checks for KMOVD vs KMOVW based on BWI. Differential Revision: https://reviews.llvm.org/D30968 llvm-svn: 298928
* ExecutionDepsFix: Normalize names; NFCMatthias Braun2017-03-181-3/+3
| | | | | | | Normalize ExeDepsFix, execution-fix, ExecutionDependencyFix and ExecutionDepsFix to the last one. llvm-svn: 298183
* TargetInstrInfo: Provide default implementation of isTailCall().Matthias Braun2017-03-161-22/+0
| | | | | | | | | | In fact this default implementation should be the only implementation, keep it virtual for now to accomodate targets that don't model flags correctly. Differential Revision: https://reviews.llvm.org/D30747 llvm-svn: 297980
* [Outliner] Add tail call supportJessica Paquette2017-03-131-17/+58
| | | | | | | | | | | | | | This commit adds tail call support to the MachineOutliner pass. This allows the outliner to insert jumps rather than calls in areas where tail calling is possible. Outlined tail calls include the return or terminator of the basic block being outlined from. Tail call support allows the outliner to take returns and terminators into consideration while finding candidates to outline. It also allows the outliner to save more instructions. For example, in the X86-64 outliner, a tail called outlined function saves one instruction since no return has to be inserted. llvm-svn: 297653
* [AVX-512] Fix a bad use of a high GR8 register after copying from a mask ↵Craig Topper2017-03-121-0/+2
| | | | | | | | | | register during fast isel. This ends up extracting from bits 15:8 instead of the lower bits of the mask. I'm pretty sure there are more problems lurking here. But I think this fixes PR32241. I've added the test case from that bug and added asserts that will fail if we ever try to copy between high registers and mask registers again. llvm-svn: 297574
* [Outliner] Fixed Asan bot failure in r296418Jessica Paquette2017-03-061-0/+80
| | | | | | | | Fixed the asan bot failure which led to the last commit of the outliner being reverted. The change is in lib/CodeGen/MachineOutliner.cpp in the SuffixTree's constructor. LeafVector is no longer initialized using reserve but just a standard constructor. llvm-svn: 297081
* Revert "Add MIR-level outlining pass"Matthias Braun2017-02-281-80/+0
| | | | | | | | Revert Machine Outliner for now, as it breaks the asan bot. This reverts commit r296418. llvm-svn: 296426
* Add MIR-level outlining passMatthias Braun2017-02-281-0/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | This is a patch for the outliner described in the RFC at: http://lists.llvm.org/pipermail/llvm-dev/2016-August/104170.html The outliner is a code-size reduction pass which works by finding repeated sequences of instructions in a program, and replacing them with calls to functions. This is useful to people working in low-memory environments, where sacrificing performance for space is acceptable. This adds an interprocedural outliner directly before printing assembly. For reference on how this would work, this patch also includes X86 target hooks and an X86 test. The outliner is run like so: clang -mno-red-zone -mllvm -enable-machine-outliner file.c Patch by Jessica Paquette<jpaquette@apple.com>! rdar://29166825 Differential Revision: https://reviews.llvm.org/D26872 llvm-svn: 296418
* [X86][AVX] Disable VCVTSS2SD & VCVTSD2SS memory folding and fix the register ↵Ayman Musa2017-02-231-4/+0
| | | | | | | | class of their first input when creating node in fast-isel. (Quick fix to buildbot failure after rL295940 commit). llvm-svn: 295970
* [X86][AVX512] Remove VCVTSS2SDZ & VCVTSD2SSZ from memory folding tables as ↵Ayman Musa2017-02-231-4/+0
| | | | | | | | they introduce new read dependency when folding. (Quick fix to buildbot fail). llvm-svn: 295946
* [X86][AVX512] Change VCVTSS2SD and VCVTSD2SS node types to keep consistency ↵Ayman Musa2017-02-231-2/+10
| | | | | | | | | | between VEX/EVEX versions. AVX versions of the converts work on f32/f64 types, while AVX512 version work on vectors. Differential Revision: https://reviews.llvm.org/D29988 llvm-svn: 295940
* [AVX-512] Add broadcast VPTERNLOG instructions to special case commuting switch.Craig Topper2017-02-191-13/+37
| | | | | | | | The instructions are marked commutable, but without special handling we don't get the immediate correct. While here also remove the masked memory forms that aren't commutable. llvm-svn: 295602
* [X86][XOP] Reduce the size of a multiclass by moving more stuff to ↵Craig Topper2017-02-181-4/+4
| | | | | | | | parameters instead of doing 128-bit and 256-bit simultaneously. This requires some instructions to be renamed to move the Y earlier in the instruction name. The new names are more consistent with other instructions. llvm-svn: 295579
* Recommit "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR."Craig Topper2017-02-181-2/+2
| | | | | | Clang has now been fixed to not use these intrinsics. llvm-svn: 295571
* Revert "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR."Craig Topper2017-02-181-2/+2
| | | | | | This reverts r295564. I missed that clang was still using the intrinsics despite our half implemented autoupgrade support. llvm-svn: 295565
* [X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR.Craig Topper2017-02-181-2/+2
| | | | | | It seems we were already upgrading 128-bit VPCMOV, but the intrinsic was still defined and being used in isel patterns. While I was here I also simplified the tablegen multiclasses. llvm-svn: 295564
* Re-apply r282920 "X86: Allow conditional tail calls in Win64 "leaf" ↵Hans Wennborg2017-02-161-4/+3
| | | | | | | | | | functions (PR26302)" The original commit was reverted in r283329 due to a miscompile in Chromium. That turned out to be the same issue as PR31257, which was fixed in r295262. llvm-svn: 295357
* [X86] Re-enable conditional tail calls and fix PR31257.Hans Wennborg2017-02-161-0/+90
| | | | | | | | | | | This reverts r294348, which removed support for conditional tail calls due to the PR above. It fixes the PR by marking live registers as implicitly used and defined by the now predicated tailcall. This is similar to how IfConversion predicates instructions. Differential Revision: https://reviews.llvm.org/D29856 llvm-svn: 295262
* [AVX-512] Add PACKSS/PACKUS instructions to load folding tables.Craig Topper2017-02-151-0/+36
| | | | llvm-svn: 295154
* [AVX-512] Add PAVGB/PAVGW to load folding tables.Craig Topper2017-02-141-0/+18
| | | | llvm-svn: 295035
* [AVX-512] Add various EVEX move instructions to load folding tables using ↵Craig Topper2017-02-121-4/+10
| | | | | | the VEX equivalents as a guide. llvm-svn: 294908
* [AVX-512] Add VPEXTRD/Q to load folding tables.Craig Topper2017-02-121-0/+2
| | | | llvm-svn: 294905
* [AVX-512] Add VPMINS/MINU/MAXS/MAXU instructions to load folding tables.Craig Topper2017-02-111-0/+136
| | | | llvm-svn: 294858
OpenPOWER on IntegriCloud