summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86FastISel.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Remove AVX512 early out from X86FastISel::X86SelectCmp.Craig Topper2017-10-301-3/+0
| | | | | | This shouldn't be needed anymore since i1 isn't a legal type. llvm-svn: 316912
* [X86] Use the extended vector register classes in fast isel with AVX512F/VL.Craig Topper2017-10-291-10/+10
| | | | llvm-svn: 316857
* [X86] Add AVX512 support to X86FastISel::X86SelectFPExt and ↵Craig Topper2017-10-291-4/+12
| | | | | | X86FastISel::X86SelectFPTrunc. llvm-svn: 316856
* [X86] Add AVX512 support to X86FastISel::X86MaterializeFPCraig Topper2017-10-291-2/+6
| | | | llvm-svn: 316853
* [X86] Replace some default cases in X86SelectShift with llvm_unreachable.Craig Topper2017-10-281-3/+3
| | | | llvm-svn: 316839
* [X86] Remove unneeded MVT::i1 related code from fast isel.Craig Topper2017-10-281-10/+0
| | | | llvm-svn: 316825
* [X86] Remove fast-isel code for handling i8 shifts. This is handled by auto ↵Craig Topper2017-10-271-14/+7
| | | | | | generated code. llvm-svn: 316797
* [X86] Teach fastisel to use VLX VMOVNTDQA for v4f64 and 256-bit integers ↵Craig Topper2017-10-271-2/+2
| | | | | | | | when available. This looks to have been missed from r280682. llvm-svn: 316790
* [X86] Enable extended comparison predicate support for SETUEQ/SETONE when ↵Craig Topper2017-10-091-3/+3
| | | | | | | | | | targeting AVX instructions. We believe that despite AMD's documentation, that they really do support all 32 comparision predicates under AVX. Differential Revision: https://reviews.llvm.org/D38609 llvm-svn: 315201
* [X86] Fix copy pasto in X86FastISel::fastEmitInst_rrrr.Craig Topper2017-10-021-1/+1
| | | | | | The 4th operand was not being constrained and the third operand was being constrained twice. llvm-svn: 314648
* [X86] Fix register class name in a comment. NFCCraig Topper2017-09-261-1/+1
| | | | llvm-svn: 314250
* [X86] Don't emit COPY_TO_REG to ABCD registers before EXTRACT_SUBREG of sub_8bitCraig Topper2017-09-181-14/+1
| | | | | | | | This is similar to D37843, but for sub_8bit. This fixes all of the patterns except for the 2 that emit only an EXTRACT_SUBREG. That causes a verifier error with global isel because global isel doesn't know to issue the ABCD when doing this extract on 32-bits targets. Differential Revision: https://reviews.llvm.org/D37890 llvm-svn: 313558
* [X86] Teach fastisel to handle zext/sext i8->i16 and sext i1->i8/i16/i32/i64Craig Topper2017-09-021-0/+59
| | | | | | | | | | | | | | | | | | | | | Summary: ZExt and SExt from i8 to i16 aren't implemented in the autogenerated fast isel table because normal isel does a zext/sext to 32-bits and a subreg extract to avoid a partial register write or false dependency on the upper bits of the destination. This means without handling in fast isel we end up triggering a fast isel abort. We had no custom sign extend handling at all so while I was there I went ahead and implemented sext i1->i8/i16/i32/i64 which was also missing. This generates an i1->i8 sign extend using a mask with 1, then an 8-bit negate, then continues with a sext from i8. A better sequence would be a wider and/negate, but would require more custom code. Fast isel tests are a mess and I couldn't find a good home for the tests so I created a new one. The test pr34381.ll had to have fast-isel removed because it was relying on a fast isel abort to hit the bug. The test case still seems valid with fast-isel disabled though some of the instructions changed. Reviewers: spatel, zvi, igorb, guyblank, RKSimon Reviewed By: guyblank Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37320 llvm-svn: 312422
* [X86] Remove some code from fast isel that is no longer needed with i1 being ↵Craig Topper2017-08-301-31/+0
| | | | | | an illegal type. llvm-svn: 312190
* [X86] Remove unneed AVX512 check from fast isel.Craig Topper2017-08-301-2/+1
| | | | | | This is no longer necessary now that i1 is illegal. llvm-svn: 312146
* [AVX512] Remove leftover code for when i1 was a legal type from the fast ↵Craig Topper2017-08-141-14/+0
| | | | | | | | | | | | | | | | | | | isel load/store code. Summary: I don't think we need this code anymore. It only existed because i1 used to be legal. There's probably more unneeded code in fast isel still. Reviewers: guyblank, zvi Reviewed By: guyblank Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36652 llvm-svn: 310843
* [X86] Teach fastisel to select calls to dllimport functionsReid Kleckner2017-08-051-8/+14
| | | | | | | | | | | | | | Summary: Direct calls to dllimport functions are very common Windows. We should add them to the -O0 fast path. Reviewers: rafael Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D36197 llvm-svn: 310152
* [AArch64] Extend CallingConv::X86_64_Win64 to AArch64 as wellMartin Storsjo2017-07-171-2/+2
| | | | | | | | | | | | Rename the enum value from X86_64_Win64 to plain Win64. The symbol exposed in the textual IR is changed from 'x86_64_win64cc' to 'win64cc', but the numeric value is kept, keeping support for old bitcode. Differential Revision: https://reviews.llvm.org/D34474 llvm-svn: 308208
* [X86/FastIsel] Fall-back to SelectionDAG when lowering soft-floats.Davide Italiano2017-07-121-0/+3
| | | | | | | | | | FastIsel can't handle them, so we would end up crashing during register class selection. Fixes PR26522. Differential Revision: https://reviews.llvm.org/D35272 llvm-svn: 307797
* [X86][AVX1] Split 256-bit vector non-temporal FastISel loads to keep it ↵Simon Pilgrim2017-06-061-0/+6
| | | | | | | | non-temporal (PR32744) Extension to D33728 llvm-svn: 304798
* [X86][AVX512] Make i1 illegal in the CodeGenGuy Blank2017-05-191-7/+0
| | | | | | | | | | This patch defines the i1 type as illegal in the X86 backend for AVX512. For DAG operations on <N x i1> types (build vector, extract vector element, ...) i8 is used, and should be truncated/extended. This should produce better scalar code for i1 types since GPRs will be used instead of mask registers. Differential Revision: https://reviews.llvm.org/D32273 llvm-svn: 303421
* [X86] Move getX86ConditionCode() from X86FastISel.cpp to X86InstrInfo.cpp. NFCIgor Breger2017-05-111-42/+4
| | | | | | | | | | | | | | | | Summary: Move getX86ConditionCode() from X86FastISel.cpp to X86InstrInfo.cpp so it can be used by GloabalIsel instruction selector. This is a pre-commit for a patch I'm working on to support G_ICMP. NFC. Reviewers: zvi, guyblank, delena Reviewed By: guyblank, delena Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33038 llvm-svn: 302767
* Add extra operand to CALLSEQ_START to keep frame part set up previouslySerge Pavlov2017-05-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using arguments with attribute inalloca creates problems for verification of machine representation. This attribute instructs the backend that the argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size stored in CALLSEQ_START in this case does not count the size of this argument. However CALLSEQ_END still keeps total frame size, as caller can be responsible for cleanup of entire frame. So CALLSEQ_START and CALLSEQ_END keep different frame size and the difference is treated by MachineVerifier as stack error. Currently there is no way to distinguish this case from actual errors. This patch adds additional argument to CALLSEQ_START and its target-specific counterparts to keep size of stack that is set up prior to the call frame sequence. This argument allows MachineVerifier to calculate actual frame size associated with frame setup instruction and correctly process the case of inalloca arguments. The changes made by the patch are: - Frame setup instructions get the second mandatory argument. It affects all targets that use frame pseudo instructions and touched many files although the changes are uniform. - Access to frame properties are implemented using special instructions rather than calls getOperand(N).getImm(). For X86 and ARM such replacement was made previously. - Changes that reflect appearance of additional argument of frame setup instruction. These involve proper instruction initialization and methods that access instruction arguments. - MachineVerifier retrieves frame size using method, which reports sum of frame parts initialized inside frame instruction pair and outside it. The patch implements approach proposed by Quentin Colombet in https://bugs.llvm.org/show_bug.cgi?id=27481#c1. It fixes 9 tests failed with machine verifier enabled and listed in PR27481. Differential Revision: https://reviews.llvm.org/D32394 llvm-svn: 302527
* [X86] Support of no_caller_saved_registers attributeOren Ben Simhon2017-05-031-0/+9
| | | | | | | | | This patch implements the LLVM part for no_caller_saved_registers attribute as appears here: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=5ed3cc7b66af4758f7849ed6f65f4365be8223be. In order to implement the attribute, we use the dynamic CSR mechanism to remove returned/passed arguments from the function regmask/CSR list. Differential Revision: https://reviews.llvm.org/D31876 llvm-svn: 302020
* Use Argument::hasAttribute and AttributeList::ReturnIndex moreReid Kleckner2017-04-281-9/+6
| | | | | | | | | | | This eliminates many extra 'Idx' induction variables in loops over arguments in CodeGen/ and Target/. It also reduces the number of places where we assume that ReturnIndex is 0 and that we should add one to argument numbers to get the corresponding attribute list index. NFC llvm-svn: 301666
* Move size and alignment information of regclass to TargetRegisterInfoKrzysztof Parzyszek2017-04-241-1/+2
| | | | | | | | | | | | | | | 1. RegisterClass::getSize() is split into two functions: - TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const; - TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const; 2. RegisterClass::getAlignment() is replaced by: - TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const; This will allow making those values depend on subtarget features in the future. Differential Revision: https://reviews.llvm.org/D31783 llvm-svn: 301221
* [IR] Make paramHasAttr to use arg indices instead of attr indicesReid Kleckner2017-04-141-2/+2
| | | | | | | | | This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern. Previously we were testing return value attributes with index 0, so I introduced hasReturnAttr() for that use case. llvm-svn: 300367
* [X86][MMX] Add fast-isel support for MMX non-temporal writesSimon Pilgrim2017-04-101-0/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D31754 llvm-svn: 299852
* [AVX-512] Fix bad comment from r299112. NFCCraig Topper2017-03-301-1/+2
| | | | llvm-svn: 299114
* [AVX-512] Fix another case where fastisel was generating a GR8 to VK1 copy. ↵Craig Topper2017-03-301-2/+12
| | | | | | | | This time after calls returning i1. Fixes PR32472. llvm-svn: 299112
* [AVX-512] Punt on fast-isel of truncates to i1 when AVX512 is enabled.Craig Topper2017-03-281-1/+2
| | | | | | | | | | We should be masking the value and emitting a register copy like we do in non-fast isel. Instead we were just updating the value map and emitting nothing. After r298928 we started seeing cases where we would create a copy from GR8 to GR32 because the source register in a VK1 to GR32 copy was replaced by the GR8 going into a truncate. This fixes PR32451. llvm-svn: 298957
* [AVX-512] Fix accidental uses of AH/BH/CH/DH after copies to/from mask registersCraig Topper2017-03-281-13/+45
| | | | | | | | | | | | | | | | We've had several bugs(PR32256, PR32241) recently that resulted from usages of AH/BH/CH/DH either before or after a copy to/from a mask register. This ultimately occurs because we create COPY_TO_REGCLASS with VK1 and GR8. Then in CopyToFromAsymmetricReg in X86InstrInfo we find a 32-bit super register for the GR8 to emit the KMOV with. But as these tests are demonstrating, its possible for the GR8 register to be a high register and we end up doing an accidental extra or insert from bits 15:8. I think the best way forward is to stop making copies directly between mask registers and GR8/GR16. Instead I think we should restrict to only copies between mask registers and GR32/GR64 and use EXTRACT_SUBREG/INSERT_SUBREG to handle the conversion from GR32 to GR16/8 or vice versa. Unfortunately, this complicates fastisel a bit more now to create the subreg extracts where we used to create GR8 copies. We can probably make a helper function to bring down the repitition. This does result in KMOVD being used for copies when BWI is available because we don't know the original mask register size. This caused a lot of deltas on tests because we have to split the checks for KMOVD vs KMOVW based on BWI. Differential Revision: https://reviews.llvm.org/D30968 llvm-svn: 298928
* [AVX-512] Pre-emptively fix more places in fastisel where we might copy a ↵Craig Topper2017-03-141-9/+28
| | | | | | VK1 register into a AH/BH/CH/DH register. llvm-svn: 297704
* [AVX-512] Fix another case where we are copying from a mask register using ↵Craig Topper2017-03-131-1/+2
| | | | | | | | AH/BH/CH/DH with fastisel. Fixes PR32256. Still planning to do an audit for other possible cases. llvm-svn: 297678
* [AVX-512] Fix a bad use of a high GR8 register after copying from a mask ↵Craig Topper2017-03-121-0/+11
| | | | | | | | | | register during fast isel. This ends up extracting from bits 15:8 instead of the lower bits of the mask. I'm pretty sure there are more problems lurking here. But I think this fixes PR32241. I've added the test case from that bug and added asserts that will fail if we ever try to copy between high registers and mask registers again. llvm-svn: 297574
* [X86] Fix creating vreg def after use. Ayman Musa2017-03-011-5/+10
| | | | llvm-svn: 296601
* [X86][AVX] Disable VCVTSS2SD & VCVTSD2SS memory folding and fix the register ↵Ayman Musa2017-02-231-2/+7
| | | | | | | | class of their first input when creating node in fast-isel. (Quick fix to buildbot failure after rL295940 commit). llvm-svn: 295970
* [X86] Remove scalar logical op alias instructions. Just use ↵Craig Topper2016-12-061-6/+10
| | | | | | | | | | | | | | | | | | | COPY_FROM/TO_REGCLASS and the normal packed instructions instead Summary: This patch removes the scalar logical operation alias instructions. We can just use reg class copies and use the normal packed instructions instead. This removes the need for putting these instructions in the execution domain fixing tables as was done recently. I removed the loadf64_128 and loadf32_128 patterns as DAG combine creates a narrower load for (extractelt (loadv4f32)) before we ever get to isel. I plan to add similar patterns for AVX512DQ in a future commit to allow use of the larger register class when available. Reviewers: spatel, delena, zvi, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27401 llvm-svn: 288771
* [X86] Remove unnecessary explicit uses of .SimpleTy just to do an equality ↵Craig Topper2016-12-051-11/+11
| | | | | | comparison. MVT's operator== already takes care of this. NFCI llvm-svn: 288646
* [AVX-512] Teach fast isel to handle 512-bit vector bitcasts.Craig Topper2016-12-051-2/+8
| | | | llvm-svn: 288641
* [AVX-512] Teach fast isel to use masked compare and movss for handling ↵Craig Topper2016-12-051-4/+69
| | | | | | scalar cmp and select sequence when AVX-512 is enabled. This matches the behavior of normal isel. llvm-svn: 288636
* IR: Change the gep_type_iterator API to avoid always exposing the "current" ↵Peter Collingbourne2016-12-021-1/+1
| | | | | | | | | | | | | type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458
* [X86][FastISel] Assert that we are dealing with arithmetic with overflow ↵Zvi Rackover2016-11-151-0/+3
| | | | | | intrinsics. NFC llvm-svn: 286961
* [X86][FastISel] Fix lowering of overflow result on AVX512 targetsZvi Rackover2016-11-151-2/+2
| | | | | | | | | | | | | | | | Summary: Fix a case where the overflow value of type i1, which is legal on AVX512, was assigned to a VK1 register class. We always want this value to be assigned to a GPR since the overflow return value is lowered to a SETO instruction. Fixes pr30981. Reviewers: mkuper, igorb, craig.topper, guyblank, qcolombet Subscribers: qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D26620 llvm-svn: 286958
* [X86][FastISel] Use a COPY from K register to a GPR instead of a K operationGuy Blank2016-09-281-27/+31
| | | | | | | | | | | The KORTEST was introduced due to a bug where a TEST instruction used a K register. but, turns out that the opposite case of KORTEST using a GPR is now happening The change removes the KORTEST flow and adds a COPY instruction from the K reg to a GPR. Differential Revision: https://reviews.llvm.org/D24953 llvm-svn: 282580
* [AVX-512] Teach fastisel load/store handling to use EVEX encoded ↵Craig Topper2016-09-051-42/+81
| | | | | | | | instructions for 128/256-bit vectors and scalar single/double. Still need to fix the register classes to allow the extended range of registers. llvm-svn: 280682
* [X86] Make some static arrays of opcodes const and shrink to uint16_t. NFCCraig Topper2016-09-051-6/+6
| | | | llvm-svn: 280649
* [AVX512][FastISel] Do not use K registers in TEST instructionsGuy Blank2016-08-211-6/+31
| | | | | | | | | In some cases, FastIsel was emitting TEST instruction with K reg input, which is illegal. Changed to using KORTEST when dealing with K regs. Differential Revision: https://reviews.llvm.org/D23163 llvm-svn: 279393
* Replace a few more "fall through" comments with LLVM_FALLTHROUGHJustin Bogner2016-08-171-2/+2
| | | | | | Follow up to r278902. I had missed "fall through", with a space. llvm-svn: 278970
* Replace "fallthrough" comments with LLVM_FALLTHROUGHJustin Bogner2016-08-171-12/+15
| | | | | | | This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead. llvm-svn: 278902
OpenPOWER on IntegriCloud