summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [CostModel][X86] Masked load legalization requires an binary-shuffle not a ↵Simon Pilgrim2019-04-071-2/+2
| | | | | | | | select (PR39812) Expansion/truncation is better described by SK_PermuteTwoSrc than SK_Select llvm-svn: 357864
* [X86][SSE] SimplifyDemandedBitsForTargetNode - Add initial PACKSS supportSimon Pilgrim2019-04-071-0/+19
| | | | | | | | | | In the case where we only want the sign bit (e.g. when using PACKSS truncation of comparison results for MOVMSK) then we can just demand the sign bit of the source operands. This makes use of the fact that PACKSS saturates out of range values to the min/max int values - so the sign bit is always preserved. Differential Revision: https://reviews.llvm.org/D60333 llvm-svn: 357859
* [X86] When converting (x << C1) AND C2 to (x AND (C2>>C1)) << C1 during ↵Craig Topper2019-04-061-6/+13
| | | | | | isel, try using andl over andq by favoring 32-bit unsigned immediates. llvm-svn: 357848
* [X86] combineBitcastvxi1 - provide dst VT and src SDValue directly. NFCI.Simon Pilgrim2019-04-061-19/+17
| | | | | | Prep work to make it easier to reuse the BITCAST->MOVSMK combine in other cases. llvm-svn: 357847
* [X86] Use a signed mask in foldMaskedShiftToScaledMask to enable a shorter ↵Craig Topper2019-04-061-2/+6
| | | | | | | | | | | immediate encoding. This function reorders AND and SHL to enable the SHL to fold into an LEA. The upper bits of the AND will be shifted out by the SHL so it doesn't matter what mask value we use for these bits. By using sign bits from the original mask in these upper bits we might enable a shorter immediate encoding to be used. llvm-svn: 357846
* Fix spelling mistake. NFCI.Simon Pilgrim2019-04-061-1/+1
| | | | llvm-svn: 357843
* [AMDGPU] Sort out and rename multiple CI/VI predicatesStanislav Mekhanoshin2019-04-0614-85/+82
| | | | | | Differential Revision: https://reviews.llvm.org/D60346 llvm-svn: 357835
* [X86] Enable tail calls for CallingConv::SwiftFrancis Visoiu Mistrih2019-04-051-0/+2
| | | | | | It's currently only enabled on AArch64 (enabled in r281376). llvm-svn: 357809
* [X86] Preserve operand flag when expanding TCRETURNriFrancis Visoiu Mistrih2019-04-053-2/+11
| | | | | | | | | The expansion of TCRETURNri(64) would not keep operand flags like undef/renamable/etc. which can result in machine verifier issues. Also add plumbing to be able to use `-run-pass=x86-pseudo`. llvm-svn: 357808
* [AMDGPU] Add MachineDCE pass after RenameIndependentSubregsStanislav Mekhanoshin2019-04-051-0/+9
| | | | | | | | | | | | | | Detect dead lanes can create some dead defs. Then RenameIndependentSubregs will break a REG_SEQUENCE which may use these dead defs. At this point a dead instruction can be removed but we do not run a DCE anymore. MachineDCE was only running before live variable analysis. The patch adds a mean to preserve LiveIntervals and SlotIndexes in case it works past this. Differential Revision: https://reviews.llvm.org/D59626 llvm-svn: 357805
* [X86] Merge the different Jcc instructions for each condition code into ↵Craig Topper2019-04-0521-232/+177
| | | | | | | | | | | | | | | | | | | | | single instructions that store the condition code as an operand. Summary: This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between Jcc instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. Reviewers: spatel, lebedev.ri, courbet, gchatelet, RKSimon Reviewed By: RKSimon Subscribers: MatzeB, qcolombet, eraman, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60228 llvm-svn: 357802
* [X86] Merge the different SETcc instructions for each condition code into ↵Craig Topper2019-04-0520-256/+290
| | | | | | | | | | | | | | | | | | | | | single instructions that store the condition code as an operand. Summary: This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between SETcc instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. Reviewers: andreadb, courbet, RKSimon, spatel, lebedev.ri Reviewed By: andreadb Subscribers: hiraditya, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60138 llvm-svn: 357801
* [X86] Merge the different CMOV instructions for each condition code into ↵Craig Topper2019-04-0531-422/+460
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | single instructions that store the condition code as an immediate. Summary: Reorder the condition code enum to match their encodings. Move it to MC layer so it can be used by the scheduler models. This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between CMOV instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. This does complicate the scheduler models a little since we can't assign the A and BE instructions to a separate class now. I plan to make similar changes for SETcc and Jcc. Reviewers: RKSimon, spatel, lebedev.ri, andreadb, courbet Reviewed By: RKSimon Subscribers: gchatelet, hiraditya, kristina, lebedev.ri, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60041 llvm-svn: 357800
* [AMDGPU] predicate and feature refactoringStanislav Mekhanoshin2019-04-0518-196/+245
| | | | | | | | | We have done some predicate and feature refactoring lately but did not upstream it. This is to sync. Differential revision: https://reviews.llvm.org/D60292 llvm-svn: 357791
* Change some dyn_cast to more apropriate isa. NFCFangrui Song2019-04-053-3/+3
| | | | llvm-svn: 357773
* AMDGPU/GlobalISel: Fix non-power-of-2 selectMatt Arsenault2019-04-051-0/+1
| | | | llvm-svn: 357762
* [DAGCombiner][x86] scalarize splatted vector FP opsSanjay Patel2019-04-051-0/+6
| | | | | | | | | | | | | | | There are a variety of vector patterns that may be profitably reduced to a scalar op when scalar ops are performed using a subset (typically, the first lane) of the vector register file. For x86, this is true for float/double ops and element 0 because insert/extract is just a sub-register rename. Other targets should likely enable the hook in a similar way. Differential Revision: https://reviews.llvm.org/D60150 llvm-svn: 357760
* [SelectionDAG] Compute known bits of CopyFromRegPiotr Sobczak2019-04-051-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Teach SelectionDAG how to compute known bits of ISD::CopyFromReg if the virtual reg used has one def only. This can be particularly useful when calling isBaseWithConstantOffset() with the ISD::CopyFromReg argument, as more optimizations may get enabled in the result. Also add a missing truncation on X86, found by testing of this patch. Change-Id: Id1c9fceec862d118c54a5b53adf72ada5d6daefa Reviewers: bogner, craig.topper, RKSimon Reviewed By: RKSimon Subscribers: lebedev.ri, nemanjai, jvesely, nhaehnle, javed.absar, jsji, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59535 llvm-svn: 357745
* [X86] Promote i16 SRA instructions to i32Craig Topper2019-04-051-0/+2
| | | | | | | | | | | | We already promote SRL and SHL to i32. This will introduce sign extends sometimes which might be harder to deal with than the zero we use for promoting SRL. I ran this through some of our internal benchmark lists and didn't see any major regressions. I think there might be some DAG combine improvement opportunities in the test changes here. Differential Revision: https://reviews.llvm.org/D60278 llvm-svn: 357743
* [IR] Refactor attribute methods in Function class (NFC)Evandro Menezes2019-04-0431-63/+63
| | | | | | | | Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731
* Revert [X86] When using Win64 ABI, exit with error if SSE is disabled for ↵James Y Knight2019-04-041-3/+0
| | | | | | | | | | | | | | | varargs It unnecessarily breaks previously-working code which used varargs, but didn't pass any float/double arguments (such as EDK2). Also revert the fixup on top of that: Revert [X86] Fix a test from r357317 This reverts r357317 (git commit d413f41de6baf500e5d20c638375447e18777db2) This reverts r357380 (git commit 7af32444b9b17719ebabb6bee6eb52465acc8507) llvm-svn: 357718
* [WebAssembly] Add new explicit relocation types for PIC relocationsSam Clegg2019-04-044-26/+63
| | | | | | | | See https://github.com/WebAssembly/tool-conventions/pull/106 Differential Revision: https://reviews.llvm.org/D59907 llvm-svn: 357710
* [x86] eliminate unnecessary broadcast of horizontal opSanjay Patel2019-04-041-4/+14
| | | | | | | This is another pattern that comes up if we more aggressively scalarize FP ops. llvm-svn: 357703
* [RISCV] Support assembling TLS add and associated modifiersLewis Revill2019-04-049-11/+202
| | | | | | | | | | This patch adds support in the MC layer for parsing and assembling the 4-operand add instruction needed for TLS addressing. This also involves parsing the %tprel_hi, %tprel_lo and %tprel_add operand modifiers. Differential Revision: https://reviews.llvm.org/D55341 llvm-svn: 357698
* [SystemZ] Bugfix in isFusableLoadOpStorePattern()Jonas Paulsson2019-04-041-15/+16
| | | | | | | | | | | | | | | | | | | | | This function is responsible for checking the legality of fusing an instance of load -> op -> store into a single operation. In the SystemZ backend the check was incomplete and a test case emerged with a cycle in the instruction selection DAG as a result. Instead of using the NodeIds to determine node relationships, hasPredecessorHelper() now is used just like in the X86 backend. This handled the failing tests and as well gave a few additional transformations on benchmarks. The SystemZ isFusableLoadOpStorePattern() is now a very near copy of the X86 function, and it seems this could be made a utility function in common code instead. Review: Ulrich Weigand https://reviews.llvm.org/D60255 llvm-svn: 357688
* [ARM GlobalISel] Support DBG_VALUEDiana Picus2019-04-041-0/+7
| | | | | | Make sure we can map and select DBG_VALUE. llvm-svn: 357681
* [AArch64][AsmParser] Fix .arch_extension directive parsingSander de Smalen2019-04-041-8/+2
| | | | | | | | | | | | | | | | | | This patch fixes .arch_extension directive parsing to handle a wider range of architecture extension options. The existing parser was parsing extensions as an identifier which breaks for extensions containing a "-", such as the "tlb-rmi" extension. The extension is now parsed as a string. This is consistent with the extension parsing in the .arch and .cpu directive parsing. Patch by Cullen Rhodes (c-rhodes) Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D60118 llvm-svn: 357677
* [X86] Use INSERT_SUBREG rather than SUBREG_TO_REG when creating LEA64_32 ↵Craig Topper2019-04-041-13/+8
| | | | | | | | | during isel. SUBREG_TO_REG is supposed to be used to assert that we know the upper bits are zero. But that isn't the case here. We've done no analysis of the inputs. llvm-svn: 357673
* [WebAssembly] EmscriptenEHSjLj: Don't abort if __THREW__ is definedSam Clegg2019-04-041-4/+5
| | | | | | | | | | | | | | | | | This allows __THREW__ to be defined in the current module, although it is still required to be a GlobalVariable. In emscripten we want to be able to compile the source code that defines this symbols. Previously we were avoid this by not running this pass when building that compiler-rt library, but I have change out to build it using the normal compiler path: https://github.com/emscripten-core/emscripten/pull/8391 Differential Revision: https://reviews.llvm.org/D60232 llvm-svn: 357665
* [X86] Remove CustomInserters for RDPKRU/WRPKRU. Use some custom lowering and ↵Craig Topper2019-04-044-52/+37
| | | | | | | | | | | | | | new ISD opcodes instead. These inserters inserted some instructions to zero some registers and copied from virtual registers to physical registers. This change instead inserts the zeros directly into the DAG at lowering time using new ISD opcodes that take the extra zeroes as inputs. The zeros will then go through isel on their own to select the MOV32r0 pseudo. Then we just need to mention the physical registers directly in the isel patterns and the isel table and InstrEmitter will take care of inserting the necessary copies to/from physical registers. llvm-svn: 357659
* [X86] Remove CustomInserter pseudos for MONITOR/MONITORX/CLZERO. Use custom ↵Craig Topper2019-04-035-84/+78
| | | | | | | | | | | | | | | | | | | | | | instruction selection instead. This custom inserter existed so we could do a weird thing where we pretended that the instructions support a full address mode instead of taking a pointer in EAX/RAX. I think was largely so we could be pointer size agnostic in the isel pattern. To make this work we would then put the address into an LEA into EAX/RAX in front of the instruction after isel. But the LEA is overkill when we just have a base pointer. So we end up using the LEA as a slower MOV instruction. With this change we now just do custom selection during isel instead and just assign the incoming address of the intrinsic into EAX/RAX based on its size. After the intrinsic is selected, we can let isel take care of selecting an LEA or other operation to do any address computation needed in this basic block. I've also split the instruction into a 32-bit mode version and a 64-bit mode version so the implicit use is properly sized based on the pointer. Without this we get comments in the assembly output about killing eax and defing rax or vice versa depending on whether we define the instruction to use EAX/RAX. llvm-svn: 357652
* [x86] fold shuffles of h-ops that have an undef operandSanjay Patel2019-04-031-2/+2
| | | | | | | If an operand is undef, we can assume it's the same as the other operand. llvm-svn: 357644
* [x86] eliminate movddup of horizontal opSanjay Patel2019-04-031-2/+11
| | | | | | | | | | | | This pattern would show up as a regression if we more aggressively convert vector FP ops to scalar ops. There's still a missed optimization for the v4f64 legal case (AVX) because we create that h-op with an undef operand. We should probably just duplicate the operands for that pattern to avoid trouble. llvm-svn: 357642
* [IR] Create new method in `Function` class (NFC)Evandro Menezes2019-04-032-2/+2
| | | | | | | | | Create method `optForNone()` testing for the function level equivalent of `-O0` and refactor appropriately. Differential revision: https://reviews.llvm.org/D59852 llvm-svn: 357638
* AMDGPU: Split block for si_end_cfMatt Arsenault2019-04-035-17/+128
| | | | | | | | | | | | | | | Relying on no spill or other code being inserted before this was precarious. It relied on code diligently checking isBasicBlockPrologue which is likely to be forgotten. Ideally this could be done earlier, but this doesn't work because of phis. Any other instruction can't be placed before them, so we have to accept the position being incorrect during SSA. This avoids regressions in the fast register allocator rewrite from inverting the direction. llvm-svn: 357634
* [X86] Extend boolean arguments to inline-asm according to getBooleanTypeKrzysztof Parzyszek2019-04-031-2/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D60208 llvm-svn: 357615
* [X86][AVX] combineHorizontalPredicateResult - split any/allof v16i16/v32i8 ↵Simon Pilgrim2019-04-031-1/+8
| | | | | | | | reduction on AVX1 Perform the 2 x 128-bit lo/hi OR/AND on the vectors before calling PMOVMSKB on the 128-bit result. llvm-svn: 357611
* [X86][AVX] combineHorizontalPredicateResult - support v16i16/v32i8 reduction ↵Simon Pilgrim2019-04-031-6/+3
| | | | | | | | on AVX1 Use getPMOVMSKB helper which splits v32i8 MOVMSK calls on pre-AVX2 targets. llvm-svn: 357608
* [AArch64][GlobalISel] Legalize G_FEXP2Jessica Paquette2019-04-031-1/+1
| | | | | | | | | Same as G_EXP. Add a test, and update legalizer-info-validation.mir and f16-instructions.ll. Differential Revision: https://reviews.llvm.org/D60165 llvm-svn: 357605
* Test commit: Remove double variable assignmentLewis Revill2019-04-031-1/+1
| | | | llvm-svn: 357601
* [SystemZ] Improve codegen for certain SADDO-immediate casesUlrich Weigand2019-04-032-0/+28
| | | | | | | | | | | | When performing an add-with-overflow with an immediate in the range -2G ... -4G, code currently loads the immediate into a register, which generally takes two instructions. In this particular case, it is preferable to load the negated immediate into a register instead, which always only requires one instruction, and then perform a subtract. llvm-svn: 357597
* [MIPS GlobalISel] Select floating point arithmetic operationsPetar Avramovic2019-04-032-5/+20
| | | | | | | | Select 32 and 64 bit floating point add, sub, mul and div for MIPS32. Differential Revision: https://reviews.llvm.org/D60191 llvm-svn: 357584
* [AArch64] Update v8.5a MTE LDG/STG instructionsJaved Absar2019-04-031-12/+12
| | | | | | | | | | | | | The latest MTE specification adds register Xt to the STG instruction family: STG [Xn, #offset] -> STG Xt, [Xn, #offset] The tag written to memory is taken from Xt rather than Xn. Also, the LDG instruction also was changed to read return address from Xt: LDG Xt, [Xn, #offset]. This patch includes those changes and tests. Specification is at: https://developer.arm.com/docs/ddi0596/c Differential Revision: https://reviews.llvm.org/D60188 llvm-svn: 357583
* [mips] Remove unused FGRH32 register class. NFCSimon Atanasyan2019-04-032-32/+0
| | | | | | | | If we need this class in the future we will easily restore it. Differential Revision: http://reviews.llvm.org/D60132 llvm-svn: 357570
* [X86] Make the post machine scheduler macrofusion-aware.Clement Courbet2019-04-031-0/+7
| | | | | | | | | | | | | | | | Summary: Given that X86 does not use this currently, this is an NFC. I'll experiment with enabling and will report numbers. Reviewers: andreadb, lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60185 llvm-svn: 357568
* AMDGPU: Assume ECC is enabled by default if supportedMatt Arsenault2019-04-034-6/+32
| | | | | | | | | | The test should really be checking for the property directly in the code object headers, but there are problems with this. I don't see this directly represented in the text form, and for the binary emission this is depending on a function level subtarget feature to emit a global flag. llvm-svn: 357558
* [WebAssembly] Remove unneeded target operand flagsSam Clegg2019-04-037-50/+32
| | | | | | | | | | | This change is in preparation for the addition of new target operand flags for new relocation types. Have a symbol type as part of the flag set makes it harder to use and AFAICT these are serving no purpose. Differential Revision: https://reviews.llvm.org/D60014 llvm-svn: 357548
* AMDGPU: Remove unnecessary subtarget getMatt Arsenault2019-04-031-1/+0
| | | | llvm-svn: 357542
* AMDGPU: Fix names for generation featuresMatt Arsenault2019-04-034-10/+17
| | | | | | | | We should overall stop using these, but the uppercase name didn't work. Any feature string is converted to lowercase, so these could never be found in the table. llvm-svn: 357541
* [X86] Mark the default case of the X86InstrInfo::convertToThreeAddress ↵Craig Topper2019-04-021-1/+1
| | | | | | | | | switch as unreachable. This function should only be called with instructions that are really convertible. And all convertible instructions need to be handled by the switch. So nothing should use the default. llvm-svn: 357529
OpenPOWER on IntegriCloud