summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* Test commit accessOliver Stannard2019-04-111-0/+1
| | | | llvm-svn: 358162
* [RISCV] Put data smaller than eight bytes to small data sectionShiva Chen2019-04-112-0/+120
| | | | | | | | | | | Because of gp = sdata_start_address + 0x800, gp with signed twelve-bit offset could covert most of the small data section. Linker relaxation could transfer the multiple data accessing instructions to a gp base with signed twelve-bit offset instruction. Differential Revision: https://reviews.llvm.org/D57493 llvm-svn: 358150
* [AArch64][GlobalISel] Make <2 x p0> = G_BUILD_VECTOR legal.Amara Emerson2019-04-101-0/+1
| | | | | | The existing isel support already works for p0 once the legalizer accepts it. llvm-svn: 358144
* [AArch64][GlobalISel] Add legalizer support for <8 x s16> and <16 x s8> G_ADD.Amara Emerson2019-04-101-1/+1
| | | | llvm-svn: 358143
* [AArch64][GlobalISel] Scalarize vector SDIV.Amara Emerson2019-04-101-1/+2
| | | | llvm-svn: 358142
* [X86] Teach foldMaskedShiftToScaledMask to look through an any_extend from ↵Craig Topper2019-04-101-22/+44
| | | | | | | | | | | | | | i32 to i64 between the and & shl foldMaskedShiftToScaledMask tries to reorder and & shl to enable the shl to fold into an LEA. But if there is an any_extend between them it doesn't work. This patch modifies the code to look through any_extend from i32 to i64 when the and mask only uses bits that weren't from the extended part. This will prevent a regression from D60358 caused by 64-bit SHL being narrowed to 32-bits when their upper bits aren't demanded. Differential Revision: https://reviews.llvm.org/D60532 llvm-svn: 358139
* [X86] Make _Int instructions the preferred instructon for the assembly ↵Craig Topper2019-04-106-147/+152
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | parser and disassembly parser to remove inconsistencies between VEX and EVEX. Many of our instructions have both a _Int form used by intrinsics and a form used by other IR constructs. In the EVEX space the _Int versions usually cover all the capabilities include broadcasting and rounding. While the other version only covers simple register/register or register/load forms. For this reason in EVEX, the non intrinsic form is usually marked isCodeGenOnly=1. In the VEX encoding space we were less consistent, but usually the _Int version was the isCodeGenOnly version. This commit makes the VEX instructions match the EVEX instructions. This was done by manually studying the AsmMatcher table so its possible I missed some cases, but we should be closer now. I'm thinking about using the isCodeGenOnly bit to simplify the EVEX2VEX tablegen code that disambiguates the _Int and non _Int versions. Currently it checks register class sizes and Record the memory operands come from. I have some other changes I was looking into for D59266 that may break the memory check. I had to make a few scheduler hacks to keep the _Int versions from being treated differently than the non _Int version. Differential Revision: https://reviews.llvm.org/D60441 llvm-svn: 358138
* [X86] Replace some if statements in isel address matching that should never ↵Craig Topper2019-04-101-8/+10
| | | | | | | | | | | | | | be true with asserts. And move them earlier before we looked through operands that don't change size. NFC These ifs were ensuring we don't have to handle types larger than 64 bits probably because we use getZExtValue in several places below them. None of the callers of this code pass types larger than 64-bits so we can just assert instead of branching in release code. I've also moved them earlier since we're just looking through operations that don't effect bit width. This is prep work for some refactoring I plan to do to the (and (shl)) handling code. llvm-svn: 358123
* [X86AsmPrinter] refactor to limit use of Modifier. NFCNick Desaulniers2019-04-101-47/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The Modifier memory operands is used in 2 cases of memory references (H & P ExtraCodes). Rather than pass around the likely nullptr Modifier, refactor the handling of the Modifier out from printOperand(). The refactorings in this patch: - Don't forward declare printOperand, move its definition up. - The diff makes it look like there's a change to printPCRelImm (narrator: there's not). - Create printModifiedOperand() - Move logic for Modifier to there from printOperand - Use printModifiedOperand in 3 call sites that actually create Modifiers. - Remove now unused Modifier parameter from printOperand - Remove default parameter from printLeaMemReference as it only has 1 call site that explicitly passes a parameter. - Remove default parameter from printMemReference, make call lone call site explicitly pass nullptr. - Drop Modifier parameter from printIntelMemReference, as Intel style memory references don't support the Modifiers in question. This will allow future changes to printOperand() to make it a pure virtual method on the base AsmPrinter class, allowing for more generic handling of some architecture generic constraints. X86AsmPrinter was the only derived class of AsmPrinter to have additional parameters on its printOperand function. Reviewers: craig.topper, echristo Reviewed By: echristo Subscribers: hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60526 llvm-svn: 358122
* [X86] X86ScheduleBdVer2: use !listsplat operator to cleanup loadres calculationRoman Lebedev2019-04-101-4/+7
| | | | | | | | | | | | | The problem is that one can't concatenate an empty list (implied all-ones) with non-empty list here. The result will be the non-empty list, and it won't match the length of the ExePorts list. The problems begin when LoadRes != 1 here, which is the case in PdWriteResYMMPair, and more importantly i think it will be the case for PdWriteResExPair. llvm-svn: 358118
* Revert rL357745: [SelectionDAG] Compute known bits of CopyFromRegDavid Green2019-04-101-3/+3
| | | | | | | | | | Certain optimisations from ConstantHoisting and CGP rely on Selection DAG not seeing through to the constant in other blocks. Revert this patch while we come up with a better way to handle that. I will try to follow this up with some better tests. llvm-svn: 358113
* [AArch64] Teach getTestBitOperand to look through ANY_EXTENDSCraig Topper2019-04-101-0/+6
| | | | | | | | This patch teach getTestBitOperand to look through ANY_EXTENDs when the extended bits aren't used. The test case changed here is based what D60358 did to test16 in tbz-tbnz.ll. So this patch will avoid that regression. Differential Revision: https://reviews.llvm.org/D60482 llvm-svn: 358108
* [AsmPrinter] refactor to remove remove AsmVariant. NFCNick Desaulniers2019-04-1025-154/+87
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The InlineAsm::AsmDialect is only required for X86; no architecture makes use of it and as such it gets passed around between arch-specific and general code while being unused for all architectures but X86. Since the AsmDialect is queried from a MachineInstr, which we also pass around, remove the additional AsmDialect parameter and query for it deep in the X86AsmPrinter only when needed/as late as possible. This refactor should help later planned refactors to AsmPrinter, as this difference in the X86AsmPrinter makes it harder to make AsmPrinter more generic. Reviewers: craig.topper Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60488 llvm-svn: 358101
* [X86][AVX] getTargetConstantBitsFromNode - extract bits from ↵Simon Pilgrim2019-04-101-0/+13
| | | | | | X86ISD::SUBV_BROADCAST llvm-svn: 358096
* [AArch64] Add lowering pattern for scalar fp16 facge and facgtDiogo N. Sampaio2019-04-101-0/+10
| | | | | | | | | | | | | | | | Summary: The fp16 scalar version of facge and facgt requires a custom patter matching, as the result type is not the same width of the operands. Reviewers: olista01, javed.absar, pbarrio Reviewed By: javed.absar Subscribers: kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60212 llvm-svn: 358083
* [ARM] [FIX] Add missing f16 vector operations loweringDiogo N. Sampaio2019-04-102-1/+6
| | | | | | | | | | | | | | | | | | | | Summary: Add missing <8xhalf> shufflevectors pattern, when using concat_vector dag node. As well, allows <8xhalf> and <4xhalf> vldup1 operations. These instructions are required for v8.2a fp16 lowering of vmul_n_f16, vmulq_n_f16 and vmulq_lane_f16 intrinsics. Reviewers: olista01, pbarrio, LukeGeeson, efriedma Reviewed By: efriedma Subscribers: efriedma, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60319 llvm-svn: 358081
* [NFC] Fix unused variable warning.Clement Courbet2019-04-101-3/+0
| | | | llvm-svn: 358080
* Fixup r358063Diana Picus2019-04-101-2/+2
| | | | | | Fix warning/error about mixed signedness. llvm-svn: 358065
* [ARM GlobalISel] Add some asserts. NFC.Diana Picus2019-04-101-0/+2
| | | | | | Make sure some arm opcodes don't unintentionally sneak into thumb mode. llvm-svn: 358064
* [ARM GlobalISel] Select G_FCONSTANT for VFP3Diana Picus2019-04-102-10/+59
| | | | | | | | | | | | | | | | Make it possible to TableGen code for FCONSTS and FCONSTD. We need to make two changes to the TableGen descriptions of vfp_f32imm and vfp_f64imm respectively: * add GISelPredicateCode to check that the immediate fits in 8 bits; * extract the SDNodeXForms into separate definitions and create a GISDNodeXFormEquiv and a custom renderer function for each of them. There's a lot of boilerplate to get the actual value of the immediate, but it basically just boils down to calling ARM_AM::getFP32Imm or ARM_AM::getFP64Imm. llvm-svn: 358063
* [ARM GlobalISel] Select G_FCONSTANT into poolsDiana Picus2019-04-101-0/+21
| | | | | | | Put all floating point constants into constant pools and load their values from there. llvm-svn: 358062
* [ARM GlobalISel] Map G_FCONSTANTDiana Picus2019-04-101-0/+8
| | | | llvm-svn: 358061
* [X86] Move the 2 byte VEX optimization for MOV instructions back to the ↵Craig Topper2019-04-103-54/+63
| | | | | | | | X86AsmParser::processInstruction where it used to be. Block when {vex3} prefix is present. Years ago I moved this to an InstAlias using VR128H/VR128L. But now that we support {vex3} pseudo prefix, we need to block the optimization when it is set to match gas behavior. llvm-svn: 358046
* [Sparc] Fix incorrect MI insertion position for spilling f128.Jim Lin2019-04-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Obviously, new built MI (sethi+add or sethi+xor+add) for constructing large offset should be inserted before new created MI for storing even register into memory. So the insertion position should be *StMI instead of II. before fixed: std %f0, [%g1+80] sethi 4, %g1 <<< add %g1, %sp, %g1 <<< this two instructions should be put before "std %f0, [%g1+80]". sethi 4, %g1 add %g1, %sp, %g1 std %f2, [%g1+88] after fixed: sethi 4, %g1 add %g1, %sp, %g1 std %f0, [%g1+80] sethi 4, %g1 add %g1, %sp, %g1 std %f2, [%g1+88] Reviewers: venkatra, jyknight Reviewed By: jyknight Subscribers: jyknight, fedor.sergeev, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60397 llvm-svn: 358042
* [X86] Support the EVEX versions vcvt(t)ss2si and vcvt(t)sd2si with the ↵Craig Topper2019-04-102-32/+27
| | | | | | | | | | | | {evex} pseudo prefix in the assembler. The EVEX versions are ambiguous with the VEX versions based on operands alone so we had explicitly dropped them from the AsmMatcher table. Unfortunately, when we add them they incorrectly show in the table before their VEX counterparts. This is different how the prioritization normally works. To fix this we have to explicitly reject the instructions unless the {evex} prefix has been seen. llvm-svn: 358041
* [X86] Add VEX_LIG to scalar VEX/EVEX instructions that were missing it.Craig Topper2019-04-092-39/+40
| | | | | | | | | | Scalar VEX/EVEX instructions don't use the L bit and don't look at it for decoding either. So we should ignore it in our disassembler. The missing instructions here were found by grepping the raw tablegen class definitions in the tablegen debug output. llvm-svn: 358040
* [X86] Fix a dangling StringRef issue introduced in r358029.Craig Topper2019-04-091-3/+4
| | | | | | | | I was attempting to convert mnemonics to lower case after processing a pseudo prefix. But the ParseOperands just hold a StringRef for tokens so there is no where to allocate the memory. Add FIXMEs for the lower case issue which also exists in the prefix parsing code. llvm-svn: 358036
* [AArch64][GlobalISel] Add isel support for vector G_ICMP and G_ASHR & G_SHLAmara Emerson2019-04-091-2/+259
| | | | | | | | | | | | | | | | The selection for G_ICMP is unfortunately not currently importable from SDAG due to the use of custom SDNodes. To support this, this selection method has an opcode table which has been generated by a script, indexed by various instruction properties. Ideally in future we will have a GISel native selection patterns that we can write in tablegen to improve on this. For selection of some types we also need support for G_ASHR and G_SHL which are generated as a result of legalization. This patch also adds support for them, generating the same code as SelectionDAG currently does. Differential Revision: https://reviews.llvm.org/D60436 llvm-svn: 358035
* [AArch64][GlobalISel] Legalize vector G_ICMP.Amara Emerson2019-04-091-2/+27
| | | | | | | | Selection support will be coming in a later patch. Differential Revision: https://reviews.llvm.org/D60435 llvm-svn: 358034
* [AArch64][GlobalISel] Add legalization for some vector G_SHL and G_ASHR.Amara Emerson2019-04-091-4/+6
| | | | | | | | This is needed for some future support for vector ICMP. Differential Revision: https://reviews.llvm.org/D60433 llvm-svn: 358033
* [GlobalISel][AArch64] Allow CallLowering to handle types which are normallyAmara Emerson2019-04-093-7/+60
| | | | | | | | | | | required to be passed as different register types. E.g. <2 x i16> may need to be passed as a larger <2 x i32> type, so formal arg lowering needs to be able truncate it back. Likewise, when dealing with returns of these types, they need to be widened in the appropriate way back. Differential Revision: https://reviews.llvm.org/D60425 llvm-svn: 358032
* [X86] Add support for {vex2}, {vex3}, and {evex} to the assembler to match ↵Craig Topper2019-04-093-9/+111
| | | | | | | | | | | | | | | | gas. Use {evex} to improve the one our 32-bit AVX512 tests. These can be used to force the encoding used for instructions. {vex2} will fail if the instruction is not VEX encoded, but otherwise won't do anything since we prefer vex2 when possible. Might need to skip use of the _REV MOV instructions for this too, but I haven't done that yet. {vex3} will force the instruction to use the 3 byte VEX encoding or fail if there is no VEX form. {evex} will force the instruction to use the EVEX version or fail if there is no EVEX version. Differential Revision: https://reviews.llvm.org/D59266 llvm-svn: 358029
* [PowerPC] fix trivial typos in comment, NFCHiroshi Inoue2019-04-093-6/+6
| | | | llvm-svn: 357981
* [X86] Have EVEX2VEX tablegenerator use HasVEX_L and HasEVEX_L2 fields ↵Craig Topper2019-04-091-4/+1
| | | | | | | | | | instead of the composite EVEX_LL field. Remove the EVEX_LL field. NFCI The composite existed to simplify some other tablegen code and not really in an important way. Remove the combined field and just calculate the vector size using two ifs. llvm-svn: 357972
* [X86] Use VEX_WIG for VPINSRB/W and VPEXTRB/W to match what is done for EVEX.Craig Topper2019-04-091-5/+5
| | | | | | | | | | | | | The instruction's document this as W0 for the VEX encoding. But there's a footnote mentioning that VEX.W is ignored in 64-bit mode. And the main VEX encoding description says the VEX.W bit is ignored for instructions that are equivalent to a legacy SSE instruction that uses REX.W to select a GPR which would apply here. By making this match EVEX we can remove a special case of allowing EVEX2VEX to turn an EVEX.WIG instruction into VEX.W0. llvm-svn: 357971
* [X86] Split the VEX_WPrefix in X86Inst tablegen class into 3 separate fields ↵Craig Topper2019-04-091-8/+8
| | | | | | with clear meanings. llvm-svn: 357970
* AMDGPU/GlobalISel: Implement call lowering for shaders returning valuesTom Stellard2019-04-091-3/+73
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, volkan, llvm-commits Differential Revision: https://reviews.llvm.org/D57166 llvm-svn: 357964
* [PowerPC] initialize SchedModel according to platform.Chen Zheng2019-04-091-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D60177 llvm-svn: 357962
* [X86] Derive ssmem and sdmem from X86MemOperand. NFCICraig Topper2019-04-091-12/+2
| | | | | | This changes the operand type from v4f32/v2f64 to iPTR which seems more correct. But that doesn't seem to do anything other than change the comments in X86GenDAGISel.inc. Probably because we use a ComplexPattern to do the matching so there's no autogenerated code to change. llvm-svn: 357959
* [X86] Fix a couple lowering functions that called ReplaceAllUsesOfValueWith ↵Craig Topper2019-04-081-6/+5
| | | | | | | | | | | | for the newly created code and then return SDValue(). Use MERGE_VALUES instead. Returning SDValue() makes the caller think custom lowering was unsuccessful and then it will fall back to trying to expand the original node. This expanded code will end up with no users and end up being pruned later. But it was useless unnecessary work to create it. Instead return a MERGE_VALUES with all the results so the caller knows something changed. The caller can handle the replacements. For one of the cases I had to use UNDEF has a dummy value for a result we know is unused. This should get pruned later. llvm-svn: 357935
* [x86] make 8-bit shl undesirableSanjay Patel2019-04-081-3/+7
| | | | | | | | | | | | | | | | I was looking at a potential DAGCombiner fix for 1 of the regressions in D60278, and it caused severe regression test pain because x86 TLI lies about the desirability of 8-bit shift ops. We've hinted at making all 8-bit ops undesirable for the reason in the code comment: // TODO: Almost no 8-bit ops are desirable because they have no actual // size/speed advantages vs. 32-bit ops, but they do have a major // potential disadvantage by causing partial register stalls. ...but that leads to massive diffs and exposes all kinds of optimization holes itself. Differential Revision: https://reviews.llvm.org/D60286 llvm-svn: 357912
* [X86] Make LowerOperationWrapper more robust. Remove now unnecessary ↵Craig Topper2019-04-081-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | ReplaceAllUsesWith from LowerMSCATTER. Previously LowerOperationWrapper took the number of results from the original node and counted that many results from the new node. This was intended to drop chain operands from FP_TO_SINT lowering that uses X87 with memory operations to stack temporaries. The final load had an extra chain output that needs to be ignored. Unfortunately, it didn't work with scatter which has 2 result operands, the mask output which is discarded and a chain output. The chain output is the one that is needed but it comes second and it would be dropped by the previous logic here. To workaround this we were doing a ReplaceAllUses in the lowering code so that the generic legalization code wouldn't see any uses to replace since it had been given the wrong result/type. After this change we take the LowerOperation result directly if the original node has one result. This allows us to directly return the chain from scatter or the load data from the FP_TO_SINT case. When the original node has multiple results we'll ensure the returned node has the same number and copy them over. For cases where the original node has multiple results and the new code for some reason has even more results, MERGE_VALUES can be used to pass only the needed results. llvm-svn: 357887
* [X86] Use (SUBREG_TO_REG (MOV32rm)) for extloadi64i8/extloadi64i16 when the ↵Craig Topper2019-04-072-3/+17
| | | | | | | | | | | | | | | | | | | | | load is 4 byte aligned or better and not volatile. Summary: Previously we would use MOVZXrm8/MOVZXrm16, but those are longer encodings. This is similar to what we do in the loadi32 predicate. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60341 llvm-svn: 357875
* Reapply [ValueTracking] Support min/max selects in computeConstantRange()Nikita Popov2019-04-071-2/+9
| | | | | | | | | | | | | | | | | | Add support for min/max flavor selects in computeConstantRange(), which allows us to fold comparisons of a min/max against a constant in InstSimplify. This fixes an infinite InstCombine loop, with the test case taken from D59378. Relative to the previous iteration, this contains some adjustments for AMDGPU med3 tests: The AMDGPU target runs InstSimplify prior to codegen, which ends up constant folding some existing med3 tests after this change. To preserve these tests a hidden -amdgpu-scalar-ir-passes option is added, which allows disabling scalar IR passes (that use InstSimplify) for testing purposes. Differential Revision: https://reviews.llvm.org/D59506 llvm-svn: 357870
* [CostModel][X86] Masked load legalization requires an binary-shuffle not a ↵Simon Pilgrim2019-04-071-2/+2
| | | | | | | | select (PR39812) Expansion/truncation is better described by SK_PermuteTwoSrc than SK_Select llvm-svn: 357864
* [X86][SSE] SimplifyDemandedBitsForTargetNode - Add initial PACKSS supportSimon Pilgrim2019-04-071-0/+19
| | | | | | | | | | In the case where we only want the sign bit (e.g. when using PACKSS truncation of comparison results for MOVMSK) then we can just demand the sign bit of the source operands. This makes use of the fact that PACKSS saturates out of range values to the min/max int values - so the sign bit is always preserved. Differential Revision: https://reviews.llvm.org/D60333 llvm-svn: 357859
* [X86] When converting (x << C1) AND C2 to (x AND (C2>>C1)) << C1 during ↵Craig Topper2019-04-061-6/+13
| | | | | | isel, try using andl over andq by favoring 32-bit unsigned immediates. llvm-svn: 357848
* [X86] combineBitcastvxi1 - provide dst VT and src SDValue directly. NFCI.Simon Pilgrim2019-04-061-19/+17
| | | | | | Prep work to make it easier to reuse the BITCAST->MOVSMK combine in other cases. llvm-svn: 357847
* [X86] Use a signed mask in foldMaskedShiftToScaledMask to enable a shorter ↵Craig Topper2019-04-061-2/+6
| | | | | | | | | | | immediate encoding. This function reorders AND and SHL to enable the SHL to fold into an LEA. The upper bits of the AND will be shifted out by the SHL so it doesn't matter what mask value we use for these bits. By using sign bits from the original mask in these upper bits we might enable a shorter immediate encoding to be used. llvm-svn: 357846
* Fix spelling mistake. NFCI.Simon Pilgrim2019-04-061-1/+1
| | | | llvm-svn: 357843
OpenPOWER on IntegriCloud