summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86ISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Remove unreachable code from LowerTRUNCATE. NFCCraig Topper2019-08-121-16/+4
| | | | | | | | All three 256->128 bit cases were already handled above. Noticed while looking at the coverage report. llvm-svn: 368609
* [X86] Add a paranoia type check to the code that detects AVG patterns from ↵Craig Topper2019-08-121-5/+6
| | | | | | | | | | | | | truncating stores. If we're after type legalize, we should make sure we won't create a store with an illegal type when we separate the AVG pattern from the truncating store. I don't know of a way to fail for this today. Just noticed while I was in the vicinity. llvm-svn: 368608
* [X86] Simplify creation of saturating truncating stores.Craig Topper2019-08-121-41/+11
| | | | | | | We just need to check if the truncating store is legal instead of going through isSATValidOnAVX512Subtarget. llvm-svn: 368607
* [X86] Replace call to isTruncStoreLegalOrCustom with isTruncStoreLegal. NFCCraig Topper2019-08-121-1/+1
| | | | | | We have no custom trunc stores on X86. llvm-svn: 368606
* [X86] Disable use of zmm registers for varargs musttail calls under ↵Craig Topper2019-08-121-1/+1
| | | | | | | | | prefer-vector-width=256 and min-legal-vector-width=256. Under this config, the v16f32 type we try to use isn't to a register class so the getRegClassFor call will fail. llvm-svn: 368594
* [X86][SSE] ComputeKnownBits - add basic PSADBW handlingSimon Pilgrim2019-08-121-2/+11
| | | | llvm-svn: 368558
* [X86] Simplify some of the type checks in combineSubToSubus.Craig Topper2019-08-111-5/+10
| | | | | | | If we have SSE2 we can handle any i8/i16 type and let type legalization deal with it. llvm-svn: 368538
* [X86] Don't use SplitOpsAndApply for ISD::USUBSAT.Craig Topper2019-08-111-10/+4
| | | | | | | Target independent type legalization and custom lowering should be able to handle it. llvm-svn: 368537
* [X86] Remove some more code from combineShuffle that is no longer needed ↵Craig Topper2019-08-111-47/+0
| | | | | | with widening legalization. llvm-svn: 368523
* [X86] Remove some code from combineShuffle that seems largely unnecessary ↵Craig Topper2019-08-111-60/+0
| | | | | | | | | | | with widening legalization. The test case that changed is probably better served through allowing combineTruncatedArithmetic to create narrow vectors. It also appears InstCombine would have simplified this test case to remove the zext and trunc anyway. llvm-svn: 368522
* [X86][SSE] Lower shuffle as ANY_EXTEND_VECTOR_INREGSimon Pilgrim2019-08-101-3/+3
| | | | | | | | | | On SSE41+ targets we always lower vector shuffles to ZERO_EXTEND_VECTOR_INREG, even if we don't need the extended bits. This patch relaxes this so that we lower to ANY_EXTEND_VECTOR_INREG if we can, meaning that shuffle combines have a better idea of what elements need to be kept zero. This helps the multiple reduction code as we can now combine away a lot more of the pack+extend codes. Differential Revision: https://reviews.llvm.org/D65741 llvm-svn: 368515
* [X86] Match the IR pattern form movmsk on SSE1 only targets where v4i32 ↵Craig Topper2019-08-101-3/+22
| | | | | | | | | | | | | | | | | | | isn't legal Summary: This patch adds a special DAG combine for SSE1 to recognize the IR pattern InstCombine gives us for movmsk. This only does the recognition for a few cases where its obvious the input won't be scalarized resulting in building a vector just do to the movmsk. I've made it separate from our existing matching for movmsk since that's called in multiple places and I didn't spend time to see if the other callers would make sense here. Plus the restrictions and additional checks would complicate that. This fixes the case from PR42870. Buts its probably still broken the presence of logic ops feeding the movmsk pattern which would further hide the v4f32 type. Reviewers: spatel, RKSimon, xbolva00 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65689 llvm-svn: 368506
* [X86] Fix stack probe issue on windows32.Luo, Yuanke2019-08-101-0/+13
| | | | | | | | | | | | | | | | | | | | | Summary: On windows if the frame size exceed 4096 bytes, compiler need to generate a call to _alloca_probe. X86CallFrameOptimization pass changes the reserved stack size and cause of stack probe function not be inserted. This patch fix the issue by detecting the call frame size, if the size exceed 4096 bytes, drop X86CallFrameOptimization. Reviewers: craig.topper, wxiao3, annita.zhang, rnk, RKSimon Reviewed By: rnk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65923 llvm-svn: 368503
* Remove variable only used in an assert.Eric Christopher2019-08-091-2/+1
| | | | llvm-svn: 368486
* [X86] Remove custom handling for extloads from LowerLoad.Craig Topper2019-08-091-183/+1
| | | | | | We don't appear to need this with widening legalization. llvm-svn: 368479
* [X86][SSE] Swap X86ISD::BLENDV inputs with an inverted selection mask (PR42825)Simon Pilgrim2019-08-091-0/+6
| | | | | | | | As discussed on PR42825, if we are inverting the selection mask we can just swap the inputs and avoid the inversion. Differential Revision: https://reviews.llvm.org/D65522 llvm-svn: 368438
* [X86] Remove code that expands truncating stores from combineStore.Craig Topper2019-08-091-76/+1
| | | | | | | We shouldn't form trunc stores that need to be expanded now that we are using widening legalization. llvm-svn: 368400
* [X86] Remove stale FIXME from combineMaskedStore. NFCCraig Topper2019-08-091-4/+0
| | | | | | | I believe PR34584 was tracking that FIXME, but its since been closed and a test case was added. llvm-svn: 368397
* [X86] Remove DAG combine expansion of extending masked load and truncating ↵Craig Topper2019-08-091-181/+24
| | | | | | | | | | masked store. The only way to generate these was through promoting legalization of narrow vectors, but we widen those types now. So we shouldn't produce these nodes. llvm-svn: 368396
* [X86] Remove handler for (U/S)(ADD/SUB)SAT from ReplaceNodeResults. Remove ↵Craig Topper2019-08-091-9/+4
| | | | | | | | TypeWidenVector check from code that handles X86ISD::VPMADDWD and X86ISD::AVG. More unneeded code since we now legalize narrow vectors by widening. llvm-svn: 368395
* [X86] Remove ISD::SETCC handling from ReplaceNodeResults.Craig Topper2019-08-091-27/+0
| | | | | | This is no longer needed since we widen v2i32 instead of promoting. llvm-svn: 368394
* [X86] Simplify ISD::LOAD handling in ReplaceNodeResults and ISD::STORE ↵Craig Topper2019-08-091-12/+10
| | | | | | handling in LowerStore now that v2i32 is widened to v4i32. llvm-svn: 368390
* [X86] Merge v2f32 and v2i32 gather/scatter handling in ↵Craig Topper2019-08-091-86/+12
| | | | | | ReplaceNodeResults/LowerMSCATTER now that v2i32 is also widened like v2f32. llvm-svn: 368389
* [X86] Now unreachable handling for f64->v2i32/v4i16/v8i8 bitcasts from ↵Craig Topper2019-08-091-14/+0
| | | | | | | | ReplaceNodeResults. We rely on the generic type legalizer for this now. llvm-svn: 368388
* [X86] Simplify ReplaceNodeResults handling for FP_TO_SINT/UINT for vectors ↵Craig Topper2019-08-091-44/+10
| | | | | | to only handle widening. llvm-svn: 368387
* [X86] Simplify ReplaceNodeResults handling for ↵Craig Topper2019-08-091-4/+5
| | | | | | SIGN_EXTEND/ZERO_EXTEND/TRUNCATE for vectors to only handle widening. llvm-svn: 368386
* [X86] Simplify ReplaceNodeResults handling for UDIV/UREM/SDIV/SREM for ↵Craig Topper2019-08-091-12/+3
| | | | | | vectors to only handle widening. llvm-svn: 368385
* [X86] Remove vector promotion handling from the ReplaceNodeResults ISD::MUL ↵Craig Topper2019-08-091-28/+14
| | | | | | | | handling code. We now widen illegal vector types so we don't need this anymore. llvm-svn: 368384
* [X86] Improve codegen of v8i64->v8i16 and v16i32->v16i8 truncate with ↵Craig Topper2019-08-081-1/+22
| | | | | | | | | | | | avx512vl, avx512bw, min-legal-vector-width<=256 and prefer-vector-width=256 Under this configuration we'll want to split the v8i64 or v16i32 into two vectors. The default legalization will try to truncate each of those 256-bit pieces one step to 128-bit, concatenate those, then truncate one more time from the new 256 to 128 bits. With this patch we now truncate the two splits to 64-bits then concatenate those. We have to do this two different ways depending on whether have widening legalization enabled. Without widening legalization we have to manually construct X86ISD::VTRUNC to prevent the ISD::TRUNCATE with a narrow result being promoted to 128 bits with a larger element type than what we want followed by something like a pshufb to grab the lower half of each element to finish the job. With widening legalization we just get the right thing. When we switch to widening by default we can just delete the other code path. Differential Revision: https://reviews.llvm.org/D65626 llvm-svn: 368349
* [X86] XFormVExtractWithShuffleIntoLoad - handle shuffle mask scalingSimon Pilgrim2019-08-081-13/+27
| | | | | | | | If the target shuffle mask is from a wider type, attempt to scale the mask so that the extraction can attempt to peek through. Fixes the regression mentioned in rL368307 llvm-svn: 368308
* [X86] SimplifyDemandedVectorElts - attempt to recombine target shuffle using ↵Simon Pilgrim2019-08-081-0/+17
| | | | | | | | | | | | DemandedElts mask If we don't demand all elements, then attempt to combine to a simpler shuffle. At the moment we can only do this if Depth == 0 as combineX86ShufflesRecursively uses Depth to track whether the shuffle has really changed or not - we'll need to change this before we can properly start merging combineX86ShufflesRecursively into SimplifyDemandedVectorElts. The insertps-combine.ll regression is because XFormVExtractWithShuffleIntoLoad can't see through shuffles of different widths - this will be fixed in a follow-up commit. llvm-svn: 368307
* [X86][SSE] matchBinaryPermuteShuffle - split INSERTPS combinesSimon Pilgrim2019-08-081-8/+17
| | | | | | We need to prefer INSERTPS with zeros over SHUFPS, but fallback to INSERTPS if that fails. llvm-svn: 368292
* [X86] Remove -x86-experimental-vector-widening-legalization command line ↵Craig Topper2019-08-081-354/+45
| | | | | | | | | option and all its uses. This option is now defaulted to true and we don't want to support turning it off so remove the option. llvm-svn: 368258
* [X86] Add CMOV_FR32X and CMOV_FR64X to the isCMOVPseudo function.Craig Topper2019-08-081-0/+2
| | | | llvm-svn: 368250
* Recommit "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG"Amy Huang2019-08-071-0/+5
| | | | | | with a fix to clear the SDNode map when SelectionDAG is cleared. llvm-svn: 368230
* [X86] Allow pack instructions to be used for 512->256 truncates when ↵Craig Topper2019-08-071-2/+9
| | | | | | | | | | -mprefer-vector-width=256 is causing 512-bit vectors to be split If we're splitting the 512-bit vector anyway and we have zero/sign bits, then we might as well use pack instructions to concat and truncate at once. Differential Revision: https://reviews.llvm.org/D65904 llvm-svn: 368210
* Recommit r367901 "[X86] Enable ↵Craig Topper2019-08-071-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | -x86-experimental-vector-widening-legalization by default." The assert that caused this to be reverted should be fixed now. Original commit message: This patch changes our defualt legalization behavior for 16, 32, and 64 bit vectors with i8/i16/i32/i64 scalar types from promotion to widening. For example, v8i8 will now be widened to v16i8 instead of promoted to v8i16. This keeps the elements widths the same and pads with undef elements. We believe this is a better legalization strategy. But it carries some issues due to the fragmented vector ISA. For example, i8 shifts and multiplies get widened and then later have to be promoted/split into vXi16 vectors. This has the potential to cause regressions so we wanted to get it in early in the 10.0 cycle so we have plenty of time to address them. Next steps will be to merge tests that explicitly test the command line option. And then we can remove the option and its associated code. llvm-svn: 368183
* [X86] EltsFromConsecutiveLoads - early out for non-byte sized memory (PR42909)Simon Pilgrim2019-08-071-0/+3
| | | | | | Don't attempt to merge loads for types that aren't modulo 8-bits. llvm-svn: 368165
* Revert "[X86] Enable -x86-experimental-vector-widening-legalization by default."Mitch Phillips2019-08-061-2/+3
| | | | | | | | | This reverts commit 3de33245d2c992c9e0af60372043540b60f3a810. This commit broke the MSan buildbots. See https://reviews.llvm.org/rL367901 for more information. llvm-svn: 368107
* [X86] Don't allow combineSIntToFP to create v2i32 vectors after type ↵Craig Topper2019-08-061-4/+14
| | | | | | | | | | | | | | legalization. If we're after type legalization we should only be trying to turn v2i64 into v2i32. So bitcast to v4i32, shuffle the even elements together. Then use X86ISD::CVTSI2P. The alternative is to leave the v2i64 type alone and let it scalarized. Hopefully keeping it packed is better. Fixes PR42905. llvm-svn: 368091
* [X86][SSE] Call SimplifyMultipleUseDemandedBits on PACKSS/PACKUS arguments.Simon Pilgrim2019-08-061-4/+24
| | | | | | This mainly helps to replace unused arguments with UNDEF in the case where they have multiple users. llvm-svn: 368026
* [X86] SimplifyMultipleUseDemandedBits - target shuffles might not be identitySimon Pilgrim2019-08-061-2/+3
| | | | | | | | If we don't demand any non-undef shuffle elements then the assert will fail as all shuffle inputs would still be flagged as 'identity' safe. Exposed by an incoming patch. llvm-svn: 368022
* [X86][SSE] Enable min/max partial reductionSimon Pilgrim2019-08-061-1/+1
| | | | | | As mentioned on D65047 / rL366933 the plan is to enable partial reduction handling wherever possible. llvm-svn: 368016
* [SelectionDAG] Extend base addressing modes supported by MGATHER/MSCATTERCullen Rhodes2019-08-061-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Before this patch MGATHER/MSCATTER is capable of representing all common addressing modes, but only when illegal types are used. This patch adds an IndexType property so more representations are available when using legal types only. Original modes: vector of bases base + vector of signed scaled offsets New modes: base + vector of signed unscaled offsets base + vector of unsigned scaled offsets base + vector of unsigned unscaled offsets The current behaviour of addressing modes for gather/scatter remains unchanged. Patch by Paul Walker. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D65636 llvm-svn: 368008
* [X86] Enable -x86-experimental-vector-widening-legalization by default.Craig Topper2019-08-051-3/+2
| | | | | | | | | | | | | | | | | | | | | This patch changes our defualt legalization behavior for 16, 32, and 64 bit vectors with i8/i16/i32/i64 scalar types from promotion to widening. For example, v8i8 will now be widened to v16i8 instead of promoted to v8i16. This keeps the elements widths the same and pads with undef elements. We believe this is a better legalization strategy. But it carries some issues due to the fragmented vector ISA. For example, i8 shifts and multiplies get widened and then later have to be promoted/split into vXi16 vectors. This has the potential to cause regressions so we wanted to get it in early in the 10.0 cycle so we have plenty of time to address them. Next steps will be to merge tests that explicitly test the command line option. And then we can remove the option and its associated code. llvm-svn: 367901
* [DAGCombiner][x86] prevent infinite loop from truncate/extend transformsSanjay Patel2019-08-051-2/+0
| | | | | | | | | | | | | | | | | | | | | | | The test case is based on the example from the post-commit thread for: https://reviews.llvm.org/rGc9171bd0a955 This replaces the x86-specific simple-type check from: rL367766 with a check in the DAGCombiner. Adding the check isn't strictly necessary after the fix from: rL367768 ...but it seems likely that we're heading for trouble if we are creating weird types in this transform. I combined the earlier legality check into the initial clause to simplify the code. So we should only try the trunc/sext transform at the earliest combine stage, but we limit the transform to simple types anyway because the TLI hook is probably too lax about what it considers a free truncate. llvm-svn: 367834
* [LLVM][Alignment] Introduce Alignment TypeGuillaume Chatelet2019-08-051-3/+3
| | | | | | | | | | | | | | | | | | | Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jfb, jakehehrlich Reviewed By: jfb Subscribers: wuzish, jholewinski, arsenm, dschuff, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65514 llvm-svn: 367828
* [X86] Fix a bad early out in combineExtInVec that prevented recursive ↵Craig Topper2019-08-051-5/+3
| | | | | | shuffle combining from running with -x86-experimental-vector-widening-legalization. llvm-svn: 367798
* [TargetLowering][X86] Teach SimplifyDemandedVectorElts to replace the base ↵Craig Topper2019-08-041-17/+0
| | | | | | | | | | | | | | | | | | | | | | | | vector of INSERT_SUBVECTOR with undef if none of the elements are demanded even if the node has other users. Summary: The SimplifyDemandedVectorElts function can replace with undef when no elements are demanded, but due to how it interacts with TargetLoweringOpts, it can only do this when the node has no other users. Remove a now unneeded DAG combine from the X86 backend. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65713 llvm-svn: 367788
* [X86] lowerShuffleAsSpecificZeroOrAnyExtend - use undef PSHUFB mask indices ↵Simon Pilgrim2019-08-041-2/+6
| | | | | | for ANY_EXTEND shuffles llvm-svn: 367784
OpenPOWER on IntegriCloud