summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
...
* [Cost][X86] Add v2i64 truncation costsSimon Pilgrim2019-09-221-0/+4
| | | | | | | | We are missing costs for a lot of truncation cases, I'm hoping to address all the 'zero cost' cases in trunc.ll I thought this was a vector widening side effect, but even before this we had some interesting LV decisions (notably over indvars) being made due to these zero costs. llvm-svn: 372498
* [X86] Use sse_load_f32/f64 and timm in patterns for memory form of ↵Craig Topper2019-09-211-4/+3
| | | | | | | | | | | | vgetmantss/sd. Previously we only matched scalar_to_vector and scalar load, but we should be able to narrow a vector load or match vzload. Also need to match TargetConstant instead of Constant. The register patterns were previously updated, but not the memory patterns. llvm-svn: 372458
* Fix missed case of switching getConstant to getTargetConstant. Try 2.Sterling Augustine2019-09-201-1/+1
| | | | | | | | | | | | | | Summary: This fixes a crasher introduced by r372338. Reviewers: echristo, arsenm Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67850 llvm-svn: 372434
* Use llvm::StringLiteral instead of StringRef in few placesFangrui Song2019-09-201-1/+1
| | | | llvm-svn: 372395
* Revert r372366 "Use getTargetConstant for BLENDI, and add a test to catch it."Nico Weber2019-09-201-1/+1
| | | | | | | | | | | | | | | | | | | This reverts commit 52621307bcab2013e8833f3317cebd63a6db3885. Tests have been failing all night with [0/2] ACTION //llvm/test:check-llvm(//llvm/utils/gn/build/toolchain:unix) -- Testing: 33647 tests, 64 threads -- Testing: 0 .. 10.. UNRESOLVED: LLVM :: CodeGen/AMDGPU/GlobalISel/isel-blendi-gettargetconstant.ll (6943 of 33647) ******************** TEST 'LLVM :: CodeGen/AMDGPU/GlobalISel/isel-blendi-gettargetconstant.ll' FAILED ******************** Test has no run line! ******************** Since there were other concerns on https://reviews.llvm.org/D67785, I'm just reverting for now. llvm-svn: 372383
* [X86] Convert tbm_bextri_u32/tbm_bextri_u64 intrinsics TargetConstant ↵Craig Topper2019-09-202-3/+13
| | | | | | | | | | | | | | | | | argument to a regular Constant during lowering. We reuse an ISD opcode here that can be reached from BMI that doesn't require it to be an immediate. Our isel patterns to match the TBM immediate form require a Constant and not a TargetConstant. We were accidentally getting the Constant due to a quirk of combineBEXTR calling SimplifyDemandedBits. The call to SimplifyDemandedBits ended up constant folding the TargetConstant to a regular Constant. But we should probably instead be asserting if SimplifyDemandedBits on a TargetConstant so we shouldn't rely on this behavior. llvm-svn: 372373
* [X86] Use timm in MMX pinsrw/pextrw isel patterns. Add missing test cases.Craig Topper2019-09-201-3/+3
| | | | | | This fixes an isel failure after r372338. llvm-svn: 372371
* Use getTargetConstant for BLENDI, and add a test to catch it.Sterling Augustine2019-09-201-1/+1
| | | | | | | | | | | | | | | | Summary: This fixes a crasher introduced by r372338. Reviewers: echristo, arsenm Subscribers: jvesely, wdng, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67785 Tighten up the test case. llvm-svn: 372366
* [X86] Remove the special isBuildVectorOfConstantSDNodes handling from ↵Craig Topper2019-09-201-26/+2
| | | | | | | | | | LowerBUILD_VECTORvXi1. The later code that generates a constant when there are some non-const elements works basically the same and doesn't require there to be any non-const elements. llvm-svn: 372365
* Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Matt Arsenault2019-09-198-371/+367
| | | | | | | | | This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338
* [DAG][X86] Convert isNegatibleForFree/GetNegatedExpression to a target hook ↵Simon Pilgrim2019-09-192-17/+129
| | | | | | | | | | | | | | | | (PR42863) This patch converts the DAGCombine isNegatibleForFree/GetNegatedExpression into overridable TLI hooks and includes a demonstration X86 implementation. The intention is to let us extend existing FNEG combines to work more generally with negatible float ops, allowing it work with target specific combines and opcodes (e.g. X86's FMA variants). Unlike the SimplifyDemandedBits, we can't just handle target nodes through a Target callback, we need to do this as an override to allow targets to handle generic opcodes as well. This does mean that the target implementations has to duplicate some checks (recursion depth etc.). I've only begun to replace X86's FNEG handling here, handling FMADDSUB/FMSUBADD negation and some low impact codegen changes (some FMA negatation propagation). We can build on this in future patches. Differential Revision: https://reviews.llvm.org/D67557 llvm-svn: 372333
* Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Hans Wennborg2019-09-198-366/+370
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC*. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_* instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314
* [X86] Prevent crash in LowerBUILD_VECTORvXi1 for v64i1 vectors on 32-bit ↵Craig Topper2019-09-191-6/+14
| | | | | | | | | targets when the vector is a mix of constants and non-constant. We need to materialize the constants as two 32-bit values that are casted to v32i1 and then concatenated. llvm-svn: 372304
* [X86] Change a SmallVector& argument to SmallVectorImpl&. NFCCraig Topper2019-09-191-1/+1
| | | | | | Avoids repeating the size. llvm-svn: 372302
* [X86] Remove unused argument from a helper function. NFCCraig Topper2019-09-191-4/+3
| | | | llvm-svn: 372301
* GlobalISel: Don't materialize immarg arguments to intrinsicsMatt Arsenault2019-09-198-370/+366
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Encode them directly as an imm argument to G_INTRINSIC*. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_* instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285
* Add AutoUpgrade function to add new address space datalayout string to ↵Amy Huang2019-09-182-16/+3
| | | | | | | | | | | | | | | | | | | | | | existing datalayouts. Summary: Add function to AutoUpgrade to change the datalayout of old X86 datalayout strings. This adds "-p270:32:32-p271:32:32-p272:64:64" to X86 datalayouts that are otherwise valid and don't already contain it. This also removes the compatibility changes in https://reviews.llvm.org/D66843. Datalayout change in https://reviews.llvm.org/D64931. Reviewers: rnk, echristo Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67631 llvm-svn: 372267
* [Alignment][NFC] Remove LogAlignment functionsGuillaume Chatelet2019-09-181-1/+1
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67620 llvm-svn: 372231
* [Alignment][NFC] Use Align::None instead of 1Guillaume Chatelet2019-09-181-3/+3
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67704 llvm-svn: 372230
* [X86] Break non-power of 2 vXi1 vectors into scalars for argument passing ↵Craig Topper2019-09-181-10/+15
| | | | | | | | | | with avx512. This generates worse code, but matches what is done for avx2 and prevents crashes when more arguments are passed than we have registers for. llvm-svn: 372200
* [X86] Prevent assertion when calling a function that returns double with ↵Craig Topper2019-09-181-0/+4
| | | | | | | | -mno-sse2 on x86-64. As seen in the most recent updates to PR10498 llvm-svn: 372197
* [X86] Use APInt::operator<<= and APInt::lshrInPlace. NFCCraig Topper2019-09-171-4/+4
| | | | llvm-svn: 372159
* [X86] Simplify b2b KSHIFTL+KSHIFTR using demanded elts.Craig Topper2019-09-171-13/+66
| | | | llvm-svn: 372155
* [X86] Call SimplifyDemandedVectorElts on KSHIFTL/KSHIFTR nodes during DAG ↵Craig Topper2019-09-171-0/+16
| | | | | | combine. llvm-svn: 372154
* [X86] Simplify some code in LowerBUILD_VECTORvXi1. NFCICraig Topper2019-09-171-15/+8
| | | | | | | | | | | | | The case were Immediate is 0 and HasConstElts is true should never happen since that would mean the constant elts were all zero. But we check for all zero build vector earlier. So just use HasConstElts and blindly take Immediate without checking if its 0. Move the code that bitcasts and extract the immediate into the the HasConstElts case since the other code just creates an undef with the right type. No casting needed. llvm-svn: 372153
* [X86] X86DAGToDAGISel::tryFoldLoad - assert root/parent pointers are ↵Simon Pilgrim2019-09-171-0/+1
| | | | | | | | non-null. NFCI. Silences a static analyzer warning. llvm-svn: 372118
* Hide implementation details in namespaces.Benjamin Kramer2019-09-171-2/+2
| | | | llvm-svn: 372113
* [X86] Use APInt::getLowBitsSet helper. NFCI.Simon Pilgrim2019-09-171-1/+2
| | | | | | Also avoids a static analyzer warning about out of range shifts. llvm-svn: 372103
* [SVE][MVT] Fixed-length vector MVT rangesGraham Hunter2019-09-171-4/+4
| | | | | | | | | | | | | | | | | * Reordered MVT simple types to group scalable vector types together. * New range functions in MachineValueType.h to only iterate over the fixed-length int/fp vector types. * Stopped backends which don't support scalable vector types from iterating over scalable types. Reviewers: sdesmalen, greened Reviewed By: greened Differential Revision: https://reviews.llvm.org/D66339 llvm-svn: 372099
* [X86] Split oversized vXi1 vector arguments and return values into scalars ↵Craig Topper2019-09-172-0/+34
| | | | | | | | | | | | | | | | | on avx512 targets. Previously we tried to split them into narrower v64i1 or v16i1 pieces that each got promoted to vXi8 and then passed in a zmm or xmm register. But this crashes when you need to pass more pieces than available registers reserved for argument passing. The scalarizing done here generates much longer and slower code, but is consistent with the behavior of avx2 and earlier targets for these types. Fixes PR43323. llvm-svn: 372069
* [X86] Allow masked VBROADCAST instructions to be turned into BLENDM with a ↵Craig Topper2019-09-172-78/+158
| | | | | | | | | | broadcast load to avoid a copy. The BLENDM instructions allow an 2 sources and an independent destination while masked VBROADCAST has the destination tied to the source. llvm-svn: 372068
* [X86] Add support for commuting EVEX VCMP instructons with any immediate value.Craig Topper2019-09-171-6/+33
| | | | | | Previously we limited to the EQ/NE/TRUE/FALSE/ORD/UNORD immediates. llvm-svn: 372067
* [X86] Enable commuting of EVEX VCMP for all immediate values during isel.Craig Topper2019-09-173-14/+39
| | | | llvm-svn: 372065
* [X86][AVX] matchShuffleWithSHUFPD - add support for zeroable operandsSimon Pilgrim2019-09-161-15/+41
| | | | | | Determine if all of the uses of LHS/RHS operands can be replaced with a zero vector. llvm-svn: 372013
* [X86][NFC] Add a `use-aa` feature.Clement Courbet2019-09-162-0/+8
| | | | | | | | | | | | | | | | | | Summary: This allows enabling useaa on the command-line and will allow enabling the feature on a per-CPU basis where benchmarking shows improvements. This is modelled after the ARM/AArch64 target. Reviewers: RKSimon, andreadb, craig.topper Subscribers: javed.absar, kristof.beyls, hiraditya, ychen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67266 llvm-svn: 371989
* [X86] Use incDecVectorConstant to simplify the min/max code in LowerVSETCC.Craig Topper2019-09-131-14/+12
| | | | | | | incDecVectorConstant is used for a similar reason in LowerVSETCCWithSUBUS so we might as well share the code. llvm-svn: 371861
* [X86] negateFMAOpcode - extend to support FMADDSUB/FMSUBADD and output ↵Simon Pilgrim2019-09-131-27/+40
| | | | | | | | | | negation. NFCI. Some prep work for PR42863, this change allows us to move all the FMA opcode mappings into the negateFMAOpcode helper. For the FMADDSUB/FMSUBADD cases, we can only negate the accumulator - any other negations will result in an error. llvm-svn: 371840
* DAG/GlobalISel: Correct type profile of bitcount opsMatt Arsenault2019-09-131-6/+6
| | | | | | | | The result integer does not need to be the same width as the input. AMDGPU, NVPTX, and Hexagon all have patterns working around the types matching. GlobalISel defines these as being different type indexes. llvm-svn: 371797
* Rename nonvolatile_load/store to simple_load/store [NFC]Philip Reames2019-09-124-17/+17
| | | | | | Implement the TODO from D66318. llvm-svn: 371789
* [DAGCombiner][X86] Pass the CmpOpVT to reduceSelectOfFPConstantLoads so X86 ↵Craig Topper2019-09-122-2/+3
| | | | | | | | | | | can exclude fp128 compares. The X86 decision assumes the compare will produce a result in an XMM register, but that can't happen for an fp128 compare since those go to a libcall the returns an i32. Pass the VT so X86 can check the type. llvm-svn: 371775
* [X86] Move negateFMAOpcode helper earlier to help future patch. NFCI.Simon Pilgrim2019-09-121-32/+32
| | | | llvm-svn: 371770
* AArch64: support arm64_32, an ILP32 slice for watchOS.Tim Northover2019-09-121-0/+1
| | | | | | | | This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM. FastISel is mostly disabled for now since it would generate incorrect code for ILP32. llvm-svn: 371722
* [X86] Enable -mprefer-vector-width=256 by default for Skylake-avx512 and ↵Craig Topper2019-09-111-0/+2
| | | | | | | | | | | | | | | | later Intel CPUs. AVX512 instructions can cause a frequency drop on these CPUs. This can negate the performance gains from using wider vectors. Enabling prefer-vector-width=256 will prevent generation of zmm registers unless explicit 512 bit operations are used in the original source code. I believe gcc and icc both do something similar to this by default. Differential Revision: https://reviews.llvm.org/D67259 llvm-svn: 371694
* [X86] Fix latent bugs in 32-bit CMPXCHG8B inserterReid Kleckner2019-09-112-7/+11
| | | | | | | | | | | | | | | I found three issues: 1. the loop over E[ABCD]X copies run over BB start 2. the direct address of cmpxchg8b could be a frame index 3. the displacement of cmpxchg8b could be a global instead of an immediate These were all introduced together in r287875, and should be fixed with this change. Issue reported by Zachary Turner. llvm-svn: 371678
* [X86] Move x86_64 fp128 conversion to libcalls from type legalization to DAG ↵Craig Topper2019-09-112-20/+157
| | | | | | | | | | | | | | legalization fp128 is considered a legal type for a register, but has almost no legal operations so everything needs to be converted to a libcall. Previously this was implemented by tricking type legalization into softening the operations with various checks for "is legal in hardware register" to change the behavior to still use f128 as the resulting type instead of converting to i128. This patch abandons this approach and instead moves the libcall conversions to LegalizeDAG. This is the approach taken by AArch64 where they also have a legal fp128 type, but no legal operations. I think this is more in spirit with how SelectionDAG's phases are supposed to work. I had to make some hacks for STRICT_FP_ROUND because some of the strict FP handling checks if ISD::FP_ROUND is Legal for a given result type, but I had to make ISD::FP_ROUND Custom to allow making a libcall when the input is f128. For all other types the Custom handler just returns the original node. These hacks are incomplete and don't work for a strict truncate from f128, but I don't think it worked before either since LegalizeFloatTypes doesn't know about strict ops yet. I've also raised PR43209 against AArch64 which currently crashes on a strict ftrunc from f64->f32 because of FP_ROUND being marked Custom for the same reason there. Differential Revision: https://reviews.llvm.org/D67128 llvm-svn: 371672
* [Alignment][NFC] use llvm::Align for AsmPrinter::EmitAlignmentGuillaume Chatelet2019-09-111-2/+2
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: dschuff, sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67443 llvm-svn: 371616
* Reland "Change the X86 datalayout to add three address spacesAmy Huang2019-09-102-2/+18
| | | | | | | | | | for 32 bit signed, 32 bit unsigned, and 64 bit pointers." This reverts 57076d3199fc2b0af4a3736b7749dd5462cacda5. Original review at https://reviews.llvm.org/D64931. Review for added fix at https://reviews.llvm.org/D66843. llvm-svn: 371568
* [X86] Updated target specific selection dag code to conservatively check for ↵Philip Reames2019-09-103-20/+20
| | | | | | | | | | | | | | isAtomic in addition to isVolatile See D66309 for context. This is the first sweep of x86 target specific code to add isAtomic bailouts where appropriate. The intention here is to have the switch from AtomicSDNode to LoadSDNode/StoreSDNode be close to NFC; that is, I'm not looking to allow additional optimizations at this time. Sorry for the lack of tests. As discussed in the review, most of these are vector tests (for which atomicity is not well defined) and I couldn't figure out to exercise the anyextend cases which aren't vector specific. Differential Revision: https://reviews.llvm.org/D66322 llvm-svn: 371547
* [X86] Add broadcast load unfolding support for VCMPPS/PD.Craig Topper2019-09-101-0/+6
| | | | llvm-svn: 371487
* [Windows] Replace TrapUnreachable with an int3 insertion passReid Kleckner2019-09-094-11/+125
| | | | | | | | | | | | This is an alternative to D66980, which was reverted. Instead of inserting a pseudo instruction that optionally expands to nothing, add a pass that inserts int3 when appropriate after basic block layout. Reviewers: hans Differential Revision: https://reviews.llvm.org/D67201 llvm-svn: 371466
OpenPOWER on IntegriCloud