summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [llvm] Migrate llvm::make_unique to std::make_uniqueJonas Devlieghere2019-08-1575-244/+244
| | | | | | | | Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013
* [PowerPC] Use xxleqv to set all one vector IMM(-1).Jinsong Ji2019-08-154-15/+33
| | | | | | | | | | | | | | | | | | Summary: xxspltib/vspltisb are 3 cycle PM instructions, xxleqv is 2 cycle ALU instruction. We should use xxleqv to set all one vectors. Reviewers: hfinkel, nemanjai, steven.zhang Subscribers: hiraditya, kbarton, MaskRay, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65529 llvm-svn: 369006
* [ARM] Fix alignment checks for BE VLDRHDavid Green2019-08-151-2/+2
| | | | | | | | | | | | We need to allow any alignment at least 2, not just exactly 2, so that the big endian loads and stores can be selected successfully. I've also added extra BE testing for the load and store tests. Thanks to Oliver for the report. Differential Revision: https://reviews.llvm.org/D66222 llvm-svn: 368996
* [SDAG][x86] check for relaxed math when matching an FP reductionSanjay Patel2019-08-151-2/+2
| | | | | | | | | | | | | | | | If the last step in an FP add reduction allows reassociation and doesn't care about -0.0, then we are free to recognize that computation as a reduction that may reorder the intermediate steps. This is requested directly by PR42705: https://bugs.llvm.org/show_bug.cgi?id=42705 and solves PR42947 (if horizontal math instructions are actually faster than the alternative): https://bugs.llvm.org/show_bug.cgi?id=42947 Differential Revision: https://reviews.llvm.org/D66236 llvm-svn: 368995
* [ARM] MVE predicate store patternsDavid Green2019-08-151-0/+7
| | | | | | | | | Stack loads and stores were already working, but direct stores were not. This adds the patterns for them, same as predicate loads. Differential Revision: https://reviews.llvm.org/D66213 llvm-svn: 368988
* [AArch64] Change location of frame-record within callee-save area.Sander de Smalen2019-08-153-26/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the location of the frame-record (FP, LR) to the bottom of the callee-saved area. According to the AAPCS the location of the frame-record within the stackframe is unspecified (section 5.2.3 The Frame Pointer), so the compiler should be free to choose a different location. The reason for changing the location of the frame-record is to prepare the frame for allocating an SVE area below the callee-saves. This way the compiler can use the VL-scaled addressing modes to directly access SVE objects from the frame-pointer. : : | stack | | stack | | args | | args | +-------+ +-------+ | x30 | | x19 | | x29 | | x20 | FP -> |- - - -| | x21 | | x19 | ==> | x22 | | x20 | |- - - -| | x21 | | x30 | | x22 | | x29 | +-------+ +-------+ <- FP |///////| |///////| // realignment gap |- - - -| |- - - -| |spills/| |spills/| | locals| | locals| SP -> +-------+ +-------+ <- SP Things to point out: - The algorithm to find a paired register should be prevented from accidentally pairing some callee-saved register with LR that is not FP, since they should always be paired together when the frame has a frame-record. - For Darwin platforms the location of the frame-record is unchanged, since the unwind encoding does not allow for encoding this position dynamically and other tools currently depend on the former layout. Reviewers: efriedma, rovka, rengolin, thegameg, greened, t.p.northover Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D65653 llvm-svn: 368987
* [ARM] MVE trunc to i1 vectorsDavid Green2019-08-151-0/+7
| | | | | | | | | This adds patterns for selecting trunc instructions from full vectors to i1's vectors. Differential Revision: https://reviews.llvm.org/D66201 llvm-svn: 368981
* [X86] Add isel pattern to match VZEXT_MOVL and a v2i64 scalar_to_vector ↵Craig Topper2019-08-151-0/+4
| | | | | | | | | bitcasted from x86mmx to MOVQ2DQ. We already had the pattern for just the scalar to vector and bitcast, but not the case where we wanted zeroes in the high half of the xmm. llvm-svn: 368972
* [X86] Make sure load is non-volatile in the MMX_X86movdq2q (loadv2i64) isel ↵Craig Topper2019-08-151-1/+1
| | | | | | | | | pattern. This pattern will narrow the load so we should make sure its not volatile. llvm-svn: 368971
* [X86] Remove unneeded isel pattern for v4f32->v4i32 fp_to_sint and ↵Craig Topper2019-08-151-3/+0
| | | | | | | | | | conversion to MMX. fp_to_sint is turned into X86cvttp2si during isel preprocessing. The other redundant isel patterns were removed previously, but I missed this one because its in the MMX td file. llvm-svn: 368968
* [X86] Disable custom type legalization for v2i32/v4i16/v8i8->i64.Craig Topper2019-08-151-2/+1
| | | | | | The default legalization can take care of this. llvm-svn: 368967
* [X86] Disable custom type legalization for v2i32/v4i16/v8i8->f64 bitcast.Craig Topper2019-08-151-1/+2
| | | | | | | The generic legalization handles this in the same way so just use that. llvm-svn: 368966
* [X86] Remove some unreachable code from LowerBITCAST.Craig Topper2019-08-151-42/+26
| | | | llvm-svn: 368965
* [X86] Remove some dead code and combine some repeated code that's left.Craig Topper2019-08-151-17/+3
| | | | | | | | If the width is 256 bits, then we must have AVX so the else here was unnecessary. Once that's removed then the >= 256 bit code is identical to the 128 bit code with a different VT so combine them. llvm-svn: 368956
* [AArch64][GlobalISel] Custom selection for s8 load acquire.Amara Emerson2019-08-141-1/+8
| | | | | | | | | Implement this single atomic load instruction so that we can compile stack protector code. Differential Revision: https://reviews.llvm.org/D66245 llvm-svn: 368923
* InferAddressSpaces: Move target intrinsic handling to TTIMatt Arsenault2019-08-142-0/+45
| | | | | | | | I'm planning on handling intrinsics that will benefit from checking the address space enums. Don't bother moving the address collection for now, since those won't need th enums. llvm-svn: 368895
* [WebAssembly] Stop unrolling SIMD shifts since they are fixed in V8Thomas Lively2019-08-141-5/+0
| | | | | | | | | | | | | | | | | Summary: Fixes PR42973. Tests don't change because simd-arith.ll tests behavior on unimplemented-simd128, which does not include any temporary workarounds such as the one removed in this revision. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66166 llvm-svn: 368868
* [X86] Use PSADBW for v8i8 addition reductions.Craig Topper2019-08-141-2/+12
| | | | | | | | Improves the 8 byte case from PR42674. Differential Revision: https://reviews.llvm.org/D66069 llvm-svn: 368864
* [NFC][AIX] Change assertionXiangling Liao2019-08-141-1/+1
| | | | | | | | | Address one left comment on https://reviews.llvm.org/D63547. A minor change for assertion. Differential Revision: https://reviews.llvm.org/D63547 llvm-svn: 368860
* [X86][CostModel] Adjust the costs of ZERO_EXTEND/SIGN_EXTEND with less than ↵Craig Topper2019-08-141-10/+12
| | | | | | | | | | | | 128-bit inputs Now that we legalize by widening, the element types here won't change. Previously these were modeled as the elements being widened and then the instruction might become an AND or SHL/ASHR pair. But now they'll become something like a ZERO_EXTEND_VECTOR_INREG/SIGN_EXTEND_VECTOR_INREG. For AVX2, when the destination type is legal its clear the cost should be 1 since we have extend instructions that can produce 256 bit vectors from less than 128 bit vectors. I'm a little less sure about AVX1 costs, but I think the ones I changed were definitely too high, but they might still be too high. Differential Revision: https://reviews.llvm.org/D66169 llvm-svn: 368858
* [X86] Add llvm_unreachable to a switch that covers all expected values.Craig Topper2019-08-141-0/+1
| | | | llvm-svn: 368857
* [PowerPC][NFC] Consolidate duplicate XX3Form_SetZero and XX3Form_Zero.Jinsong Ji2019-08-142-11/+4
| | | | | | Rename one to XX3Form_SameOp, remove the other one. llvm-svn: 368856
* [AIX] Add call lowering for parameters that could pass onto FPRsJason Liu2019-08-142-4/+26
| | | | | | | | | | Summary: This patch adds call lowering functionality to enable passing parameters onto floating point registers when needed. Differential Revision: https://reviews.llvm.org/D63654 llvm-svn: 368855
* [AArch64][GlobalISel] RBS: Treat s128s like vectors when unmerging.Amara Emerson2019-08-131-1/+1
| | | | | | | | The destinations should be FPRs (for now). Differential Revision: https://reviews.llvm.org/D66184 llvm-svn: 368775
* [AArch64] Remove incorrect usage of MONonTemporal.Eli Friedman2019-08-131-2/+1
| | | | | | | This has no effect at the moment, but might matter if we try to implement non-temporal loads in the future. llvm-svn: 368770
* [AIX]Lowering global address for 32/64bit small/large code modelsXiangling Liao2019-08-136-46/+119
| | | | | | | | | | | | This patch implements global address lowering for 32/64 bit with small/large code models. 1.For 32bit large code model on AIX, there are newly added pseudo opcode LWZtocL & ADDIStocHA32, the support of which on MC layer will be provided by future patches. 2.The default code model on AIX should be small code model. 3.Since AIX does not have medium code model, "report_fatal_error" when users specify it. Differential Revision: https://reviews.llvm.org/D63547 llvm-svn: 368744
* [AMDGPU] Fix to 'Fold readlane from copy of SGPR or imm'Tim Renouf2019-08-131-0/+3
| | | | | | | | | That change (r363670) could leave a copy from vgpr to sgpr. Fixed. Differential Revision: https://reviews.llvm.org/D66133 Change-Id: I00c3fe6fda2e8e1e36f53195b881b1449c777ea4 llvm-svn: 368736
* [ARM] Add MVE beats vector cost modelDavid Green2019-08-134-21/+87
| | | | | | | | | | | | | | | | | | | | | | | | The MVE architecture has the idea of "beats", where a vector instruction can be executed over several ticks of the architecture. This adds a similar system into the Arm backend cost model, multiplying the cost of all vector instructions by a factor. This factor essentially becomes the expected difference between scalar code and vector code, on average. MVE Vector instructions can also overlap so the a true cost of them is often lower. But equally scalar instructions can in some situations be dual issued, or have other optimisations such as unrolling or make use of dsp instructions. The default is chosen as 2. This should not prevent vectorisation is a most cases (as the vector instructions will still be doing at least 4 times the work), but it will help prevent over vectorising in cases where the benefits are less likely. This adds things so far to the obvious places in ARMTargetTransformInfo, and updates a few related costs like not treating float instructions as cost 2 just because they are floats. Differential Revision: https://reviews.llvm.org/D66005 llvm-svn: 368733
* Use Register over unsigned in LateEHPrepare (NFC)Heejin Ahn2019-08-131-1/+1
| | | | | | | | | | | | | | | | Summary: While D65962 is pending for review, I landed D65475 that added one more use of `unsigned`. Changed it to `Register`. Reviewers: dsanders Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66064 llvm-svn: 368727
* Reland r368691: "[AIX] Implement LR prolog/epilog save/restore"Hubert Tong2019-08-132-6/+34
| | | | | | | | | | | | | | | | | | | | | | Trying again with the code changes (and not just the new test). Summary: This patch fixes the offsets of fields in the stack frame linkage save area for AIX. Reviewers: sfertile, hubert.reinterpretcast, jasonliu, Xiangling_L, xingxue, ZarkoCA, daltenty Reviewed By: hubert.reinterpretcast Subscribers: wuzish, nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64424 Patch by Chris Bowler! llvm-svn: 368721
* GlobalISel: Partially implement fewerElementsVector G_UNMERGE_VALUESMatt Arsenault2019-08-131-1/+10
| | | | | | Odd sized vectors aren't handled yet. llvm-svn: 368713
* [ARM] Fix detection of duplicates when parsing reg list operandsMomchil Velikov2019-08-131-19/+43
| | | | | | Differential Revision: https://reviews.llvm.org/D65957 llvm-svn: 368712
* [ARM] Fix encoding of APSR in CLRM instructionMomchil Velikov2019-08-132-16/+7
| | | | | | | | | The APSR is encoded by setting bit 15 in the register list of the CLRM instruction (cf. https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf). Differential Revision: https://reviews.llvm.org/D65873 llvm-svn: 368711
* GlobalISel: Implement lower for G_SHUFFLE_VECTORMatt Arsenault2019-08-131-0/+3
| | | | llvm-svn: 368709
* GlobalISel: Change representation of shuffle masksMatt Arsenault2019-08-132-44/+8
| | | | | | | | | | | | | | | | | | Currently shufflemasks get emitted as any other constant, and you end up with a bunch of virtual registers of G_CONSTANT with a G_BUILD_VECTOR. The AArch64 selector then asserts on anything that doesn't fit this pattern. This isn't an ideal representation, and should avoid legalization and have fewer opportunities for a representational error. Rather than invent a new shuffle mask operand type, similar to what ShuffleVectorSDNode does, just track the original IR Constant mask operand. I don't completely like the idea of adding another link to the IR, but MIR is already quite dependent on IR constants already, and this will allow sharing the shuffle mask utility functions with the IR. llvm-svn: 368704
* [X86] XFormVExtractWithShuffleIntoLoad - handle shuffle mask scalingSimon Pilgrim2019-08-131-13/+27
| | | | | | | | | | If the target shuffle mask is from a wider type, attempt to scale the mask so that the extraction can attempt to peek through. Fixes the regression mentioned in rL368662 Reapplying this as rL368308 had to be reverted as part of rL368660 to revert rL368276 llvm-svn: 368663
* [X86] SimplifyDemandedVectorElts - attempt to recombine target shuffle using ↵Simon Pilgrim2019-08-131-0/+17
| | | | | | | | | | | | | | DemandedElts mask (reapplied) If we don't demand all elements, then attempt to combine to a simpler shuffle. At the moment we can only do this if Depth == 0 as combineX86ShufflesRecursively uses Depth to track whether the shuffle has really changed or not - we'll need to change this before we can properly start merging combineX86ShufflesRecursively into SimplifyDemandedVectorElts. The insertps-combine.ll regression is because XFormVExtractWithShuffleIntoLoad can't see through shuffles of different widths - this will be fixed in a follow-up commit. Reapplying this as rL368307 had to be reverted as part of rL368660 to revert rL368276 llvm-svn: 368662
* Revert r368276 "[TargetLowering] SimplifyDemandedBits - call ↵Hans Wennborg2019-08-131-44/+13
| | | | | | | | | | | | | | | | | | | | | | SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT" This introduced a false positive MemorySanitizer warning about use of uninitialized memory in a vectorized crc function in Chromium. That suggests maybe something is not right with this transformation. See https://crbug.com/992853#c7 for a reproducer. This also reverts the follow-up commits r368307 and r368308 which depended on this. > This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract. > > In particular this helps remove some unnecessary scalar->vector->scalar patterns. > > The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue. > > Differential Revision: https://reviews.llvm.org/D65887 llvm-svn: 368660
* [PowerPC] Fix ICE when truncating some vectorsQiu Chaofan2019-08-131-1/+3
| | | | | | | | | | | | The legalizer would hit an assertion on PowerPC platform when truncating a vector whose size is not power of 2. This patch is to add a check to prevent vectors with such odd-size elements from being custom lowered. Reviewed By: Hal Finkel Differential Revision: https://reviews.llvm.org/D65261 llvm-svn: 368654
* [AArch64][GlobalISel] Replace explicit vreg creation with implicit using ↵Amara Emerson2019-08-131-3/+4
| | | | | | SrcOp. NFC. llvm-svn: 368653
* [GlobalISel] Make the InstructionSelector instance non-const, allowing state ↵Amara Emerson2019-08-1315-64/+62
| | | | | | | | | | | | | | | | to be maintained. Currently we can't keep any state in the selector object that we get from subtarget. As a result we have to plumb through all our variables through multiple functions. This change makes it non-const and adds a virtual init() method to allow further state to be captured for each target. AArch64 makes use of this in this patch to cache a call to hasFnAttribute() which is expensive to call, and is used on each selection of G_BRCOND. Differential Revision: https://reviews.llvm.org/D65984 llvm-svn: 368652
* [AMDGPU] Fix msan failure in printf loweringStanislav Mekhanoshin2019-08-131-5/+3
| | | | llvm-svn: 368645
* [AMDGPU] removed unused functions from printf loweringStanislav Mekhanoshin2019-08-121-21/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D66117 llvm-svn: 368633
* [WinEH] Fix catch block parent frame pointer offsetReid Kleckner2019-08-121-3/+8
| | | | | | | | | | | | r367088 made it so that funclets store XMM registers into their local frame instead of storing them to the parent frame. However, that change forgot to update the parent frame pointer offset for catch blocks. This change does that. Fixes crashes when an exception is rethrown in a catch block that saves XMMs, as described in https://crbug.com/992860. llvm-svn: 368631
* [risc-v] Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVMDaniel Sanders2019-08-126-49/+49
| | | | | | | | | | | | | | | | | | | Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Depends on D65919 Reviewers: lenary Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision for full review was: https://reviews.llvm.org/D65962 llvm-svn: 368629
* [aarch64] Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVMDaniel Sanders2019-08-1221-159/+159
| | | | | | | | | | | | | | | | | | | | | | | Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Manual fixups in: AArch64InstrInfo.cpp - genFusedMultiply() now takes a Register* instead of unsigned* AArch64LoadStoreOptimizer.cpp - Ternary operator was ambiguous between Register/MCRegister. Settled on Register Depends on D65919 Reviewers: aemerson Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision for full review was: https://reviews.llvm.org/D65962 llvm-svn: 368628
* [webassembly] Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVMDaniel Sanders2019-08-1213-45/+45
| | | | | | | | | | | | | | | | | Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Reviewers: aheejin Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision for whole review: https://reviews.llvm.org/D65962 llvm-svn: 368627
* [AMDGPU] Use PredicateControl in MIMGBaseOpcode. NFC.Stanislav Mekhanoshin2019-08-121-2/+2
| | | | | | | | | | | This is infrastructural, will be needed for future work. For some reason it was only used in MIMG_NoSampler, while needed everywere we use MIMGBaseOpcode if we want to use predicates. Differential Revision: https://reviews.llvm.org/D66115 llvm-svn: 368626
* [X86] Allow combineTruncateWithSat to use pack instructions for i16->i8 ↵Craig Topper2019-08-121-1/+2
| | | | | | | | | | | without AVX512BW. We need AVX512BW to be able to truncate an i16 vector. If we don't have that we have to extend i16->i32, then trunc, i32->i8. But we won't be able to remove the min/max if we do that. At least not without more special handling. llvm-svn: 368623
* [X86] Remove unreachable code from LowerTRUNCATE. NFCCraig Topper2019-08-121-16/+4
| | | | | | | | All three 256->128 bit cases were already handled above. Noticed while looking at the coverage report. llvm-svn: 368609
OpenPOWER on IntegriCloud