summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [PGO] Update ICP pass for recent byval type changesReid Kleckner2019-07-011-0/+9
| | | | | | | | | | Fixes verifier errors encountered in PR42413. Reviewers: xur, t.p.northover, inglorion, gbiv, george.burgess.iv Differential Revision: https://reviews.llvm.org/D63842 llvm-svn: 364861
* AMDGPU: Correct properties for adjcallstack* pseudosMatt Arsenault2019-07-011-0/+4
| | | | | | | These should be SALU writes, and these are lowered to instructions that def SCC. llvm-svn: 364859
* [InstCombine] reduce more checks for power-of-2-or-zero using ctpopSanjay Patel2019-07-011-1/+7
| | | | | | | | | Extends the transform from: rL364341 ...to include another (more common?) pattern that tests whether a value is a power-of-2 (including or excluding zero). llvm-svn: 364856
* [X86] Use v4i32 vzloads instead of v2i64 for vpmovzx/vpmovsx patterns where ↵Craig Topper2019-07-013-9/+7
| | | | | | | | | | | | only 32-bits are loaded. v2i64 vzload defines a 64-bit memory access. It doesn't look like we have any coverage for this either way. Also remove some vzload usages where the instruction loads only 16-bits. llvm-svn: 364851
* [mips] Add missing schedinfo for MIPSeh_return[32|64] instructionsSimon Atanasyan2019-07-011-1/+1
| | | | llvm-svn: 364850
* [mips] Add virtualization ASE to P5600 scheduling definitionsSimon Atanasyan2019-07-011-0/+5
| | | | llvm-svn: 364849
* [mips] Add missing schedinfo for LONG_BRANCH_* instructionsSimon Atanasyan2019-07-012-11/+27
| | | | llvm-svn: 364848
* [X86] Remove several bad load folding isel patterns for VPMOVZX/VPMOVSX.Craig Topper2019-07-012-12/+0
| | | | | | | These patterns all matched a v2i64 vzload which only loads 64-bits to instructions that load a full 128-bits. llvm-svn: 364847
* Revert [SLP] Look-ahead operand reordering heuristic.Jordan Rupprecht2019-07-011-236/+46
| | | | | | | | This reverts r364478 (git commit 574cb0eb3a7ac95e62d223a60bef891171dfe321) The patch is causing compilation timeouts. llvm-svn: 364846
* Testing commit access through minor formatting changeNilanjana Basu2019-07-011-2/+3
| | | | llvm-svn: 364843
* GlobalISel: Try to widen merges with other mergesMatt Arsenault2019-07-011-2/+28
| | | | | | | | If the requested source type an be used as a merge source type, create a merge of merges. This avoids creating large, illegal extensions and bit-ops directly to the result type. llvm-svn: 364841
* [X86] Correct v4f32->v2i64 cvt(t)ps2(u)qq memory isel patternsCraig Topper2019-07-012-2/+93
| | | | | | | | | | | | | | | These instructions only read 64-bits of memory so we shouldn't allow a full vector width load to be pattern matched in case it is marked volatile. Instead allow vzload or scalar_to_vector+load. Also add a DAG combine to turn full vector loads into vzload when used by one of these instructions if the load isn't volatile. This fixes another case for PR42079 llvm-svn: 364838
* AMDGPU/GlobalISel: Handle more input argument intrinsicsMatt Arsenault2019-07-012-41/+72
| | | | llvm-svn: 364836
* AMDGPU/GlobalISel: Lower kernarg segment ptr intrinsicsMatt Arsenault2019-07-013-24/+48
| | | | llvm-svn: 364835
* AMDGPU/GlobalISel: Legalize workgroup ID intrinsicsMatt Arsenault2019-07-012-0/+36
| | | | llvm-svn: 364834
* AMDGPU/GlobalISel: Legalize workitem ID intrinsicsMatt Arsenault2019-07-013-0/+127
| | | | | | | | | Tests don't cover the masked input path since non-kernel arguments aren't lowered yet. Test is copied directly from the existing test, with 2 additions. llvm-svn: 364833
* AMDGPU/GlobalISel: Custom lower control flow intrinsicsMatt Arsenault2019-07-012-0/+68
| | | | | | | | Replace the brcond for the 2 cases that act as branches. For now follow how the current system works, although I think we can eventually get rid of the pseudos. llvm-svn: 364832
* AMDGPU/GlobalISel: Handle 16-bit SALU min/maxMatt Arsenault2019-07-011-5/+19
| | | | | | | | | This needs to be extended to s32, and expanded into cmp+select. This is relying on the fact that widenScalar happens to leave the instruction in place, but this isn't a guaranteed property of LegalizerHelper. llvm-svn: 364831
* AMDGPU/GlobalISel: Lower SALU min/max to cmp+selectMatt Arsenault2019-07-011-6/+41
| | | | | | | Use a change observer to apply a register bank to the newly created intermediate result register. llvm-svn: 364830
* [X86] Avoid SFB - Fix inconsistent codegen with/without debug info(2)Robert Lougher2019-07-011-0/+4
| | | | | | | | | | | | | | The function findPotentialBlockers may consider debug info instructions as potential blockers and may stop searching for a store-load pair prematurely. This patch corrects this and tests the cases where the store is separated from the load by more than InspectionLimit debug instructions. Patch by Chris Dawson. Differential Revision: https://reviews.llvm.org/D62408 llvm-svn: 364829
* AMDGPU/GlobalISel: Legalize s16 add/sub/mulMatt Arsenault2019-07-012-2/+85
| | | | | | | If this is scalar, promote to s32. Use a new observer class to assign the register bank of newly created registers. llvm-svn: 364827
* AMDGPU/GlobalISel: Fix allowing non-boolean conditions for G_SELECTMatt Arsenault2019-07-011-9/+20
| | | | | | | | | The condition register bank must be scc or vcc so that a copy will be inserted, which will be lowered to a compare. Currently greedy unnecessarily forces using a VCC select. llvm-svn: 364825
* GlobalISel: Verify G_MERGE_VALUES operand sizesMatt Arsenault2019-07-011-0/+10
| | | | llvm-svn: 364822
* [GlobalISel]: Allow backends to custom legalize IntrinsicsAditya Nandakumar2019-07-012-0/+10
| | | | | | | | | https://reviews.llvm.org/D31359 Add a hook "legalizeInstrinsic" to allow backends to override this and custom lower/legalize intrinsics. llvm-svn: 364821
* AMDGPU/GlobalISel: RegBankSelect for sendmsg/sendmsghaltMatt Arsenault2019-07-011-3/+29
| | | | llvm-svn: 364819
* AMDGPU/GlobalISel: Legalize s16 fcmpMatt Arsenault2019-07-011-1/+9
| | | | llvm-svn: 364817
* GlobalISel: Implement lower for min/maxMatt Arsenault2019-07-011-0/+36
| | | | llvm-svn: 364816
* AMDGPU/GFX10: implement ds_ordered_count changesNicolai Haehnle2019-07-011-1/+22
| | | | | | | | | | | | | | | | | | | Summary: ds_ordered_count can now simultaneously operate on up to 4 dwords in a single instruction, which are taken from (and returned to) lanes 0..3 of a single VGPR. Change-Id: I19b6e7b0732b617c10a779a7f9c0303eec7dd276 Reviewers: mareko, arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63716 llvm-svn: 364815
* AMDGPU: Support GDS atomicsNicolai Haehnle2019-07-018-54/+97
| | | | | | | | | | | | | | | | | Summary: Original patch by Marek Olšák Change-Id: Ia97d5d685a63a377d86e82942436d1fe6e429bab Reviewers: mareko, arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63452 llvm-svn: 364814
* AMDGPU/GlobalISel: RegBankSelect for DS ordered add/swapMatt Arsenault2019-07-011-2/+31
| | | | llvm-svn: 364811
* AArch64/GlobalISel: Fix trying to select invalid MIRMatt Arsenault2019-07-011-18/+15
| | | | | | Physical registers are not allowed to be a phi operand. llvm-svn: 364810
* AMDGPU/GlobalISel: RegBankSelect for amdgcn.writelaneMatt Arsenault2019-07-011-5/+58
| | | | llvm-svn: 364808
* AMDGPU/GlobalISel: Fail instead of assert when selecting loadsMatt Arsenault2019-07-011-5/+11
| | | | llvm-svn: 364807
* AMDGPU/GlobalISel: Complete implementation of G_GEPMatt Arsenault2019-07-013-53/+79
| | | | | | | | Also works around tablegen defect in selecting add with unused carry, but if we have to manually select GEP, might as well handle add manually. llvm-svn: 364806
* AMDGPU/GlobalISel: Select G_PHIMatt Arsenault2019-07-012-0/+41
| | | | llvm-svn: 364805
* AMDGPU/GlobalISel: Try to select VOP3 form of addMatt Arsenault2019-07-011-0/+20
| | | | | | | | | | | There are several things broken, but at least emit the right thing for gfx9. The import of the pattern with the unused carry out seems to not work. Needs a special class for clamp, because OperandWithDefaultOps doesn't really work. llvm-svn: 364804
* [X86] Add widenSubVector to size in bits helper. NFCI.Simon Pilgrim2019-07-011-4/+16
| | | | | | | | We can already widenSubVector to a specific type (of the same scalar type) - this variant just specifies the target vector size. This will be useful when CombineShuffleWithExtract relaxes the need to have the same scalar type for all shuffle operand subvector sources. llvm-svn: 364803
* AMDGPU/GlobalISel: RegBankSelect for readlane/readfirstlaneMatt Arsenault2019-07-012-0/+82
| | | | llvm-svn: 364801
* AMDGPU/GlobalISel: Implement select for 32-bit G_ADDTom Stellard2019-07-012-2/+7
| | | | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: hiraditya, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58804 llvm-svn: 364797
* [ARM] Fix MVE_VQxDMLxDH instruction classMikhail Maltsev2019-07-011-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: According to the ARMARM, the VQDMLADH, VQRDMLADH, VQDMLSDH and VQRDMLSDH instructions handle their results as follows: "The base variant writes the results into the lower element of each pair of elements in the destination register, whereas the exchange variant writes to the upper element in each pair". I.e., the initial content of the output register affects the result, as usual, we model this with an additional input. Also, for 32-bit variants Qd is not allowed to be the same register as Qm and Qn, we use @earlyclobber to indicate this. This patch also changes vpred_r to vpred_n because the instructions don't have an explicit 'inactive' operand. Reviewers: dmgreen, ostannard, simon_tatham Reviewed By: simon_tatham Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64007 llvm-svn: 364796
* AMDGPU/GlobalISel: Select G_BRCOND for vccMatt Arsenault2019-07-012-25/+44
| | | | llvm-svn: 364795
* [ARM] MVE: support QQPRRegClass and QQQQPRRegClassMikhail Maltsev2019-07-011-2/+3
| | | | | | | | | | | | | | | | | | | Summary: QQPRRegClass and QQQQPRRegClass are used by the interleaving/deinterleaving loads/stores to represent sequences of consecutive SIMD registers. Reviewers: ostannard, simon_tatham, dmgreen Reviewed By: simon_tatham Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64009 llvm-svn: 364794
* [InstCombine] (Y + ~X) + 1 --> Y - X fold (PR42459)Roman Lebedev2019-07-011-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: To be noted, this pattern is not unhandled by instcombine per-se, it is somehow does end up being folded when one runs opt -O3, but not if it's just -instcombine. Regardless, that fold is indirect, depends on some other folds, and is thus blind when there are extra uses. This does address the regression being exposed in D63992. https://godbolt.org/z/7DGltU https://rise4fun.com/Alive/EPO0 Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42459 | PR42459 ]] Reviewers: spatel, nikic, huihuiz Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63993 llvm-svn: 364792
* [InstCombine] Shift amount reassociation in bittest (PR42399)Roman Lebedev2019-07-011-0/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Given pattern: `icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0` we should move shifts to the same hand of 'and', i.e. rewrite as `icmp eq/ne (and (x shift (Q+K)), y), 0` iff `(Q+K) u< bitwidth(x)` It might be tempting to not restrict this to situations where we know we'd fold two shifts together, but i'm not sure what rules should there be to avoid endless combine loops. We pick the same shift that was originally used to shift the variable we picked to shift: https://rise4fun.com/Alive/6x1v Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42399 | PR42399]]. Reviewers: spatel, nikic, RKSimon Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63829 llvm-svn: 364791
* [Hexagon] Custom-lower UADDO(x, 1) and USUBO(x, 1)Krzysztof Parzyszek2019-07-012-2/+42
| | | | llvm-svn: 364790
* AMDGPU/GlobalISel: Select G_FRAME_INDEXMatt Arsenault2019-07-012-0/+19
| | | | llvm-svn: 364789
* AMDGPU/GFX10: fix scratch resource descriptorNicolai Haehnle2019-07-011-2/+2
| | | | | | | | | | | | | | | | | | | | Summary: The stride should depend on the wave size, not the hardware generation. Also, the 32_FLOAT format is 0x16, not 16; though that shouldn't be relevant. Change-Id: I088f93bf6708974d085d1c50967f119061da6dc6 Reviewers: arsenm, rampitec, mareko Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63808 llvm-svn: 364788
* AMDGPU/GlobalISel: Make s16 select legalMatt Arsenault2019-07-012-7/+9
| | | | | | | This is easy to handle and avoids legalization artifacts which are likely to obscure combines. llvm-svn: 364787
* AMDGPU/GlobalISel: Select G_BRCOND for scc conditionsMatt Arsenault2019-07-012-0/+34
| | | | llvm-svn: 364786
* AMDGPU/GlobalISel: Tolerate copies with no type setMatt Arsenault2019-07-011-3/+6
| | | | | | | isVCC has the same bug, but isn't used in a context where it can cause a problem. llvm-svn: 364784
OpenPOWER on IntegriCloud