summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstSimplify] Move masked.gather w/no active lanes handling to InstSimplify ↵Philip Reames2019-04-221-5/+0
| | | | | | | | from InstCombine In the process, use the existing masked.load combine which is slightly stronger, and handles a mix of zero and undef elements in the mask. llvm-svn: 358913
* [InstCombine] Factor out unreachable inst idiom creation [NFC]Philip Reames2019-04-171-10/+3
| | | | | | | | In InstCombine, we use an idiom of "store i1 true, i1 undef" to indicate we've found a path which we've proven unreachable. We can't actually insert the unreachable instruction since that would require changing the CFG. We leave that to simplifycfg later. This just factors out that idiom creation so we don't duplicate the same mostly undocument idiom creation in multiple places. llvm-svn: 358600
* [InstCombine] Prune fshl/fshr with masked operandsNikita Popov2019-04-161-0/+4
| | | | | | | | | | | | | | | | If a constant shift amount is used, then only some of the LHS/RHS operand bits are demanded and we may be able to simplify based on that. InstCombineSimplifyDemanded already had the necessary support for that, we just weren't calling it with fshl/fshr as root. In particular, this allows us to relax some masked funnel shifts into simple shifts, as shown in the tests. Patch by Shawn Landden. Differential Revision: https://reviews.llvm.org/D60660 llvm-svn: 358515
* [PGO] Profile guided code size optimization.Hiroshi Yamauchi2019-04-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Enable some of the existing size optimizations for cold code under PGO. A ~5% code size saving in big internal app under PGO. The way it gets BFI/PSI is discussed in the RFC thread http://lists.llvm.org/pipermail/llvm-dev/2019-March/130894.html Note it doesn't currently touch loop passes. Reviewers: davidxl, eraman Reviewed By: eraman Subscribers: mgorny, javed.absar, smeenai, mehdi_amini, eraman, zzheng, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59514 llvm-svn: 358422
* [InstCombine] ssubo X, C -> saddo X, -CNikita Popov2019-04-101-0/+21
| | | | | | | | | | | | ssubo X, C is equivalent to saddo X, -C. Make the transformation in InstCombine and allow the logic implemented for saddo to fold prior usages of add nsw or sub nsw with constants. Patch by Dan Robertson. Differential Revision: https://reviews.llvm.org/D60061 llvm-svn: 358099
* [InstCombine][X86] Expand MOVMSK to generic IR (PR39927)Simon Pilgrim2019-04-081-40/+14
| | | | | | | | | | | | | | | First step towards removing the MOVMSK intrinsics completely - this patch expands MOVMSK to the pattern: e.g. PMOVMSKB(v16i8 x): %cmp = icmp slt <16 x i8> %x, zeroinitializer %int = bitcast <16 x i8> %cmp to i16 %res = zext i16 %int to i32 Which is correctly handled by ISel and FastIsel (give or take an annoying movzx move....): https://godbolt.org/z/rkrSFW Differential Revision: https://reviews.llvm.org/D60256 llvm-svn: 357909
* [InstCombine] Simplify ctpop with bitreverse/bswapDavid Bolvansky2019-04-031-0/+8
| | | | | | | | | | | | | | | | Summary: Fixes PR41337 Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60148 llvm-svn: 357564
* [InstCombine] Simplify ctlz/cttz with bitreverseDavid Bolvansky2019-04-021-1/+9
| | | | | | | | | | | | | | | | Summary: Fixes PR41273 Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60096 llvm-svn: 357521
* [instcombine] Add some todos, and arrange code for readibilityPhilip Reames2019-03-211-32/+34
| | | | llvm-svn: 356642
* Simplify operands of masked stores and scatters based on demanded elementsPhilip Reames2019-03-201-8/+49
| | | | | | | | If we know we're not storing a lane, we don't need to compute the lane. This could be improved by using the undef element result to further prune the mask, but I want to separate that into its own change since it's relatively likely to expose other problems. Differential Revision: https://reviews.llvm.org/D57247 llvm-svn: 356590
* [InstCombine] Fold add nuw + uadd.with.overflowNikita Popov2019-03-201-6/+9
| | | | | | | | | | | | | Fold add nuw and uadd.with.overflow with constants if the addition does not overflow. Part of https://bugs.llvm.org/show_bug.cgi?id=38146. Patch by Dan Robertson. Differential Revision: https://reviews.llvm.org/D59471 llvm-svn: 356584
* [instcombine] Add todos describing missing transforms for masked.* intrinsicsPhilip Reames2019-03-201-0/+17
| | | | llvm-svn: 356536
* [InstCombine] allow general vector constants for funnel shift to shift ↵Sanjay Patel2019-03-181-20/+13
| | | | | | | | | | | | | transforms Follow-up to: rL356338 rL356369 We can calculate an arbitrary vector constant minus the bitwidth, so there's no need to limit this transform to scalars and splats. llvm-svn: 356372
* [InstCombine] extend rotate-left-by-constant canonicalization to funnel shiftSanjay Patel2019-03-181-9/+10
| | | | | | | | | | | Follow-up to: rL356338 Rotates are a special case of funnel shift where the 2 input operands are the same value, but that does not need to be a restriction for the canonicalization when the shift amount is a constant. llvm-svn: 356369
* [InstCombine] canonicalize rotate right by constant to rotate leftSanjay Patel2019-03-171-7/+20
| | | | | | | | | | | This was noted as a backend problem: https://bugs.llvm.org/show_bug.cgi?id=41057 ...and subsequently fixed for x86: rL356121 But we should canonicalize these in IR for the benefit of all targets and improve IR analysis such as CSE. llvm-svn: 356338
* [SimplifyDemandedVec] Strengthen handling all undef lanes (particularly GEPs)Philip Reames2019-03-151-2/+1
| | | | | | | | | | | | A change of two parts: 1) A generic enhancement for all callers of SDVE to exploit the fact that if all lanes are undef, the result is undef. 2) A GEP specific piece to strengthen/fix the vector index undef element handling, and call into the generic infrastructure when visiting the GEP. The result is that we replace a vector gep with at least one undef in each lane with a undef. We can also do the same for vector intrinsics. Once the masked.load patch (D57372) has landed, I'll update to include call tests as well. Differential Revision: https://reviews.llvm.org/D57468 llvm-svn: 356293
* [InstCombine] canonicalize funnel shift constant shift amount to be modulo ↵Sanjay Patel2019-03-141-2/+13
| | | | | | | | | | | | | | | bitwidth The shift argument is defined to be modulo the bitwidth, so if that argument is a constant, we can always reduce the constant to its minimal form to allow better CSE and other follow-on transforms. We need to be careful to ignore constant expressions here, or we will likely infinite loop. I'm adding a general vector constant query for that case. Differential Revision: https://reviews.llvm.org/D59374 llvm-svn: 356192
* IR: Add immarg attributeMatt Arsenault2019-03-121-13/+6
| | | | | | | | | | | | | | | | | This indicates an intrinsic parameter is required to be a constant, and should not be replaced with a non-constant value. Add the attribute to all AMDGPU and generic intrinsics that comments indicate it should apply to. I scanned other target intrinsics, but I don't see any obvious comments indicating which arguments are intended to be only immediates. This breaks one questionable testcase for the autoupgrade. I'm unclear on whether the autoupgrade is supposed to really handle declarations which were never valid. The verifier fails because the attributes now refer to a parameter past the end of the argument list. llvm-svn: 355981
* [InstCombine] Fold add nsw + sadd.with.overflowNikita Popov2019-03-061-10/+40
| | | | | | | | | | | | | Fold `add nsw` and `sadd.with.overflow` with constants if the addition does not overflow. Part of https://bugs.llvm.org/show_bug.cgi?id=38146. Patch by Dan Robertson. Differential Revision: https://reviews.llvm.org/D58881 llvm-svn: 355530
* [InstCombine] Fix an unused variable warning.Craig Topper2019-02-101-1/+1
| | | | llvm-svn: 353630
* Implementation of asm-goto support in LLVMCraig Topper2019-02-081-18/+39
| | | | | | | | | | | | | | | | | | | | | | | | | This patch accompanies the RFC posted here: http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html This patch adds a new CallBr IR instruction to support asm-goto inline assembly like gcc as used by the linux kernel. This instruction is both a call instruction and a terminator instruction with multiple successors. Only inline assembly usage is supported today. This also adds a new INLINEASM_BR opcode to SelectionDAG and MachineIR to represent an INLINEASM block that is also considered a terminator instruction. There will likely be more bug fixes and optimizations to follow this, but we felt it had reached a point where we would like to switch to an incremental development model. Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii Differential Revision: https://reviews.llvm.org/D53765 llvm-svn: 353563
* [opaque pointer types] Pass function type for CallBase::setCalledFunction.James Y Knight2019-02-011-8/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D57174 llvm-svn: 352914
* [opaque pointer types] Pass value type to LoadInst creation.James Y Knight2019-02-011-7/+8
| | | | | | | | | This cleans up all LoadInst creation in LLVM to explicitly pass the value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57172 llvm-svn: 352911
* [opaque pointer types] Pass function types to CallInst creation.James Y Knight2019-02-011-12/+16
| | | | | | | | | This cleans up all CallInst creation in LLVM to explicitly pass a function type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57170 llvm-svn: 352909
* [InstCombine] reduce duplicate code; NFCSanjay Patel2019-02-011-2/+1
| | | | | | | | | An unused variable problem was introduced with rL352870 and stubbed out with rL352871, but we can make a better fix by actually using the local variable in code rather than just the assert. llvm-svn: 352873
* [InstCombine] Fix -Wunused-variable when -DLLVM_ENABLE_ASSERTIONS=offFangrui Song2019-02-011-0/+1
| | | | llvm-svn: 352871
* [InstCombine] try to reduce x86 addcarry to generic uaddo intrinsicSanjay Patel2019-02-011-0/+33
| | | | | | | | | | | | | | | | | | | If we can reduce the x86-specific intrinsic to the generic op, it allows existing simplifications and value tracking folds. AFAICT, this always results in identical x86 codegen in the non-reduced case...which should be true because we semi-generically (too aggressively IMO) convert to llvm.uadd.with.overflow in CGP, so the DAG/isel must already combine/lower this intrinsic as expected. This isn't quite what was requested in: https://bugs.llvm.org/show_bug.cgi?id=40486 ...but we want to have these kinds of folds early for efficiency and to enable greater simplifications. For the case in the bug report where we have: _addcarry_u64(0, ahi, 0, &ahi) ...this gets completely simplified away in IR. Differential Revision: https://reviews.llvm.org/D57453 llvm-svn: 352870
* [CallSite removal] Remove CallSite uses from InstCombine.Craig Topper2019-01-311-87/+81
| | | | | | | | | | | | Reviewers: chandlerc Reviewed By: chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57494 llvm-svn: 352771
* Add a 'dynamic' parameter to the objectsize intrinsicErik Pilkington2019-01-301-3/+2
| | | | | | | | | | | | | | This is meant to be used with clang's __builtin_dynamic_object_size. When 'true' is passed to this parameter, the intrinsic has the potential to be folded into instructions that will be evaluated at run time. When 'false', the objectsize intrinsic behaviour is unchanged. rdar://32212419 Differential revision: https://reviews.llvm.org/D56761 llvm-svn: 352664
* SimplifyDemandedVectorElts for all intrinsicsPhilip Reames2019-01-301-32/+15
| | | | | | | | | | The point is that this simplifies integration of new intrinsics into SimplifiedDemandedVectorElts, and ensures we don't miss any existing ones. This is intended to be NFC-ish, but as seen from the diffs, can produce slightly different output. This is due to order of transforms w/in instcombine resulting in two slightly different fixed points. That's something we should fix, but isn't a problem w/this patch per se. Differential Revision: https://reviews.llvm.org/D57398 llvm-svn: 352653
* [X86] Auto upgrade VPCOM/VPCOMU intrinsics to generic integer comparisonsSimon Pilgrim2019-01-201-55/+0
| | | | | | | | This causes a couple of changes in the upgrade tests as signed/unsigned eq/ne are equivalent and we constant fold true/false codes, these changes are the same as what we already do for avx512 cmp/ucmp. Noticed while cleaning up vector integer comparison costs for PR40376. llvm-svn: 351697
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [InstCombine]Avoid introduction of unaligned mem accessSerguei Katkov2019-01-161-3/+20
| | | | | | | | | | | | | | InstCombine is able to transform mem transfer instrinsic to alone store or store/load pair. It might result in generation of unaligned atomic load/store which later in backend will be transformed to libcall. It is not an evident gain and it is better to keep intrinsic as is and handle it at backend. Reviewers: reames, anna, apilipenko, mkazantsev Reviewed By: reames Subscribers: t.p.northover, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D56582 llvm-svn: 351295
* AMDGPU: Add a fast path for icmp.i1(src, false, NE)Marek Olsak2019-01-151-0/+5
| | | | | | | | | | | | | | | | | Summary: This allows moving the condition from the intrinsic to the standard ICmp opcode, so that LLVM can do simplifications on it. The icmp.i1 intrinsic is an identity for retrieving the SGPR mask. And we can also get the mask from and i1, or i1, xor i1. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52060 llvm-svn: 351150
* [X86][SSE] Auto upgrade PADDS/PSUBS intrinsics to SADD_SAT/SSUB_SAT generic ↵Simon Pilgrim2018-12-211-78/+0
| | | | | | | | | | | | intrinsics (llvm) This auto upgrades the signed SSE saturated math intrinsics to SADD_SAT/SSUB_SAT generic intrinsics. Clang counterpart: https://reviews.llvm.org/D55890 Differential Revision: https://reviews.llvm.org/D55894 llvm-svn: 349892
* Introduce llvm.loop.parallel_accesses and llvm.access.group metadata.Michael Kruse2018-12-201-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current llvm.mem.parallel_loop_access metadata has a problem in that it uses LoopIDs. LoopID unfortunately is not loop identifier. It is neither unique (there's even a regression test assigning the some LoopID to multiple loops; can otherwise happen if passes such as LoopVersioning make copies of entire loops) nor persistent (every time a property is removed/added from a LoopID's MDNode, it will also receive a new LoopID; this happens e.g. when calling Loop::setLoopAlreadyUnrolled()). Since most loop transformation passes change the loop attributes (even if it just to mark that a loop should not be processed again as llvm.loop.isvectorized does, for the versioned and unversioned loop), the parallel access information is lost for any subsequent pass. This patch unlinks LoopIDs and parallel accesses. llvm.mem.parallel_loop_access metadata on instruction is replaced by llvm.access.group metadata. llvm.access.group points to a distinct MDNode with no operands (avoiding the problem to ever need to add/remove operands), called "access group". Alternatively, it can point to a list of access groups. The LoopID then has an attribute llvm.loop.parallel_accesses with all the access groups that are parallel (no dependencies carries by this loop). This intentionally avoid any kind of "ID". Loops that are clones/have their attributes modifies retain the llvm.loop.parallel_accesses attribute. Access instructions that a cloned point to the same access group. It is not necessary for each access to have it's own "ID" MDNode, but those memory access instructions with the same behavior can be grouped together. The behavior of llvm.mem.parallel_loop_access is not changed by this patch, but should be considered deprecated. Differential Revision: https://reviews.llvm.org/D52116 llvm-svn: 349725
* [InstCombine] try to convert x86 movmsk intrinsic to generic IR (PR39927)Sanjay Patel2018-12-111-20/+38
| | | | | | | | | | | | | call iM movmsk(sext <N x i1> X) --> zext (bitcast <N x i1> X to iN) to iM This has the potential to create less-than-8-bit scalar types as shown in some of the test diffs, but it looks like the backend knows how to deal with that in these patterns. This is the simple part of the fix suggested in: https://bugs.llvm.org/show_bug.cgi?id=39927 Differential Revision: https://reviews.llvm.org/D55529 llvm-svn: 348862
* [InstCombine] Support ssub.sat canonicalization for non-splatsNikita Popov2018-12-011-5/+4
| | | | | | | | | | | | Extend ssub.sat(X, C) -> sadd.sat(X, -C) canonicalization to also support non-splat vector constants. This is done by generalizing the implementation of the isNotMinSignedValue() helper to return true for constants that are non-splat, but don't contain any signed min elements. Differential Revision: https://reviews.llvm.org/D55011 llvm-svn: 348072
* [InstCombine] Combine saturating add/sub with constant operandsNikita Popov2018-11-281-0/+34
| | | | | | | | | | | | | | | | | Combine sat(sat(X + C1) + C2) -> sat(X + (C1+C2)) and sat(sat(X - C1) - C2) -> sat(X - (C1+C2)) if the sign of C1 and C2 matches. In the unsigned case we can compute C1+C2 with saturating arithmetic, and InstSimplify will reduce this just to the saturation value. For the signed case, we cannot perform the simplification if the result of the addition overflows. This change is part of https://reviews.llvm.org/D54534. llvm-svn: 347773
* [InstCombine] Canonicalize ssub.sat to sadd.satNikita Popov2018-11-281-0/+11
| | | | | | | | | Canonicalize ssub.sat(X, C) to ssub.sat(X, -C) if C is constant and not signed minimum. This will help further optimizations to apply. This change is part of https://reviews.llvm.org/D54534. llvm-svn: 347772
* [InstCombine] Use known overflow information for saturating add/subNikita Popov2018-11-281-0/+38
| | | | | | | | | | | | | If ValueTracking can determine that the add/sub can newer overflow, replace it with the corresponding nuw/nsw add/sub. Additionally, for the unsigned case, if ValueTracking determines that the add/sub always overflows, replace the result with the saturation value. This change is part of https://reviews.llvm.org/D54534. llvm-svn: 347770
* [InstCombine] Canonicalize const arg for saturating addsNikita Popov2018-11-281-0/+6
| | | | | | | | | If a saturating add intrinsic has one constant argument, make sure it is on the RHS. This will simplify further transformations. This change is part of https://reviews.llvm.org/D54534. llvm-svn: 347769
* [InstCombine] add helper function to reduce code duplication; NFCSanjay Patel2018-11-261-24/+19
| | | | llvm-svn: 347604
* [InstCombine] Simplify funnel shift with zero/undef operand to shiftNikita Popov2018-11-231-0/+23
| | | | | | | | | | | | | | | | | | | | The following simplifications are implemented: * `fshl(X, 0, C) -> shl X, C%BW` * `fshl(X, undef, C) -> shl X, C%BW` (assuming undef = 0) * `fshl(0, X, C) -> lshr X, BW-C%BW` * `fshl(undef, X, C) -> lshr X, BW-C%BW` (assuming undef = 0) * `fshr(X, 0, C) -> shl X, (BW-C%BW)` * `fshr(X, undef, C) -> shl X, BW-C%BW` (assuming undef = 0) * `fshr(0, X, C) -> lshr X, C%BW` * `fshr(undef, X, C) -> lshr, X, C%BW` (assuming undef = 0) The simplification is only performed if the shift amount C is constant, because we can explicitly compute C%BW and BW-C%BW in this case. Differential Revision: https://reviews.llvm.org/D54778 llvm-svn: 347505
* [InstCombine] fold funnel shift amount based on demanded bitsSanjay Patel2018-11-131-0/+14
| | | | | | | | | | | | | | The shift amount of a funnel shift is modulo the scalar bitwidth: http://llvm.org/docs/LangRef.html#llvm-fshl-intrinsic ...so we can use demanded bits analysis on that operand to simplify it when we have a power-of-2 bitwidth. This is another step towards canonicalizing {shift/shift/or} to the intrinsics in IR. Differential Revision: https://reviews.llvm.org/D54478 llvm-svn: 346814
* [GC][InstCombine] Fix a potential iteration issuePhilip Reames2018-11-121-1/+4
| | | | | | Noticed via inspection. Appears to be largely innocious in practice, but slight code change could have resulted in either visit order dependent missed optimizations or infinite loops. May be a minor compile time problem today. llvm-svn: 346698
* InstCombine: Avoid introducing poison values when lowering llvm.amdgcn.[us]bfeTom Stellard2018-11-081-13/+5
| | | | | | | | | | | | | | | | | | | Summary: When the 3rd argument to these intrinsics is zero, lowering them to shift instructions produces poison values, since we end up with shift amounts equal to the number of bits in the shifted value. This means we can only lower these intrinsics if we can prove that the 3rd argument is not zero. Reviewers: arsenm Reviewed By: arsenm Subscribers: bnieuwenhuizen, jvesely, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D53739 llvm-svn: 346422
* [InstCombine] Combine nested min/max intrinsics with constantsVolkan Keles2018-10-311-1/+35
| | | | | | | | | | | | Reviewers: arsenm, spatel Reviewed By: spatel Subscribers: lebedev.ri, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D53774 llvm-svn: 345751
* [InstCombine] InstCombine and InstSimplify for minimum and maximumThomas Lively2018-10-191-5/+22
| | | | | | | | | | | | Summary: Depends on D52765 Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52766 llvm-svn: 344799
* [TI removal] Make variables declared as `TerminatorInst` and initializedChandler Carruth2018-10-151-1/+1
| | | | | | | | | | | | | by `getTerminator()` calls instead be declared as `Instruction`. This is the biggest remaining chunk of the usage of `getTerminator()` that insists on the narrow type and so is an easy batch of updates. Several files saw more extensive updates where this would cascade to requiring API updates within the file to use `Instruction` instead of `TerminatorInst`. All of these were trivial in nature (pervasively using `Instruction` instead just worked). llvm-svn: 344502
OpenPOWER on IntegriCloud