summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Rename intrinsics to use amdgcn prefixMatt Arsenault2016-01-221-1/+1
| | | | | | | | | | | The intrinsic target prefix should match the target name as it appears in the triple. This is not yet complete, but gets most of the important ones. llvm.AMDGPU.* intrinsics used by mesa and libclc are still handled for compatability for now. llvm-svn: 258557
* [opaque pointer types] [NFC] FindAvailableLoadedValue: take LoadInst instead ↵Eduard Burtescu2016-01-221-1/+1
| | | | | | | | | | | | of just the pointer. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16422 llvm-svn: 258477
* don't repeat function names in comments; NFCSanjay Patel2016-01-201-19/+14
| | | | llvm-svn: 258360
* 80-cols; NFCSanjay Patel2016-01-201-2/+2
| | | | llvm-svn: 258323
* remove outdated comment; NFCSanjay Patel2016-01-191-4/+0
| | | | llvm-svn: 258147
* [opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with ↵Eduard Burtescu2016-01-192-16/+13
| | | | | | | | | | | | | | | | | | get{Source,Result}ElementType. Summary: GEPOperator: provide getResultElementType alongside getSourceElementType. This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has. GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16275 llvm-svn: 258145
* combine clauses with same output ; NFCISanjay Patel2016-01-181-8/+3
| | | | llvm-svn: 258062
* use m_OneUse ; NFCISanjay Patel2016-01-181-4/+2
| | | | llvm-svn: 258059
* fix variable names, typos ; NFCSanjay Patel2016-01-181-36/+36
| | | | llvm-svn: 258058
* fix typo; NFCSanjay Patel2016-01-181-1/+1
| | | | llvm-svn: 258057
* [opaque pointer types] [breaking-change] [NFC] SimplifyGEPInst: take the ↵Manuel Jacob2016-01-171-1/+1
| | | | | | | | | | | | | | source element type of the GEP as an argument. Patch by Eduard Burtescu. Reviewers: dblaikie, mjacob Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16281 llvm-svn: 258024
* GlobalValue: use getValueType() instead of getType()->getPointerElementType().Manuel Jacob2016-01-162-3/+2
| | | | | | | | | | | | Reviewers: mjacob Subscribers: jholewinski, arsenm, dsanders, dblaikie Patch by Eduard Burtescu. Differential Revision: http://reviews.llvm.org/D16260 llvm-svn: 257999
* Re-commit r257064, after it was reverted in r257340.Silviu Baranga2016-01-151-3/+320
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This contains a fix for the issue that caused the revert: we no longer assume that we can insert instructions after the instruction that produces the base pointer. We previously assumed that this would be ok, because the instruction produces a value and therefore is not a terminator. This is false for invoke instructions. We will now insert these new instruction directly at the location of the users. Original commit message: [InstCombine] Look through PHIs, GEPs, IntToPtrs and PtrToInts to expose more constants when comparing GEPs Summary: When comparing two GEP instructions which have the same base pointer and one of them has a constant index, it is possible to only compare indices, transforming it to a compare with a constant. This removes one use for the GEP instruction with the constant index, can reduce register pressure and can sometimes lead to removing the comparisson entirely. InstCombine was already doing this when comparing two GEPs if the base pointers were the same. However, in the case where we have complex pointer arithmetic (GEPs applied to GEPs, PHIs of GEPs, conversions to or from integers, etc) the value of the original base pointer will be hidden to the optimizer and this transformation will be disabled. This change detects when the two sides of the comparison can be expressed as GEPs with the same base pointer, even if they don't appear as such in the IR. The transformation will convert all the pointer arithmetic to arithmetic done on indices and all the relevant uses of GEPs to GEPs with a common base pointer. The GEP comparison will be converted to a comparison done on indices. Reviewers: majnemer, jmolloy Subscribers: hfinkel, jevinskie, jmolloy, aadg, llvm-commits Differential Revision: http://reviews.llvm.org/D15146 llvm-svn: 257897
* Change isSafeToLoadUnconditionally arguments order. Separated from ↵Artur Pilipenko2016-01-151-2/+2
| | | | | | http://reviews.llvm.org/D10920. llvm-svn: 257894
* [InstCombine] Rewrite bswap/bitreverse handling completely.James Molloy2016-01-151-179/+8
| | | | | | | | | | | | | | There are several requirements that ended up with this design; 1. Matching bitreversals is too heavyweight for InstCombine and doesn't really need to be done so early. 2. Bitreversals and byteswaps are very related in their matching logic. 3. We want to implement support for matching more advanced bswap/bitreverse patterns like partial bswaps/bitreverses. 4. Bswaps are best matched early in InstCombine. The result of these is that a new utility function is created in Transforms/Utils/Local.h that can be configured to search for bswaps, bitreverses or both. InstCombine uses it to find only bswaps, CGP uses it to find only bitreversals. We can then extend the matching logic in one place only. llvm-svn: 257875
* function names start with a lower case letter ; NFCSanjay Patel2016-01-122-6/+6
| | | | llvm-svn: 257496
* Revert r257164 - it has caused spec2k6 failures in LTO modeSilviu Baranga2016-01-111-322/+3
| | | | llvm-svn: 257340
* InstCombineCompares.cpp: Fix a warning. [-Wbraced-scalar-init]NAKAMURA Takumi2016-01-081-1/+1
| | | | llvm-svn: 257167
* Re-commit r257064, this time with a fixed assertSilviu Baranga2016-01-081-3/+322
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In setInsertionPoint if the value is not a PHI, Instruction or Argument it should be a Constant, not a ConstantExpr. Original commit message: [InstCombine] Look through PHIs, GEPs, IntToPtrs and PtrToInts to expose more constants when comparing GEPs Summary: When comparing two GEP instructions which have the same base pointer and one of them has a constant index, it is possible to only compare indices, transforming it to a compare with a constant. This removes one use for the GEP instruction with the constant index, can reduce register pressure and can sometimes lead to removing the comparisson entirely. InstCombine was already doing this when comparing two GEPs if the base pointers were the same. However, in the case where we have complex pointer arithmetic (GEPs applied to GEPs, PHIs of GEPs, conversions to or from integers, etc) the value of the original base pointer will be hidden to the optimizer and this transformation will be disabled. This change detects when the two sides of the comparison can be expressed as GEPs with the same base pointer, even if they don't appear as such in the IR. The transformation will convert all the pointer arithmetic to arithmetic done on indices and all the relevant uses of GEPs to GEPs with a common base pointer. The GEP comparison will be converted to a comparison done on indices. Reviewers: majnemer, jmolloy Subscribers: hfinkel, jevinskie, jmolloy, aadg, llvm-commits Differential Revision: http://reviews.llvm.org/D15146 llvm-svn: 257164
* [InstCombine] insert a new shuffle in a safe place (PR25999)Sanjay Patel2016-01-081-10/+7
| | | | | | | | Limit this transform to a basic block and guard against PHIs. Hopefully, this fixes the remaining failures in PR25999: https://llvm.org/bugs/show_bug.cgi?id=25999 llvm-svn: 257133
* Revert r257064. It caused failures in some sanitizer tests.Silviu Baranga2016-01-071-322/+3
| | | | llvm-svn: 257069
* Fix build after r257064: we should be returning false, not nullptrSilviu Baranga2016-01-071-2/+2
| | | | llvm-svn: 257067
* [InstCombine] Look through PHIs, GEPs, IntToPtrs and PtrToInts to expose ↵Silviu Baranga2016-01-071-3/+322
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | more constants when comparing GEPs Summary: When comparing two GEP instructions which have the same base pointer and one of them has a constant index, it is possible to only compare indices, transforming it to a compare with a constant. This removes one use for the GEP instruction with the constant index, can reduce register pressure and can sometimes lead to removing the comparisson entirely. InstCombine was already doing this when comparing two GEPs if the base pointers were the same. However, in the case where we have complex pointer arithmetic (GEPs applied to GEPs, PHIs of GEPs, conversions to or from integers, etc) the value of the original base pointer will be hidden to the optimizer and this transformation will be disabled. This change detects when the two sides of the comparison can be expressed as GEPs with the same base pointer, even if they don't appear as such in the IR. The transformation will convert all the pointer arithmetic to arithmetic done on indices and all the relevant uses of GEPs to GEPs with a common base pointer. The GEP comparison will be converted to a comparison done on indices. Reviewers: majnemer, jmolloy Subscribers: hfinkel, jevinskie, jmolloy, aadg, llvm-commits Differential Revision: http://reviews.llvm.org/D15146 llvm-svn: 257064
* fix typo; NFCSanjay Patel2016-01-061-1/+1
| | | | llvm-svn: 256883
* [InstCombine] insert a new shuffle before its uses (PR26015)Sanjay Patel2016-01-051-8/+21
| | | | | | | | | | | | | | | | Although this solves the test case in PR26015: https://llvm.org/bugs/show_bug.cgi?id=26015 And may solve PR25999: https://llvm.org/bugs/show_bug.cgi?id=25999 ...I suspect this is not the best solution. I think we want to insert the new shuffle just ahead of the earliest ExtractElementInst that we're replacing, but I don't know how that should be implemented. Differential Revision: http://reviews.llvm.org/D15878 llvm-svn: 256857
* [Statepoints] Refactor GCRelocateOperands into an intrinsic wrapper. NFC.Manuel Jacob2016-01-051-2/+1
| | | | | | | | | | | | | | | Summary: This commit renames GCRelocateOperands to GCRelocateInst and makes it an intrinsic wrapper, similar to e.g. MemCpyInst. Also, all users of GCRelocateOperands were changed to use the new intrinsic wrapper instead. Reviewers: sanjoy, reames Subscribers: reames, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D15762 llvm-svn: 256811
* [InstructionCombining] prepareICWorklistFromFunction halts in infinite loop ↵Chen Li2016-01-041-3/+2
| | | | | | | | | | | | | | with instructions of token type Summary: This patch fixes a bug in prepareICWorklistFromFunction, where the loop becomes infinite with instructions of token type. The patch checks if the instruction is token type, and if so it updates EndInst with the current instruction. Reviewers: reames, majnemer Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D15859 llvm-svn: 256792
* fix formatting; NFCSanjay Patel2015-12-301-8/+8
| | | | llvm-svn: 256645
* [InstCombine] transform more extract/insert pairs into shuffles (PR2109)Sanjay Patel2015-12-241-3/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is an extension of the shuffle combining from r203229: http://reviews.llvm.org/rL203229 The idea is to widen a short input vector with undef elements so the existing shuffle transform for extract/insert can kick in. The motivation is to finally solve PR2109: https://llvm.org/bugs/show_bug.cgi?id=2109 For that example, the IR becomes: %1 = bitcast <2 x i32>* %P to <2 x float>* %ld1 = load <2 x float>, <2 x float>* %1, align 8 %2 = shufflevector <2 x float> %ld1, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef> %i2 = shufflevector <4 x float> %A, <4 x float> %2, <4 x i32> <i32 0, i32 1, i32 4, i32 5> ret <4 x float> %i2 And x86 SSE output improves from: movq (%rdi), %xmm1 ## xmm1 = mem[0],zero movdqa %xmm1, %xmm2 shufps $229, %xmm2, %xmm2 ## xmm2 = xmm2[1,1,2,3] shufps $48, %xmm0, %xmm1 ## xmm1 = xmm1[0,0],xmm0[3,0] shufps $132, %xmm1, %xmm0 ## xmm0 = xmm0[0,1],xmm1[0,2] shufps $32, %xmm0, %xmm2 ## xmm2 = xmm2[0,0],xmm0[2,0] shufps $36, %xmm2, %xmm0 ## xmm0 = xmm0[0,1],xmm2[2,0] retq To the almost optimal: movhpd (%rdi), %xmm0 Note: There's a tension in the existing transform related to generating arbitrary shufflevector masks. We avoid that in other places in InstCombine because we're scared that codegen can't handle strange masks, but it looks like we're ok with producing those here. I purposely chose weird insert/extract indexes for the regression tests to see the effect in these cases. For PowerPC+Altivec, AArch64, and X86+SSE/AVX, I think the codegen is equal or better for these examples. Differential Revision: http://reviews.llvm.org/D15096 llvm-svn: 256394
* [OperandBundles] Have InstCombine play nice with operand bundlesDavid Majnemer2015-12-231-4/+6
| | | | | | | Don't assume a call's use corresponds to an argument operand, it might correspond to a bundle operand. llvm-svn: 256327
* [InstCombine] Fix indentation. NFC.Craig Topper2015-12-211-2/+2
| | | | llvm-svn: 256131
* [InstCombine] Extend peephole DSE to handle unordered atomicsPhilip Reames2015-12-171-6/+11
| | | | | | | | | | | | This extends the same line of reasoning used in EarlyCSE w/http://reviews.llvm.org/D15352 to the DSE implementation in InstCombine. Key points: * We only remove unordered or simple stores. * The loads producing values consumed by dead stores don't influence whether the store is dead. Differential Revision: http://reviews.llvm.org/D15354 llvm-svn: 255932
* [InstCombine] Adding "\n" to debug output. NFC.Weiming Zhao2015-12-171-2/+2
| | | | | | | | | | | | | | | Summary: [InstCombine] Adding '\n' to debug output. NFC. Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org> Reviewers: apazos, majnemer, weimingz Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15403 llvm-svn: 255920
* InstCombineLoadStoreAlloca.cpp: Avoid instantiating Twine.NAKAMURA Takumi2015-12-151-4/+9
| | | | llvm-svn: 255637
* Instcombine: destructor loads of structs that do not contains paddingMehdi Amini2015-12-152-5/+59
| | | | | | | | | | | | | | | | | | | | For non padded structs, we can just proceed and deaggregate them. We don't want ot do this when there is padding in the struct as to not lose information about this padding (the subsequents passes would then try hard to preserve the padding, which is undesirable). Also update extractvalue.ll and cast.ll so that they use structs with padding. Remove the FIXME in the extractvalue of laod case as the non padded case is handled when processing the load, and we don't want to do it on the padded case. Patch by: Amaury SECHET <deadalnix@gmail.com> Differential Revision: http://reviews.llvm.org/D14483 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255600
* getParent() ^ 3 == getModule() ; NFCISanjay Patel2015-12-145-20/+14
| | | | llvm-svn: 255511
* [InstCombine] fold trunc ([lshr] (bitcast vector) ) --> extractelement (PR25543)Sanjay Patel2015-12-141-55/+47
| | | | | | | | | | | | | | | | | | | | | | | This is a fix for PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The idea is to take the existing fold of: bitcast ( trunc ( lshr ( bitcast X))) --> extractelement (bitcast X) ( http://reviews.llvm.org/rL112232 ) And break it into less specific transforms so we'll catch more cases such as the example in the bug report: bitcast ( trunc ( lshr ( bitcast X))) --> bitcast ( extractelement (bitcast X)) --> extractelement (bitcast X) Enabling patches for this change: http://reviews.llvm.org/rL255399 (combine bitcasts) http://reviews.llvm.org/rL255433 (canonicalize extractelement(bitcast X)) Differential Revision: http://reviews.llvm.org/D15392 llvm-svn: 255504
* [InstCombine] canonicalize (bitcast (extractelement X)) --> ↵Sanjay Patel2015-12-121-28/+17
| | | | | | | | | | | | | | | | | | | | | (extractelement(bitcast X)) This change was discussed in D15392. It allows us to remove the fold that was added in: http://reviews.llvm.org/r255261 ...and it will allow us to generalize this fold: http://reviews.llvm.org/rL112232 while preserving the order of bitcast + extract that it produces and testing shows is better handled by the backend. Note that the existing check for "isVectorTy()" wasn't strong enough in general and specifically because: x86_mmx. It's not a vector, but it's not vectorizable either. So here we check VectorType::isValidElementType() directly before proceeding with the transform. llvm-svn: 255433
* [InstCombine] Make MatchBSwap also match bit reversalsJames Molloy2015-12-112-103/+136
| | | | | | MatchBSwap has most of the functionality to match bit reversals already. If we switch it from looking at bytes to individual bits and remove a few early exits, we can extend the main recursive function to match any sequence of ORs, ANDs and shifts that assemble a value from different parts of another, base value. Once we have this bit->bit mapping, we can very simply detect if it is appropriate for a bswap or bitreverse. llvm-svn: 255334
* [InstCombine] fold bitcasts around an extractelement (3rd try)Sanjay Patel2015-12-101-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a redo of r255137 (reverted at r255227) which was a redo of r255124 (reverted at r255126) with a fixed check for a scalar source type and an added test for the failure that caused the revert. Original commit message: Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255261
* Revert r255137.Akira Hatanaka2015-12-101-39/+0
| | | | | | This commit broke apple's internal bot. llvm-svn: 255227
* [InstCombine] fold bitcasts around an extractelement (2nd try)Sanjay Patel2015-12-091-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a redo of r255124 (reverted at r255126) with an added check for a scalar destination type and an added test for the failure seen in Clang's test/CodeGen/vector.c. The extra test shows a different missing optimization. Original commit message: Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255137
* Revert "[InstCombine] fold bitcasts around an extractelement"Mehdi Amini2015-12-091-37/+0
| | | | | | | | | This reverts commit r255124. Broke http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/4193/steps/test/logs/stdio From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255126
* [InstCombine] fold bitcasts around an extractelementSanjay Patel2015-12-091-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255124
* [InstCombine] Call getCmpPredicateForMinMax only with a valid SPFSanjoy Das2015-12-051-1/+5
| | | | | | | | | | | | | | | | Summary: There are `SelectPatternFlavor`s that don't represent min or max idioms, and we should not be passing those to `getCmpPredicateForMinMax`. Fixes PR25745. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15249 llvm-svn: 254869
* Move EH-specific helper functions to a more appropriate placeDavid Majnemer2015-12-021-1/+1
| | | | | | No functionality change is intended. llvm-svn: 254562
* Do (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1 rather than (A == C1 || A ↵David Majnemer2015-12-021-4/+4
| | | | | | | | | | == C2) -> (A | (C1 ^ C2)) == C2 when C1 ^ C2 is a power of 2. Differential Revision: http://reviews.llvm.org/D14223 Patch by Amaury SECHET! llvm-svn: 254518
* [AttributeSet] Overload AttributeSet::addAttribute to reduce compileAkira Hatanaka2015-12-021-7/+14
| | | | | | | | | | | | | | | | | | | | time. The new overloaded function is used when an attribute is added to a large number of slots of an AttributeSet (for example, to function parameters). This is much faster than calling AttributeSet::addAttribute once per slot, because AttributeSet::getImpl (which calls FoldingSet::FIndNodeOrInsertPos) is called only once per function instead of once per slot. With this commit, clang compiles a file which used to take over 22 minutes in just 13 seconds. rdar://problem/23581000 Differential Revision: http://reviews.llvm.org/D15085 llvm-svn: 254491
* fix typos in comments; NFCSanjay Patel2015-11-291-6/+8
| | | | llvm-svn: 254266
* [OperandBundles] Extract duplicated code into a helper function, NFCSanjoy Das2015-11-251-5/+1
| | | | llvm-svn: 254047
OpenPOWER on IntegriCloud