summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] Don't sink an instr after a catchswitchDavid Majnemer2016-04-011-1/+5
| | | | | | A catchswitch is a terminator, instructions cannot be inserted after it. llvm-svn: 265158
* AMDGPU: Add frexp_exp intrinsicMatt Arsenault2016-03-301-5/+16
| | | | llvm-svn: 264944
* AMDGPU: Constant folding for frexp_mantMatt Arsenault2016-03-301-0/+14
| | | | llvm-svn: 264943
* Minor code cleanup. NFC.Junmo Park2016-03-231-1/+1
| | | | llvm-svn: 264124
* [InstCombine] Don't insert instructions before a catch switchDavid Majnemer2016-03-191-0/+3
| | | | | | | | | CatchSwitches are not splittable, we cannot insert casts, etc. before them. This fixes PR26992. llvm-svn: 263874
* [InstCombine] Combine A->B->A BitCastGuozhi Wei2016-03-172-0/+107
| | | | | | | | | | This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 263734
* Also handle the new Rust pers fn to isCatchAll()Bjorn Steinbrink2016-03-151-2/+3
| | | | llvm-svn: 263585
* [attrs] Handle convergent CallSites.Justin Lebar2016-03-141-1/+10
| | | | | | | | | | | | | | | | | | | | | | | Summary: Previously we had a notion of convergent functions but not of convergent calls. This is insufficient to correctly analyze calls where the target is unknown, e.g. indirect calls. Now a call is convergent if it targets a known-convergent function, or if it's explicitly marked as convergent. As usual, we can remove convergent where we can prove that no convergent operations are performed in the call. Originally landed as r261544, then reverted in r261544 for (incidental) build breakage. Re-landed here with no changes. Reviewers: chandlerc, jingyue Subscribers: llvm-commits, tra, jhen, hfinkel Differential Revision: http://reviews.llvm.org/D17739 llvm-svn: 263481
* Remove PreserveNames template parameter from IRBuilderMehdi Amini2016-03-132-4/+4
| | | | | | | | This reapplies r263258, which was reverted in r263321 because of issues on Clang side. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263393
* [x86, InstCombine] delete x86 SSE2 masked store with zero maskSanjay Patel2016-03-121-0/+6
| | | | | | | | | This follows up on the related AVX instruction transforms, but this one is too strange to do anything more with. Intel's behavioral description of this instruction in its Software Developer's Manual is tragi-comic. llvm-svn: 263340
* Temporarily revert:Eric Christopher2016-03-122-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit ae14bf6488e8441f0f6d74f00455555f6f3943ac Author: Mehdi Amini <mehdi.amini@apple.com> Date: Fri Mar 11 17:15:50 2016 +0000 Remove PreserveNames template parameter from IRBuilder Summary: Following r263086, we are now relying on a flag on the Context to discard Value names in release builds. Reviewers: chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18023 From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263258 91177308-0d34-0410-b5e6-96231b3b80d8 until we can figure out what to do about clang and Release build testing. This reverts commit 263258. llvm-svn: 263321
* Remove PreserveNames template parameter from IRBuilderMehdi Amini2016-03-112-4/+4
| | | | | | | | | | | | | | | Summary: Following r263086, we are now relying on a flag on the Context to discard Value names in release builds. Reviewers: chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18023 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263258
* [PM] Make the AnalysisManager parameter to run methods a reference.Chandler Carruth2016-03-111-5/+5
| | | | | | | | | | | | This was originally a pointer to support pass managers which didn't use AnalysisManagers. However, that doesn't realistically come up much and the complexity of supporting it doesn't really make sense. In fact, *many* parts of the pass manager were just assuming the pointer was never null already. This at least makes it much more explicit and clear. llvm-svn: 263219
* [InstCombine] Use Twines to generate names.Benjamin Kramer2016-03-111-15/+5
| | | | | | | | | Since the names are used in a loop this does more work in debug builds. In release builds value names are generally discarded so we don't have to do the concatenation at all. It's also simpler code, no functional change intended. llvm-svn: 263215
* [ValueTracking] Extract isKnownPositive [NFCI]Philip Reames2016-03-091-2/+2
| | | | | | Extract out a generic interface from a recently landed patch and document a TODO in case compile time becomes a problem. llvm-svn: 263062
* [InstCombine] (icmp sgt smin(PosA, B) 0) -> (icmp sgt B 0)Philip Reames2016-03-091-0/+13
| | | | | | | | When checking whether an smin is positive, we can move the comparison to one of the inputs if the other is known positive. If the known positive one is the min, then the other can't be negative. If the other is the min, then we compute the min. Differential Revision: http://reviews.llvm.org/D17873 llvm-svn: 263059
* InstCombine: Restrict computeKnownBits() on all Values to OptLevel > 2Matthias Braun2016-03-092-11/+22
| | | | | | | | | | | | | | | | | | As part of r251146 InstCombine was extended to call computeKnownBits on every value in the function to determine whether it happens to be constant. This increases typical compiletime by 1-3% (5% in irgen+opt time) in my measurements. On the other hand this case did not trigger once in the whole llvm-testsuite. This patch introduces the notion of ExpensiveCombines which are only enabled for OptLevel > 2. I removed the check in InstructionSimplify as that is called from various places where the OptLevel is not known but given the rarity of the situation I think a check in InstCombine is enough. Differential Revision: http://reviews.llvm.org/D16835 llvm-svn: 263047
* Reland r262337 "calculate builtin_object_size if arg is a removable pointer"Petar Jovanovic2016-03-091-8/+25
| | | | | | | | | | | | | | | | | | Original commit message: calculate builtin_object_size if argument is a removable pointer This patch fixes calculating correct value for builtin_object_size function when pointer is used only in builtin_object_size function call and never after that. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D17337 Reland the original change with a small modification (first do a null check and then do the cast) to satisfy ubsan. llvm-svn: 263011
* Revert "[InstCombine] Combine A->B->A BitCast"Junmo Park2016-03-082-104/+0
| | | | | | This reverts commit r262670 due to compile failure. llvm-svn: 262916
* [InstCombine] Combine A->B->A BitCastGuozhi Wei2016-03-032-0/+104
| | | | | | | | | | This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 262670
* [InstCombine] transform bitcasted bitwise logic ops with constants (PR26702)Sanjay Patel2016-03-031-7/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given that we're not actually reducing the instruction count in the included regression tests, I think we would call this a canonicalization step. The motivation comes from the example in PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable example of: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1> %bc = bitcast <4 x i32> %not to <2 x i64> %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1> %bc2 = bitcast <2 x i64> %notnot to <4 x i32> ret <4 x i32> %bc2 } Simplifies to the expected: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> ret <4 x i32> %lobit } Differential Revision: http://reviews.llvm.org/D17583 llvm-svn: 262645
* Explode store of arrays in instcombineAmaury Sechet2016-03-021-1/+33
| | | | | | | | | | | | Summary: This is the last step toward supporting aggregate memory access in instcombine. This explodes stores of arrays into a serie of stores for each element, allowing them to be optimized. Reviewers: joker.eph, reames, hfinkel, majnemer, mgrang Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17828 llvm-svn: 262530
* Unpack array of all sizes in InstCombineAmaury Sechet2016-03-021-5/+38
| | | | | | | | | | | | Summary: This is another step toward improving fca support. This unpack load of array in a series of load to array's elements. Reviewers: chandlerc, joker.eph, majnemer, reames, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15890 llvm-svn: 262521
* revert r262424 because there's a *clang test* for AArch64 that checks -O3 ↵Sanjay Patel2016-03-021-17/+5
| | | | | | | | asm output that is broken by this change llvm-svn: 262440
* [InstCombine] convert 'isPositive' and 'isNegative' vector comparisons to ↵Sanjay Patel2016-03-011-5/+17
| | | | | | | | | | | | | | | | | | shifts (PR26701) As noted in the code comment, I don't think we can do the same transform that we do for *scalar* integers comparisons to *vector* integers comparisons because it might pessimize the general case. Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT for integer vectors. But we should now recognize all the variants of this construct and produce the optimal code for the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26701 llvm-svn: 262424
* Perform InstructioinCombiningPass before SampleProfile pass.Dehao Chen2016-03-011-20/+0
| | | | | | | | | | | | Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17742 llvm-svn: 262419
* Fix an issue where fast math flags were dropped during scalarization.Owen Anderson2016-03-011-2/+4
| | | | | | | Most portions of InstCombine properly propagate fast math flags, but apparently the vector scalarization section was overlooked. llvm-svn: 262376
* Revert "calculate builtin_object_size if argument is a removable pointer"Petar Jovanovic2016-03-011-19/+6
| | | | | | | Revert r262337 as "check-llvm ubsan" step failed on sanitizer-x86_64-linux-fast buildbot. llvm-svn: 262349
* calculate builtin_object_size if argument is a removable pointerPetar Jovanovic2016-03-011-6/+19
| | | | | | | | | | | | This patch fixes calculating correct value for builtin_object_size function when pointer is used only in builtin_object_size function call and never after that. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D17337 llvm-svn: 262337
* [x86, InstCombine] transform more x86 masked loads to LLVM intrinsicsSanjay Patel2016-02-291-1/+7
| | | | | | | Continuation of: http://reviews.llvm.org/rL262269 llvm-svn: 262273
* [x86, InstCombine] transform x86 AVX masked loads to LLVM intrinsicsSanjay Patel2016-02-291-1/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The intended effect of this patch in conjunction with: http://reviews.llvm.org/rL259392 http://reviews.llvm.org/rL260145 is that customers using the AVX intrinsics in C will benefit from combines when the load mask is constant: __m128 mload_zeros(float *f) { return _mm_maskload_ps(f, _mm_set1_epi32(0)); } __m128 mload_fakeones(float *f) { return _mm_maskload_ps(f, _mm_set1_epi32(1)); } __m128 mload_ones(float *f) { return _mm_maskload_ps(f, _mm_set1_epi32(0x80000000)); } __m128 mload_oneset(float *f) { return _mm_maskload_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0)); } ...so none of the above will actually generate a masked load for optimized code. This is the masked load counterpart to: http://reviews.llvm.org/rL262064 llvm-svn: 262269
* [InstCombine] Be more conservative about removing stackrestoreReid Kleckner2016-02-271-1/+7
| | | | | | | We ended up removing a save/restore pair around an inalloca call, leading to a miscompile in Chromium. llvm-svn: 262095
* [x86, InstCombine] transform x86 AVX2 masked stores to LLVM intrinsicsSanjay Patel2016-02-261-1/+4
| | | | | | | | | Replicate everything for integers...because x86. Continuation of: http://reviews.llvm.org/rL262064 llvm-svn: 262077
* [x86, InstCombine] transform x86 AVX masked stores to LLVM intrinsicsSanjay Patel2016-02-261-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The intended effect of this patch in conjunction with: http://reviews.llvm.org/rL259392 http://reviews.llvm.org/rL260145 is that customers using the AVX intrinsics in C will benefit from combines when the store mask is constant: void mstore_zero_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(0), v); } void mstore_fake_ones_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(1), v); } void mstore_ones_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(0x80000000), v); } void mstore_one_set_elt_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0), v); } ...so none of the above will actually generate a masked store for optimized code. Differential Revision: http://reviews.llvm.org/D17485 llvm-svn: 262064
* [InstCombine] enable optimization of casted vector xor instructionsSanjay Patel2016-02-241-18/+8
| | | | | | | | | | | | | | | | This is part of the payoff for the refactoring in: http://reviews.llvm.org/rL261649 http://reviews.llvm.org/rL261707 In addition to removing a pile of duplicated code, the xor case was missing the optimization for vector types because it checked "SrcTy->isIntegerTy()" rather than "SrcTy->isIntOrIntVectorTy()" like 'and' and 'or' were already doing. This solves part of: https://llvm.org/bugs/show_bug.cgi?id=26702 llvm-svn: 261750
* NFC. Move isDereferenceable to Loads.h/cppArtur Pilipenko2016-02-241-0/+1
| | | | | | | | | | This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16180 llvm-svn: 261736
* [InstCombine] refactor visitOr() to use foldCastedBitwiseLogic()Sanjay Patel2016-02-231-47/+31
| | | | | | | | | | | | | | | | Note: The 'and' case in foldCastedBitwiseLogic() is inheriting one extra check from the nearly identical 'or' case: if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src)) But I'm not sure how to expose that difference in a regression test. Without that check, the 'or' path will infinite loop on: test/Transforms/InstCombine/zext-or-icmp.ll because the zext-or-icmp fold is attempting a reverse transform. The refactoring should extend to the 'xor' case next to solve part of PR26702. llvm-svn: 261707
* [InstCombine] improve readability ; NFCISanjay Patel2016-02-231-30/+36
| | | | | | Less indenting, named local variables, more descriptive names. llvm-svn: 261659
* [InstCombine] less indenting; NFCSanjay Patel2016-02-231-31/+32
| | | | llvm-svn: 261652
* [InstCombine] add helper function to foldCastedBitwiseLogic() ; NFCISanjay Patel2016-02-232-29/+41
| | | | | | | | | | | | This is a straight cut and paste of the existing code and is intended to be the first step in solving part of PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 We should be able to reuse most of this and delete the nearly identical existing code in visitOr(). Then, we can enhance visitXor() to use the same code too. llvm-svn: 261649
* Revert "[attrs] Handle convergent CallSites."Justin Lebar2016-02-221-10/+1
| | | | | | | This reverts r261544, which was causing a test failure in Transforms/FunctionAttrs/readattrs.ll. llvm-svn: 261549
* [attrs] Handle convergent CallSites.Justin Lebar2016-02-221-1/+10
| | | | | | | | | | | | | | | | | | | | Summary: Previously we had a notion of convergent functions but not of convergent calls. This is insufficient to correctly analyze calls where the target is unknown, e.g. indirect calls. Now a call is convergent if it targets a known-convergent function, or if it's explicitly marked as convergent. As usual, we can remove convergent where we can prove that no convergent operations are performed in the call. Reviewers: chandlerc, jingyue Subscribers: hfinkel, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17317 llvm-svn: 261544
* fix inaccurate comment; NFCSanjay Patel2016-02-211-2/+1
| | | | llvm-svn: 261484
* [InstCombine] add getNegativeIsTrueBoolVec() helper function; NFCSanjay Patel2016-02-211-22/+20
| | | | | | | | | Originally part of: http://reviews.llvm.org/D17485 We need this when simplifying masked memory ops too. llvm-svn: 261483
* [InstCombine] SSE/SSE2 (u)comiss/(u)comisd comparison intrinsics only use ↵Simon Pilgrim2016-02-201-0/+40
| | | | | | the lowest vector element llvm-svn: 261460
* [AA] Preserve the AA results wrapper pass as well as BasicAA in a fewChandler Carruth2016-02-191-0/+4
| | | | | | | | | | | | | more places to prevent gratuitous re-"runs" of these passes. The passes themselves don't do any work when run, but we keep spending time scheduling and running these needlessly when we really don't need to do so. This is the first patch towards fixing the really horrible loop pass pipeline fragmentation pointed out by Sanjoy in PR24804. llvm-svn: 261302
* Remove uses of builtin comma operator.Richard Trieu2016-02-184-20/+43
| | | | | | Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270
* NFC: Fix formatingAmaury Sechet2016-02-171-4/+4
| | | | llvm-svn: 261156
* Fix load alignement when unpacking aggregates structsAmaury Sechet2016-02-171-12/+26
| | | | | | | | | | | | Summary: Store and loads unpacked by instcombine do not always have the right alignement. This explicitely compute the alignement and set it. Reviewers: dblaikie, majnemer, reames, hfinkel, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17326 llvm-svn: 261139
* [InstCombine] Don't aggressively replace xor with icmpDavid Majnemer2016-02-121-17/+20
| | | | | | | | | | | | | | | | | | For some cases, InstCombine replaces the sequence of xor/sub instruction followed by cmp instruction into a single cmp instruction. However, this replacement may result suboptimal result especially when the xor/sub has more than one use, as discussed in bug 26465 (https://llvm.org/bugs/show_bug.cgi?id=26465). This patch make the replacement happen only when xor/sub has only one use. Differential Revision: http://reviews.llvm.org/D16915 Patch by Taewook Oh! llvm-svn: 260695
OpenPOWER on IntegriCloud