summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[InstCombine] Combine A->B->A BitCast"Junmo Park2016-03-082-104/+0
| | | | | | This reverts commit r262670 due to compile failure. llvm-svn: 262916
* [InstCombine] Combine A->B->A BitCastGuozhi Wei2016-03-032-0/+104
| | | | | | | | | | This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 262670
* [InstCombine] transform bitcasted bitwise logic ops with constants (PR26702)Sanjay Patel2016-03-031-7/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given that we're not actually reducing the instruction count in the included regression tests, I think we would call this a canonicalization step. The motivation comes from the example in PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable example of: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1> %bc = bitcast <4 x i32> %not to <2 x i64> %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1> %bc2 = bitcast <2 x i64> %notnot to <4 x i32> ret <4 x i32> %bc2 } Simplifies to the expected: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> ret <4 x i32> %lobit } Differential Revision: http://reviews.llvm.org/D17583 llvm-svn: 262645
* Explode store of arrays in instcombineAmaury Sechet2016-03-021-1/+33
| | | | | | | | | | | | Summary: This is the last step toward supporting aggregate memory access in instcombine. This explodes stores of arrays into a serie of stores for each element, allowing them to be optimized. Reviewers: joker.eph, reames, hfinkel, majnemer, mgrang Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17828 llvm-svn: 262530
* Unpack array of all sizes in InstCombineAmaury Sechet2016-03-021-5/+38
| | | | | | | | | | | | Summary: This is another step toward improving fca support. This unpack load of array in a series of load to array's elements. Reviewers: chandlerc, joker.eph, majnemer, reames, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15890 llvm-svn: 262521
* revert r262424 because there's a *clang test* for AArch64 that checks -O3 ↵Sanjay Patel2016-03-021-17/+5
| | | | | | | | asm output that is broken by this change llvm-svn: 262440
* [InstCombine] convert 'isPositive' and 'isNegative' vector comparisons to ↵Sanjay Patel2016-03-011-5/+17
| | | | | | | | | | | | | | | | | | shifts (PR26701) As noted in the code comment, I don't think we can do the same transform that we do for *scalar* integers comparisons to *vector* integers comparisons because it might pessimize the general case. Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT for integer vectors. But we should now recognize all the variants of this construct and produce the optimal code for the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26701 llvm-svn: 262424
* Perform InstructioinCombiningPass before SampleProfile pass.Dehao Chen2016-03-011-20/+0
| | | | | | | | | | | | Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17742 llvm-svn: 262419
* Fix an issue where fast math flags were dropped during scalarization.Owen Anderson2016-03-011-2/+4
| | | | | | | Most portions of InstCombine properly propagate fast math flags, but apparently the vector scalarization section was overlooked. llvm-svn: 262376
* Revert "calculate builtin_object_size if argument is a removable pointer"Petar Jovanovic2016-03-011-19/+6
| | | | | | | Revert r262337 as "check-llvm ubsan" step failed on sanitizer-x86_64-linux-fast buildbot. llvm-svn: 262349
* calculate builtin_object_size if argument is a removable pointerPetar Jovanovic2016-03-011-6/+19
| | | | | | | | | | | | This patch fixes calculating correct value for builtin_object_size function when pointer is used only in builtin_object_size function call and never after that. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D17337 llvm-svn: 262337
* [x86, InstCombine] transform more x86 masked loads to LLVM intrinsicsSanjay Patel2016-02-291-1/+7
| | | | | | | Continuation of: http://reviews.llvm.org/rL262269 llvm-svn: 262273
* [x86, InstCombine] transform x86 AVX masked loads to LLVM intrinsicsSanjay Patel2016-02-291-1/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The intended effect of this patch in conjunction with: http://reviews.llvm.org/rL259392 http://reviews.llvm.org/rL260145 is that customers using the AVX intrinsics in C will benefit from combines when the load mask is constant: __m128 mload_zeros(float *f) { return _mm_maskload_ps(f, _mm_set1_epi32(0)); } __m128 mload_fakeones(float *f) { return _mm_maskload_ps(f, _mm_set1_epi32(1)); } __m128 mload_ones(float *f) { return _mm_maskload_ps(f, _mm_set1_epi32(0x80000000)); } __m128 mload_oneset(float *f) { return _mm_maskload_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0)); } ...so none of the above will actually generate a masked load for optimized code. This is the masked load counterpart to: http://reviews.llvm.org/rL262064 llvm-svn: 262269
* [InstCombine] Be more conservative about removing stackrestoreReid Kleckner2016-02-271-1/+7
| | | | | | | We ended up removing a save/restore pair around an inalloca call, leading to a miscompile in Chromium. llvm-svn: 262095
* [x86, InstCombine] transform x86 AVX2 masked stores to LLVM intrinsicsSanjay Patel2016-02-261-1/+4
| | | | | | | | | Replicate everything for integers...because x86. Continuation of: http://reviews.llvm.org/rL262064 llvm-svn: 262077
* [x86, InstCombine] transform x86 AVX masked stores to LLVM intrinsicsSanjay Patel2016-02-261-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The intended effect of this patch in conjunction with: http://reviews.llvm.org/rL259392 http://reviews.llvm.org/rL260145 is that customers using the AVX intrinsics in C will benefit from combines when the store mask is constant: void mstore_zero_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(0), v); } void mstore_fake_ones_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(1), v); } void mstore_ones_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(0x80000000), v); } void mstore_one_set_elt_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0), v); } ...so none of the above will actually generate a masked store for optimized code. Differential Revision: http://reviews.llvm.org/D17485 llvm-svn: 262064
* [InstCombine] enable optimization of casted vector xor instructionsSanjay Patel2016-02-241-18/+8
| | | | | | | | | | | | | | | | This is part of the payoff for the refactoring in: http://reviews.llvm.org/rL261649 http://reviews.llvm.org/rL261707 In addition to removing a pile of duplicated code, the xor case was missing the optimization for vector types because it checked "SrcTy->isIntegerTy()" rather than "SrcTy->isIntOrIntVectorTy()" like 'and' and 'or' were already doing. This solves part of: https://llvm.org/bugs/show_bug.cgi?id=26702 llvm-svn: 261750
* NFC. Move isDereferenceable to Loads.h/cppArtur Pilipenko2016-02-241-0/+1
| | | | | | | | | | This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16180 llvm-svn: 261736
* [InstCombine] refactor visitOr() to use foldCastedBitwiseLogic()Sanjay Patel2016-02-231-47/+31
| | | | | | | | | | | | | | | | Note: The 'and' case in foldCastedBitwiseLogic() is inheriting one extra check from the nearly identical 'or' case: if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src)) But I'm not sure how to expose that difference in a regression test. Without that check, the 'or' path will infinite loop on: test/Transforms/InstCombine/zext-or-icmp.ll because the zext-or-icmp fold is attempting a reverse transform. The refactoring should extend to the 'xor' case next to solve part of PR26702. llvm-svn: 261707
* [InstCombine] improve readability ; NFCISanjay Patel2016-02-231-30/+36
| | | | | | Less indenting, named local variables, more descriptive names. llvm-svn: 261659
* [InstCombine] less indenting; NFCSanjay Patel2016-02-231-31/+32
| | | | llvm-svn: 261652
* [InstCombine] add helper function to foldCastedBitwiseLogic() ; NFCISanjay Patel2016-02-232-29/+41
| | | | | | | | | | | | This is a straight cut and paste of the existing code and is intended to be the first step in solving part of PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 We should be able to reuse most of this and delete the nearly identical existing code in visitOr(). Then, we can enhance visitXor() to use the same code too. llvm-svn: 261649
* Revert "[attrs] Handle convergent CallSites."Justin Lebar2016-02-221-10/+1
| | | | | | | This reverts r261544, which was causing a test failure in Transforms/FunctionAttrs/readattrs.ll. llvm-svn: 261549
* [attrs] Handle convergent CallSites.Justin Lebar2016-02-221-1/+10
| | | | | | | | | | | | | | | | | | | | Summary: Previously we had a notion of convergent functions but not of convergent calls. This is insufficient to correctly analyze calls where the target is unknown, e.g. indirect calls. Now a call is convergent if it targets a known-convergent function, or if it's explicitly marked as convergent. As usual, we can remove convergent where we can prove that no convergent operations are performed in the call. Reviewers: chandlerc, jingyue Subscribers: hfinkel, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17317 llvm-svn: 261544
* fix inaccurate comment; NFCSanjay Patel2016-02-211-2/+1
| | | | llvm-svn: 261484
* [InstCombine] add getNegativeIsTrueBoolVec() helper function; NFCSanjay Patel2016-02-211-22/+20
| | | | | | | | | Originally part of: http://reviews.llvm.org/D17485 We need this when simplifying masked memory ops too. llvm-svn: 261483
* [InstCombine] SSE/SSE2 (u)comiss/(u)comisd comparison intrinsics only use ↵Simon Pilgrim2016-02-201-0/+40
| | | | | | the lowest vector element llvm-svn: 261460
* [AA] Preserve the AA results wrapper pass as well as BasicAA in a fewChandler Carruth2016-02-191-0/+4
| | | | | | | | | | | | | more places to prevent gratuitous re-"runs" of these passes. The passes themselves don't do any work when run, but we keep spending time scheduling and running these needlessly when we really don't need to do so. This is the first patch towards fixing the really horrible loop pass pipeline fragmentation pointed out by Sanjoy in PR24804. llvm-svn: 261302
* Remove uses of builtin comma operator.Richard Trieu2016-02-184-20/+43
| | | | | | Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270
* NFC: Fix formatingAmaury Sechet2016-02-171-4/+4
| | | | llvm-svn: 261156
* Fix load alignement when unpacking aggregates structsAmaury Sechet2016-02-171-12/+26
| | | | | | | | | | | | Summary: Store and loads unpacked by instcombine do not always have the right alignement. This explicitely compute the alignement and set it. Reviewers: dblaikie, majnemer, reames, hfinkel, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17326 llvm-svn: 261139
* [InstCombine] Don't aggressively replace xor with icmpDavid Majnemer2016-02-121-17/+20
| | | | | | | | | | | | | | | | | | For some cases, InstCombine replaces the sequence of xor/sub instruction followed by cmp instruction into a single cmp instruction. However, this replacement may result suboptimal result especially when the xor/sub has more than one use, as discussed in bug 26465 (https://llvm.org/bugs/show_bug.cgi?id=26465). This patch make the replacement happen only when xor/sub has only one use. Differential Revision: http://reviews.llvm.org/D16915 Patch by Taewook Oh! llvm-svn: 260695
* Re-apply r238452, the bug was in clang and was fixed in r260567.Quentin Colombet2016-02-111-5/+10
| | | | | | | | | | | | | | | | Original commit message: [InstCombine] Fold IntToPtr and PtrToInt into preceding loads. Currently we only fold a BitCast into a Load when the BitCast is its only user. Do the same for any no-op cast. Patch by Philip Pfaffe! Differential Revision: http://reviews.llvm.org/D9152 llvm-svn: 260612
* Set load alignment on aggregate loads.Pete Cooper2016-02-111-1/+2
| | | | | | | | | | | | | | | | When optimizing a extractvalue(load), we generate a load from the aggregate type. This load didn't have alignment set and so would get the alignment of the type. This breaks when the type is packed and so the alignment should be lower. For example, loading { int, int } would give us alignment of 4, but the original load from this type may have an alignment of 1 if packed. Reviewed by David Majnemer Differential revision: http://reviews.llvm.org/D17158 llvm-svn: 260587
* Fixed typo in r260530Jun Bum Lim2016-02-111-5/+5
| | | | llvm-svn: 260541
* [InstCombine] Simplify a known nonzero incoming value of PHIJun Bum Lim2016-02-111-0/+36
| | | | | | | | | | | | | | | | | | | | Summary: When a PHI is used only to be compared with zero, it is possible to replace an incoming value with any non-zero constant if the incoming value can be proved as a known nonzero value. For example, in below code, we can replace the incoming value %v with any non-zero constant based on the fact that the PHI is only used to be compared with zero and %v is a known non-zero value: %v = select %cond, 1, 2 %p = phi [%v, BB] ... %c = icmp eq, %p, 0 Reviewers: mcrosier, jmolloy, sanjoy Subscribers: hfinkel, mcrosier, majnemer, llvm-commits, haicheng, bmakam, mssimpso, gberry Differential Revision: http://reviews.llvm.org/D16240 llvm-svn: 260530
* Don't propagate dereferenceable attribute through gc.relocate in InstCombineArtur Pilipenko2016-02-111-6/+0
| | | | | | | | Reviewed By: reames Differential Revision: http://reviews.llvm.org/D16143 llvm-svn: 260509
* [InstCombine][GC] Handle gc.relocations of vector typePhilip Reames2016-02-091-25/+22
| | | | | | | | We introduced gc.relocates of vector-of-pointer types a couple of weeks back. Somehow, I missed updating the InstCombine rule to account for this. If we hit this code path with a vector-of-pointers gc.relocate, we'd crash on a cast<PointerType>. I also took the chance to do a bit of code style cleanup. llvm-svn: 260279
* [InstCombine] Revert r238452: Fold IntToPtr and PtrToInt into preceding loads.Quentin Colombet2016-02-031-10/+5
| | | | | | | | | | | | | | | | | | | | | | | | | According to git bisect, this is the root cause of a miscompile for Regex in libLLVMSupport. I am still working on reducing a test case. The actual bug may be elsewhere and this commit just exposed it. Anyway, at the moment, to reproduce, follow these steps: 1. Build clang and libLTO in release mode. 2. Create a new build directory <stage2> and cd into it. 3. Use clang and libLTO from #1 to build llvm-extract in Release mode + asserts using -O2 -flto 4. Run llvm-extract -ralias '.*bar' -S test/Other/extract-alias.ll Result: program doesn't contain global named '.*bar'! Expected result: @a0a0bar = alias void ()* @bar @a0bar = alias void ()* @bar declare void @bar() Note: In step #3, if you don't use lto or asserts, the miscompile disappears. llvm-svn: 259674
* Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes.Eugene Zelenko2016-02-021-2/+0
| | | | | | Differential revision: http://reviews.llvm.org/D16793 llvm-svn: 259539
* function names start with a lowercase letter; NFCSanjay Patel2016-02-0114-324/+324
| | | | llvm-svn: 259425
* [InstCombine] simplify masked scatter/gather intrinsics with zero masksSanjay Patel2016-02-011-4/+22
| | | | | | | | | | | A masked scatter with a zero mask means there's no store. A masked gather with a zero mask means the passthru arg is returned. This is a continuation of: http://reviews.llvm.org/rL259369 http://reviews.llvm.org/rL259392 llvm-svn: 259421
* [InstCombine] simplify masked store intrinsics with all ones or zeros masksSanjay Patel2016-02-011-1/+21
| | | | | | | | | | A masked store with a zero mask means there's no store. A masked store with an allOnes mask means it's a normal vector store. This is a continuation of: http://reviews.llvm.org/rL259369 llvm-svn: 259392
* [InstCombine] Don't transform (X+INT_MAX)>=(Y+INT_MAX) -> (X<=Y)David Majnemer2016-02-011-1/+1
| | | | | | | | | This miscompile came about because we tried to use a transform which was only appropriate for xor operators when addition was present. This fixes PR26407. llvm-svn: 259375
* [InstCombine] simplify masked load intrinsics with all ones or zeros masksSanjay Patel2016-02-011-0/+30
| | | | | | | | | A masked load with a zero mask means there's no load. A masked load with an allOnes mask means it's a normal vector load. Differential Revision: http://reviews.llvm.org/D16691 llvm-svn: 259369
* add helper function for minnum/maxnum ; NFCSanjay Patel2016-01-311-74/+80
| | | | llvm-svn: 259326
* use range-based for loop; NFCSanjay Patel2016-01-311-3/+3
| | | | llvm-svn: 259325
* fix formatting; NFCSanjay Patel2016-01-311-13/+13
| | | | llvm-svn: 259324
* simplify; NFCSanjay Patel2016-01-311-8/+5
| | | | llvm-svn: 259323
* InstCombine: fabs(x) * fabs(x) -> x * xMatt Arsenault2016-01-301-4/+15
| | | | llvm-svn: 259295
OpenPOWER on IntegriCloud