summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert "calculate builtin_object_size if argument is a removable pointer"Petar Jovanovic2016-03-011-19/+6
| | | | | | | Revert r262337 as "check-llvm ubsan" step failed on sanitizer-x86_64-linux-fast buildbot. llvm-svn: 262349
* calculate builtin_object_size if argument is a removable pointerPetar Jovanovic2016-03-011-6/+19
| | | | | | | | | | | | This patch fixes calculating correct value for builtin_object_size function when pointer is used only in builtin_object_size function call and never after that. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D17337 llvm-svn: 262337
* [x86, InstCombine] transform more x86 masked loads to LLVM intrinsicsSanjay Patel2016-02-291-1/+7
| | | | | | | Continuation of: http://reviews.llvm.org/rL262269 llvm-svn: 262273
* [LLE] Fix a commentAdam Nemet2016-02-291-3/+3
| | | | llvm-svn: 262270
* [x86, InstCombine] transform x86 AVX masked loads to LLVM intrinsicsSanjay Patel2016-02-291-1/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The intended effect of this patch in conjunction with: http://reviews.llvm.org/rL259392 http://reviews.llvm.org/rL260145 is that customers using the AVX intrinsics in C will benefit from combines when the load mask is constant: __m128 mload_zeros(float *f) { return _mm_maskload_ps(f, _mm_set1_epi32(0)); } __m128 mload_fakeones(float *f) { return _mm_maskload_ps(f, _mm_set1_epi32(1)); } __m128 mload_ones(float *f) { return _mm_maskload_ps(f, _mm_set1_epi32(0x80000000)); } __m128 mload_oneset(float *f) { return _mm_maskload_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0)); } ...so none of the above will actually generate a masked load for optimized code. This is the masked load counterpart to: http://reviews.llvm.org/rL262064 llvm-svn: 262269
* [LLE] Fix SingleSource/Benchmarks/Polybench/stencils/jacobi-2d-imper with PollyAdam Nemet2016-02-291-0/+5
| | | | | | | | | We can actually have dependences between accesses with different underlying types. Bail in this case. A test will follow shortly. llvm-svn: 262267
* Enable LoopLoadElimination by defaultAdam Nemet2016-02-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I re-benchmarked this and results are similar to original results in D13259: On ARM64: SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog -59.27% SingleSource/Benchmarks/Polybench/stencils/adi -19.78% On x86: SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog -27.14% And of course the original ~20% gain on SPECint_2006/456.hmmer with Loop Distribution. In terms of compile time, there is ~5% increase on both SingleSource/Benchmarks/Misc/oourafft and SingleSource/Benchmarks/Linkpack/linkpack-pc. These are both very tiny loop-intensive programs where SCEV computations dominates compile time. The reason that time spent in SCEV increases has to do with the design of the old pass manager. If a transform pass does not preserve an analysis we *invalidate* the analysis even if there was *no* modification made by the transform pass. This means that currently we don't take advantage of LLE and LV sharing the same analysis (LAA) and unfortunately we recompute LAA *and* SCEV for LLE. (There should be a way to work around this limitation in the case of SCEV and LAA since both compute things on demand and internally cache their result. Thus we could pretend that transform passes preserve these analyses and manually invalidate them upon actual modification. On the other hand the new pass manager is supposed to solve so I am not sure if this is worthwhile.) Reviewers: hfinkel, dberlin Subscribers: dberlin, reames, mssimpso, aemerson, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16300 llvm-svn: 262250
* Minor code cleanup. NFCRong Xu2016-02-291-16/+15
| | | | llvm-svn: 262242
* Move discriminator assignment to the right place.Dehao Chen2016-02-291-4/+7
| | | | | | | | | | | | Summary: Now discriminator is assigned per-function instead of per-module. Reviewers: davidxl, dnovillo Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D17664 llvm-svn: 262240
* [PGO] Remove redundant counter copies for avail_extern functions.Xinliang David Li2016-02-271-3/+32
| | | | | | Differential Revision: http://reviews.llvm.org/D17654 llvm-svn: 262157
* Revert "[sancov] do not instrument nodes that are full pre-dominators"Renato Golin2016-02-271-22/+11
| | | | | | This reverts commit r262103, as it broke all ARM and AArch64 bots. llvm-svn: 262139
* [instrprof] Use __{start,stop}_SECNAME on PS4 too.Sean Silva2016-02-271-1/+2
| | | | | | | | | | | | | | | | | | | | | Summary: The PS4 linker seems to handle this fine. Hi David, it seems that indeed most ELF linkers support __{start,stop}_SECNAME, as our proprietary linker does as well. This follows the pattern of r250679 w.r.t. the testing. Maggie, Phillip, Paul: I've tested this with the PS4 SDK 3.5 toolchain prerelease and it seems to work fine. Reviewers: davidxl Subscribers: probinson, phillip.power, MaggieYi Differential Revision: http://reviews.llvm.org/D17672 llvm-svn: 262112
* [sancov] properly initializing pass.Mike Aizatsky2016-02-271-1/+6
| | | | llvm-svn: 262111
* [libFuzzer] don't emit callbacks to sanitizer run-time in ↵Kostya Serebryany2016-02-271-12/+14
| | | | | | -fsanitize-coverage=trace-pc mode; update libFuzzer doc for previous commit llvm-svn: 262110
* [LICM] Teach LICM how to handle cases where the alias set tracker wasChandler Carruth2016-02-271-20/+32
| | | | | | | | | | | | | | | | | | | | merged into a loop that was subsequently unrolled (or otherwise nuked). In this case it can't merge in the ASTs for any remaining nested loops, it needs to re-add their instructions dircetly. The fix is very isolated, but I've pulled the code for merging blocks into the AST into a single place in the process. The only behavior change is in the case which would have crashed before. This fixes a crash reported by Mikael Holmen on the list after r261316 restored much of the loop pass pipelining and allowed us to actually do this kind of nested transformation sequenc. I've taken that test case and further reduced it into the somewhat twisty maze of loops in the included test case. This does in fact trigger the bug even in this reduced form. llvm-svn: 262108
* [sancov] do not instrument nodes that are full pre-dominatorsMike Aizatsky2016-02-271-11/+22
| | | | | | | | | | | Summary: Without tree pruning clang has 2,667,552 points. Wiht only dominators pruning: 1,515,586. With both dominators & predominators pruning: 1,340,534. Differential Revision: http://reviews.llvm.org/D17671 llvm-svn: 262103
* [InstCombine] Be more conservative about removing stackrestoreReid Kleckner2016-02-271-1/+7
| | | | | | | We ended up removing a save/restore pair around an inalloca call, leading to a miscompile in Chromium. llvm-svn: 262095
* [x86, InstCombine] transform x86 AVX2 masked stores to LLVM intrinsicsSanjay Patel2016-02-261-1/+4
| | | | | | | | | Replicate everything for integers...because x86. Continuation of: http://reviews.llvm.org/rL262064 llvm-svn: 262077
* [x86, InstCombine] transform x86 AVX masked stores to LLVM intrinsicsSanjay Patel2016-02-261-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The intended effect of this patch in conjunction with: http://reviews.llvm.org/rL259392 http://reviews.llvm.org/rL260145 is that customers using the AVX intrinsics in C will benefit from combines when the store mask is constant: void mstore_zero_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(0), v); } void mstore_fake_ones_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(1), v); } void mstore_ones_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(0x80000000), v); } void mstore_one_set_elt_mask(float *f, __m128 v) { _mm_maskstore_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0), v); } ...so none of the above will actually generate a masked store for optimized code. Differential Revision: http://reviews.llvm.org/D17485 llvm-svn: 262064
* [JumpThreading] Simplify Instructions first in ComputeValueKnownInPredecessors()Haicheng Wu2016-02-261-20/+35
| | | | | | This change tries to find more opportunities to thread over basic blocks. llvm-svn: 261981
* [LoopUnrollAnalyzer] Check that we're using SCEV for the same loop we're ↵Michael Zolotukhin2016-02-261-1/+1
| | | | | | | | | | | | | | simulating. Summary: Check that we're using SCEV for the same loop we're simulating. Otherwise, we might try to use the iteration number of the current loop in SCEV expressions for inner/outer loops IVs, which is clearly incorrect. Reviewers: chandlerc, hfinkel Subscribers: sanjoy, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17632 llvm-svn: 261958
* [sancov] Pruning full dominator blocks from instrumentation.Mike Aizatsky2016-02-261-4/+32
| | | | | | | | | | | | | | Summary: This is the first simple attempt to reduce number of coverage- instrumented blocks. If a basic block dominates all its successors, then its coverage information is useless to us. Ingore such blocks if santizer-coverage-prune-tree option is set. Differential Revision: http://reviews.llvm.org/D17626 llvm-svn: 261949
* [asan] Do not instrument globals in the special "LLVM" sectionsAnna Zaks2016-02-241-1/+1
| | | | llvm-svn: 261794
* [SimplifyCFG] Use a more elegant solution than r261731David Majnemer2016-02-241-11/+9
| | | | | | | | | | | | | | | | The cleanupret instruction has an invariant that it's 'from' operand be a cleanuppad. This invariant was violated when we removed a dead block which removed a cleanuppad leaving behind a cleanupret with an undef 'from' operand. This was solved in r261731 by staving off the removal of the dead block to a later pass. However, it occured to me that we do not need to do this. Instead, we can simply avoid processing the cleanupret if it has an undef 'from' operand because we know that it will be removed soon. llvm-svn: 261754
* [InstCombine] enable optimization of casted vector xor instructionsSanjay Patel2016-02-241-18/+8
| | | | | | | | | | | | | | | | This is part of the payoff for the refactoring in: http://reviews.llvm.org/rL261649 http://reviews.llvm.org/rL261707 In addition to removing a pile of duplicated code, the xor case was missing the optimization for vector types because it checked "SrcTy->isIntegerTy()" rather than "SrcTy->isIntOrIntVectorTy()" like 'and' and 'or' were already doing. This solves part of: https://llvm.org/bugs/show_bug.cgi?id=26702 llvm-svn: 261750
* NFC. Move isDereferenceable to Loads.h/cppArtur Pilipenko2016-02-242-0/+2
| | | | | | | | | | This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16180 llvm-svn: 261736
* [SimplifyCFG] Do not blindly remove unreachable blocksDavid Majnemer2016-02-241-3/+11
| | | | | | | | | | | DeleteDeadBlock was called indiscriminately, leading to cleanuprets with undef cleanuppad references. Instead, try to drain the BB of most of it's instructions if it is unreachable. We can then remove the BB if it solely consists of a terminator (and maybe some phis). llvm-svn: 261731
* [InstCombine] refactor visitOr() to use foldCastedBitwiseLogic()Sanjay Patel2016-02-231-47/+31
| | | | | | | | | | | | | | | | Note: The 'and' case in foldCastedBitwiseLogic() is inheriting one extra check from the nearly identical 'or' case: if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src)) But I'm not sure how to expose that difference in a regression test. Without that check, the 'or' path will infinite loop on: test/Transforms/InstCombine/zext-or-icmp.ll because the zext-or-icmp fold is attempting a reverse transform. The refactoring should extend to the 'xor' case next to solve part of PR26702. llvm-svn: 261707
* [InstCombine] improve readability ; NFCISanjay Patel2016-02-231-30/+36
| | | | | | Less indenting, named local variables, more descriptive names. llvm-svn: 261659
* [WinEH] Don't inline an 'unwinds to caller' cleanupret into funclets which ↵David Majnemer2016-02-231-0/+21
| | | | | | | | | | | | | | | | | | | | | locally unwind It is problematic if the inlinee has a cleanupret which unwinds to caller and we inline it into a call site which doesn't unwind. If the funclet unwinds anywhere other than to the caller, then we will give the funclet two unwind destinations. This will result in a verifier failure. Seeing as how the caller wasn't an invoke (which would locally unwind) and that the funclet cannot unwind to caller, we must conclude that an 'unwind to caller' cleanupret is dynamically unreachable. This fixes PR26698. Differential Revision: http://reviews.llvm.org/D17536 llvm-svn: 261656
* [InstCombine] less indenting; NFCSanjay Patel2016-02-231-31/+32
| | | | llvm-svn: 261652
* [InstCombine] add helper function to foldCastedBitwiseLogic() ; NFCISanjay Patel2016-02-232-29/+41
| | | | | | | | | | | | This is a straight cut and paste of the existing code and is intended to be the first step in solving part of PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 We should be able to reuse most of this and delete the nearly identical existing code in visitOr(). Then, we can enhance visitXor() to use the same code too. llvm-svn: 261649
* Follow up for r261597: Add the * to the auto.Michael Zolotukhin2016-02-231-1/+1
| | | | llvm-svn: 261600
* Follow-up for r261595: use range loop.Michael Zolotukhin2016-02-231-4/+2
| | | | llvm-svn: 261597
* [LoopUnroll] Avoid unnecessary DT recomputation.Michael Zolotukhin2016-02-231-8/+54
| | | | | | | | | | | | | | | | | | | Summary: When we completely unroll a loop, it's pretty easy to update DT in-place and thus avoid rebuilding it. DT recalculation is one of the most time-consuming tasks in loop-unroll, so avoiding it at least in case of full unroll should be beneficial. On some extreme (but still real-world) tests this patch improves compile time by ~2x. Reviewers: escha, jmolloy, hfinkel, sanjoy, chandlerc Subscribers: joker.eph, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D17473 llvm-svn: 261595
* Set function entry count as 0 if sample profile is not found for the function.Dehao Chen2016-02-221-0/+1
| | | | | | | | | | | | Summary: This change makes the sample profile's behavior consistent with instr profile. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17522 llvm-svn: 261587
* [LoopDataPrefetch] Make it testable with optAdam Nemet2016-02-221-0/+1
| | | | | | | | | | | | | | | Summary: Since this is an IR pass it's nice to be able to write tests without llc. This is the counterpart of the llc test under CodeGen/PowerPC/loop-data-prefetch.ll. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17464 llvm-svn: 261578
* [LoopUnrolling] Fix a bug introduced in r259869 (PR26688).Michael Zolotukhin2016-02-221-2/+6
| | | | | | | | The issue was that we only required LCSSA rebuilding if the immediate parent-loop had values used outside of it. The fix is to enaable the same logic for all outer loops, not only immediate parent. llvm-svn: 261575
* [RS4GC] "Constant fold" the rs4gc-split-vector-values flagPhilip Reames2016-02-221-156/+0
| | | | | | This flag was part of a migration to a new means of handling vectors-of-points which was described in the llvm-dev thread "FYI: Relocating vector of pointers". The old code path has been off by default for a while without complaints, so time to cleanup. llvm-svn: 261569
* [RS4GC] Revert optimization attempt due to memory corruptionPhilip Reames2016-02-221-63/+3
| | | | | | | | This change reverts "246133 [RewriteStatepointsForGC] Reduce the number of new instructions for base pointers" and a follow on bugfix 12575. As pointed out in pr25846, this code suffers from a memory corruption bug. Since I'm (empirically) not going to get back to this any time soon, simply reverting the problematic change is the right answer. llvm-svn: 261565
* Revert "[attrs] Handle convergent CallSites."Justin Lebar2016-02-222-35/+38
| | | | | | | This reverts r261544, which was causing a test failure in Transforms/FunctionAttrs/readattrs.ll. llvm-svn: 261549
* [attrs] Handle convergent CallSites.Justin Lebar2016-02-222-38/+35
| | | | | | | | | | | | | | | | | | | | Summary: Previously we had a notion of convergent functions but not of convergent calls. This is insufficient to correctly analyze calls where the target is unknown, e.g. indirect calls. Now a call is convergent if it targets a known-convergent function, or if it's explicitly marked as convergent. As usual, we can remove convergent where we can prove that no convergent operations are performed in the call. Reviewers: chandlerc, jingyue Subscribers: hfinkel, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17317 llvm-svn: 261544
* Fix some abuse of auto flagged by clang's -Wrange-loop-analysis.Benjamin Kramer2016-02-221-1/+1
| | | | llvm-svn: 261524
* Allow setting MaxRerollIterations above 16Elena Demikhovsky2016-02-221-5/+4
| | | | | | | | By Ayal Zaks. Differential Revision http://reviews.llvm.org/D17258 llvm-svn: 261517
* ADT: Remove == and != comparisons between ilist iterators and pointersDuncan P. N. Exon Smith2016-02-216-8/+8
| | | | | | | | | | | | | | I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498
* TransformUtils: Avoid getNodePtrUnchecked() in integer division, NFCDuncan P. N. Exon Smith2016-02-211-2/+7
| | | | | | | | | | Stop relying on `getNodePtrUnchecked()` being useful on invalid iterators. This function is documented to be for internal use only, and the pointer type will eventually have to change to remove UB from ilist_iterator. Instead, check the iterator before it has been invalidated. llvm-svn: 261497
* fix inaccurate comment; NFCSanjay Patel2016-02-211-2/+1
| | | | llvm-svn: 261484
* [InstCombine] add getNegativeIsTrueBoolVec() helper function; NFCSanjay Patel2016-02-211-22/+20
| | | | | | | | | Originally part of: http://reviews.llvm.org/D17485 We need this when simplifying masked memory ops too. llvm-svn: 261483
* [LoopDeletion] Add an assert that verifies LCSSASanjoy Das2016-02-211-1/+3
| | | | | | | This is inspired by PR24804 -- had this assert been there before, isolating the root cause for PR24804 would have been far easier. llvm-svn: 261481
* [InstCombine] SSE/SSE2 (u)comiss/(u)comisd comparison intrinsics only use ↵Simon Pilgrim2016-02-201-0/+40
| | | | | | the lowest vector element llvm-svn: 261460
OpenPOWER on IntegriCloud