summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [BBVectorize] Don't vectorize selects with a scalar condition and vector ↵Michael Kuperstein2016-05-261-0/+33
| | | | | | | | | | operands. This fixes PR27879. Differential Revision: http://reviews.llvm.org/D20659 llvm-svn: 270888
* [CaptureTracking] Volatile operations capture their memory locationDavid Majnemer2016-05-261-0/+8
| | | | | | | | | | The memory location that corresponds to a volatile operation is very special. They are observed by the machine in ways which we cannot reason about. Differential Revision: http://reviews.llvm.org/D20555 llvm-svn: 270879
* [InstCombine] Catch more bswap cases missed due to zext and truncs.Chad Rosier2016-05-261-0/+38
| | | | | | | Fixes PR27824. Differential Revision: http://reviews.llvm.org/D20591. llvm-svn: 270853
* [MergedLoadStoreMotion] Don't transform across may-throw callsDavid Majnemer2016-05-262-1/+58
| | | | | | | | | | | | It is unsafe to hoist a load before a function call which may throw, the throw might prevent a pointer dereference. Likewise, it is unsafe to sink a store after a call which may throw. The caller might be able to observe the difference. This fixes PR27858. llvm-svn: 270828
* [ConstantFold] Fix incorrect index rewrites for GEPsAdam Nemet2016-05-261-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If an index for a vector or array type is out-of-range GEP constant folding tries to factor it into preceding dimensions. The code however does not consider addressing of structure field padding which should not qualify as out-of-range index. As demonstrated by the testcase, this can occur if the indexing performed on a vector type and the preceding index is an array type. SROA generates GEPs for example involving padding bytes as it slices an alloca. My fix disables this folding if the element type is a vector type. I believe that this is the only way we can end up with padding. (We have no access to DataLayout so I am not sure if there is actual robust way of actually checking the presence of padding.) Reviewers: majnemer Subscribers: llvm-commits, Gerolf Differential Revision: http://reviews.llvm.org/D20663 llvm-svn: 270826
* MemorySSA: Revert r269678 and r268068; replace with special casing in MemorySSA.Peter Collingbourne2016-05-263-0/+39
| | | | | | | | | | | | | It turns out that too many passes are relying on alias analysis results for control dependencies. Until we fix that by introducing a more accurate modelling of control dependencies, special case assume in MemorySSA instead. Also introduce tests to ensure we don't regress the FunctionAttrs or LICM passes. Differential Revision: http://reviews.llvm.org/D20658 llvm-svn: 270823
* [IRCE] Optimize conjunctions of range checksSanjoy Das2016-05-261-0/+99
| | | | | | | | | | | | | After this change, we do the expected thing for cases like ``` Check0Passed = /* range check IRCE can optimize */ Check1Passed = /* range check IRCE can optimize */ if (!(Check0Passed && Check1Passed)) throw_Exception(); ``` llvm-svn: 270804
* [PM] Port PartiallyInlineLibCalls to the new pass manager.Davide Italiano2016-05-251-0/+1
| | | | llvm-svn: 270798
* Look for a loop's starting location in the llvm.loop metadataHal Finkel2016-05-251-0/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Getting accurate locations for loops is important, because those locations are used by the frontend to generate optimization remarks. Currently, optimization remarks for loops often appear on the wrong line, often the first line of the loop body instead of the loop itself. This is confusing because that line might itself be another loop, or might be somewhere else completely if the body was inlined function call. This happens because of the way we find the loop's starting location. First, we look for a preheader, and if we find one, and its terminator has a debug location, then we use that. Otherwise, we look for a location on an instruction in the loop header. The fallback heuristic is not bad, but will almost always find the beginning of the body, and not the loop statement itself. The preheader location search often fails because there's often not a preheader, and even when there is a preheader, depending on how it was formed, it sometimes carries the location of some preceeding code. I don't see any good theoretical way to fix this problem. On the other hand, this seems like a straightforward solution: Put the debug location in the loop's llvm.loop metadata. A companion Clang patch will cause Clang to insert llvm.loop metadata with appropriate locations when generating debugging information. With these changes, our loop remarks have much more accurate locations. Differential Revision: http://reviews.llvm.org/D19738 llvm-svn: 270771
* [TLI] Also cover Linux 64 libfunc (stat64, ...) prototype checking.Ahmed Bougacha2016-05-252-1/+63
| | | | | | My script missed those in r270750. llvm-svn: 270763
* [TLI] Fix NumParams==0 prototype checking typo.Ahmed Bougacha2016-05-252-27/+1651
| | | | | | | | | | | | | There was a typo in r267758. It caused invalid accesses when given something like "void @free(...)", as NumParams == 0, and we then try to look at the 0th parameter. Turns out, most of these were untested; add both attribute and missing-prototype checks for all libc libfuncs. Differential Revision: http://reviews.llvm.org/D20543 llvm-svn: 270750
* [IR] Copy comdats in GlobalObject::copyAttributesFromReid Kleckner2016-05-251-0/+14
| | | | | | | | | | | | | | This is probably correct for all uses except cross-module IR linking, where we need to move the comdat from the source module to the destination module. Fixes PR27870. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D20631 llvm-svn: 270743
* [x86] avoid code explosion from LoopVectorizer for gather loop (PR27826) Sanjay Patel2016-05-251-0/+41
| | | | | | | | | | | | | | By making pointer extraction from a vector more expensive in the cost model, we avoid the vectorization of a loop that is very likely to be memory-bound: https://llvm.org/bugs/show_bug.cgi?id=27826 There are still bugs related to this, so we may need a more general solution to avoid vectorizing obviously memory-bound loops when we don't have HW gather support. Differential Revision: http://reviews.llvm.org/D20601 llvm-svn: 270729
* [X86] Remove the llvm.x86.sse2.storel.dq intrinsic. It hasn't been used in a ↵Craig Topper2016-05-251-13/+0
| | | | | | long time. llvm-svn: 270677
* [FunctionAttrs] Volatile loads should disable readonlyDavid Majnemer2016-05-251-0/+8
| | | | | | | | A volatile load has side effects beyond what callers expect readonly to signify. For example, it is not safe to reorder two function calls which each perform a volatile load to the same memory location. llvm-svn: 270671
* [PM] Port BDCE to the new pass manager.Davide Italiano2016-05-251-0/+1
| | | | llvm-svn: 270647
* Re-enable "[LoopUnroll] Enable advanced unrolling analysis by default" one ↵Michael Zolotukhin2016-05-241-1/+1
| | | | | | | | more time. This reverts commit r270577. llvm-svn: 270630
* [LoopUnrollAnalyzer] Fix a crash in UnrolledInstAnalyzer::visitCastInst.Michael Zolotukhin2016-05-241-0/+18
| | | | | | This fixes PR27847. Now for real. llvm-svn: 270629
* [InstCombine] Clean up and FileCheckize test case.Chad Rosier2016-05-241-61/+74
| | | | llvm-svn: 270586
* Revert r270518, which re-enabled "[LoopUnroll] Enable advanced unrolling ↵Hans Wennborg2016-05-241-1/+1
| | | | | | | | analysis by default. Chromium builds are still hitting the assert in PR27874. llvm-svn: 270577
* [ValueTracking, InstSimplify] extend isKnownNonZero() to handle vector constantsSanjay Patel2016-05-241-21/+5
| | | | | | | | | | | | | Similar in spirit to D20497 : If all elements of a constant vector are known non-zero, then we can say that the whole vector is known non-zero. It seems like we could extend this to FP scalar/vector too, but isKnownNonZero() says it only works for integers and pointers for now. Differential Revision: http://reviews.llvm.org/D20544 llvm-svn: 270562
* [InstCombine][X86][SSE41] The SSE41 PMOVSX intrinsics are auto upgraded now ↵Simon Pilgrim2016-05-241-67/+0
| | | | | | and aren't handled by InstCombine any more llvm-svn: 270561
* Revert "Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by ↵Michael Zolotukhin2016-05-241-1/+1
| | | | | | | | | default."" This reverts commit r270512 and reapplies r270478. Originally it caused PR27847, but it was fixed in r270517. llvm-svn: 270518
* [LoopUnrollAnalyzer] Fix a crash in UnrolledInstAnalyzer::visitCastInst.Michael Zolotukhin2016-05-241-1/+20
| | | | | | This fixes PR27847. llvm-svn: 270517
* Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default."Hans Wennborg2016-05-231-1/+1
| | | | | | This caused PR27847. llvm-svn: 270512
* [IRCE] Optimize "uses" not branches; NFCISanjoy Das2016-05-232-2/+2
| | | | | | | | | | This changes IRCE to optimize uses, and not branches. This change is NFCI since the uses we do inspect are in practice only ever going to be the condition use in conditional branches; but this flexibility will later allow us to analyze more complex expressions than just a direct branch on a range check. llvm-svn: 270500
* [InstSimplify] add vector tests for isKnownNonZeroSanjay Patel2016-05-231-0/+81
| | | | llvm-svn: 270498
* [InstCombine] Fix assertion when bitcast is converted to gepGerolf Hoflehner2016-05-231-0/+32
| | | | | | | | | | When an aggregate contains an opaque type its size cannot be determined. This triggers an "Invalid GetElementPtrInst indices for type" assert in function checkGEPType. The fix suppresses the conversion in this case. http://reviews.llvm.org/D20319 llvm-svn: 270479
* [LoopUnroll] Enable advanced unrolling analysis by default.Michael Zolotukhin2016-05-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch turns on LoopUnrollAnalyzer by default. To mitigate compile time regressions, I chose very conservative thresholds for now. Later we can make them more aggressive, but it might require being smarter in which loops we're optimizing. E.g. currently the biggest issue is that with more agressive thresholds we unroll many cold loops, which increases compile time for no performance benefit (performance of those loops is improved, but it doesn't matter since they are cold). Test results for compile time(using 4 samples to reduce noise): ``` MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes 5.19% SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect 4.19% MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow 3.39% MultiSource/Applications/JM/lencod/lencod 1.47% MultiSource/Benchmarks/Fhourstones-3_1/fhourstones3_1 -6.06% ``` I didn't see any performance changes in the testsuite, but it improves some internal tests. Reviewers: hfinkel, chandlerc Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D20482 llvm-svn: 270478
* [ValueTracking, InstCombine] extend isKnownToBeAPowerOfTwo() to handle ↵Sanjay Patel2016-05-223-10/+4
| | | | | | | | | | | | | | vector splat constants We could try harder to handle non-splat vector constants too, but that seems much rarer to me. Note that the div test isn't resolved because there's a check for isIntegerTy() guarding that transform. Differential Revision: http://reviews.llvm.org/D20497 llvm-svn: 270369
* [SimplifyCFG] Remove cleanuppads which are empty except for calls to ↵David Majnemer2016-05-211-0/+31
| | | | | | | | | | | | | lifetime.end A cleanuppad is not cheap, they turn into many instructions and result in additional spills and fills. It is not worth keeping a cleanuppad around if all it does is hold a lifetime.end instruction. N.B. We first try to merge the cleanuppad with another cleanuppad to avoid dropping the lifetime and debug info markers. llvm-svn: 270314
* [GuardWidening] Fix incorrect use of remove_ifSanjoy Das2016-05-211-0/+38
| | | | | | | | | | | I had used `std::remove_if` under the assumption that it moves the predicate matching elements to the end, but actaully the elements remaining towards the end (after the iterator returned by `std::remove_if`) are indeterminate. Fix the bug (and make the code more straightforward) by using a temporary SmallVector, and add a test case demonstrating the issue. llvm-svn: 270306
* Fix constant folding of addrspacecast of nullMatt Arsenault2016-05-211-0/+39
| | | | | | | This should not be making assumptions on the value of the casted pointer. llvm-svn: 270293
* add test vector sdivSanjay Patel2016-05-201-0/+15
| | | | llvm-svn: 270285
* add test for vector shiftSanjay Patel2016-05-201-0/+13
| | | | llvm-svn: 270284
* add tests for vector uremSanjay Patel2016-05-201-1/+23
| | | | llvm-svn: 270271
* use FileCheck instead of grep for exact checkingSanjay Patel2016-05-201-5/+10
| | | | llvm-svn: 270265
* Functions with differing phis should not be merged.Mark Lacey2016-05-201-0/+50
| | | | | | | | | | | Check that the incoming blocks of phi nodes are identical, and block function merging if they are not. rdar://problem/26255167 Differential Revision: http://reviews.llvm.org/D20462 llvm-svn: 270250
* [SimplifyCFG] eliminate switch cases based on known range of switch conditionSanjay Patel2016-05-201-4/+0
| | | | | | | | | | | | This was noted in PR24766: https://llvm.org/bugs/show_bug.cgi?id=24766#c2 We may not know whether the sign bit(s) are zero or one, but we can still optimize based on knowing that the sign bit is repeated. Differential Revision: http://reviews.llvm.org/D20275 llvm-svn: 270222
* Allow -inline-threshold to override default threshold.Easwaran Raman2016-05-191-0/+89
| | | | | | | | Before r257832, the threshold used by SimpleInliner was explicitly specified or generated from opt levels and passed to the base class Inliner's constructor. There, it was first overridden by explicitly specified -inline-threshold. The refactoring in r257832 did not preserve this behavior for all opt levels. This change brings back the original behavior. Differential Revision: http://reviews.llvm.org/D20452 llvm-svn: 270153
* [GuardWidening] Introduce range check mergingSanjoy Das2016-05-191-0/+197
| | | | | | | | | | | | | | | | | | | | | | Sequences of range checks expressed using guards, like guard((I - 2) u< L) guard((I - 1) u< L) guard((I + 0) u< L) guard((I + 1) u< L) guard((I + 2) u< L) can sometimes be combined into a smaller sequence: guard((I - 2) u< L AND (I + 2) u< L) if we can prove that (I - 2) u< L AND (I + 2) u< L implies all of checks expressed in the previous sequence. This change teaches GuardWidening to do this kind of merging when feasible. llvm-svn: 270151
* [InstCombine] Avoid combining the bitcast of a var that is used as both ↵Guozhi Wei2016-05-191-0/+20
| | | | | | | | | | address and result of load instructions This patch fixes https://llvm.org/bugs/show_bug.cgi?id=27703. If there is a sequence of one or more load instructions, each loaded value is used as address of later load instruction, bitcast is necessary to change the value type, don't optimize it. llvm-svn: 270135
* Recommit r255691 since PR26509 has been fixed.Wei Mi2016-05-192-1/+72
| | | | llvm-svn: 270113
* CodeGen: Make the global-merge pass independently testable, and add a test.Peter Collingbourne2016-05-191-0/+20
| | | | llvm-svn: 270023
* [GuardWidening] Use getEquivalentICmp to fold constant comparesSanjoy Das2016-05-191-1/+45
| | | | | | | `ConstantRange::getEquivalentICmp` is more general, and better factored. llvm-svn: 270019
* New pass: guard wideningSanjoy Das2016-05-181-0/+337
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Implement guard widening in LLVM. Description from GuardWidening.cpp: The semantics of the `@llvm.experimental.guard` intrinsic lets LLVM transform it so that it fails more often that it did before the transform. This optimization is called "widening" and can be used hoist and common runtime checks in situations like these: ``` %cmp0 = 7 u< Length call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ] call @unknown_side_effects() %cmp1 = 9 u< Length call @llvm.experimental.guard(i1 %cmp1) [ "deopt"(...) ] ... ``` to ``` %cmp0 = 9 u< Length call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ] call @unknown_side_effects() ... ``` If `%cmp0` is false, `@llvm.experimental.guard` will "deoptimize" back to a generic implementation of the same function, which will have the correct semantics from that point onward. It is always _legal_ to deoptimize (so replacing `%cmp0` with false is "correct"), though it may not always be profitable to do so. NB! This pass is a work in progress. It hasn't been tuned to be "production ready" yet. It is known to have quadriatic running time and will not scale to large numbers of guards Reviewers: reames, atrick, bogner, apilipenko, nlewycky Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20143 llvm-svn: 269997
* Follow-up patch of http://reviews.llvm.org/D19948 to handle missing profiles ↵Dehao Chen2016-05-181-37/+38
| | | | | | | | | | | | | | when simplifying CFG. Summary: Set default branch weight to 1:1 if one of the branch has profile missing when simplifying CFG. Reviewers: spatel, davidxl Subscribers: danielcdh, llvm-commits Differential Revision: http://reviews.llvm.org/D20307 llvm-svn: 269995
* [LoopUnrollAnalyzer] Take into account cost of instructions controlling ↵Michael Zolotukhin2016-05-181-0/+32
| | | | | | | | | branches, along with their operands. Previously, we didn't add their and their operands cost, which could've resulted in unrolling loops for no actual benefit. llvm-svn: 269985
* AMDGPU: Other sizes of popcnt are fastMatt Arsenault2016-05-181-1/+24
| | | | | | | We can chain bcnt instructions together, so any width popcnt is pretty fast. llvm-svn: 269950
* AMDGPU: Fix a few slightly broken testsMatt Arsenault2016-05-181-22/+23
| | | | | | | Fix minor bugs and uses of undef which break when pointer related optimization passes are run. llvm-svn: 269944
OpenPOWER on IntegriCloud