summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert r228556: InstCombine: propagate nonNull through assumeChandler Carruth2015-02-101-37/+0
| | | | | | | | | This commit isn't using the correct context, and is transfoming calls that are operands to loads rather than calls that are operands to an icmp feeding into an assume. I've replied on the original review thread with a very reduced test case and some thoughts on how to rework this. llvm-svn: 228677
* PlaceSafepoints: modernize gc.result.* -> gc.resultRamkumar Ramachandra2015-02-091-1/+16
| | | | | | Differential Revision: http://reviews.llvm.org/D7516 llvm-svn: 228625
* Introduce more tests for PlaceSafepointsPhilip Reames2015-02-093-0/+157
| | | | | | These tests the two optimizations for backedge insertion currently implemented and the split backedge flag which is currently off by default. llvm-svn: 228617
* Minor test cleanupPhilip Reames2015-02-091-5/+4
| | | | | | | a) add gc attribute b) remove unused param llvm-svn: 228612
* Add basic tests for PlaceSafepointsPhilip Reames2015-02-091-0/+72
| | | | | | This is just adding really simple tests which should have been part of the original submission. When doing so, I discovered that I'd mistakenly removed required pieces when preparing the patch for upstream submission. I fixed two such bugs in this submission. llvm-svn: 228610
* DeadArgElim: fix mismatch in accounting of array return types.Tim Northover2015-02-091-1/+45
| | | | | | | | | | | | Some parts of DeadArgElim were only considering the individual fields of StructTypes separately, but others (where insertvalue & extractvalue instructions occur) also looked into ArrayTypes. This one is an actual bug; the mismatch can lead to an argument being considered used by a return sub-value that isn't being tracked (and hence is dead by default). It then gets incorrectly eliminated. llvm-svn: 228559
* DeadArgElim: assess uses of entire return value aggregate.Tim Northover2015-02-091-0/+71
| | | | | | | | | | | | | | | | | | | | | | | Previously, a non-extractvalue use of an aggregate return value meant the entire return was considered live (the algorithm gave up entirely). This was correct, but conservative. It's better to actually look at that Use, making the analysis results apply to all sub-values under consideration. E.g. %val = call { i32, i32 } @whatever() [...] ret { i32, i32 } %val The return is using the entire aggregate (sub-values 0 and 1). We can still simplify @whatever if we can prove that this return is itself unused. Also unifies the logic slightly between aggregate and non-aggregate cases.. llvm-svn: 228558
* InstCombine: propagate nonNull through assumeRamkumar Ramachandra2015-02-091-0/+37
| | | | | | | | | Make assume (load (call|invoke) != null) set nonNull return attribute for the call and invoke. Also include tests. Differential Revision: http://reviews.llvm.org/D7107 llvm-svn: 228556
* Correctly combine alias.scope metadata by a union instead of intersectingBjorn Steinbrink2015-02-081-0/+24
| | | | | | | | | | | | | | | | | Summary: The alias.scope metadata represents sets of things an instruction might alias with. When generically combining the metadata from two instructions the result must be the union of the original sets, because the new instruction might alias with anything any of the original instructions aliased with. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7490 llvm-svn: 228525
* ValueTracking: Make isBytewiseValue simpler and more powerful at the same time.Benjamin Kramer2015-02-071-0/+15
| | | | | | | Turns out there is a simpler way of checking that all bytes in a word are equal than binary decomposition. llvm-svn: 228503
* Properly update AA metadata when performing call slot optimizationBjorn Steinbrink2015-02-071-0/+22
| | | | | | | | Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7482 llvm-svn: 228500
* InstCombine: Combine select sequences into a single selectMatthias Braun2015-02-061-0/+28
| | | | | | | | | | | | | | Normalize select(C0, select(C1, a, b), b) -> select((C0 & C1), a, b) select(C0, a, select(C1, a, b)) -> select((C0 | C1), a, b) This normal form may enable further combines on the And/Or and shortens paths for the values. Many targets prefer the other but can go back easily in CodeGen. Differential Revision: http://reviews.llvm.org/D7399 llvm-svn: 228409
* Teach isDereferenceablePointer() to look through bitcast constant expressions.Michael Kuperstein2015-02-051-0/+46
| | | | | | | | This fixes a LICM regression due to the new load+store pair canonicalization. Differential Revision: http://reviews.llvm.org/D7411 llvm-svn: 228284
* Value soft float calls as more expensive in the inliner.Cameron Esfahani2015-02-051-0/+136
| | | | | | | | | | | | | | Summary: When evaluating floating point instructions in the inliner, ask the TTI whether it is an expensive operation. By default, it's not an expensive operation. This keeps the default behavior the same as before. The ARM TTI has been updated to return back TCC_Expensive for targets which don't have hardware floating point. Reviewers: chandlerc, echristo Reviewed By: echristo Subscribers: t.p.northover, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D6936 llvm-svn: 228263
* StructurizeCFG: Use a reverse post-order traversalTom Stellard2015-02-043-9/+189
| | | | | | | | | | | | | | We were previously doing a post-order traversal and operating on the list in reverse, however this would occasionaly cause backedges for loops to be visited before some of the other blocks in the loop. We know use a reverse post-order traversal, which avoids this issue. The reverse post-order traversal is not completely ideal, so we need to manually fixup the list to ensure that inner loop backedges are visited before outer loop backedges. llvm-svn: 228186
* Reverting VLD1/VST1 base-updating/post-incrementing combiningRenato Golin2015-02-041-2/+2
| | | | | | | | | | | This reverts patches 223862, 224198, 224203, and 224754, which were all related to the vector load/store combining and were reverted/reaplied a few times due to the same alignment problems we're seeing now. Further tests, mainly self-hosting Clang, will be needed to reapply this patch in the future. llvm-svn: 228129
* Allow PRE to insert no-cost phi nodesDaniel Berlin2015-02-031-0/+31
| | | | llvm-svn: 228024
* Add straight-line strength reduction to LLVMJingyue Wu2015-02-031-0/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Straight-line strength reduction (SLSR) is implemented in GCC but not yet in LLVM. It has proven to effectively simplify statements derived from an unrolled loop, and can potentially benefit many other cases too. For example, LLVM unrolls #pragma unroll foo (int i = 0; i < 3; ++i) { sum += foo((b + i) * s); } into sum += foo(b * s); sum += foo((b + 1) * s); sum += foo((b + 2) * s); However, no optimizations yet reduce the internal redundancy of the three expressions: b * s (b + 1) * s (b + 2) * s With SLSR, LLVM can optimize these three expressions into: t1 = b * s t2 = t1 + s t3 = t2 + s This commit is only an initial step towards implementing a series of such optimizations. I will implement more (see TODO in the file commentary) in the near future. This optimization is enabled for the NVPTX backend for now. However, I am more than happy to push it to the standard optimization pipeline after more thorough performance tests. Test Plan: test/StraightLineStrengthReduce/slsr.ll Reviewers: eliben, HaoLiu, meheff, hfinkel, jholewinski, atrick Reviewed By: jholewinski, atrick Subscribers: karthikthecool, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D7310 llvm-svn: 228016
* Fix: SLPVectorizer crashes with assertion when vectorizing a cmp instruction.Erik Eckstein2015-02-021-0/+56
| | | | | | | | | The commit r225977 uncovered this bug. The problem was that the vectorizer tried to read the second operand of an already deleted instruction. The bug didn't show up before r225977 because the freed memory still contained a non-null pointer. With r225977 deletion of instructions is delayed and the read operand pointer is always null. llvm-svn: 227800
* [PM] Port SimplifyCFG to the new pass manager.Chandler Carruth2015-02-011-0/+1
| | | | | | | | This should be sufficient to replace the initial (minor) function pass pipeline in Clang with the new pass manager. I'll probably add an (off by default) flag to do that just to ensure we can get extra testing. llvm-svn: 227726
* [PM] Port EarlyCSE to the new pass manager.Chandler Carruth2015-02-012-0/+2
| | | | | | | | I've added RUN lines both to the basic test for EarlyCSE and the target-specific test, as this serves as a nice test that the TTI layer in the new pass manager is in fact working well. llvm-svn: 227725
* Inliner: Use replaceDbgDeclareForAlloca() instead of splicing theAdrian Prantl2015-01-302-3/+3
| | | | | | | instruction and generalize it to optionally dereference the variable. Follow-up to r227544. llvm-svn: 227604
* Move the target specific test case arbitrary-induction-step.ll to ↵Hao Liu2015-01-301-0/+0
| | | | | | test/Transforms/LoopVectorize/AArch64 folder. llvm-svn: 227561
* [LoopVectorize] Induction variables: support arbitrary constant step.Hao Liu2015-01-303-4/+153
| | | | | | | | | | Previously, only -1 and +1 step values are supported for induction variables. This patch extends LV to support arbitrary constant steps. Initial patch by Alexey Volkov. Some bug fixes are added in the following version. Differential Revision: http://reviews.llvm.org/D6051 and http://reviews.llvm.org/D7193 llvm-svn: 227557
* Fix PR22386. The inliner moves static allocas to the entry basic blockAdrian Prantl2015-01-301-0/+141
| | | | | | so we need to move the dbg.declare intrinsics that describe them, too. llvm-svn: 227544
* [GVN] don't propagate equality comparisons of FP zero (PR22376)Sanjay Patel2015-01-291-6/+42
| | | | | | | | | | | | | | In http://reviews.llvm.org/D6911, we allowed GVN to propagate FP equalities to allow some simple value range optimizations. But that introduced a bug when comparing to -0.0 or 0.0: these compare equal even though they are not bitwise identical. This patch disallows propagating zero constants in equality comparisons. Fixes: http://llvm.org/bugs/show_bug.cgi?id=22376 Differential Revision: http://reviews.llvm.org/D7257 llvm-svn: 227491
* Teach SplitBlockPredecessors how to handle landingpad blocks.Philip Reames2015-01-281-1/+1
| | | | | | | | | | Patch by: Igor Laevsky <igor@azulsystems.com> "Currently SplitBlockPredecessors generates incorrect code in case if basic block we are going to split has a landingpad. Also seems like it is fairly common case among it's users to conditionally call either SplitBlockPredecessors or SplitLandingPadPredecessors. Because of this I think it is reasonable to add this condition directly into SplitBlockPredecessors." Differential Revision: http://reviews.llvm.org/D7157 llvm-svn: 227390
* [X86] Reduce some 32-bit imuls into lea + shlMichael Kuperstein2015-01-281-1/+3
| | | | | | | | Reduce integer multiplication by a constant of the form k*2^c, where k is in {3,5,9} into a lea + shl. Previously it was only done for imulq on 64-bit platforms, but it makes sense for imull and 32-bit as well. Differential Revision: http://reviews.llvm.org/D7196 llvm-svn: 227308
* Fold fcmp in cases where value is provably non-negative. By Arch Robison.Elena Demikhovsky2015-01-281-0/+60
| | | | | | | | | | | | This patch folds fcmp in some cases of interest in Julia. The patch adds a function CannotBeOrderedLessThanZero that returns true if a value is provably not less than zero. I.e. the function returns true if the value is provably -0, +0, positive, or a NaN. The patch extends InstructionSimplify.cpp to fold instances of fcmp where: - the predicate is olt or uge - the first operand is provably not less than zero - the second operand is zero The motivation for handling these cases optimizing away domain checks for sqrt in Julia for common idioms such as sqrt(x*x+y*y).. http://reviews.llvm.org/D6972 llvm-svn: 227298
* Move EH personality type classification to Analysis/LibCallSemantics.hReid Kleckner2015-01-281-0/+52
| | | | | | | | | | | | | | Summary: Also add enum types for __C_specific_handler and _CxxFrameHandler3 for which we know a few things. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7214 llvm-svn: 227284
* [SimplifyLibCalls] Don't confuse strcpy_chk for stpcpy_chk.Ahmed Bougacha2015-01-276-121/+155
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was introduced in a faulty refactoring (r225640, mea culpa): the tests weren't testing the return values, so, for both __strcpy_chk and __stpcpy_chk, we would return the end of the buffer (matching stpcpy) instead of the beginning (for strcpy). The root cause was the prefix "__" being ignored when comparing, which made us always pick LibFunc::stpcpy_chk. Pass the LibFunc::Func directly to avoid this kind of error. Also, make the testcases as explicit as possible to prevent this. The now-useful testcases expose another, entangled, stpcpy problem, with the further simplification. This was introduced in a refactoring (r225640) to match the original behavior. However, this leads to problems when successive simplifications generate several similar instructions, none of which are removed by the custom replaceAllUsesWith. For instance, InstCombine (the main user) doesn't erase the instruction in its custom RAUW. When trying to simplify say __stpcpy_chk: - first, an stpcpy is created (fortified simplifier), - second, a memcpy is created (normal simplifier), but the stpcpy call isn't removed. - third, InstCombine later revisits the instructions, and simplifies the first stpcpy to a memcpy. We now have two memcpys. llvm-svn: 227250
* Teach IRCE to look at branch weights when recognizing range checksSanjoy Das2015-01-275-14/+58
| | | | | | | | | | | Splitting a loop to make range checks redundant is profitable only if the range check "never" fails. Make this fact a part of recognizing a range check -- a branch is a range check only if it is expected to pass (via branch_weights metadata). Differential Revision: http://reviews.llvm.org/D7192 llvm-svn: 227249
* [InstCombine] Teach how to fold a select into a cttz/ctlz with the ↵Andrea Di Biagio2015-01-272-2/+304
| | | | | | | | | | | 'is_zero_undef' flag. This patch teaches the Instruction Combiner how to fold a cttz/ctlz followed by a icmp plus select into a single cttz/ctlz with flag 'is_zero_undef' cleared. Added test InstCombine/select-cmp-cttz-ctlz.ll. llvm-svn: 227197
* LoopRotate: Don't walk the uses of a ConstantDavid Majnemer2015-01-271-0/+24
| | | | | | | | | | LoopRotate wanted to avoid live range interference by looking at the uses of a Value in the loop latch and seeing if any lied outside of the loop. We would wrongly perform this operation on Constants. This fixes PR22337. llvm-svn: 227171
* Commoning of target specific load/store intrinsics in Early CSE.Chad Rosier2015-01-262-0/+236
| | | | | | | Phabricator revision: http://reviews.llvm.org/D7121 Patch by Sanjin Sijaric <ssijaric@codeaurora.org>! llvm-svn: 227149
* Add test cases for PRE w/volatile loadsPhilip Reames2015-01-261-0/+82
| | | | | | These tests check that the combination of 227110 (cross block query inst) and 227112 (volatile load semantics) work together properly to allow PRE in cases where a loop contains a volatile access. llvm-svn: 227146
* SimplifyCFG: Omit range checks for switch lookup tables when default is ↵Hans Wennborg2015-01-261-0/+29
| | | | | | | | | | | unreachable The range check would get optimized away later, but we might as well not emit them in the first place. http://reviews.llvm.org/D6471 llvm-svn: 227126
* SimplifyCFG: don't remove unreachable default switch destinationsHans Wennborg2015-01-264-68/+34
| | | | | | | | | | | | | | | An unreachable default destination can be exploited by other optimizations and allows for more efficient lowering. Both the SDag switch lowering and LowerSwitch can exploit unreachable defaults. Also make TurnSwitchRangeICmp handle switches with unreachable default. This is kind of separate change, but it cannot be tested without the change above, and I don't want to land the change above without this since that would regress other tests. Differential Revision: http://reviews.llvm.org/D6471 llvm-svn: 227125
* Refine memory dependence's notion of volatile semanticsPhilip Reames2015-01-261-0/+75
| | | | | | | | | | According to my reading of the LangRef, volatiles are only ordered with respect to other volatiles. It is entirely legal and profitable to forward unrelated loads over the volatile load. This patch implements this for GVN by refining the transition rules MemoryDependenceAnalysis uses when encountering a volatile. The added test cases show where the extra flexibility is profitable for local dependence optimizations. I have a related change (227110) which will extend this to non-local dependence (i.e. PRE), but that's essentially orthogonal to the semantic change in this patch. I have tested the two together and can confirm that PRE works over a volatile load with both changes. I will be submitting a PRE w/volatiles test case seperately in the near future. Differential Revision: http://reviews.llvm.org/D6901 llvm-svn: 227112
* Pass QueryInst down through non-local dependency calculationPhilip Reames2015-01-262-1/+76
| | | | | | | | | | | | | | This change is mostly motivated by exposing information about the original query instruction to the actual scanning work in getPointerDependencyFrom when used by GVN PRE. In a follow up change, I will use this to be more precise with regards to the semantics of volatile instructions encountered in the scan of a basic block. Worth noting, is that this change (despite appearing quite simple) is not semantically preserving. By providing more information to the helper routine, we allow some optimizations to kick in that weren't previously able to (when called from this code path.) In particular, we see that treatment of !invariant.load becomes more precise. In theory, we might see a difference with an ordered/atomic instruction as well, but I'm having a hard time actually finding a test case which shows that. Test wise, I've included new tests for !invariant.load which illustrate this difference. I've also included some updated TBAA tests which highlight that this change isn't needed for that optimization to kick in - it's handled inside alias analysis itself. Eventually, it would be nice to factor the !invariant.load handling inside alias analysis as well. Differential Revision: http://reviews.llvm.org/D6895 llvm-svn: 227110
* SLPVectorizer: fix wrong scheduling of atomic load/stores.Erik Eckstein2015-01-261-0/+31
| | | | | | This fixes PR22306. llvm-svn: 227077
* [PM] Port LowerExpectIntrinsic to the new pass manager.Chandler Carruth2015-01-241-0/+1
| | | | | | | | | | | | This just lifts the logic into a static helper function, sinks the legacy pass to be a trivial wrapper of that helper fuction, and adds a trivial wrapper for the new PM as well. Not much to see here. I switched a test case to run in both modes, but we have to strip the dead prototypes separately as that pass isn't in the new pass manager (yet). llvm-svn: 226999
* [PM] Port instcombine to the new pass manager!Chandler Carruth2015-01-241-0/+1
| | | | | | | | | | | | | | | | | | | | This is exciting as this is a much more involved port. This is a complex, existing transformation pass. All of the core logic is shared between both old and new pass managers. Only the access to the analyses is separate because the actual techniques are separate. This also uses a bunch of different and interesting analyses and is the first time where we need to use an analysis across an IR layer. This also paves the way to expose instcombine utility functions. I've got a static function that implements the core pass logic over a function which might be mildly interesting, but more interesting is likely exposing a routine which just uses instructions *already in* the worklist and combines until empty. I've switched one of my favorite instcombine tests to run with both as well to make sure this keeps working. llvm-svn: 226987
* LowerSwitch: replace unreachable default with popular case destinationHans Wennborg2015-01-232-2/+115
| | | | | | | | | | | | SimplifyCFG currently does this transformation, but I'm planning to remove that to allow other passes, such as this one, to exploit the unreachable default. This patch takes care to keep track of what case values are unreachable even after the transformation, allowing for more efficient lowering. Differential Revision: http://reviews.llvm.org/D6697 llvm-svn: 226934
* Revert "Don't remove a landing pad if the invoke requires a table entry."Reid Kleckner2015-01-221-77/+0
| | | | | | | | | | | | This reverts commit r176827. Björn Steinbrink pointed out that this didn't actually fix the bug (PR15555) it was attempting to fix. With this reverted, we can now remove landingpad cleanups that immediately resume unwinding, converting the invoke to a call. llvm-svn: 226850
* Fix crashes in IRCE caused by mismatched typesSanjoy Das2015-01-221-0/+66
| | | | | | | | | | | | | | | There are places where the inductive range check elimination pass depends on two llvm::Values or llvm::SCEVs to be of the same llvm::Type when they do not need to be. This patch relaxes those restrictions (by bailing out of the optimization if the types mismatch), and adds test cases to trigger those paths. These issues were found by bootstrapping clang with IRCE running in the -O3 pass ordering. Differential Revision: http://reviews.llvm.org/D7082 llvm-svn: 226793
* Fixed a bug in masked load/store in reversed loop.Elena Demikhovsky2015-01-221-0/+82
| | | | | | | | | Added a test. The bug was submitted to bugzilla: http://llvm.org/bugs/show_bug.cgi?id=22225 llvm-svn: 226791
* [canonicalize] Teach InstCombine to canonicalize loads which are onlyChandler Carruth2015-01-222-3/+53
| | | | | | | | | | | | | | | | | | | | | | | ever stored to always use a legal integer type if one is available. Regardless of whether this particular type is good or bad, it ensures we don't get weird differences in generated code (and resulting performance) from "equivalent" patterns that happen to end up using a slightly different type. After some discussion on llvmdev it seems everyone generally likes this canonicalization. However, there may be some parts of LLVM that handle it poorly and need to be fixed. I have at least verified that this doesn't impede GVN and instcombine's store-to-load forwarding powers in any obvious cases. Subtle cases are exactly what we need te flush out if they remain. Also note that this IR pattern should already be hitting LLVM from Clang at least because it is exactly the IR which would be produced if you used memcpy to copy a pointer or floating point between memory instead of a variable. llvm-svn: 226781
* DebugInfo: Use distinct inlinedAt MDLocations to avoid separate inlined ↵David Blaikie2015-01-213-3/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | calls being coalesced When two calls from the same MDLocation are inlined they currently get treated as one inlined function call (creating difficulty debugging, duplicate variables, etc). Clang worked around this by including column information on inline calls which doesn't address LTO inlining or calls to the same function from the same line and column (such as through a macro). It also didn't address ctor and member function calls. By making the inlinedAt locations distinct, every call site has an explicitly distinct location that cannot be coalesced with any other call. This can produce linearly (2x in the worst case where every call is inlined and the call instruction has a non-call instruction at the same location) more debug locations. Any increase beyond that are in cases where the Clang workaround was insufficient and the new scheme is creating necessary distinct nodes that were being erroneously coalesced previously. After this change to LLVM the incomplete workarounds in Clang. That should reduce the number of debug locations (in a build without column info, the default on Darwin, not the default on Linux) by not creating pseudo-distinct locations for every call to an inline function. (oh, and I made the inlined-at chain rebuilding iterative instead of recursive because I was having trouble wrapping my head around it the way it was - open to discussion on the right design for that function (including going back to a recursive solution)) llvm-svn: 226736
* InstCombine: Don't strip bitcasts off of callsites marked 'thunk'David Majnemer2015-01-211-0/+11
| | | | | | | The return type of a thunk is meaningless, we just want the arguments and return value to be forwarded. llvm-svn: 226708
OpenPOWER on IntegriCloud