summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [MergeFuncs] Efficiently defer functions on mergeJF Bastien2015-09-021-28/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch introduces a side table in Merge Functions to efficiently remove functions from the function set when functions they refer to are merged. Previously these functions would need to be compared lg(N) times to find the appropriate FunctionNode in the tree to defer. With the recent determinism changes, this comparison is more expensive. In addition, the removal function would not always actually remove the function from the set (i.e. after remove(F), there would sometimes still be a node in the tree which contains F). With these changes, these functions are properly deferred, and so more functions can be merged. In addition, when there are many merged functions (and thus more deferred functions), there is a speedup: chromium: 48678 merged -> 49380 merged; 6.58s -> 5.49s libxul.so: 41004 merged -> 41030 merged; 8.02s -> 6.94s mysqld: 1607 merged -> 1607 merged (same); 0.215s -> 0.212s (probably noise) Author: jrkoenig Reviewers: jfb, dschuff Subscribers: llvm-commits, nlewycky Differential revision: http://reviews.llvm.org/D12537 llvm-svn: 246735
* [RewriteStatepointsForGC] Delete stale comment [NFC]Philip Reames2015-09-021-3/+0
| | | | llvm-svn: 246722
* [RewriteStatepointsForGC] Pull a function out of anon namespace [NFC]Philip Reames2015-09-021-1/+5
| | | | | | Thanks to David Blaikie for noticing in previous commit. llvm-svn: 246721
* [RewriteStatepointsForGC] Bugfix for change 246133Philip Reames2015-09-021-16/+16
| | | | | | | | Fix a bug in change 246133. I didn't handle the case where we had a cycle in the use graph and could add an instruction we were about to erase back on to the worklist. Oddly, I have not been able to write a small test case for this, even with the AssertingVH added. I have confirmed the basic theory for the fix on a large failing example, but all attempts to reduce that to something appropriate for a test case have failed. Differential Revision: http://reviews.llvm.org/D12575 llvm-svn: 246718
* Fix release build warning for unused functionPhilip Reames2015-09-021-1/+2
| | | | llvm-svn: 246717
* [RewriteStatepointsForGC] Improve debug output [NFC]Philip Reames2015-09-021-30/+36
| | | | llvm-svn: 246713
* assuem(X) handling in GVN bugfixPiotr Padlewski2015-09-021-1/+20
| | | | | | | | | | There was infinite loop because it was trying to change assume(true) into assume(true) Also added handling when assume(false) appear http://reviews.llvm.org/D12516 llvm-svn: 246697
* Constant propagation after hitting assume(cmp) bugfixPiotr Padlewski2015-09-022-9/+46
| | | | | | | | | Last time code run into assertion `BBE.isSingleEdge()` in lib/IR/Dominators.cpp:200. http://reviews.llvm.org/D12170 llvm-svn: 246696
* Constant propagation after hiting llvm.assumePiotr Padlewski2015-09-021-3/+68
| | | | | | | | | | | After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 246695
* [RemoveDuplicatePHINodes] Start over after removing a PHI.Benjamin Kramer2015-09-021-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes RemoveDuplicatePHINodes more effective and fixes an assertion failure. Triggering the assertions requires a DenseSet reallocation so this change only contains a constructive test. I'll explain the issue with a small example. In the following function there's a duplicate PHI, %4 and %5 are identical. When this is found the DenseSet in RemoveDuplicatePHINodes contains %2, %3 and %4. define void @F() { br label %1 ; <label>:1 ; preds = %1, %0 %2 = phi i32 [ 42, %0 ], [ %4, %1 ] %3 = phi i32 [ 42, %0 ], [ %5, %1 ] %4 = phi i32 [ 42, %0 ], [ 23, %1 ] %5 = phi i32 [ 42, %0 ], [ 23, %1 ] br label %1 } after RemoveDuplicatePHINodes runs the function looks like this. %3 has changed and is now identical to %2, but RemoveDuplicatePHINodes never saw this. define void @F() { br label %1 ; <label>:1 ; preds = %1, %0 %2 = phi i32 [ 42, %0 ], [ %4, %1 ] %3 = phi i32 [ 42, %0 ], [ %4, %1 ] %4 = phi i32 [ 42, %0 ], [ 23, %1 ] br label %1 } If the DenseSet does a reallocation now it will reinsert all keys and stumble over %3 now having a different hash value than it had when inserted into the map for the first time. This change clears the set whenever a PHI is deleted and starts the progress from the beginning, allowing %3 to be deleted and avoiding inconsistent DenseSet state. This potentially has a negative performance impact because it rescans all PHIs, but I don't think that this ever makes a difference in practice. llvm-svn: 246694
* [LV] Don't bail to MiddleBlock if a runtime check fails, bail to ScalarPH ↵James Molloy2015-09-021-60/+27
| | | | | | | | | | instead We were bailing to two places if our runtime checks failed. If the initial overflow check failed, we'd go to ScalarPH. If any other check failed, we'd go to MiddleBlock. This caused us to have to have an extra PHI per induction and reduction as the vector loop's exit block was not dominated by its latch. There's no need to have this behavior - if we just always go to ScalarPH we can get rid of a bunch of complexity. llvm-svn: 246637
* [LV] Move some code around slightly to make the intent of the function more ↵James Molloy2015-09-021-3/+1
| | | | | | | | clear. NFC. llvm-svn: 246636
* [LV] Cleanup: Sink an IRBuilder closer to its uses.James Molloy2015-09-021-10/+5
| | | | | | NFC. llvm-svn: 246635
* [LV] Refactor all runtime check emissions into helper functions.James Molloy2015-09-021-100/+126
| | | | | | | | This reduces the complexity of createEmptyBlock() and will open the door to further refactoring. The test change is simply because we're now constant folding a trivial test. llvm-svn: 246634
* [LV] Pull creation of trip counts into a helper function.James Molloy2015-09-021-63/+101
| | | | | | ... and do a tad of tidyup while we're at it. Because StartIdx must now be zero, there's no difference between Count and EndIdx. llvm-svn: 246633
* [LV] Factor the creation of the loop induction variable out of createEmptyLoop()James Molloy2015-09-021-19/+43
| | | | | | It makes things easier to understand if this is in a helper method. This is part of my ongoing spaghetti-removal operation on createEmptyLoop. llvm-svn: 246632
* [LV] Never widen an induction variable.James Molloy2015-09-021-115/+49
| | | | | | | | | | There's no need to widen canonical induction variables. It's just as efficient to create a *new*, wide, induction variable. Consider, if we widen an indvar, then we'll have to truncate it before its uses anyway (1 trunc). If we create a new indvar instead, we'll have to truncate that instead (1 trunc) [besides which IndVars should go and clean up our mess after us anyway on principle]. This lets us remove a ton of special-casing code. llvm-svn: 246631
* [LV] Switch to using canonical induction variables.James Molloy2015-09-021-14/+8
| | | | | | | | | | Vectorized loops only ever have one induction variable. All induction PHIs from the scalar loop are rewritten to be in terms of this single indvar. We were trying very hard to pick an indvar that already existed, even if that indvar wasn't canonical (didn't start at zero). But trying so hard is really fruitless - creating a new, canonical, indvar only results in one extra add in the worst case and that add is trivially easy to push through the PHI out of the loop by instcombine. If we try and be less clever here and instead let instcombine clean up our mess (as we do in many other places in LV), we can remove unneeded complexity. llvm-svn: 246630
* Move createEliminateAvailableExternallyPass earlier in the pass pipelineYaron Keren2015-09-021-11/+13
| | | | | | | to save running many ModulePasses on available external functions that are thrown away anyhow. llvm-svn: 246619
* DeadArgElim: don't eliminate arguments from naked functionsHans Wennborg2015-09-011-0/+21
| | | | | | Differential Revision: http://reviews.llvm.org/D12534 llvm-svn: 246564
* Fix Windows build by including raw_ostream.hHans Wennborg2015-08-311-0/+1
| | | | llvm-svn: 246486
* [FunctionAttr] Infer nonnull attributes on returnsPhilip Reames2015-08-311-0/+148
| | | | | | | | Teach FunctionAttr to infer the nonnull attribute on return values of functions which never return a potentially null value. This is done both via a conservative local analysis for the function itself and a optimistic per-SCC analysis. If no function in the SCC returns anything which could be null (other than values from other functions in the SCC), we can conclude no function returned a null pointer. Even if some function within the SCC returns a null pointer, we may be able to locally conclude that some don't. Differential Revision: http://reviews.llvm.org/D9688 llvm-svn: 246476
* [JumpThreading] make jump threading respect convergent annotation.Jingyue Wu2015-08-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: JumpThreading shouldn't duplicate a convergent call, because that would move a convergent call into a control-inequivalent location. For example, if (cond) { ... } else { ... } convergent_call(); if (cond) { ... } else { ... } should not be optimized to if (cond) { ... convergent_call(); ... } else { ... convergent_call(); ... } Test Plan: test/Transforms/JumpThreading/basic.ll Patch by Xuetian Weng. Reviewers: resistor, arsenm, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12484 llvm-svn: 246415
* [InstCombine] Fix PR24605.Sanjoy Das2015-08-282-0/+11
| | | | | | | | | | | | | | PR24605 is caused due to an incorrect insert point in instcombine's IR builder. When simplifying %t = add X Y ... %m = icmp ... %t the replacement for %t should be placed before %t, not before %m, as there could be a use of %t between %t and %m. llvm-svn: 246315
* Optimize memcmp(x,y,n)==0 for small n and suitably aligned x/y.Chad Rosier2015-08-281-0/+22
| | | | | | | http://reviews.llvm.org/D6952 PR20673 llvm-svn: 246313
* Remove Merge Functions pointer comparisonsJF Bastien2015-08-281-17/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch removes two remaining places where pointer value comparisons are used to order functions: comparing range annotation metadata, and comparing block address constants. (These are both rare cases, and so no actual non-determinism was observed from either case). The fix for range metadata is simple: the annotation always consists of a pair of integers, so we just order by those integers. The fix for block addresses is more subtle. Two constants are the same if they are the same basic block in the same function, or if they refer to corresponding basic blocks in each respective function. Note that in the first case, merging is trivially correct. In the second, the correctness of merging relies on the fact that the the values of block addresses cannot be compared. This change is actually an enhancement, as these functions could not previously be merged (see merge-block-address.ll). There is still a problem with cross function block addresses, in that constants pointing to a basic block in a merged function is not updated. This also more robustly compares floating point constants by all fields of their semantics, and fixes a dyn_cast/cast mixup. Author: jrkoenig Reviewers: dschuff, nlewycky, jfb Subscribers llvm-commits Differential revision: http://reviews.llvm.org/D12376 llvm-svn: 246305
* [SROA] Fix PR24463, a crash I introduced in SROA by allowing it toChandler Carruth2015-08-281-3/+13
| | | | | | | | | | | | handle more allocas with loads past the end of the alloca. I suspect there are some related crashers with slightly different patterns, but I'll fix those and add test cases as I find them. Thanks to David Majnemer for the excellent test case reduction here. Made this super simple to debug and fix. llvm-svn: 246289
* Revert r246244 and r246243Steven Wu2015-08-282-113/+11
| | | | | | These two commits cause clang/llvm bootstrap to hang. llvm-svn: 246279
* Constant propagation after hitting assume(cmp) bugfixPiotr Padlewski2015-08-282-9/+46
| | | | | | | | | Last time code run into assertion `BBE.isSingleEdge()` in lib/IR/Dominators.cpp:200. http://reviews.llvm.org/D12170 llvm-svn: 246244
* Constant propagation after hiting llvm.assumePiotr Padlewski2015-08-281-3/+68
| | | | | | | | | | | After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 246243
* Improve vectorization diagnostic messages and extend vectorize(enable) pragma.Tyler Nowicki2015-08-271-15/+25
| | | | | | | | | | | | | | | | This patch changes the analysis diagnostics produced when loops with floating-point recurrences or memory operations are identified. The new messages say "cannot prove it is safe to reorder * operations; allow reordering by specifying #pragma clang loop vectorize(enable)". Depending on the type of diagnostic the message will include additional options such as ffast-math or __restrict__. This patch also allows the vectorize(enable) pragma to override the low pointer memory check threshold. When the hint is given a higher threshold is used. See the clang patch for the options produced for each diagnostic. llvm-svn: 246187
* [LoopVectorize] Add Support for Small Size Reductions.Chad Rosier2015-08-272-32/+214
| | | | | | | | | | | | | | | | | | | | | | | Unlike scalar operations, we can perform vector operations on element types that are smaller than the native integer types. We type-promote scalar operations if they are smaller than a native type (e.g., i8 arithmetic is promoted to i32 arithmetic on Arm targets). This patch detects and removes type-promotions within the reduction detection framework, enabling the vectorization of small size reductions. In the legality phase, we look through the ANDs and extensions that InstCombine creates during promotion, keeping track of the smaller type. In the profitability phase, we use the smaller type and ignore the ANDs and extensions in the cost model. Finally, in the code generation phase, we truncate the result of the reduction to allow InstCombine to rewrite the entire expression in the smaller type. This fixes PR21369. http://reviews.llvm.org/D12202 Patch by Matt Simpson <mssimpso@codeaurora.org>! llvm-svn: 246149
* [LoopVectorize] Extract InductionInfo into a helper class...James Molloy2015-08-273-134/+87
| | | | | | | | ... and move it into LoopUtils where it can be used by other passes, just like ReductionDescriptor. The API is very similar to ReductionDescriptor - that is, not very nice at all. Sorting these both out will come in a followup. NFC llvm-svn: 246145
* Whoops, remove trailing whitespace.Alex Rosenberg2015-08-271-1/+1
| | | | llvm-svn: 246141
* Allow value forwarding past release fences in EarlyCSEPhilip Reames2015-08-271-0/+11
| | | | | | | | | | | | A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-store forwarding across a release fence. We do need to make sure that stores before the fence can't be eliminated even if there's another store to the same location after the fence. In theory, we could reorder the second store above the fence and *then* eliminate the former, but we can't do this if the stores are on opposite sides of the fence. Note: While more aggressive then what's there, this patch is still implementing a really conservative ordering. In particular, I'm not trying to exploit undefined behavior via races, or the fact that the LangRef says only 'atomic' accesses are ordered w.r.t. fences. Differential Revision: http://reviews.llvm.org/D11434 llvm-svn: 246134
* [RewriteStatepointsForGC] Reduce the number of new instructions for base ↵Philip Reames2015-08-271-3/+57
| | | | | | | | | | | | | | pointers When computing base pointers, we introduce new instructions to propagate the base of existing instructions which might not be bases. However, the algorithm doesn't make any effort to recognize when the new instruction to be inserted is the same as an existing one already in the IR. Since this is happening immediately before rewriting, we don't really have a chance to fix it after the pass runs without teaching loop passes about statepoints. I'm really not thrilled with this patch. I've rewritten it 4 different ways now, but this is the best I've come up with. The case where the new instruction is just the original base defining value could be merged into the existing algorithm with some complexity. The problem is that we might have something like an extractelement from a phi of two vectors. It may be trivially obvious that the base of the 0th element is an existing instruction, but I can't see how to make the algorithm itself figure that out. Thus, I resort to the call to SimplifyInstruction instead. Note that we can only adjust the instructions we've inserted ourselves. The live sets are still being tracked in side structures at this point in the code. We can't easily muck with instructions which might be in them. Long term, I'm really thinking we need to materialize the live pointer sets explicitly in the IR somehow rather than using side structures to track them. Differential Revision: http://reviews.llvm.org/D12004 llvm-svn: 246133
* Improved printing of analysis diagnostics in the loop vectorizer.Tyler Nowicki2015-08-271-18/+26
| | | | | | | | This patch ensures that every analysis diagnostic produced by the vectorizer will be printed if the loop has a vectorization hint on it. The condition has also been improved to prevent printing when a disabling hint is specified. llvm-svn: 246132
* [SimplifyCFG] Prune code from a provably unreachable switch defaultPhilip Reames2015-08-261-0/+17
| | | | | | | | | | As Sanjoy pointed out over in http://reviews.llvm.org/D11819, a switch on an icmp should always be able to become a branch instruction. This patch generalizes that notion slightly to prove that the default case of a switch is unreachable if the cases completely cover all possible bit patterns in the condition. Once that's done, the switch to branch conversion kicks in just fine. Note: Duplicate case values are disallowed by the LangRef and verifier. Differential Revision: http://reviews.llvm.org/D11995 llvm-svn: 246125
* Fix memory leak in sample profile pass.Diego Novillo2015-08-261-28/+23
| | | | | | | | | | | | | | | | The problem here were the function analyses invoked by the function pass manager from the new IPO pass. I looked at other IPO passes needing dominance information and the only one that requires it (partial inliner) does not use the standard dependency mechanism. This patch mimics what the partial inliner does to compute dominance, post-dominance and loop info. One thing I like about this approach is that I can delay the computation of all this until I actually need it. This should bring the ASAN buildbot back to green. If there's a better way to fix this, I'll do it in a follow-up patch. llvm-svn: 246066
* [SimplifyLibCalls] Fix a typoDavid Majnemer2015-08-261-1/+1
| | | | | | | cbrt(sqrt(x)) calculates the sixth root, not the ninth root. cbrt(cbrt(x)) calculates the ninth root. llvm-svn: 246046
* [SROA] Rip out all support for SSAUpdater in SROA.Chandler Carruth2015-08-261-198/+8
| | | | | | | | | | | | | This was only added to preserve the old ScalarRepl's use of SSAUpdater which was originally to avoid use of dominance frontiers. Now, we only need a domtree, and we'll need a domtree right after this pass as well and so it makes perfect sense to always and only use the dom-tree powered mem2reg. This was flag-flipper earlier and has stuck reasonably so I wanted to gut the now-dead code out of SROA before we waste more time with it. Among other things, this will make passmanager porting easier. llvm-svn: 246028
* Modernize with range-based for loops.Alex Rosenberg2015-08-261-21/+15
| | | | llvm-svn: 246018
* Reduce code duplication.Alex Rosenberg2015-08-261-12/+23
| | | | llvm-svn: 246017
* Trailing whitespaceAlex Rosenberg2015-08-261-1/+1
| | | | llvm-svn: 246016
* Comparing operands should not require the same ValueIDJF Bastien2015-08-261-5/+2
| | | | | | | | | | | | | Summary: When comparing basic blocks, there is an additional check that two Value*'s should have the same ID, which interferes with merging equivalent constants of different kinds (such as a ConstantInt and a ConstantPointerNull in the included testcase). The cmpValues function already ensures that the two values in each function are the same, so removing this check should not cause incorrect merging. Also, the type comparison is redundant, based on reviewing the code and testing on the test suite and several large LTO bitcodes. Author: jrkoenig Reviewers: nlewycky, jfb, dschuff Subscribers: llvm-commits Differential revision: http://reviews.llvm.org/D12302 llvm-svn: 246001
* Make variable argument intrinsics behave correctly in a Win64 CC function.Charles Davis2015-08-251-0/+4
| | | | | | | | | | | | | | | | Summary: This change makes the variable argument intrinsics, `llvm.va_start` and `llvm.va_copy`, and the `va_arg` instruction behave as they do on Windows inside a `CallingConv::X86_64_Win64` function. It's needed for a Clang patch I have to add support for GCC's `__builtin_ms_va_list` constructs. Reviewers: nadav, asl, eugenis CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1622 llvm-svn: 245990
* [msan] Precise instrumentation for icmp sgt %x, -1.Evgeniy Stepanov2015-08-251-15/+20
| | | | | | | | | | Extend signed relational comparison instrumentation with a special case for comparisons with -1. This fixes an MSan false positive when such comparison is used as a sign bit test. https://llvm.org/bugs/show_bug.cgi?id=24561 llvm-svn: 245980
* Update libdeps in LLVMipo and LLVMScalarOpts, corresponding to r245940.NAKAMURA Takumi2015-08-252-2/+2
| | | | llvm-svn: 245957
* Fix dependencies/shared library buildMatthias Braun2015-08-251-1/+1
| | | | llvm-svn: 245955
* The patch replace the overflow check in loop vectorization with the minimum ↵Wei Mi2015-08-251-22/+25
| | | | | | | | | | | loop iterations check. The loop minimum iterations check below ensures the loop has enough trip count so the generated vector loop will likely be executed, and it covers the overflow check. Differential Revision: http://reviews.llvm.org/D12107. llvm-svn: 245952
OpenPOWER on IntegriCloud