summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[SCCP] Manually fold branches on undef."Davide Italiano2018-01-071-26/+3
| | | | | | | I thought this was responsible for PR35723, but I was wrong, the issue lies elsewhere. Revert while I debug. llvm-svn: 321975
* [SLPVectorizer] Reintroduce std::stable_sort(properlyDominates()).Davide Italiano2018-01-071-9/+23
| | | | | | | | The approach was never discussed, I wasn't able to reproduce this non-determinism, and the original author went AWOL. After a discussion on the ML, Philip suggested to revert this. llvm-svn: 321974
* [LV][VPlan] NFC patch to move LoopVectorizationPlanner class out of ↵Hal Finkel2018-01-074-271/+269
| | | | | | | | | | | | | | | | | | | | | | | | | LoopVectorize.cpp Another small step forward to move VPlan stuff outside of LoopVectorize.cpp. VPlanBuilder.h is renamed to LoopVectorizationPlanner.h LoopVectorizationPlanner class is moved from LoopVectorize.cpp to LoopVectorizationPlanner.h LoopVectorizationCostModel::VectorizationFactor class is moved to LoopVectorizationPlanner.h (used by the planner class) --- this needs further streamlining work in later patches and thus all I did was take it out of the CostModel class and moved to the header file. The callback function had to stay inside LoopVectorize.cpp since it calls an InnerLoopVectorizer member function declared in it. Next Steps: Make InnerLoopVectorizer, LoopVectorizationCostModel, and other classes more modular and more aligned with VPlan direction, in small increments. Previous step was: r320900 (https://reviews.llvm.org/D41045) Patch by Hideki Saito, thanks! Differential Revision: https://reviews.llvm.org/D41420 llvm-svn: 321962
* [CodeExtractor] Use subset of function attributes for extracted function.Florian Hahn2018-01-071-4/+74
| | | | | | | | | | | | | | | | | | | | | | | | In addition to target-dependent attributes, we can also preserve a white-listed subset of target independent function attributes. The white-list excludes problematic attributes, most prominently: * attributes related to memory accesses, as alloca instructions could be moved in/out of the extracted block * control-flow dependent attributes, like no_return or thunk, as the relerelevant instructions might or might not get extracted. Thanks @efriedma and @aemerson for providing a set of attributes that cannot be propagated. Reviewers: efriedma, davidxl, davide, silvas Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D41334 llvm-svn: 321961
* [InlineFunction] Preserve calling convention when forwarding VarArgs.Florian Hahn2018-01-061-0/+1
| | | | | | | | | | Reviewers: efriedma, rnk, davide Reviewed By: rnk, davide Differential Revision: https://reviews.llvm.org/D41556 llvm-svn: 321943
* [InlineFunction] Preserve attributes when forwarding VarArgs.Florian Hahn2018-01-061-4/+23
| | | | | | | | | | Reviewers: rnk, efriedma Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D41555 llvm-svn: 321942
* [InlineFunction] Inline vararg functions that do not access varargs.Florian Hahn2018-01-061-17/+20
| | | | | | | | | | | | | If the varargs are not accessed by a function, we can inline the function. Reviewers: dblaikie, chandlerc, davide, efriedma, rnk, hfinkel Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D41335 llvm-svn: 321940
* [InstCombine] relax use constraint for min/max (~a, ~b) --> ~min/max(a, b)Sanjay Patel2018-01-061-2/+2
| | | | | | | | | | | In the minimal case, this won't remove instructions, but it still improves uses of existing values. In the motivating example from PR35834, it does remove instructions, and sets that case up to be optimized by something like D41603: https://reviews.llvm.org/D41603 llvm-svn: 321936
* [Utils] Simplify salvageDebugInfo, NFCIVedant Kumar2018-01-051-34/+30
| | | | | | | | | Having a single call to findDbgUsers() allows salvageDebugInfo() to return earlier. Differential Revision: https://reviews.llvm.org/D41787 llvm-svn: 321915
* [InstCombine] add folds for min(~a, b) --> ~max(a, b)Sanjay Patel2018-01-051-22/+12
| | | | | | | | | | | | | | | | | Besides the bug of omitting the inverse transform of max(~a, ~b) --> ~min(a, b), the use checking and operand creation were off. We were potentially creating repeated identical instructions of existing values. This led to infinite looping after I added the extra folds. By using the simpler m_Not matcher and not creating new 'not' ops for a and b, we avoid that problem. It's possible that not using IsFreeToInvert() here is more limiting than the simpler matcher, but there are no tests for anything more exotic. It's also possible that we should relax the use checking further to handle a case like PR35834: https://bugs.llvm.org/show_bug.cgi?id=35834 ...but we can make that a follow-up if it is needed. llvm-svn: 321882
* WholeProgramDevirt: Simplify ORE getter mechanism for old PM. NFCI.Peter Collingbourne2018-01-051-34/+17
| | | | llvm-svn: 321841
* Revert "[JumpThreading] Preservation of DT and LVI across the pass"Reid Kleckner2018-01-044-310/+89
| | | | | | | This reverts r321825, it causes crashes in Chromium. Reproducer forthcoming. llvm-svn: 321832
* [JumpThreading] Preservation of DT and LVI across the passBrian M. Rzycki2018-01-044-89/+310
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: See D37528 for a previous (non-deferred) version of this patch and its description. Preserves dominance in a deferred manner using a new class DeferredDominance. This reduces the performance impact of updating the DominatorTree at every edge insertion and deletion. A user may call DDT->flush() within JumpThreading for an up-to-date DT. This patch currently has one flush() at the end of runImpl() to ensure DT is preserved across the pass. LVI is also preserved to help subsequent passes such as CorrelatedValuePropagation. LVI is simpler to maintain and is done immediately (not deferred). The code to perfom the preversation was minimally altered and was simply marked as preserved for the PassManager to be informed. This extends the analysis available to JumpThreading for future enhancements. One example is loop boundary threading. Reviewers: dberlin, kuhar, sebpop Reviewed By: kuhar, sebpop Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40146 llvm-svn: 321825
* Add assertion on DT availability during LI update in UpdateAnalysisInformationAnna Thomas2018-01-041-0/+1
| | | | | | | | | | | | | This came up during discussions in llvm-commits for rL321653: Check for unreachable preds before updating LI in UpdateAnalysisInformation The assert provides hints to passes to require both DT and LI if we plan on updating LI through this function. Tests run: make check llvm-svn: 321805
* [InstCombine] safely create a constant of the right type (PR35794)Sanjay Patel2018-01-041-4/+4
| | | | llvm-svn: 321801
* [GVNHoist] Fix: PR35222 gvn-hoist incorrectly erases load in case of a loopAditya Kumar2018-01-041-1/+1
| | | | | | | | | | | Reviewers: dberlin sebpop eli.friedman Differential Revision: https://reviews.llvm.org/D41453 llvm-svn: 321789
* StructurizeCFG: Fix broken backedge detectionMatt Arsenault2018-01-031-82/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The work order was changed in r228186 from SCC order to RPO with an arbitrary sorting function. The sorting function attempted to move inner loop nodes earlier. This was was apparently relying on an assumption that every block in a given loop / the same loop depth would be seen before visiting another loop. In the broken testcase, a block outside of the loop was encountered before moving onto another block in the same loop. The testcase would then structurize such that one blocks unconditional successor could never be reached. Revert to plain RPO for the analysis phase. This fixes detecting edges as backedges that aren't really. The processing phase does use another visited set, and I'm unclear on whether the order there is as important. An arbitrary order doesn't work, and triggers some infinite loops. The reversed RPO list seems to work and is closer to the order that was used before, minus the arbitary custom sorting. A few of the changed tests now produce smaller code, and a few are slightly worse looking. llvm-svn: 321751
* [InstCombine] Check for out of range shift values using APInt before calling ↵Simon Pilgrim2018-01-031-4/+4
| | | | | | | | getZExtValue Reduced from oss-fuzz #4871 test case llvm-svn: 321748
* [BasicBlockUtils] Check for unreachable preds before updating LI in ↵Anna Thomas2018-01-021-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | UpdateAnalysisInformation Summary: We are incorrectly updating the LI when loop-simplify generates dedicated exit blocks for a loop. The issue is that there's an implicit assumption that the Preds passed into UpdateAnalysisInformation are reachable. However, this is not true and breaks LI by incorrectly updating the header of a loop. One such case is when we generate dedicated exits when the exit block is a landing pad (through SplitLandingPadPredecessors). There maybe other cases as well, since we do not guarantee that Preds passed in are reachable basic blocks. The added test case shows how loop-simplify breaks LI for the outer loop (and DT in turn) after we try to generate the LoopSimplifyForm. Reviewers: davide, chandlerc, sanjoy Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41519 llvm-svn: 321653
* [InstCombine] Missed optimization in math expression: squashing sqrt functionsDmitry Venikov2018-01-021-0/+17
| | | | | | | | | | | | | | Summary: This patch enables folding under -ffast-math flag sqrt(a) * sqrt(b) -> sqrt(a*b) Reviewers: hfinkel, spatel, davide Reviewed By: spatel, davide Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D41322 llvm-svn: 321637
* [SimplifyCFG] Return to the pass manager the correct value.Davide Italiano2017-12-311-1/+1
| | | | | | | I wanted to commit this with r321603, but I failed to squash the two commits. llvm-svn: 321606
* [Utils/Local] Use `auto` when the type is obvious. NFCI.Davide Italiano2017-12-311-6/+6
| | | | llvm-svn: 321605
* [Utils] Remove commented debug message. NFCI.Davide Italiano2017-12-311-4/+0
| | | | llvm-svn: 321604
* [SimplifyCFG] Stop hoisting musttail calls incorrectly.Davide Italiano2017-12-311-0/+11
| | | | | | PR35774. llvm-svn: 321603
* Use phi ranges to simplify code. No functionality change intended.Benjamin Kramer2017-12-3021-341/+205
| | | | llvm-svn: 321585
* StructurizeCFG: Use phi iterator rangeMatt Arsenault2017-12-291-8/+2
| | | | llvm-svn: 321568
* Remove superfluous copies in sample profiling.Benjamin Kramer2017-12-281-5/+6
| | | | | | No functionliaty change intended. llvm-svn: 321530
* Revert r321377, it causes regression to https://reviews.llvm.org/P8055.Guozhi Wei2017-12-281-179/+0
| | | | llvm-svn: 321528
* Avoid int to string conversion in Twine or raw_ostream contexts.Benjamin Kramer2017-12-283-6/+6
| | | | | | Some output changes from uppercase hex to lowercase hex, no other functionality change intended. llvm-svn: 321526
* [RewriteStatepoints] Fix incorrect assertionMax Kazantsev2017-12-281-7/+2
| | | | | | | | | | | | | | | | | | | | `RewriteStatepointsForGC` iterates over function blocks and their predecessors in order of declaration. One of outcomes of this is that callsites are placed in arbitrary order which has nothing to do with travelsar order. On the other hand, function `recomputeLiveInValues` asserts that bases are added to `Info.PointerToBase` before their deried pointers are updated. But if call sites are processed in order different from RPOT, this is not necessarily true. We cannot guarantee that the base was placed there before every pointer derived from it. All we can guarantee is that this base was marked as known base by this point. This patch replaces the fact that we assert from checking that the base was added to the map with assert that the base was marked as known base. Differential Revision: https://reviews.llvm.org/D41593 llvm-svn: 321517
* [InstCombine] Check for isa<Instruction> before using cast<>Simon Pilgrim2017-12-281-1/+1
| | | | | | | | Protects against casts from constexpr etc. Reduced from oss-fuzz #4788 test case llvm-svn: 321515
* Revert "[memcpyopt] Teach memcpyopt to optimize across basic blocks"Reid Kleckner2017-12-281-46/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts r321138. It seems there are still underlying issues with memdep. PR35519 seems to still be present if debug info is enabled. We end up losing a memcpy. Somehow during store to memset merging, we insert the memset after the memcpy or fail to update the memdep analysis to account for the newly inserted memset of a pair. Reduced test case: #include <assert.h> #include <stdio.h> #include <string> #include <utility> #include <vector> void do_push_back( std::vector<std::pair<std::string, std::vector<std::string>>>* crls) { crls->push_back(std::make_pair(std::string(), std::vector<std::string>())); } int __attribute__((optnone)) main() { // Put some data in the vector and then remove it so we take the push_back // fast path. std::vector<std::pair<std::string, std::vector<std::string>>> crl_set; crl_set.push_back({"asdf", {}}); crl_set.pop_back(); printf("first word in vector storage: %p\n", *(void**)crl_set.data()); // Do the push_back which may fail to initialize the data. do_push_back(&crl_set); auto* first = &crl_set.back().first; printf("first word in vector storage (should be zero): %p\n", *(void**)crl_set.data()); assert(first->empty()); puts("ok"); } Compile with libc++, enable optimizations, and enable debug info: $ clang++ -stdlib=libc++ -g -O2 t.cpp -o t.exe -Wl,-rpath=llvm/build/lib This program will assert with this change. llvm-svn: 321510
* [InstCombine] Gracefully handle out of range extractelement indicesSimon Pilgrim2017-12-271-3/+5
| | | | | | | | InstSimplify is responsible for handling these, but we shouldn't just assert here. Reduced from oss-fuzz #4808 test case llvm-svn: 321489
* [instcombine] add powi(x, 2) -> x * xPhilip Reames2017-12-271-0/+4
| | | | llvm-svn: 321468
* Sink a couple of transforms from instcombine into instsimplify.Philip Reames2017-12-271-24/+2
| | | | llvm-svn: 321467
* [NFC] Extract out a helper function for SimplifyCall(CS, Q)Philip Reames2017-12-271-3/+1
| | | | | | This simplifies code, but the real motivation is that it lets me clean up some downstream code. llvm-svn: 321466
* [Unroll][DebugInfo] Propagate loop body's debug location to epilog preheaderZhaoshi Zheng2017-12-261-1/+6
| | | | | | | NewExit and epilog PreHeader should has the same debug loc as the original loop body, instead of original loop exit. llvm-svn: 321465
* [InstCombine] fix miscompile of frem with 0.0 operand (PR34870)Sanjay Patel2017-12-261-4/+0
| | | | | | | We might want to select NAN here or do this transform with fast-math, but this should at least fix the miscompile. llvm-svn: 321461
* Make helpers static. No functionality change.Benjamin Kramer2017-12-241-3/+4
| | | | llvm-svn: 321425
* [CallSiteSplitting] Remove isOrHeader restriction.Florian Hahn2017-12-231-27/+19
| | | | | | | | | | | By following the single predecessors of the predecessors of the call site, we do not need to restrict the control flow. Reviewed By: junbuml, davide Differential Revision: https://reviews.llvm.org/D40729 llvm-svn: 321413
* [SCCP] Manually fold branches on undef.Davide Italiano2017-12-231-3/+26
| | | | | | | | | | | | | | | This code was originally removed and replace with an assertion because believed unnecessary. It turns out there was simply no test coverage for this case, and the constant folder doesn't yet know about patterns like `br undef %label1, %label2`. Presumably at some point the constant folder might learn about these patterns, but it's a broader change. A testcase will be added to make sure this doesn't regress again in the future. Fixes PR35723. llvm-svn: 321402
* [SimplifyCFG] Don't do if-conversion if there is a long dependence chainGuozhi Wei2017-12-221-0/+179
| | | | | | | | | | If after if-conversion, most of the instructions in this new BB construct a long and slow dependence chain, it may be slower than cmp/branch, even if the branch has a high miss rate, because the control dependence is transformed into data dependence, and control dependence can be speculated, and thus, the second part can execute in parallel with the first part on modern OOO processor. This patch checks for the long dependence chain, and give up if-conversion if find one. Differential Revision: https://reviews.llvm.org/D39352 llvm-svn: 321377
* Add hasProfileData() to check if a function has profile data. NFC.Easwaran Raman2017-12-225-10/+9
| | | | | | | | | | | | | | | | | | | Summary: This replaces calls to getEntryCount().hasValue() with hasProfileData that does the same thing. This refactoring is useful to do before adding synthetic function entry counts but also a useful cleanup IMO even otherwise. I have used hasProfileData instead of hasRealProfileData as David had earlier suggested since I think profile implies "real" and I use the phrase "synthetic entry count" and not "synthetic profile count" but I am fine calling it hasRealProfileData if you prefer. Reviewers: davidxl, silvas Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41461 llvm-svn: 321331
* [SimplifyCFG] Avoid quadratic on a predecessors number behavior in ↵Michael Zolotukhin2017-12-211-14/+10
| | | | | | | | | | | | | | | | | | | | | | instruction sinking. If a block has N predecessors, then the current algorithm will try to sink common code to this block N times (whenever we visit a predecessor). Every attempt to sink the common code includes going through all predecessors, so the complexity of the algorithm becomes O(N^2). With this patch we try to sink common code only when we visit the block itself. With this, the complexity goes down to O(N). As a side effect, the moment the code is sunk is slightly different than before (the order of simplifications has been changed), that's why I had to adjust two tests (note that neither of the tests is supposed to test SimplifyCFG): * test/CodeGen/AArch64/arm64-jumptable.ll - changes in this test mimic the changes that previous implementation of SimplifyCFG would do. * test/CodeGen/ARM/avoid-cpsr-rmw.ll - in this test I disabled common code sinking by a command line flag. llvm-svn: 321236
* [ICP] Expose unconditional call promotion interfaceMatthew Simpson2017-12-201-77/+178
| | | | | | | | | | | | | | | | | | | | | | | This patch modifies the indirect call promotion utilities by exposing and using an unconditional call promotion interface. The unconditional promotion interface (i.e., call promotion without creating an if-then-else) can be used if it's known that an indirect call has only one possible callee. The existing conditional promotion interface uses this unconditional interface to promote an indirect call after it has been versioned and placed within the "then" block. A consequence of unconditional promotion is that the fix-up operations for phi nodes in the normal destination of invoke instructions are changed. This is necessary because the existing implementation assumed that an invoke had been versioned, creating a "merge" block where a return value bitcast could be placed. In the new implementation, the edge between a promoted invoke's parent block and its normal destination is split if needed to add a bitcast for the return value. If the invoke is also versioned, the phi node merging the return value of the promoted and original invoke instructions is placed in the "merge" block. Differential Revision: https://reviews.llvm.org/D40751 llvm-svn: 321210
* [hwasan] Implement -fsanitize-recover=hwaddress.Evgeniy Stepanov2017-12-201-7/+18
| | | | | | | | | | | | Summary: Very similar to AddressSanitizer, with the exception of the error type encoding. Reviewers: kcc, alekseyshl Subscribers: cfe-commits, kubamracek, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D41417 llvm-svn: 321203
* [InstCombine] Add debug location to new caller.Florian Hahn2017-12-201-0/+1
| | | | | | | | | | Reviewers: rnk, aprantl, majnemer Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D414 llvm-svn: 321191
* Revert r320548:[SLP] Vectorize jumbled memory loadsMohammad Shahid2017-12-201-195/+83
| | | | llvm-svn: 321181
* [LV] Remove unnecessary DoExtraAnalysis guard (silent bug)Florian Hahn2017-12-201-2/+2
| | | | | | | | | | | | | | canVectorize is only checking if the loop has a normalized pre-header if DoExtraAnalysis is true. This doesn't make sense to me because reporting analysis information shouldn't alter legality checks. This is probably the result of a last minute minor change before committing (?). Patch by Diego Caballero. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D40973 llvm-svn: 321172
* [memcpyopt] Teach memcpyopt to optimize across basic blocksDan Gohman2017-12-201-10/+46
| | | | | | | | | | | | | | | | | This teaches memcpyopt to make a non-local memdep query when a local query indicates that the dependency is non-local. This notably allows it to eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%. This is r319482 and r319483, along with fixes for PR35519: fix the optimization that merges stores into memsets to preserve cached memdep info, and fix memdep's non-local caching strategy to not assume that larger queries are always more conservative than smaller ones. Fixes PR28958 and PR35519. Differential Revision: https://reviews.llvm.org/D40802 llvm-svn: 321138
OpenPOWER on IntegriCloud