summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar
Commit message (Collapse)AuthorAgeFilesLines
* LICM: Don't crash when an instruction is used by an unreachable BBDavid Majnemer2014-09-021-1/+6
| | | | | | | | | | | | | | | | | | | Summary: BBs might contain non-LCSSA'd values after the LCSSA pass is run if they are unreachable from the entry block. Normally, the users of the instruction would be PHIs but the unreachable BBs have normal users; rewrite their uses to be undef values. An alternative fix could involve fixing this at LCSSA but that would require this invariant to hold after subsequent transforms. If a BB created an unreachable block, they would be in violation of this. This fixes PR19798. Differential Revision: http://reviews.llvm.org/D5146 llvm-svn: 216911
* SROA: Don't insert instructions before a PHIDavid Majnemer2014-09-011-1/+4
| | | | | | | | | | | | | | | SROA may decide that it needs to insert a bitcast and would set it's insertion point before a PHI. This will create an invalid module right quick. Instead, choose the first insertion point in the basic block that holds our PHI. This fixes PR20822. Differential Revision: http://reviews.llvm.org/D5141 llvm-svn: 216891
* Fix a really bad miscompile introduced in r216865 - the else-if logicChandler Carruth2014-09-011-10/+14
| | | | | | | | | | | | | | | | | | chain became completely broken here as *all* intrinsic users ended up being skipped, and the ones that seemed to be singled out were actually the exact wrong set. This is a great example of why long else-if chains can be easily confusing. Switch the entire code to use early exits and early continues to have simpler (and more importantly, correct) logic here, as well as fixing the reversed logic for detecting and continuing on lifetime intrinsics. I've also significantly cleaned up the test case and added another test case demonstrating an example where the optimization is not (trivially) safe to perform. llvm-svn: 216871
* Ignore lifetime intrinsics in use list for MemCpyOptimizer. Patch by Luqman ↵Nick Lewycky2014-09-011-0/+4
| | | | | | Aden, review by Hal Finkel. llvm-svn: 216865
* Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or ↵Craig Topper2014-08-271-1/+1
| | | | | | just letting them be implicitly created. llvm-svn: 216525
* Use range based for loops to avoid needing to re-mention SmallPtrSet size.Craig Topper2014-08-243-25/+18
| | | | llvm-svn: 216351
* [SROA] Fold a PHI node if all its incoming values are the sameJingyue Wu2014-08-221-41/+41
| | | | | | | | | | | | | | | | | | | Summary: Fixes PR20425. During slice building, if all of the incoming values of a PHI node are the same, replace the PHI node with the common value. This simplification makes alloca's used by PHI nodes easier to promote. Test Plan: Added three more tests in phi-and-select.ll Reviewers: nlewycky, eliben, meheff, chandlerc Reviewed By: chandlerc Subscribers: zinovy.nis, hfinkel, baldrick, llvm-commits Differential Revision: http://reviews.llvm.org/D4659 llvm-svn: 216299
* SROA: Handle a case of store size being smaller than allocation sizeReid Kleckner2014-08-221-4/+6
| | | | | | | | | | | | | | | | In this case, we are creating an x86_fp80 slice for a union from C where the padding bytes may contain real data. An x86_fp80 alloca is 16 bytes, and that's just fine. We can't, however, use regular loads and stores to access the slice, because the store size is only 10 bytes / 80 bits. Instead, use memcpy and memset. Fixes PR18726. Reviewed By: chandlerc Differential Revision: http://reviews.llvm.org/D5012 llvm-svn: 216248
* [CLNUP] Remove return after llvm_unreachable. Thanks to Hal Finkel for pointing.Zinovy Nis2014-08-211-1/+0
| | | | llvm-svn: 216176
* Reassociate x + -0.1234 * y into x - 0.1234 * yErik Verbruggen2014-08-211-2/+49
| | | | | | | | | | | This does not require -ffast-math, and it gives CSE/GVN more options to eliminate duplicate expressions in, e.g.: return ((x + 0.1234 * y) * (x - 0.1234 * y)); Differential Revision: http://reviews.llvm.org/D4904 llvm-svn: 216169
* [INDVARS] Extend using of widening of induction variables for the cases of ↵Zinovy Nis2014-08-211-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | "sub nsw" and "mul nsw" instructions. Currently only "add nsw" are widened. This patch eliminates tons of "sext" instructions for 64 bit code (and the corresponding target code) in cases like: int N = 100; float **A; void foo(int x0, int x1) { float * A_cur = &A[0][0]; float * A_next = &A[1][0]; for(int x = x0; x < x1; ++x). { // Currently only [x+N] case is widened. Others 2 cases lead to sext. // This patch fixes it, so all 3 cases do not need sext. const float div = A_cur[x + N] + A_cur[x - N] + A_cur[x * N]; A_next[x] = div; } } ... > clang++ test.cpp -march=core-avx2 -Ofast -fno-unroll-loops -fno-tree-vectorize -S -o - Differential Revision: http://reviews.llvm.org/D4695 llvm-svn: 216160
* Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid ↵Craig Topper2014-08-215-22/+22
| | | | | | needing to mention the size. llvm-svn: 216158
* Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to ↵Craig Topper2014-08-185-22/+22
| | | | | | | | avoid needing to mention the size." Getting a weird buildbot failure that I need to investigate. llvm-svn: 215870
* Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid ↵Craig Topper2014-08-175-22/+22
| | | | | | needing to mention the size. llvm-svn: 215868
* Introduce a helper to combine instruction metadata.Rafael Espindola2014-08-151-43/+19
| | | | | | | | | Replace the old code in GVN and BBVectorize with it. Update SimplifyCFG to use it. Patch by Björn Steinbrink! llvm-svn: 215723
* [Reassociation] Add support for reassociation with unsafe algebra.Chad Rosier2014-08-141-81/+228
| | | | | | | Vector instructions are (still) not supported for either integer or floating point. Hopefully, that work will be landed shortly. llvm-svn: 215647
* Initialize FlattenCFG passJan Vesely2014-08-131-0/+1
| | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 215573
* Fix for multi-line comment warningGerolf Hoflehner2014-08-071-12/+12
| | | | llvm-svn: 215169
* Fix a case in SROA where lifetime intrinsics could inhibit alloca promotion. InOwen Anderson2014-08-071-0/+4
| | | | | | | | this case, the code path dealing with vector promotion was missing the explicit checks for lifetime intrinsics that were present on the corresponding integer promotion path. llvm-svn: 215148
* Revert "r214897 - Remove dead zero store to calloc initialized memory"Rui Ueyama2014-08-061-35/+15
| | | | | | It broke msan. llvm-svn: 214989
* Fix typos in comments and docJF Bastien2014-08-052-3/+3
| | | | | | Committing http://reviews.llvm.org/D4798 for Robin Morisset (morisset@google.com) llvm-svn: 214934
* Remove dead zero store to calloc initialized memoryPhilip Reames2014-08-051-15/+35
| | | | | | | | | | | | | | | | | Optimize the following IR: %1 = tail call noalias i8* @calloc(i64 1, i64 4) %2 = bitcast i8* %1 to i32* ; This store is dead and should be removed store i32 0, i32* %2, align 4 Memory returned by calloc is guaranteed to be zero initialized. If the value being stored is the constant zero (and the store is not otherwise observable across threads), we can delete the store. If the store is to an out of bounds address, it is undefined and thus also removable. Reviewed By: nicholas Differential Revision: http://reviews.llvm.org/D3942 llvm-svn: 214897
* PartiallyInlineLibCalls: Check sqrt result type before transforming it.Peter Collingbourne2014-08-011-0/+4
| | | | | | | Some configure scripts declare this with the wrong prototype, which can lead to an assertion failure. llvm-svn: 214593
* Remove some calls to std::move.Rafael Espindola2014-08-011-2/+2
| | | | | | | | | Instead of moving out the data in a ErrorOr<std::unique_ptr<Foo>>, get a reference to it. Thanks to David Blaikie for the suggestion. llvm-svn: 214516
* Fixing a few -Woverloaded-virtual warnings by exposing the hidden virtual ↵Aaron Ballman2014-07-301-0/+2
| | | | | | function as well. No functional changes intended. llvm-svn: 214325
* After unrolling a loop with llvm.loop.unroll.count metadata (unroll factorMark Heffernan2014-07-241-1/+0
| | | | | | | | | | hint) the loop unroller replaces the llvm.loop.unroll.count metadata with llvm.loop.unroll.disable metadata to prevent any subsequent unrolling passes from unrolling more than the hint indicates. This patch fixes an issue where loop unrolling could be disabled for other loops as well which share the same llvm.loop metadata. llvm-svn: 213900
* Add scoped-noalias metadataHal Finkel2014-07-243-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds scoped noalias metadata. The primary motivations for this feature are: 1. To preserve noalias function attribute information when inlining 2. To provide the ability to model block-scope C99 restrict pointers Neither of these two abilities are added here, only the necessary infrastructure. In fact, there should be no change to existing functionality, only the addition of new features. The logic that converts noalias function parameters into this metadata during inlining will come in a follow-up commit. What is added here is the ability to generally specify noalias memory-access sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA nodes: !scope0 = metadata !{ metadata !"scope of foo()" } !scope1 = metadata !{ metadata !"scope 1", metadata !scope0 } !scope2 = metadata !{ metadata !"scope 2", metadata !scope0 } !scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 } !scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 } Loads and stores can be tagged with an alias-analysis scope, and also, with a noalias tag for a specific scope: ... = load %ptr1, !alias.scope !{ !scope1 } ... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 } When evaluating an aliasing query, if one of the instructions is associated with an alias.scope id that is identical to the noalias scope associated with the other instruction, or is a descendant (in the scope hierarchy) of the noalias scope associated with the other instruction, then the two memory accesses are assumed not to alias. Note that is the first element of the scope metadata is a string, then it can be combined accross functions and translation units. The string can be replaced by a self-reference to create globally unqiue scope identifiers. [Note: This overview is slightly stylized, since the metadata nodes really need to just be numbers (!0 instead of !scope0), and the scope lists are also global unnamed metadata.] Existing noalias metadata in a callee is "cloned" for use by the inlined code. This is necessary because the aliasing scopes are unique to each call site (because of possible control dependencies on the aliasing properties). For example, consider a function: foo(noalias a, noalias b) { *a = *b; } that gets inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } -- now just because we know that a1 does not alias with b1 at the first call site, and a2 does not alias with b2 at the second call site, we cannot let inlining these functons have the metadata imply that a1 does not alias with b2. llvm-svn: 213864
* AA metadata refactoring (introduce AAMDNodes)Hal Finkel2014-07-245-45/+61
| | | | | | | | | | | | | | | | | | | | In order to enable the preservation of noalias function parameter information after inlining, and the representation of block-level __restrict__ pointer information (etc.), additional kinds of aliasing metadata will be introduced. This metadata needs to be carried around in AliasAnalysis::Location objects (and MMOs at the SDAG level), and so we need to generalize the current scheme (which is hard-coded to just one TBAA MDNode*). This commit introduces only the necessary refactoring to allow for the introduction of other aliasing metadata types, but does not actually introduce any (that will come in a follow-up commit). What it does introduce is a new AAMDNodes structure to hold all of the aliasing metadata nodes associated with a particular memory-accessing instruction, and uses that structure instead of the raw MDNode* in AliasAnalysis::Location, etc. No functionality change intended. llvm-svn: 213859
* Do not add unroll disable metadata after unrolling pass for loops with ↵Mark Heffernan2014-07-231-3/+4
| | | | | | #pragma clang loop unroll(full). llvm-svn: 213789
* In unroll pragma syntax and loop hint metadata, change "enable" forms to a ↵Mark Heffernan2014-07-231-42/+34
| | | | | | new form using the string "full". llvm-svn: 213772
* We may visit a call that uses an alloca multiple times in ↵Nick Lewycky2014-07-231-5/+3
| | | | | | callUsesLocalStack, sometimes with IsNocapture true and sometimes with IsNocapture false. We accidentally skipped work we needed to do in the IsNocapture=false case if we were called with IsNocapture=true the first time. Fixes PR20405! llvm-svn: 213726
* Revert "[C++11] Add predecessors(BasicBlock *) / successors(BasicBlock *) ↵Duncan P. N. Exon Smith2014-07-219-39/+66
| | | | | | | | | iterator ranges." This reverts commit r213474 (and r213475), which causes a miscompile on a stage2 LTO build. I'll reply on the list in a moment. llvm-svn: 213562
* Fix for regression: [Bug 20369] wrong code at -O3 on x86_64-linux-gnu in ↵Gerolf Hoflehner2014-07-211-1/+9
| | | | | | | | | 64-bit mode Prevents hoisting of loads above stores and sinking of stores below loads in MergedLoadStoreMotion.cpp (rdar://15991737) llvm-svn: 213497
* [C++11] Add predecessors(BasicBlock *) / successors(BasicBlock *) iterator ↵Manuel Jacob2014-07-209-66/+39
| | | | | | | | | | | | | | | | | | ranges. Summary: This patch introduces two new iterator ranges and updates existing code to use it. No functional change intended. Test Plan: All tests (make check-all) still pass. Reviewers: dblaikie Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4481 llvm-svn: 213474
* Templatify RegionInfo so it works on MachineBasicBlocksMatt Arsenault2014-07-191-3/+3
| | | | llvm-svn: 213456
* MergedLoadStoreMotion.cpp: Fix msc17 build. Member initializer is unavailable.NAKAMURA Takumi2014-07-191-2/+3
| | | | llvm-svn: 213448
* Fix build breakage introduced with r213412.Mark Heffernan2014-07-181-3/+4
| | | | llvm-svn: 213414
* Remove unroll pragma metadata after it is used.Mark Heffernan2014-07-181-0/+40
| | | | llvm-svn: 213412
* MergedLoadStoreMotion passGerolf Hoflehner2014-07-183-0/+629
| | | | | | | | | | | Merges equivalent loads on both sides of a hammock/diamond and hoists into into the header. Merges equivalent stores on both sides of a hammock/diamond and sinks it to the footer. Can enable if conversion and tolerate better load misses and store operand latencies. llvm-svn: 213396
* Partially revert r210444 due to performance regressionJingyue Wu2014-07-161-57/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Converting outermost zext(a) to sext(a) causes worse code when the computation of zext(a) could be reused. For example, after converting ... = array[zext(a)] ... = array[zext(a) + 1] to ... = array[sext(a)] ... = array[zext(a) + 1], the program computes sext(a), which is actually unnecessary. I added one test in split-gep-and-gvn.ll to illustrate this scenario. Also, with r211281 and r211084, we annotate more "nuw" tags to computation involving CUDA intrinsics such as threadIdx.x. These annotations help with splitting GEP a lot, rendering the benefit we get from this reverted optimization only marginal. Test Plan: make check-all Reviewers: eliben, meheff Reviewed By: meheff Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D4542 llvm-svn: 213209
* Don't eliminate memcpy's when the address of the pointer may itself be ↵Nick Lewycky2014-07-141-0/+6
| | | | | | relevant. Fixes PR18304. Patch by David Wiberg! llvm-svn: 212970
* Fix an issue with the MergeBasicBlockIntoOnlyPred() helper function where it didOwen Anderson2014-07-121-5/+0
| | | | | | | | | | not properly handle the case where the predecessor block was the entry block to the function. The only in-tree client of this is JumpThreading, which worked around the issue in its own code. This patch moves the solution into the helper so that JumpThreading (and other clients) do not have to replicate the same fix everywhere. llvm-svn: 212875
* Feeding isSafeToSpeculativelyExecute its DataLayout pointer (in Sink)Hal Finkel2014-07-101-1/+5
| | | | | | | | | | This is the one remaining place I see where passing isSafeToSpeculativelyExecute a DataLayout pointer might matter (at least for loads) -- I think I got the others in r212720. Most of the other remaining callers of isSafeToSpeculativelyExecute only use it for call sites (or otherwise exclude loads). llvm-svn: 212730
* Feeding isSafeToSpeculativelyExecute its DataLayout pointerHal Finkel2014-07-101-2/+4
| | | | | | | | | | | | | | isSafeToSpeculativelyExecute can optionally take a DataLayout pointer. In the past, this was mainly used to make better decisions regarding divisions known not to trap, and so was not all that important for users concerned with "cheap" instructions. However, now it also helps look through bitcasts for dereferencable loads, and will also be important if/when we add a dereferencable pointer attribute. This is some initial work to feed a DataLayout pointer through to callers of isSafeToSpeculativelyExecute, generally where one was already available. llvm-svn: 212720
* Allow isDereferenceablePointer to look through some bitcastsHal Finkel2014-07-103-7/+7
| | | | | | | | | | | | | | | | isDereferenceablePointer should not give up upon encountering any bitcast. If we're casting from a pointer to a larger type to a pointer to a small type, we can continue by examining the bitcast's operand. This missing capability was noted in a comment in the function. In order for this to work, isDereferenceablePointer now takes an optional DataLayout pointer (essentially all callers already had such a pointer available). Most code uses isDereferenceablePointer though isSafeToSpeculativelyExecute (which already took an optional DataLayout pointer), and to enable the LICM test case, LICM needs to actually provide its DL pointer to isSafeToSpeculativelyExecute (which it was not doing previously). llvm-svn: 212686
* Update the MemoryBuffer API to use ErrorOr.Rafael Espindola2014-07-061-3/+4
| | | | llvm-svn: 212405
* GVN: Preserve invariant.load metadataArnold Schwaighofer2014-06-261-0/+4
| | | | | | | | | | | If both instructions to be replaced are marked invariant the resulting instruction is invariant. rdar://13358910 Fix by Erik Eckstein! llvm-svn: 211801
* Rename loop unrolling and loop vectorizer metadata to have a common prefix.Eli Bendersky2014-06-251-6/+4
| | | | | | | | | | | | | | | | | | | [LLVM part] These patches rename the loop unrolling and loop vectorizer metadata such that they have a common 'llvm.loop.' prefix. Metadata name changes: llvm.vectorizer.* => llvm.loop.vectorizer.* llvm.loopunroll.* => llvm.loop.unroll.* This was a suggestion from an earlier review (http://reviews.llvm.org/D4090) which added the loop unrolling metadata. Patch by Mark Heffernan. llvm-svn: 211710
* Factor out part of LICM::sink into a helper function.Evgeniy Stepanov2014-06-251-28/+41
| | | | llvm-svn: 211678
* [LICM] Don't create more than one copy of an instruction per loop exit block ↵Evgeniy Stepanov2014-06-251-24/+34
| | | | | | | | | | | | | | | | | | | | when sinking. Fixes exponential compilation complexity in PR19835, caused by LICM::sink not handling the following pattern well: f = op g e = op f, g d = op e c = op d, e b = op c a = op b, c When an instruction with N uses is sunk, each of its operands gets N new uses (all of them - phi nodes). In the example above, if a had 1 use, c would have 2, e would have 4, and g would have 8. llvm-svn: 211673
OpenPOWER on IntegriCloud