summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [objc-arc] Fixed number of prefixing slashes in some comments in a function ↵Michael Gottesman2013-05-231-6/+6
| | | | | | from 3 to 2 to match the rest of ObjCARCOpts. llvm-svn: 182557
* SLPVectorizer: Change the order in which new instructions are added to the ↵Nadav Rotem2013-05-223-57/+132
| | | | | | | | | | | | function. We are not working on a DAG and I ran into a number of problems when I enabled the vectorizations of 'diamond-trees' (trees that share leafs). * Imroved the numbering API. * Changed the placement of new instructions to the last root. * Fixed a bug with external tree users with non-zero lane. * Fixed a bug in the placement of in-tree users. llvm-svn: 182508
* This is an update to a previous commit (r181216).Jean-Luc Duprat2013-05-222-29/+43
| | | | | | | | | | | | | | | | | | | | | | | The earlier change list introduced the following inst combines: B * (uitofp i1 C) —> select C, B, 0 A * (1 - uitofp i1 C) —> select C, 0, A select C, 0, B + select C, A, 0 —> select C, A, B Together these 3 changes would simplify : A * (1 - uitofp i1 C) + B * uitofp i1 C down to : select C, B, A In practice we found that the first two substitutions can have a negative effect on performance, because they reduce opportunities to use FMA contractions; between the two options FMAs are often the better choice. This change list amends the previous one to enable just these inst combines: select C, B, 0 + select C, 0, A —> select C, B, A A * (1 - uitofp i1 C) + B * uitofp i1 C —> select C, B, A llvm-svn: 182499
* LoopVectorize: Make Value pointers that could be RAUW'ed a VHArnold Schwaighofer2013-05-221-3/+4
| | | | | | | | | | The Value pointers we store in the induction variable list can be RAUW'ed by a call to SCEVExpander::expandCodeFor, use a TrackingVH instead. Do the same thing in some other places where we store pointers that could potentially be RAUW'ed. Fixes PR16073. llvm-svn: 182485
* [msan] A no-op implementation of VarArg handling.Evgeniy Stepanov2013-05-211-2/+23
| | | | | | | This stuff is used on platforms where MSan does not have a proper VarArg implementation (anything other than x86_64 at the moment). llvm-svn: 182375
* Remove unused #include.Bill Wendling2013-05-201-1/+0
| | | | llvm-svn: 182315
* Rename LoopSimplify.h to LoopUtils.hHal Finkel2013-05-201-1/+1
| | | | | | As discussed, LoopUtils.h is a better name. llvm-svn: 182314
* Expose InsertPreheaderForLoop from LoopSimplify to other passesHal Finkel2013-05-201-11/+12
| | | | | | | | | | | Other passes, PPC counter-loop formation for example, also need to add loop preheaders outside of the regular loop simplification pass. This makes InsertPreheaderForLoop a global function so that it can be used by other passes. No functionality change intended. llvm-svn: 182299
* LoopVectorize: Handle single edge PHIsArnold Schwaighofer2013-05-181-4/+4
| | | | | | | | We might encouter single edge PHIs - handle them with an identity select. Fixes PR15990. llvm-svn: 182199
* Add missing -*- C++ -*- to headersMatt Arsenault2013-05-171-1/+1
| | | | llvm-svn: 182164
* LoopVectorize: Simplify code. No functionality change.Benjamin Kramer2013-05-171-21/+5
| | | | llvm-svn: 182100
* [msan] Switch TLS globals to initial-exec model.Evgeniy Stepanov2013-05-161-7/+7
| | | | | | They are always defined in the main executable. llvm-svn: 181994
* LoopVectorize: Move call of canHoistAllLoads to canVectorizeWithIfConvertArnold Schwaighofer2013-05-151-4/+4
| | | | | | | | | We only want to check this once, not for every conditional block in the loop. No functionality change (except that we don't perform a check redudantly anymore). llvm-svn: 181942
* [objc-arc] Fixed a spelling error and made the statistic descriptions be ↵Michael Gottesman2013-05-151-5/+5
| | | | | | consistent about their usage of periods. llvm-svn: 181901
* LoopVectorize: Fix commentsArnold Schwaighofer2013-05-151-4/+4
| | | | | | No functionality change. llvm-svn: 181862
* LoopVectorize: Hoist conditional loads if possibleArnold Schwaighofer2013-05-151-3/+102
| | | | | | | | | | | | InstCombine can be uncooperative to vectorization and sink loads into conditional blocks. This prevents vectorization. Undo this optimization if there are unconditional memory accesses to the same addresses in the loop. radar://13815763 llvm-svn: 181860
* Fix two typoSylvestre Ledru2013-05-141-1/+1
| | | | llvm-svn: 181848
* GlobalOpt: fix an issue where CXAAtExitFn points to a deleted function.Manman Ren2013-05-141-3/+3
| | | | | | | | | | | CXAAtExitFn was set outside a loop and before optimizations where functions can be deleted. This patch will set CXAAtExitFn inside the loop and after optimizations. Seg fault when running LTO because of accesses to a deleted function. rdar://problem/13838828 llvm-svn: 181838
* Removed trailing whitespace.Michael Gottesman2013-05-141-4/+4
| | | | llvm-svn: 181760
* LoopVectorize: Handle loops with multiple forward inductionsArnold Schwaighofer2013-05-141-17/+40
| | | | | | | | | | | | We used to give up if we saw two integer inductions. After this patch, we base further induction variables on the chosen one like we do in the reverse induction and pointer induction case. Fixes PR15720. radar://13851975 llvm-svn: 181746
* [objc-arc-opts] Added debug statements when we set and unset whether a ↵Michael Gottesman2013-05-141-0/+2
| | | | | | pointer is known positive. llvm-svn: 181745
* [objc-arc-opts] In the presense of an alloca unconditionally remove RR pairs ↵Michael Gottesman2013-05-131-5/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | if and only if we are both KnownSafeBU/KnownSafeTD rather than just either or. In the presense of a block being initialized, the frontend will emit the objc_retain on the original pointer and the release on the pointer loaded from the alloca. The optimizer will through the provenance analysis realize that the two are related (albiet different), but since we only require KnownSafe in one direction, will match the inner retain on the original pointer with the guard release on the original pointer. This is fixed by ensuring that in the presense of allocas we only unconditionally remove pointers if both our retain and our release are KnownSafe (i.e. we are KnownSafe in both directions) since we must deal with the possibility that the frontend will emit what (to the optimizer) appears to be unbalanced retain/releases. An example of the miscompile is: %A = alloca retain(%x) retain(%x) <--- Inner Retain store %x, %A %y = load %A ... DO STUFF ... release(%y) call void @use(%x) release(%x) <--- Guarding Release getting optimized to: %A = alloca retain(%x) store %x, %A %y = load %A ... DO STUFF ... release(%y) call void @use(%x) rdar://13750319 llvm-svn: 181743
* Move a couple more statistics inside '#ifndef NDEBUG'.Matt Beaumont-Gay2013-05-131-1/+1
| | | | | | Suppresses an unused-variable warning in -Asserts builds. llvm-svn: 181733
* [objc-arc-opts] Add comment to BBState making it clear that ↵Michael Gottesman2013-05-131-0/+6
| | | | | | get{TopDown,BottomUp}PtrState will create a new PtrState object if it does not find a PtrState for Arg. llvm-svn: 181726
* [objc-arc] Move the before optimization statistics gathering phase out of ↵Michael Gottesman2013-05-131-8/+7
| | | | | | | | | | | OptimizeIndividualCalls. This makes the statistics gathering completely independent of the actual optimization occuring, preventing any sort of bleeding over from occuring. Additionally, it simplifies a switch statement in the non-statistic gathering case. llvm-svn: 181719
* Suppress GCC compiler warnings in release builds about variables that are onlyDuncan Sands2013-05-131-0/+1
| | | | | | read in asserts. llvm-svn: 181689
* SLPVectorizer: Swap LHS and RHS. No functionality change.Nadav Rotem2013-05-131-4/+4
| | | | llvm-svn: 181684
* SLPVectorizer: Fix a bug in the code that generates extracts for values with ↵Nadav Rotem2013-05-121-7/+27
| | | | | | | | multiple users. The external user does not have to be in lane #0. We have to save the lane for each scalar so that we know which vector lane to extract. llvm-svn: 181674
* SLPVectorizer: Clear the map that maps between scalars to vectors after each ↵Nadav Rotem2013-05-121-0/+1
| | | | | | | | round of vectorization. Testcase in the next commit. llvm-svn: 181673
* InstCombine: Flip the order of two urem transformsDavid Majnemer2013-05-121-6/+6
| | | | | | | | | | | | | | There are two transforms in visitUrem that conflict with each other. *) One, if a divisor is a power of two, subtracts one from the divisor and turns it into a bitwise-and. *) The other unwraps both operands if they are surrounded by zext instructions. Flipping the order allows the subtraction to go beneath the sign extension. llvm-svn: 181668
* LoopVectorize: Use the widest induction variable typeArnold Schwaighofer2013-05-111-21/+69
| | | | | | | | | | | | | | | | | | | | | | Use the widest induction type encountered for the cannonical induction variable. We used to turn the following loop into an empty loop because we used i8 as induction variable type and truncated 1024 to 0 as trip count. int a[1024]; void fail() { int reverse_induction = 1023; unsigned char forward_induction = 0; while ((reverse_induction) >= 0) { forward_induction++; a[reverse_induction] = forward_induction; --reverse_induction; } } radar://13862901 llvm-svn: 181667
* LoopVectorize: Use variable instead of repeated function callArnold Schwaighofer2013-05-111-3/+4
| | | | | | No functionality change intended. llvm-svn: 181666
* LoopVectorize: Use IRBuilder interface in more placesArnold Schwaighofer2013-05-111-25/+13
| | | | | | No functionality change intended. llvm-svn: 181665
* InstCombine: Turn urem to bitwise-and more oftenDavid Majnemer2013-05-111-20/+2
| | | | | | | Use isKnownToBeAPowerOfTwo in visitUrem so that we may more aggressively fold away urem instructions. llvm-svn: 181661
* SLPVectorizer: Add support for trees with external users.Nadav Rotem2013-05-102-9/+55
| | | | | | | | | | | | | For example: bar() { int a = A[i]; int b = A[i+1]; B[i] = a; B[i+1] = b; foo(a); <--- a is used outside the vectorized expression. } llvm-svn: 181648
* Add a debug printNadav Rotem2013-05-101-0/+2
| | | | llvm-svn: 181647
* InstCombine: Don't claim to be able to evaluate any shl in a zexted type.Benjamin Kramer2013-05-101-1/+11
| | | | | | | | | | The shift amount may be larger than the type leading to undefined behavior. Limit the transform to constant shift amounts. While there update the bits to clear in the result which may enable additional optimizations. PR15959. llvm-svn: 181604
* InstCombine: Verify the type before transforming uitofp into select.Benjamin Kramer2013-05-101-22/+23
| | | | | | PR15952. llvm-svn: 181586
* Fix a documentation warning: \bried -> \briefDmitri Gribenko2013-05-091-1/+1
| | | | llvm-svn: 181551
* [GVN] Split critical-edge on the fly, instead of postpone edge-splitting to nextShuxin Yang2013-05-091-13/+39
| | | | | | | | | | iteration. This on step toward non-iterative GVN. My local hack suggests that getting rid of iteration will speedup GVN by 30%+ on a medium sized input (2k LOC, C++). I cannot explain why not 2x or more at this moment. llvm-svn: 181532
* Don't replace an alias in llvm.used with its target.Rafael Espindola2013-05-091-2/+102
| | | | | | | When we replace an internal alias with its target, be careful not to replace the entry in llvm.used (and llvm.compiler_used). llvm-svn: 181524
* InstCombine: Don't just copy known bits from the first operand of an srem.Benjamin Kramer2013-05-091-1/+1
| | | | | | | That's obviously wrong. Conservatively restrict it to the sign bit, which matches the original intention of this analysis. Fixes PR15940. llvm-svn: 181518
* LoopVectorizer: Don't assert on the absence of induction variablesArnold Schwaighofer2013-05-091-1/+2
| | | | | | | | | A computable loop exit count does not imply the presence of an induction variable. Scalar evolution can return a value for an infinite loop. Fixes PR15926. llvm-svn: 181495
* Add DebugIR pass -- emits IR file and replace source lines with IR lines in MDDaniel Malea2013-05-082-0/+247
| | | | | | | | | | | - requires existing debug information to be present - fixes up file name and line number information in metadata - emits a "<orig_filename>-debug.ll" succinct IR file (without !dbg metadata or debug intrinsics) that can be read by a debugger - initialize pass in opt tool to enable the "-debug-ir" flag - lit tests to follow llvm-svn: 181467
* Fix a bug in codegenprep where it was losing track of values OptimizeMemoryInstNick Lewycky2013-05-081-5/+2
| | | | | | by switching to a ValueMap. Patch by Andrea DiBiagio! llvm-svn: 181397
* LoopVectorizer: Improve reduction variable identificationArnold Schwaighofer2013-05-071-84/+132
| | | | | | | The two nested loops were confusing and also conservative in identifying reduction variables. This patch replaces them by a worklist based approach. llvm-svn: 181369
* LoopVectorize: getConsecutiveVector must respect signed arithmeticArnold Schwaighofer2013-05-071-5/+6
| | | | | | | | | | We were passing an i32 to ConstantInt::get where an i64 was needed and we must also pass the sign if we pass negatives numbers. The start index passed to getConsecutiveVector must also be signed. Should fix PR15882. llvm-svn: 181286
* InstCombine: (X ^ signbit) + C -> X + (signbit ^ C)David Majnemer2013-05-061-0/+5
| | | | llvm-svn: 181249
* Rotate multi-exit loops even if the latch was simplified.Andrew Trick2013-05-061-14/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Test case by Michele Scandale! Fixes PR10293: Load not hoisted out of loop with multiple exits. There are few regressions with this patch, now tracked by rdar:13817079, and a roughly equal number of improvements. The regressions are almost certainly back luck because LoopRotate has very little idea of whether rotation is profitable. Doing better requires a more comprehensive solution. This checkin is a quick fix that lacks generality (PR10293 has a counter-example). But it trivially fixes the case in PR10293 without interfering with other cases, and it does satify the criteria that LoopRotate is a loop canonicalization pass that should avoid heuristics and special cases. I can think of two approaches that would probably be better in the long run. Ultimately they may both make sense. (1) LoopRotate should check that the current header would make a good loop guard, and that the loop does not already has a sufficient guard. The artifical SimplifiedLoopLatch check would be unnecessary, and the design would be more general and canonical. Two difficulties: - We need a strong guarantee that we won't endlessly rotate, so the analysis would need to be precise in order to avoid the SimplifiedLoopLatch precondition. - Analysis like this are usually based on SCEV, which we don't want to rely on. (2) Rotate on-demand in late loop passes. This could even be done by shoving the loop back on the queue after the optimization that needs it. This could work well when we find LICM opportunities in multi-branch loops. This requires some work, and it doesn't really solve the problem of SCEV wanting a loop guard before the analysis. llvm-svn: 181230
* Provide InstCombines for the following 3 cases:Jean-Luc Duprat2013-05-062-0/+53
| | | | | | | | | | | | | A * (1 - (uitofp i1 C)) -> select C, 0, A B * (uitofp i1 C) -> select C, B, 0 select C, 0, A + select C, B, 0 -> select C, B, A These come up in code that has been hand-optimized from a select to a linear blend, on platforms where that may have mattered. We want to undo such changes with the following transform: A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B llvm-svn: 181216
OpenPOWER on IntegriCloud