summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* Set debug locations for branch instructions created during inlining, evenAdrian Prantl2013-04-301-2/+10
| | | | | | | | the inlined function has multiple returns. rdar://problem/12415623 llvm-svn: 180793
* Fix a bug in foldSelectICmpAndOr.David Majnemer2013-04-301-1/+2
| | | | | | | Differences in bitwidth between X and Y could exist even if C1 and C2 have the same Log2 representation. llvm-svn: 180779
* Fix "Combine bit test + conditional or into simple math"David Majnemer2013-04-301-0/+64
| | | | | | | | | This fixes the optimization introduced in r179748 and reverted in r179750. While the optimization was sound, it did not properly respect differences in bit-width. llvm-svn: 180777
* SimplifyCFG: If convert single conditional storesArnold Schwaighofer2013-04-291-4/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This resurrects r179957, but adds code that makes sure we don't touch atomic/volatile stores: This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case where the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. llvm-svn: 180731
* Add in some conditional compilation in order to silence an unused variable ↵Michael Gottesman2013-04-291-0/+2
| | | | | | warning. llvm-svn: 180700
* [objc-arc] Apply the RV optimization to retains next to calls in ↵Michael Gottesman2013-04-292-53/+64
| | | | | | | | | | | | | | | ObjCARCContract instead of ObjCARCOpts. Turning retains into retainRV calls disrupts the data flow analysis in ObjCARCOpts. Thus we move it as late as we can by moving it into ObjCARCContract. We leave in the conversion from retainRV -> retain in ObjCARCOpt since it enables the dataflow analysis. rdar://10813093 llvm-svn: 180698
* Added statistics to count the number of retains/releases before/after ↵Michael Gottesman2013-04-291-0/+47
| | | | | | optimization. llvm-svn: 180697
* Removed trailing whitespace.Michael Gottesman2013-04-291-1/+1
| | | | llvm-svn: 180696
* Fix for r180693. = /.Michael Gottesman2013-04-291-1/+2
| | | | llvm-svn: 180694
* [objc-arc-annotations] Moved the disabling of call movement to ↵Michael Gottesman2013-04-291-6/+5
| | | | | | ConnectTDBUTraversals so that I can prevent Changed = true from being set. This prevents an infinite loop. llvm-svn: 180693
* Fix a XOR reassociation bug. Shuxin Yang2013-04-271-3/+6
| | | | | | | | | | When Reassociator optimize "(x | C1)" ^ "(X & C2)", it may swap the two subexpressions, however, it forgot to swap cached constants (of C1 and C2) accordingly. rdar://13739160 llvm-svn: 180676
* fix a typo that due to cu&paste quadrupled itselfAdrian Prantl2013-04-261-2/+2
| | | | | | rdar://problem/13056109 llvm-svn: 180618
* Bugfix for the debug intrinsic handling in InstCombiner:Adrian Prantl2013-04-261-2/+27
| | | | | | | | | | | Since we can't guarantee that the original dbg.declare instrinsic is removed by LowerDbgDeclare(), we need to make sure that we are not inserting the same dbg.value intrinsic over and over. This removes tons of redundant DIEs when compiling optimized code. rdar://problem/13056109 llvm-svn: 180615
* LoopVectorizer: Calculate the number of pointers to disambiguate at runtime ↵Nadav Rotem2013-04-261-4/+11
| | | | | | based on the numbers of reads and writes. llvm-svn: 180593
* Revert "[objc-arc] Added ImpreciseAutoreleaseSet to track autorelease calls ↵Michael Gottesman2013-04-261-38/+0
| | | | | | | | | | | | | | that were once autoreleaseRV instructions." This reverts commit r180222. I think this might tie in with a different problem which will require a different approach potentially. I am reverting this in the case I need to go down that second path. My apologies for the noise. = /. llvm-svn: 180590
* LoopVectorizer: No need to generate pointer disambiguation checks between ↵Nadav Rotem2013-04-251-4/+12
| | | | | | readonly pointers. llvm-svn: 180570
* [objc-arc] Added ImpreciseAutoreleaseSet to track autorelease calls that ↵Michael Gottesman2013-04-241-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | were once autoreleaseRV instructions. Due to the semantics of ARC, we must be extremely conservative with autorelease calls inserted by the frontend since ARC gaurantees that said object will be in the autorelease pool after that point, an optimization invariant that the optimizer must respect. On the other hand, we are allowed significantly more flexibility with autoreleaseRV instructions. Often times though this flexibility is disrupted by early transformations which transform objc_autoreleaseRV => objc_autorelease if said instruction is no longer being used as part of an RV pair (generally due to inlining). Since we can not tell the difference in between an autorelease put into place by the frontend and one created through said ``strength reduction'' we can not perform these optimizations. The addition of this set gets around said issues by allowing us to differentiate in between said two cases. rdar://problem/13697741. llvm-svn: 180222
* Fixed comment typo.Michael Gottesman2013-04-241-1/+1
| | | | llvm-svn: 180221
* LoopVectorizer: Change variable name Stride to ConsecutiveStrideArnold Schwaighofer2013-04-241-6/+6
| | | | | | | | This makes it easier to read the code. No functionality change. llvm-svn: 180197
* LoopVectorize: Scalarize padded typesArnold Schwaighofer2013-04-241-1/+9
| | | | | | | | | | | | | | | | | | This patch disables memory-instruction vectorization for types that need padding bytes, e.g., x86_fp80 has 10 bytes store size with 6 bytes padding in darwin on x86_64. Because the load/store vectorization is performed by the bit casting to a packed vector, which has incompatible memory layout due to the lack of padding bytes, the present vectorizer produces inconsistent result for memory instructions of those types. This patch checks an equality of the AllocSize of a scalar type and allocated size for each vector element, to ensure that there is no padding bytes and the array can be read/written using vector operations. Patch by Daisuke Takahashi! Fixes PR15758. llvm-svn: 180196
* LoopVectorizer: Bail out if we don't have datalayout we need itArnold Schwaighofer2013-04-241-0/+5
| | | | llvm-svn: 180195
* Make sure the instruction right after an inlined function has aAdrian Prantl2013-04-231-4/+10
| | | | | | | | | | debug location. This solves a problem where range of an inlined subroutine is emitted wrongly. Patch by Manman Ren. Fixes rdar://problem/12415623 llvm-svn: 180140
* LoopVectorizer: Fix 15830. When scalarizing and unrolling stores make sure ↵Nadav Rotem2013-04-231-4/+4
| | | | | | | | that the order in which the elements are scalarized is the same as the original order. This fixes a miscompilation in FreeBSD's regex library. llvm-svn: 180121
* Call the potentially costly isAnnotatedParallel() only once. Pekka Jaaskelainen2013-04-231-3/+5
| | | | | | Made the uniform write test's checks a bit stricter. llvm-svn: 180119
* Refuse to (even try to) vectorize loops which have uniform writes,Pekka Jaaskelainen2013-04-231-9/+9
| | | | | | | | | even if erroneously annotated with the parallel loop metadata. Fixes Bug 15794: "Loop Vectorizer: Crashes with the use of llvm.loop.parallel metadata" llvm-svn: 180081
* Move C++ code out of the C headers and into either C++ headersEric Christopher2013-04-228-0/+16
| | | | | | | or the C++ files themselves. This enables people to use just a C compiler to interoperate with LLVM. llvm-svn: 180063
* Changed back (relative to commit 179786) the operations executed when ↵Anat Shemer2013-04-221-3/+3
| | | | | | extract(cast) is transformed to cast(extract). It uses the Builder class as before. In addition the result node is added to the Worklist, so all the previous extract users will become the new scalar cast users. llvm-svn: 180045
* Clarify that llvm.used can contain aliases.Rafael Espindola2013-04-223-10/+7
| | | | | | | Also add a check for llvm.used in the verifier and simplify clients now that they can assume they have a ConstantArray. llvm-svn: 180019
* SROA: Don't crash on a select with two identical operands.Benjamin Kramer2013-04-211-8/+8
| | | | | | | This is an edge case that can happen if we modify a chain of multiple selects. Update all operands in that case and remove the assert. PR15805. llvm-svn: 179982
* Revert "SimplifyCFG: If convert single conditional stores"Arnold Schwaighofer2013-04-211-88/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is the temptation to make this tranform dependent on target information as it is not going to be beneficial on all (sub)targets. Therefore, we should probably do this in MI Early-Ifconversion. This reverts commit r179957. Original commit message: "SimplifyCFG: If convert single conditional stores This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up." llvm-svn: 179980
* SLPVectorize: Add support for vectorization of casts.Nadav Rotem2013-04-211-0/+69
| | | | llvm-svn: 179975
* SLPVectorizer: Fix a bug in the code that scans the tree in search of nodes ↵Nadav Rotem2013-04-211-0/+1
| | | | | | | | with multiple users. We did not terminate the switch case and we executed the search routine twice. llvm-svn: 179974
* When we strength reduce an objc_retainBlock call to objc_retain, increment ↵Michael Gottesman2013-04-211-1/+6
| | | | | | NumPeeps and make sure that Changed is set to true. llvm-svn: 179968
* Fixed comment typo.Michael Gottesman2013-04-211-1/+1
| | | | llvm-svn: 179967
* [objc-arc] Fixed typo in debug message.Michael Gottesman2013-04-211-1/+1
| | | | llvm-svn: 179966
* [objc-arc] Fixed comment typo.Michael Gottesman2013-04-211-1/+1
| | | | llvm-svn: 179965
* [objc-arc] Refactored OptimizeReturns so that it uses continue instead of a ↵Michael Gottesman2013-04-211-25/+30
| | | | | | large multi-level nested if statement. llvm-svn: 179964
* [objc-arc] Added debug statement saying when we are resetting a sequence's ↵Michael Gottesman2013-04-201-0/+1
| | | | | | | | | | progress. This will make it clearer when we are actually resetting a sequence's progress vs just changing state. This is an important distinction because the former case clears any pointers that we are tracking while the later does not. llvm-svn: 179963
* Fix PR15800. Do not try to vectorize vectors and structs.Nadav Rotem2013-04-201-1/+10
| | | | llvm-svn: 179960
* SimplifyCFG: If convert single conditional storesArnold Schwaighofer2013-04-201-4/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up. llvm-svn: 179957
* VecUtils: Clean up uses of dyn_cast.Benjamin Kramer2013-04-201-4/+4
| | | | llvm-svn: 179936
* SLPVectorizer: Strength reduce SmallVectors to ArrayRefs.Benjamin Kramer2013-04-203-30/+28
| | | | | | Avoids a couple of copies and allows more flexibility in the clients. llvm-svn: 179935
* SLPVectorizer: Reduce the compile time by eliminating the search for some of ↵Nadav Rotem2013-04-201-1/+1
| | | | | | the more expensive patterns. After this change will only check basic arithmetic trees that start at cmpinstr. llvm-svn: 179933
* refactor tryToVectorizePair to a new method that supports vectorization of ↵Nadav Rotem2013-04-201-0/+8
| | | | | | lists. llvm-svn: 179932
* Fix an unused variable warning.Nadav Rotem2013-04-201-0/+1
| | | | llvm-svn: 179931
* SLPVectorizer: Improve the cost model for loop invariant broadcast values.Nadav Rotem2013-04-203-11/+28
| | | | llvm-svn: 179930
* Report the number of stores that were found in the debug message.Nadav Rotem2013-04-201-6/+8
| | | | llvm-svn: 179929
* Fix the header comment.Nadav Rotem2013-04-202-2/+2
| | | | llvm-svn: 179928
* Use 64bit arithmetic for calculating distance between pointers.Nadav Rotem2013-04-201-2/+2
| | | | llvm-svn: 179927
* MergeFunc: Make pointer and integer types generate the same hash.Benjamin Kramer2013-04-191-2/+11
| | | | | | | | | The logic that actually compares the types considers pointers and integers the same if they are of the same size. This created a strange mismatch between hash and reality and made the test case for this fail on some platforms (yay, test cases). llvm-svn: 179905
OpenPOWER on IntegriCloud