summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Remove dead code from LoopUnswitchHal Finkel2013-11-081-127/+0
| | | | | | | | | | | | LoopUnswitch's code simplification routine has logic to convert conditional branches into unconditional branches, after unswitching makes the condition constant, and then remove any blocks that renders dead. Unfortunately, this code is dead, currently broken, and furthermore, has never been alive (at least as far back at 2006). No functionality change intended. llvm-svn: 194277
* [objc-arc] Convert the one directional retain/release relation assert to a ↵Michael Gottesman2013-11-051-3/+18
| | | | | | | | | | | | | | | | | conditional check + fail. Due to the previously added overflow checks, we can have a retain/release relation that is one directional. This occurs specifically when we run into an additive overflow causing us to drop state in only one direction. If that occurs, we should bail and not optimize that retain/release instead of asserting. Apologies for the size of the testcase. It is necessary to cause the additive cfg overflow to trigger. rdar://15377890 llvm-svn: 194083
* Add a runtime unrolling parameter to the LoopUnroll pass constructorHal Finkel2013-11-051-6/+10
| | | | | | | | | | | As with the other loop unrolling parameters (the unrolling threshold, partial unrolling, etc.) runtime unrolling can now also be controlled via the constructor. This will be necessary for moving non-trivial unrolling late in the pass manager (after loop vectorization). No functionality change intended. llvm-svn: 194027
* Remove dead codeShuxin Yang2013-11-041-6/+0
| | | | llvm-svn: 194017
* SLPVectorizer: Use properlyDominates to satisfy the irreflexivity of a ↵Benjamin Kramer2013-11-041-1/+1
| | | | | | | | strict weak ordering. STL debug mode checks this. llvm-svn: 194015
* Scalarize select vector arguments when extracted.Matt Arsenault2013-11-041-0/+32
| | | | | | | | When the elements are extracted from a select on vectors or a vector select, do the select on the extracted scalars from the input if there is only one use. llvm-svn: 194013
* SLPVectorizer: Add a missing pair of parens. No functionality change.Benjamin Kramer2013-11-031-1/+1
| | | | llvm-svn: 193958
* SLPVectorizer: When CSEing generated gathers only scan blocks containing them.Benjamin Kramer2013-11-031-20/+37
| | | | | | | | | | | Instead of doing a RPO traversal of the whole function remember the blocks containing gathers (typically <= 2) and scan them in dominator-first order. The actual CSE is still quadratic, but I'm not confident that adding a scoped hash table here is worth it as we're only looking at the generated instructions and not arbitrary code. llvm-svn: 193956
* Revert "Inliner: Handle readonly attribute per argument when adding memcpy"David Majnemer2013-11-031-13/+10
| | | | | | | | This reverts commit r193356, it caused PR17781. A reduced test case covering this regression has been added to the test suite. llvm-svn: 193955
* Spell "Actual" correctlyDavid Majnemer2013-11-031-1/+1
| | | | llvm-svn: 193954
* Convert calls to __sinpi and __cospi into __sincospi_stretBob Wilson2013-11-031-0/+156
| | | | | | | | | | This adds an SimplifyLibCalls case which converts the special __sinpi and __cospi (float & double variants) into a __sincospi_stret where appropriate to remove duplicated work. Patch by Tim Northover llvm-svn: 193943
* SLPVectorizer: Remove duplicated function.Benjamin Kramer2013-11-021-10/+2
| | | | llvm-svn: 193927
* LoopVectorize: Remove quadratic behavior the local CSE.Benjamin Kramer2013-11-021-26/+40
| | | | | | | | Doing this with a hash map doesn't change behavior and avoids calling isIdenticalTo O(n^2) times. This should probably eventually move into a utility class shared with EarlyCSE and the limited CSE in the SLPVectorizer. llvm-svn: 193926
* LoopVectorizer: Move cse code into its own functionArnold Schwaighofer2013-11-011-32/+37
| | | | llvm-svn: 193895
* LoopVectorizer: Perform redundancy elimination on induction variablesArnold Schwaighofer2013-11-011-1/+34
| | | | | | | | | | | | | | | | | | | | When the loop vectorizer was part of the SCC inliner pass manager gvn would run after the loop vectorizer followed by instcombine. This way redundancy (multiple uses) were removed and instcombine could perform scalarization on the induction variables. Having moved the loop vectorizer to later we no longer run any form of redundancy elimination before we perform instcombine. This caused vectorized induction variables to survive that did not before. On a recent iMac this helps linpack back from 6000Mflops to 7000Mflops. This should also help lpbench and paq8p. I ran a Release (without Asserts) build over the test-suite and did not see any negative impact on compile time. radar://15339680 llvm-svn: 193891
* LoopVectorize: Look for consecutive acces in GEPs with trailing zero indicesBenjamin Kramer2013-11-011-11/+38
| | | | | | | If we have a pointer to a single-element struct we can still build wide loads and stores to it (if there is no padding). llvm-svn: 193860
* LoopVectorizer: If dependency checks fail try runtime checksArnold Schwaighofer2013-11-011-5/+47
| | | | | | | | | | | | When a dependence check fails we can still try to vectorize loops with runtime array bounds checks. This helps linpack to vectorize a loop in dgefa. And we are back to 2x of the scalar performance on a corei7-avx. radar://15339680 llvm-svn: 193853
* LoopVectorizer: Clear all member data structures in RuntimeCheck.reset()Arnold Schwaighofer2013-11-011-0/+2
| | | | | | | | Clear all data structures when resetting the RuntimeCheck data structure. No test case. This was exposed by an upcomming change. llvm-svn: 193852
* Do not convert "call asm" to "invoke asm" in Inliner.Manman Ren2013-10-311-1/+2
| | | | | | | | | | | | | | | | Given that backend does not handle "invoke asm" correctly ("invoke asm" will be handled by SelectionDAGBuilder::visitInlineAsm, which does not have the right setup for LPadToCallSiteMap) and we already made the assumption that inline asm does not throw in InstCombiner::visitCallSite, we are going to make the same assumption in Inliner to make sure we don't convert "call asm" to "invoke asm". If it becomes necessary to add support for "invoke asm" later on, we will need to modify the backend as well as remove the assumptions that inline asm does not throw. Fix rdar://15317907 llvm-svn: 193808
* Use LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN instead of the "dso list".Rafael Espindola2013-10-313-53/+15
| | | | | | | | | | | | | | | | | | | | | | There are two ways one could implement hiding of linkonce_odr symbols in LTO: * LLVM tells the linker which symbols can be hidden if not used from native files. * The linker tells LLVM which symbols are not used from other object files, but will be put in the dso symbol table if present. GOLD's API is the second option. It was implemented almost 1:1 in llvm by passing the list down to internalize. LLVM already had partial support for the first option. It is also very similar to how ld64 handles hiding these symbols when *not* doing LTO. This patch then * removes the APIs for the DSO list. * marks LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN all linkonce_odr unnamed_addr global values and other linkonce_odr whose address is not used. * makes the gold plugin responsible for handling the API mismatch. llvm-svn: 193800
* Merge CallGraph and BasicCallGraph.Rafael Espindola2013-10-315-5/+5
| | | | llvm-svn: 193734
* Teach scalarrepl about address spacesMatt Arsenault2013-10-301-1/+1
| | | | llvm-svn: 193720
* Fix GVN creating bitcast between address spacesMatt Arsenault2013-10-301-5/+7
| | | | llvm-svn: 193710
* ARM cost model: Account for zero cost scalar SROA instructionsArnold Schwaighofer2013-10-291-3/+18
| | | | | | | | | By vectorizing a series of srl, or, ... instructions we have obfuscated the intention so much that the backend does not know how to fold this code away. radar://15336950 llvm-svn: 193573
* SLPVectorizer: Use vector type for vectorized memory operationsArnold Schwaighofer2013-10-291-2/+2
| | | | | | | | | No test case, because with the current cost model we don't see a difference. An upcoming ARM memory cost model change will expose and test this bug. radar://15332579 llvm-svn: 193572
* Revert r193251 : Use address-taken to disambiguate global variable and ↵Shuxin Yang2013-10-271-1/+0
| | | | | | indirect memops. llvm-svn: 193489
* Quick look-up for block in loop.Wan Xiaofei2013-10-262-18/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements quick look-up for block in loop by maintaining a hash set for blocks. It improves the efficiency of loop analysis a lot, the biggest improvement could be 5-6%(458.sjeng). Below are the compilation time for our benchmark in llc before & after the patch. Benchmark llc - trunk llc - patched 401.bzip2 0.339081 100.00% 0.329657 102.86% 403.gcc 19.853966 100.00% 19.605466 101.27% 429.mcf 0.049823 100.00% 0.048451 102.83% 433.milc 0.514898 100.00% 0.510217 100.92% 444.namd 1.109328 100.00% 1.103481 100.53% 445.gobmk 4.988028 100.00% 4.929114 101.20% 456.hmmer 0.843871 100.00% 0.825865 102.18% 458.sjeng 0.754238 100.00% 0.714095 105.62% 464.h264ref 2.9668 100.00% 2.90612 102.09% 471.omnetpp 4.556533 100.00% 4.511886 100.99% bitmnp01 0.038168 100.00% 0.0357 106.91% idctrn01 0.037745 100.00% 0.037332 101.11% libquake2 3.78689 100.00% 3.76209 100.66% libquake_ 2.251525 100.00% 2.234104 100.78% linpack 0.033159 100.00% 0.032788 101.13% matrix01 0.045319 100.00% 0.043497 104.19% nbench 0.333161 100.00% 0.329799 101.02% tblook01 0.017863 100.00% 0.017666 101.12% ttsprk01 0.054337 100.00% 0.053057 102.41% Reviewer : Andrew Trick <atrick@apple.com>, Hal Finkel <hfinkel@anl.gov> Approver : Andrew Trick <atrick@apple.com> Test : Pass make check-all & llvm test-suite llvm-svn: 193460
* Fix SCEVExpander: don't try to expand quadratic recurrences outside a loop.Andrew Trick2013-10-252-3/+21
| | | | | | | | | | | | Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu (affecting trunk and 3.3) When SCEV expands a recurrence outside of a loop it attempts to scale by the stride of the recurrence. Chained recurrences don't work that way. We could compute binomial coefficients, but would hve to guarantee that the chained AddRec's are in a perfectly reduced form. llvm-svn: 193438
* Handle calls and invokes in GlobalStatus.Rafael Espindola2013-10-251-0/+5
| | | | | | | | | | | This patch teaches GlobalStatus to analyze a call that uses the global value as a callee, not as an argument. With this change internalize call handle the common use of linkonce_odr functions. This reduces the number of linkonce_odr functions in a LTO build of clang (checked with the emit-llvm gold plugin option) from 1730 to 60. llvm-svn: 193436
* LoopVectorizer: Don't attempt to vectorize extractelement instructionsHal Finkel2013-10-251-2/+3
| | | | | | | | | | | | | | | The loop vectorizer does not currently understand how to vectorize extractelement instructions. The existing check, which excluded all vector-valued instructions, did not catch extractelement instructions because it checked only the return value. As a result, vectorization would proceed, producing illegal instructions like this: %58 = extractelement <2 x i32> %15, i32 0 %59 = extractelement i32 %58, i32 0 where the second extractelement is illegal because its first operand is not a vector. llvm-svn: 193434
* Inliner: Handle readonly attribute per argument when adding memcpyTom Stellard2013-10-241-10/+13
| | | | | | Patch by: Vincent Lejeune llvm-svn: 193356
* Mark vector loops as already vectorizedRenato Golin2013-10-241-0/+4
| | | | | | | | Make sure we mark all loops (scalar and vector) when vectorizing, so that we don't try to vectorize them anymore. Also, set unroll to 1, since this is what we check for on early exit. llvm-svn: 193349
* fix PR17635: false positive with packed structuresNuno Lopes2013-10-241-1/+2
| | | | | | LLVM optimizers may widen accesses to packed structures that overflow the structure itself, but should be in bounds up to the alignment of the object llvm-svn: 193317
* Fix a bug in LinearFunctionTestReplace that created invalid loop exit checks.Juergen Ributzka2013-10-241-1/+7
| | | | | | Reviewed by Andy llvm-svn: 193303
* Clarify comments in genLoopLimit.Andrew Trick2013-10-241-3/+4
| | | | llvm-svn: 193292
* Fixed comment typo in GCOVProfiling.cppYuchen Wu2013-10-231-1/+1
| | | | llvm-svn: 193268
* Use address-taken to disambiguate global variable and indirect memops.Shuxin Yang2013-10-231-0/+1
| | | | | | | | | | Major steps include: 1). introduces a not-addr-taken bit-field in GlobalVariable 2). GlobalOpt pass sets "not-address-taken" if it proves a global varirable dosen't have its address taken. 3). AA use this info for disambiguation. llvm-svn: 193251
* Fix spelling, grammar, and match naming convention for test files.Eric Christopher2013-10-211-3/+3
| | | | llvm-svn: 193130
* SimplifyCFG: Don't duplicate calls to functions marked noduplicate v2Tom Stellard2013-10-211-0/+15
| | | | | | | v2: - Use CI->cannotDuplicate() llvm-svn: 193115
* Use more type helper functionsMatt Arsenault2013-10-214-23/+23
| | | | llvm-svn: 193109
* Teach SimplifyCFG about address spacesMatt Arsenault2013-10-211-5/+9
| | | | llvm-svn: 193104
* Optimize more linkonce_odr values during LTO.Rafael Espindola2013-10-214-210/+200
| | | | | | | | | | | When a linkonce_odr value that is on the dso list is not unnamed_addr we can still look to see if anything is actually using its address. If not, it is safe to hide it. This patch implements that by moving GlobalStatus to Transforms/Utils and using it in Internalize. llvm-svn: 193090
* Fix the predecessor removal logic in r193045.Michael Gottesman2013-10-211-11/+9
| | | | | | Additionally some small comment/stylistic fixes are included as well. llvm-svn: 193068
* Don't eliminate a partially redundant load if it's in a landing pad.Bill Wendling2013-10-212-15/+7
| | | | | | | | | | | | A landing pad can be jumped to only by the unwind edge of an invoke instruction. If we eliminate a partially redundant load in a landing pad, it will create a basic block that violates this constraint. It then leads to other problems down the line if it tries to merge that basic block with the landing pad. Avoid this by not eliminating the load in a landing pad. PR17621 llvm-svn: 193064
* Teach simplify-cfg how to correctly create covered lookup tables for ↵Michael Gottesman2013-10-201-6/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | switches on iN with N >= 3. One optimization simplify-cfg performs is the converting of switches to lookup tables if the switch has > 4 cases. This is done by: 1. Finding the max/min case value and calculating the switch case range. 2. Create a lookup table basic block. 3. Perform a check in the switch's BB to see if the input value is in the switch's case range. If the input value satisfies said predicate branch to the lookup table BB, otherwise branch to the switch's default destination BB using the default value as the result. The conditional check consists of subtracting the min case value of the table from any input iN value and then ensuring that said value is unsigned less than the size of the lookup table represented as an iN value. If the lookup table is a covered lookup table, the size of the table will be N which is 0 as an iN value. Thus the comparison will be an `icmp ult` of an iN value against 0 which is always false yielding the incorrect result. This patch fixes this problem by recognizing if we have a covered lookup table and if we do, unconditionally jumps to the lookup table BB since the covering property of the lookup table implies no input values could not be handled by said BB. rdar://15268442 llvm-svn: 193045
* Perform an intelligent splice of the predecessor with the single successor.Bill Wendling2013-10-191-1/+14
| | | | | | | | If the predecessor's being spliced into a landing pad, then we need the PHIs to come first and the rest of the predecessor's code to come *after* the landing pad instruction. llvm-svn: 193035
* Mark some command line flags as hiddenNadav Rotem2013-10-181-3/+3
| | | | llvm-svn: 193013
* Rename fields of GlobalStatus to match the coding style.Rafael Espindola2013-10-171-43/+41
| | | | llvm-svn: 192910
* rename SafeToDestroyConstant to isSafeToDestroyConstant and clang-format.Rafael Espindola2013-10-171-10/+12
| | | | llvm-svn: 192907
* Simplify the interface of AnalyzeGlobal a bit and rename to analyzeGlobal.Rafael Espindola2013-10-171-14/+22
| | | | | | No functionality change. llvm-svn: 192906
OpenPOWER on IntegriCloud