summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [asan] Fix invalid debug info for promotable allocasKuba Brecka2015-07-171-1/+7
| | | | | | | | | | Since r230724 ("Skip promotable allocas to improve performance at -O0"), there is a regression in the generated debug info for those non-instrumented variables. When inspecting such a variable's value in LLDB, you often get garbage instead of the actual value. ASan instrumentation is inserted before the creation of the non-instrumented alloca. The only allocas that are considered standard stack variables are the ones declared in the first basic-block, but the initial instrumentation setup in the function breaks that invariant. This patch makes sure uninstrumented allocas stay in the first BB. Differential Revision: http://reviews.llvm.org/D11179 llvm-svn: 242510
* Internalize: internalize comdat members as a group, and drop comdat on such ↵Peter Collingbourne2015-07-161-26/+71
| | | | | | | | | | | | | | | | | | | | members. Internalizing an individual comdat group member without also internalizing the other members of the comdat can break comdat semantics. For example, if a module contains a reference to an internalized comdat member, and the linker chooses a comdat group from a different object file, this will break the reference to the internalized member. This change causes the internalizer to only internalize comdat members if all other members of the comdat are not externally visible. Once a comdat group has been fully internalized, there is no need to apply comdat rules to its members; later optimization passes (e.g. globaldce) can legally drop individual members of the comdat. So we drop the comdat attribute from all comdat members. Differential Revision: http://reviews.llvm.org/D10679 llvm-svn: 242423
* Add PM extension point EP_VectorizerStartTobias Grosser2015-07-161-0/+2
| | | | | | | This extension point allows passes to be executed right before the vectorizer and other highly target specific optimizations are run. llvm-svn: 242389
* Create a wrapper pass for BranchProbabilityInfo.Cong Hou2015-07-151-2/+3
| | | | | | | | This new wrapper pass is useful when we want to do branch probability analysis conditionally (e.g. only in PGO mode) but don't want to add one more pass dependence. http://reviews.llvm.org/D11241 llvm-svn: 242349
* [LoopUnswitch] Add an else clause to IsTrivialUnswitchCondition() when ↵Chen Li2015-07-151-1/+2
| | | | | | | | | | | | | | | | | checking HeaderTerm instruction type Summary: This is a trivial code change with no functionality effect. When LoopUnswitch determines trivial unswitch condition, it checks whether the loop header's terminator instruction is a branch instruction or switch instruction since trivial unswitch condition can only apply to these two instruction types. The current code does not fail the check directly on other instruction types, but check the nullness of LoopExitBB variable instead. The added else clause makes the check fail immediately on other instruction types and makes the code more obvious. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11239 llvm-svn: 242345
* Fix mergefunc infinite loopJF Bastien2015-07-151-2/+6
| | | | | | | | | | | | | | | Self-referential constants containing references to a merged function no longer cause the MergeFunctions pass to infinite loop. Also adds a reproduction IR which would otherwise fail, which was isolated from a similar issue in Chromium. Author: jrkoenig Reviewers: nlewycky, jfb Subscribers: llvm-commits, nlewycky, jfb Differential Revision: http://reviews.llvm.org/D11208 llvm-svn: 242337
* [LoopUnrolling] Handle cast instructions.Michael Zolotukhin2015-07-151-0/+15
| | | | | | | | | During estimation of unrolling effect we should be able to propagate constants through casts. Differential Revision: http://reviews.llvm.org/D10207 llvm-svn: 242257
* Create a wrapper pass for BlockFrequencyInfo.Wei Mi2015-07-141-3/+3
| | | | | | | | | | | | This is useful when we want to do block frequency analysis conditionally (e.g. only in PGO mode) but don't want to add one more pass dependence. Patch by congh. Approved by dexonsmith. Differential Revision: http://reviews.llvm.org/D11196 llvm-svn: 242248
* [InstCombine] Generalize sub of selects optimization to all BinaryOperatorsDavid Majnemer2015-07-142-26/+27
| | | | | | | This exposes further optimization opportunities if the selects are correlated. llvm-svn: 242235
* [LAA] Introduce RuntimePointerChecking::PointerInfo, NFCAdam Nemet2015-07-141-2/+2
| | | | | | | Turn this structure-of-arrays (i.e. the various pointer attributes) into array-of-structures. llvm-svn: 242219
* [LAA] Lift RuntimePointerCheck out of LoopAccessInfo, NFCAdam Nemet2015-07-143-12/+12
| | | | | | | | | I am planning to add more nested classes inside RuntimePointerCheck so all these triple-nesting would be hard to follow. Also rename it to RuntimePointerChecking (i.e. append 'ing'). llvm-svn: 242218
* GVN: use a static array instead of regenerating it each time. NFC.Tim Northover2015-07-141-1/+1
| | | | llvm-svn: 242202
* GVN: tolerate an instruction being replaced without existing in the leaderboardTim Northover2015-07-141-1/+4
| | | | | | | | | | | | | | Sometimes an incidentally created instruction can duplicate a Value used elsewhere. It then often doesn't end up in the leader table. If it's later removed, we attempt to remove it from the leader table and segfault. Instead we should just ignore the removal request, which won't cause any problems. The reverse situation, where the original instruction is replaced by the new one (which you might think could leave the leader table empty) cannot occur, because the incidental instruction will never be found in the first place. llvm-svn: 242199
* [SROA] Don't de-atomic volatile loads and storesDavid Majnemer2015-07-141-6/+15
| | | | | | | | | | | Volatile loads and stores are made visible in global state regardless of what memory is involved. It is not correct to disregard the ordering and synchronization scope because it is possible to synchronize with memory operations performed by hardware. This partially addresses PR23737. llvm-svn: 242126
* Update enforceKnownAlignment after the isWeakForLinker semantic changeReid Kleckner2015-07-141-7/+4
| | | | | | | | | | | | | Previously we would refrain from attempting to increase the linkage of available_externally globals because they were considered weak for the linker. Now they are treated more like a declaration instead of a weak definition. This was causing SSE alignment faults in Chromuim, when some code assumed it could increase the alignment of a dllimported global that it didn't control. http://crbug.com/509256 llvm-svn: 242091
* Loop idiom recognizer was replacing too many uses of popcount.Pete Cooper2015-07-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | When spotting that a loop can use ctpop, we were incorrectly replacing all uses of a value with a value derived from ctpop. The bug here was exposed because we were replacing a use prior to the ctpop with the ctpop value and so we have a use before def, i.e., we changed %tobool.5 = icmp ne i32 %num, 0 store i1 %tobool.5, i1* %ptr br i1 %tobool.5, label %for.body.lr.ph, label %for.end to store i1 %1, i1* %ptr %0 = call i32 @llvm.ctpop.i32(i32 %num) %1 = icmp ne i32 %0, 0 br i1 %1, label %for.body.lr.ph, label %for.end Even if we inserted the ctpop so that it dominates the store here, that would still be incorrect. The store doesn’t want the result of ctpop. The fix is very simple, and involves replacing only the branch condition with the ctpop instead of all uses. Reviewed by Hal Finkel. llvm-svn: 242068
* Enable runtime unrolling with unroll pragma metadataMark Heffernan2015-07-131-2/+4
| | | | | | | | | | Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N") and a runtime trip count. Also, do not unroll loops with unroll full metadata if the loop has a runtime loop count. Previously, such loops would be unrolled with a very large threshold (pragma-unroll-threshold) if runtime unrolled happened to be enabled resulting in a very large (and likely unwise) unroll factor. llvm-svn: 242047
* Avoid using Loop::getSubLoopsVector.Benjamin Kramer2015-07-132-7/+7
| | | | | | | Passes should never modify it, just use the const version. While there reduce copying in LoopInterchange. No functional change intended. llvm-svn: 242041
* Remove unused variable.Rafael Espindola2015-07-131-1/+0
| | | | | | Sorry I missed it in the previous commit. llvm-svn: 242032
* Aliases don't have available_externally linkage.Rafael Espindola2015-07-131-11/+0
| | | | | | | Allowing that is probably a good idea, but currently we don't, so this is dead code. llvm-svn: 242031
* Don't change the visibility when converting a definition to a declaration.Rafael Espindola2015-07-132-3/+1
| | | | llvm-svn: 242030
* [InstSimplify] Teach InstSimplify how to simplify extractelementDavid Majnemer2015-07-131-58/+9
| | | | llvm-svn: 242008
* [InstSimplify] Teach InstSimplify how to simplify extractvalueDavid Majnemer2015-07-131-10/+3
| | | | llvm-svn: 242007
* [LICM] Don't try to sink values out of loops without any exitsDavid Majnemer2015-07-121-1/+12
| | | | | | | | | | | | | There is no suitable basic block to sink instructions in loops without exits. The only way an instruction in a loop without exits can be used is as an incoming value to a PHI. In such cases, the incoming block for the corresponding value is unreachable. This fixes PR24013. Differential Revision: http://reviews.llvm.org/D10903 llvm-svn: 241987
* Move getStrideFromPointer and friends from LoopVectorize to VectorUtilsHal Finkel2015-07-111-137/+0
| | | | | | | | | | | | | | | | The following functions are moved from the LoopVectorizer to VectorUtils: - getGEPInductionOperand - stripGetElementPtr - getUniqueCastUse - getStrideFromPointer These used to be static functions in LoopVectorize, but will also be used by the upcoming loop versioning LICM transformation. Patch by Ashutosh Nema! llvm-svn: 241980
* [PM/AA] Completely remove the AliasAnalysis::copyValue interface.Chandler Carruth2015-07-115-18/+1
| | | | | | | | | | | | | | | | | | | | | No in-tree alias analysis used this facility, and it was not called in any particularly rigorous way, so it seems unlikely to be correct. Note that one of the only stateful AA implementations in-tree, GlobalsModRef is completely broken currently (and any AA passes like it are equally broken) because Module AA passes are not effectively invalidated when a function pass that fails to update the AA stack runs. Ultimately, it doesn't seem like we know how we want to build stateful AA, and until then trying to support and maintain correctness for an untested API is essentially impossible. To that end, I'm planning to rip out all of the update API. It can return if and when we need it and know how to build it on top of the new pass manager and as part of *tested* stateful AA implementations in the tree. Differential Revision: http://reviews.llvm.org/D10889 llvm-svn: 241975
* Renamed some uses of unroll to interleave in the vectorizer.Tyler Nowicki2015-07-111-91/+98
| | | | llvm-svn: 241971
* [InstCombine] Actually combine AA metadata when replacing one load with anotherBjorn Steinbrink2015-07-101-2/+0
| | | | | | Fixes PR24083 llvm-svn: 241955
* [TTI] BasicTTIImpl assumes no vector registersJingyue Wu2015-07-101-3/+8
| | | | | | | | | | | | | | | | | | | | | | | Summary: Following the discussion on r241884, it's more reasonable to assume that a target has no vector registers by default instead of letting every such target overrides getNumberOfRegisters. Therefore, this patch modifies BasicTTIImpl::getNumberOfRegisters to return 0 when Vector is true, and partially reverts r241884 which modifies NVPTXTTIImpl::getNumberOfRegisters. It also fixes a performance bug in LoopVectorizer. Even if a target has no vector registers, vectorization may still help ILP. So, we need both checks to be false before disabling loop vectorization all together. Reviewers: hfinkel Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11108 llvm-svn: 241942
* [LoopDist/LoopVer] Move LoopVersioning to a new module, NFCAdam Nemet2015-07-103-115/+108
| | | | | | | | | | | | | | | | | | | Summary: The class will obviously need improvement down the road. For one, there is no reason that addPHINodes would have to be exposed like that. I will make this and other improvements in follow-up patches. The main goal is to be able to share this functionality. The LoopLoadElimination pass I am working on needs it too. Later we can move other clients as well (LV and Ashutosh's LICMVer). Reviewers: hfinkel, ashutosh.nema Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10577 llvm-svn: 241932
* [LoopDist] Move loop-versioning helper functions to Cloning, NFCAdam Nemet2015-07-102-66/+70
| | | | | | | | | | | | | | Summary: This makes them available to the LoopVersioning class as that is moved to its own module in the next patch. Reviewers: ashutosh.nema, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10576 llvm-svn: 241931
* [InstSimplify] Fold away ord/uno fcmps when nnan is present.Benjamin Kramer2015-07-101-2/+2
| | | | | | | This is important to fold away the slow case of complex multiplies emitted by clang. llvm-svn: 241911
* Disable loop re-rotation for -Oz (patch by Andrey Turetsky)Alexey Bataev2015-07-101-2/+2
| | | | | | | After changes in rL231820 loop re-rotation is performed even in -Oz mode. Since loop rotation is disabled for -Oz, it seems loop re-rotation should be disabled too. Differential Revision: http://reviews.llvm.org/D10961 llvm-svn: 241897
* Revert the new EH instructionsDavid Majnemer2015-07-107-51/+14
| | | | | | This reverts commits r241888-r241891, I didn't mean to commit them. llvm-svn: 241893
* Address Reid's review feedback.David Majnemer2015-07-101-8/+12
| | | | llvm-svn: 241889
* New EH representation for MSVC compatibilityDavid Majnemer2015-07-107-14/+47
| | | | | | | | | | | | | | | Summary: This introduces new instructions neccessary to implement MSVC-compatible exception handling support. Most of the middle-end and none of the back-end haven't been audited or updated to take them into account. Reviewers: rnk, JosephTremoulet, reames, nlewycky, rjmccall Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11041 llvm-svn: 241888
* [InstCombine] Employ AliasAnalysis in FindAvailableLoadedValueBjorn Steinbrink2015-07-103-8/+18
| | | | llvm-svn: 241887
* [InstCombine] Properly combine metadata when replacing a load with anotherBjorn Steinbrink2015-07-101-1/+18
| | | | | | | | Not doing this can lead to misoptimizations down the line, e.g. because of range metadata on the replacing load excluding values that are valid for the load that is being replaced. llvm-svn: 241886
* [IndVars] Try to use existing values in RewriteLoopExitValues.Sanjoy Das2015-07-091-2/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In RewriteLoopExitValues, before expanding out an SCEV expression using SCEVExpander, try to see if an existing LLVM IR expression already computes the value we're interested in. If so use that existing expression. Apart from reducing IndVars' reliance on the rest of the compilation pipeline, this also prevents IndVars from concluding some expressions as "high cost" when they're not. For instance, `InductiveRangeCheckElimination` often emits code of the following form: ``` len = umin(len_A, len_B) loop: ... if (i++ < len) goto loop outside_loop: use(i) ``` `SCEVExpander` refuses to rewrite the use of `i` in `outside_loop`, since it thinks the value of `i` on loop exit, `len`, is a high cost expansion since it contains an `umax` in it. With this change, `IndVars` can see that it can re-use `len` instead of creating a new expression to compute `umin(len_A, len_B)`. I considered putting this cleverness in `SCEVExpander`, but I was worried that it may then have a deterimental effect on other passes that use it. So I decided it was better to just do this in the one place where it seems like an obviously good idea, with the intent of generalizing later if needed. Reviewers: atrick, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10782 llvm-svn: 241838
* [SLPVectorizer] Try different vectorization factors for store chainsSanjay Patel2015-07-081-7/+37
| | | | | | | | | | | | | | | | ...and set max vector register size based on target This patch is based on discussion on the llvmdev mailing list: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/087405.html and also solves: https://llvm.org/bugs/show_bug.cgi?id=17170 Several FIXME/TODO items are noted in comments as potential improvements. Differential Revision: http://reviews.llvm.org/D10950 llvm-svn: 241760
* [LoopVectorizer] Rename BypassBlock to VectorPH, and CheckBlock to ↵Michael Zolotukhin2015-07-081-46/+46
| | | | | | NewVectorPH. NFCI. llvm-svn: 241742
* [LoopVectorizer] Restructurize code for emitting RT checks. NFCI.Michael Zolotukhin2015-07-081-18/+22
| | | | | | | | | | Place all code corresponding to a run-time check in one place. Previously we generated some code, then proceeded to a next check, then finished the code for the first check (like splitting blocks and generating branches). Now the code for generating a check is self-contained. llvm-svn: 241741
* [LoopVectorizer] Remove redundant variables PastOverflowCheck and ↵Michael Zolotukhin2015-07-081-11/+2
| | | | | | OverflowCheckAnchor. NFCI. llvm-svn: 241740
* [LoopVectorizer] Move some code around to ease further refactoring. NFCI.Michael Zolotukhin2015-07-081-16/+13
| | | | llvm-svn: 241739
* [LoopVectorizer] Remove redundant variable LastBypassBlock. NFC.Michael Zolotukhin2015-07-081-14/+12
| | | | llvm-svn: 241738
* Rename llvm.frameescape and llvm.framerecover to localescape and localrecoverReid Kleckner2015-07-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Initially, these intrinsics seemed like part of a family of "frame" related intrinsics, but now I think that's more confusing than helpful. Initially, the LangRef specified that this would create a new kind of allocation that would be allocated at a fixed offset from the frame pointer (EBP/RBP). We ended up dropping that design, and leaving the stack frame layout alone. These intrinsics are really about sharing local stack allocations, not frame pointers. I intend to go further and add an `llvm.localaddress()` intrinsic that returns whatever register (EBP, ESI, ESP, RBX) is being used to address locals, which should not be confused with the frame pointer. Naming suggestions at this point are welcome, I'm happy to re-run sed. Reviewers: majnemer, nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11011 llvm-svn: 241633
* Revert "Revert r241570, it caused PR24053"David Majnemer2015-07-071-2/+1
| | | | | | | | This reverts commit r241602. We had a latent bug in SCCP where we would make a basic block empty and then proceed to ask questions about it's terminator. llvm-svn: 241616
* [llvm-extract] Drop comdats from declarationsReid Kleckner2015-07-061-2/+8
| | | | | | The verifier rejects comdats on declarations. llvm-svn: 241483
* Resubmit "Add new EliminateAvailableExternally module pass" (r239480)Teresa Johnson2015-07-063-0/+112
| | | | | | | | | | | | | | | | | | | | | | This change includes a fix for https://code.google.com/p/chromium/issues/detail?id=499508#c3, which required updating the visibility for symbols with eliminated definitions. --Original Commit Message-- Add new EliminateAvailableExternally module pass, which is performed in O2 compiles just before GlobalDCE, unless we are preparing for LTO. This pass eliminates available externally globals (turning them into declarations), regardless of whether they are dead/unreferenced, since we are guaranteed to have a copy available elsewhere at link time. This enables additional opportunities for GlobalDCE. If we are preparing for LTO (e.g. a -flto -c compile), the pass is not included as we want to preserve available externally functions for possible link time inlining. The FE indicates whether we are doing an -flto compile via the new PrepareForLTO flag on the PassManagerBuilder. llvm-svn: 241466
* remove unnecessary temp variable; NFCISanjay Patel2015-07-051-5/+4
| | | | llvm-svn: 241415
OpenPOWER on IntegriCloud