summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* A better attempt to add a missing includeRafael Espindola2015-12-141-1/+0
| | | | llvm-svn: 255578
* Trying to fix the build in a bot.Rafael Espindola2015-12-141-0/+1
| | | | llvm-svn: 255577
* LoopRotate: Convert the methods of LoopRotate to utility functions. NFCJustin Bogner2015-12-141-79/+82
| | | | | | | | | | This moves the actual work to do loop rotation into standalone functions with the analysis results they need passed in as arguments, leaving the class itself as a relatively simple shim. This will make the functions easy to reuse when we're ready to port this transformation to the new pass manager. llvm-svn: 255574
* LoopRotate: Reorder some method implementations. NFCJustin Bogner2015-12-141-159/+159
| | | | | | | | | | This just moves some callers after their callees. My next patch will convert some of these methods to stand alone functions, and that diff is more obviously NFC if I move these first. That change, in turn, will make it much easier to port this pass to the new pass manager once the loop pass manager is in place. llvm-svn: 255573
* Use diagnostic handler in the LLVMContextRafael Espindola2015-12-141-2/+2
| | | | | | | | | | | | | | | | This patch converts code that has access to a LLVMContext to not take a diagnostic handler. This has a few advantages * It is easier to use a consistent diagnostic handler in a single program. * Less clutter since we are not passing a handler around. It does make it a bit awkward to implement some C APIs that return a diagnostic string. I will propose new versions of these APIs and deprecate the current ones. llvm-svn: 255571
* Revert "Don't create unnecessary PHIs"Reid Kleckner2015-12-141-5/+0
| | | | | | | | | This reverts commit r255489. It causes test failures in Chromium and does not appear to respect the AlternativeV parameter. llvm-svn: 255562
* [MergeFunctions] Use II instead of CI for InvokeInst; NFCSanjoy Das2015-12-141-5/+5
| | | | | | Using `CI` is slightly misleading. llvm-svn: 255529
* Teach MergeFunctions about operand bundlesSanjoy Das2015-12-141-0/+31
| | | | llvm-svn: 255528
* [IR] Remove terminatepadDavid Majnemer2015-12-144-26/+3
| | | | | | | | | | | | | It turns out that terminatepad gives little benefit over a cleanuppad which calls the termination function. This is not sufficient to implement fully generic filters but MSVC doesn't support them which makes terminatepad a little over-designed. Depends on D15478. Differential Revision: http://reviews.llvm.org/D15479 llvm-svn: 255522
* getParent() ^ 3 == getModule() ; NFCISanjay Patel2015-12-1414-37/+25
| | | | llvm-svn: 255511
* [InstCombine] fold trunc ([lshr] (bitcast vector) ) --> extractelement (PR25543)Sanjay Patel2015-12-141-55/+47
| | | | | | | | | | | | | | | | | | | | | | | This is a fix for PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The idea is to take the existing fold of: bitcast ( trunc ( lshr ( bitcast X))) --> extractelement (bitcast X) ( http://reviews.llvm.org/rL112232 ) And break it into less specific transforms so we'll catch more cases such as the example in the bug report: bitcast ( trunc ( lshr ( bitcast X))) --> bitcast ( extractelement (bitcast X)) --> extractelement (bitcast X) Enabling patches for this change: http://reviews.llvm.org/rL255399 (combine bitcasts) http://reviews.llvm.org/rL255433 (canonicalize extractelement(bitcast X)) Differential Revision: http://reviews.llvm.org/D15392 llvm-svn: 255504
* [sanitizer] [msan] VarArgHelper for AArch64Adhemerval Zanella2015-12-141-0/+239
| | | | | | | | This patch add support for variadic argument for AArch64. All the MSAN unit tests are not passing as well the signal_stress_test (currently set as XFAIl for aarch64). llvm-svn: 255495
* Don't create unnecessary PHIsJames Molloy2015-12-141-0/+5
| | | | | | | | | | | | In conditional store merging, we were creating PHIs when we didn't need to. If the value to be predicated isn't defined in the block we're predicating, then it doesn't need a PHI at all (because we only deal with triangles and diamonds, any value not in the predicated BB must dominate the predicated BB). This fixes a large code size increase in some benchmarks in a popular embedded benchmark suite. llvm-svn: 255489
* Revert r255460, which still causes test failures on some platforms.Cong Hou2015-12-131-106/+31
| | | | | | Further investigation on the failures is ongoing. llvm-svn: 255463
* [LoopVectorizer] Refine loop vectorizer's register usage calculator by ↵Cong Hou2015-12-131-31/+106
| | | | | | | | | | | | | | | | | | | | ignoring specific instructions. (This is the second attempt to check in this patch: REQUIRES: asserts is added to reg-usage.ll now.) LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the register usage for specific VFs. However, it takes into account many instructions that won't be vectorized, such as induction variables, GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative when choosing VF. In this patch, the induction variables that won't be vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set so that their register usage won't be considered any more. Differential revision: http://reviews.llvm.org/D15177 llvm-svn: 255460
* Revert r255454 as it leads to several test failers on buildbots.Cong Hou2015-12-131-106/+31
| | | | llvm-svn: 255456
* [LoopVectorizer] Refine loop vectorizer's register usage calculator by ↵Cong Hou2015-12-131-31/+106
| | | | | | | | | | | | | | | | | ignoring specific instructions. LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the register usage for specific VFs. However, it takes into account many instructions that won't be vectorized, such as induction variables, GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative when choosing VF. In this patch, the induction variables that won't be vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set so that their register usage won't be considered any more. Differential revision: http://reviews.llvm.org/D15177 llvm-svn: 255454
* [PGO] Stop using invalid char in instr variable names.Xinliang David Li2015-12-121-2/+2
| | | | | | | | | | | | | Before the patch, -fprofile-instr-generate compile will fail if no integrated-as is specified when the file contains any static functions (the -S output is also invalid). This is the second try. The fix in this patch is very localized. Only profile symbol names of profile symbols with internal linkage are fixed up while initializer of name syms are not changes. This means there is no format change nor version bump. llvm-svn: 255434
* [InstCombine] canonicalize (bitcast (extractelement X)) --> ↵Sanjay Patel2015-12-121-28/+17
| | | | | | | | | | | | | | | | | | | | | (extractelement(bitcast X)) This change was discussed in D15392. It allows us to remove the fold that was added in: http://reviews.llvm.org/r255261 ...and it will allow us to generalize this fold: http://reviews.llvm.org/rL112232 while preserving the order of bitcast + extract that it produces and testing shows is better handled by the backend. Note that the existing check for "isVectorTy()" wasn't strong enough in general and specifically because: x86_mmx. It's not a vector, but it's not vectorizable either. So here we check VectorType::isValidElementType() directly before proceeding with the transform. llvm-svn: 255433
* [IR] Reformulate LLVM's EH funclet IRDavid Majnemer2015-12-129-79/+148
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While we have successfully implemented a funclet-oriented EH scheme on top of LLVM IR, our scheme has some notable deficiencies: - catchendpad and cleanupendpad are necessary in the current design but they are difficult to explain to others, even to seasoned LLVM experts. - catchendpad and cleanupendpad are optimization barriers. They cannot be split and force all potentially throwing call-sites to be invokes. This has a noticable effect on the quality of our code generation. - catchpad, while similar in some aspects to invoke, is fairly awkward. It is unsplittable, starts a funclet, and has control flow to other funclets. - The nesting relationship between funclets is currently a property of control flow edges. Because of this, we are forced to carefully analyze the flow graph to see if there might potentially exist illegal nesting among funclets. While we have logic to clone funclets when they are illegally nested, it would be nicer if we had a representation which forbade them upfront. Let's clean this up a bit by doing the following: - Instead, make catchpad more like cleanuppad and landingpad: no control flow, just a bunch of simple operands; catchpad would be splittable. - Introduce catchswitch, a control flow instruction designed to model the constraints of funclet oriented EH. - Make funclet scoping explicit by having funclet instructions consume the token produced by the funclet which contains them. - Remove catchendpad and cleanupendpad. Their presence can be inferred implicitly using coloring information. N.B. The state numbering code for the CLR has been updated but the veracity of it's output cannot be spoken for. An expert should take a look to make sure the results are reasonable. Reviewers: rnk, JosephTremoulet, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D15139 llvm-svn: 255422
* SamplePGO - Reduce memory utilization by 10x.Diego Novillo2015-12-111-1/+1
| | | | | | | | | | | | | | | DenseMap is the wrong data structure to use for sample records and call sites. The keys are too large, causing massive core memory growth when reading profiles. Before this patch, a 21Mb input profile was causing the compiler to grow to 3Gb in memory. By switching to std::map, the compiler now grows to 300Mb in memory. There still are some opportunities for memory footprint reduction. I'll be looking at those next. llvm-svn: 255389
* Revert r255247, r255265, and r255286 due to serious compile-time regressions.Chad Rosier2015-12-111-242/+93
| | | | | | | | Revert "[DSE] Disable non-local DSE to see if the bots go green." Revert "[DeadStoreElimination] Use range-based loops. NFC." Revert "[DeadStoreElimination] Add support for non-local DSE." llvm-svn: 255354
* AlignmentFromAssumptions and SLPVectorizer preserves AA and GlobalsAAHal Finkel2015-12-112-0/+7
| | | | | | | | | | | | GlobalsAA's assumptions that passes do not escape globals not previously escaped is not violated by AlignmentFromAssumptions and SLPVectorizer. Marking them as such allows GlobalsAA to be preserved until GVN in the LTO pipeline. http://lists.llvm.org/pipermail/llvm-dev/2015-December/092972.html Patch by Vaivaswatha Nagaraj! llvm-svn: 255348
* PruneEH pass incorrectly reports that a change was madeArtur Pilipenko2015-12-111-12/+7
| | | | | | | | Reviewed By: reames Differential Revision: http://reviews.llvm.org/D14097 llvm-svn: 255343
* [Mem2Reg] Respect optnoneJames Molloy2015-12-111-0/+3
| | | | | | | | Mem2Reg shouldn't be optimizing a function that is marked optnone. There is a test checking this that fails when mem2reg is explicitly added to the standard pass pipeline. llvm-svn: 255336
* [InstCombine] Make MatchBSwap also match bit reversalsJames Molloy2015-12-112-103/+136
| | | | | | MatchBSwap has most of the functionality to match bit reversals already. If we switch it from looking at bytes to individual bits and remove a few early exits, we can extend the main recursive function to match any sequence of ORs, ANDs and shifts that assemble a value from different parts of another, base value. Once we have this bit->bit mapping, we can very simply detect if it is appropriate for a bswap or bitreverse. llvm-svn: 255334
* [DSE] Disable non-local DSE to see if the bots go green.Chad Rosier2015-12-101-1/+1
| | | | | | I see a few bots timing out, so I'm speculatively disabling r255247. llvm-svn: 255286
* [DeadStoreElimination] Use range-based loops. NFC.Chad Rosier2015-12-101-9/+6
| | | | llvm-svn: 255265
* [InstCombine] fold bitcasts around an extractelement (3rd try)Sanjay Patel2015-12-101-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a redo of r255137 (reverted at r255227) which was a redo of r255124 (reverted at r255126) with a fixed check for a scalar source type and an added test for the failure that caused the revert. Original commit message: Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255261
* [ThinLTO] Debug message cleanup (NFC)Teresa Johnson2015-12-101-8/+8
| | | | | | | | Added some missing spaces between the module identifier and the start of the debug message. Also added a ":" after the module identifier to make this look a little nicer. llvm-svn: 255259
* [DeadStoreElimination] Add support for non-local DSE.Chad Rosier2015-12-101-90/+242
| | | | | | | | | | | | We extend the search for redundant stores to predecessor blocks that unconditionally lead to the block BB with the current store instruction. That also includes single-block loops that unconditionally lead to BB, and if-then-else blocks where then- and else-blocks unconditionally lead to BB. http://reviews.llvm.org/D13363 Patch by Ivan Baev <ibaev@codeaurora.org>! llvm-svn: 255247
* [LLE] Use the PredicatedScalarEvolution interface to query SCEVs for dependencesSilviu Baranga2015-12-101-16/+15
| | | | | | | | | | | | | | | | | Summary: LAA uses the PredicatedScalarEvolution interface, so it can produce forward/backward dependences having SCEVs that are AddRecExprs only after being transformed by PredicatedScalarEvolution. Use PredicatedScalarEvolution to get the expected expressions. Reviewers: anemet Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D15382 llvm-svn: 255241
* Revert r255137.Akira Hatanaka2015-12-101-39/+0
| | | | | | This commit broke apple's internal bot. llvm-svn: 255227
* Add arg_begin() and arg_end() to CallInst and InvokeInst; NFCISanjoy Das2015-12-103-7/+4
| | | | | | | | | | - This simplifies the CallSite class, arg_begin / arg_end are now simple wrapper getters. - In several places, we were creating CallSite instances solely to call arg_begin and arg_end. With this change, that's no longer required. llvm-svn: 255226
* [Float2Int] Don't operate on vector instructionsReid Kleckner2015-12-091-0/+2
| | | | | | | This fixes a crash bug. It's also not clear if we'd want to do this transform for vectors. llvm-svn: 255155
* Don't assign a temporary string to a StringRef.Rafael Espindola2015-12-091-1/+1
| | | | | | Should fix the windows debug and asan bots. llvm-svn: 255149
* Use WeakVH to keep track of calls with operand bundles in CloneCodeInfoSanjoy Das2015-12-091-1/+3
| | | | | | | | `CloneAndPruneIntoFromInst` can DCE instructions after cloning them into the new function, and so an AssertingVH is too strong. This change switches CloneCodeInfo to use a std::vector<WeakVH>. llvm-svn: 255148
* Delete trailing whitespace; NFCSanjoy Das2015-12-091-1/+1
| | | | llvm-svn: 255147
* [ThinLTO] FunctionImport pass can take a const index pointer (NFC)Teresa Johnson2015-12-091-3/+3
| | | | llvm-svn: 255140
* [InstCombine] fold bitcasts around an extractelement (2nd try)Sanjay Patel2015-12-091-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a redo of r255124 (reverted at r255126) with an added check for a scalar destination type and an added test for the failure seen in Clang's test/CodeGen/vector.c. The extra test shows a different missing optimization. Original commit message: Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255137
* Revert "Revert r253253 and r253126: "Don't recompute LCSSA after ↵Michael Zolotukhin2015-12-091-2/+12
| | | | | | | | | | | loop-unrolling when possible."" The bug in IndVarSimplify was fixed in r254976, r254977, so I'm reapplying the original patch for avoiding redundant LCSSA recomputation. This reverts commit ffe3b434e505e403146aff00be0c177bb6d13466. llvm-svn: 255133
* [PGO] Resubmit "MST based PGO instrumentation infrastructure" (r254021)Rong Xu2015-12-095-1/+939
| | | | | | | | | | | | | | | This new patch fixes a few bugs that exposed in last submit. It also improves the test cases. --Original Commit Message-- This patch implements a minimum spanning tree (MST) based instrumentation for PGO. The use of MST guarantees minimum number of CFG edges getting instrumented. An addition optimization is to instrument the less executed edges to further reduce the instrumentation overhead. The patch contains both the instrumentation and the use of the profile to set the branch weights. Differential Revision: http://reviews.llvm.org/D12781 llvm-svn: 255132
* Revert "[InstCombine] fold bitcasts around an extractelement"Mehdi Amini2015-12-091-37/+0
| | | | | | | | | This reverts commit r255124. Broke http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/4193/steps/test/logs/stdio From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255126
* [InstCombine] fold bitcasts around an extractelementSanjay Patel2015-12-091-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255124
* Re-commit r255115, with the PredicatedScalarEvolution class moved toSilviu Baranga2015-12-094-92/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ScalarEvolution.h, in order to avoid cyclic dependencies between the Transform and Analysis modules: [LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions Summary: This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is that both LAA and LV should use this interface everywhere. This also solves a problem involving the result of SCEV expression rewritting when the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates P1: {a,+,b} has nsw P2: b = 1. Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies). The SCEVPredicatedLayer maintains the order of transformations by feeding back the results of previous transformations into new transformations, and therefore avoiding this issue. The SCEVPredicatedLayer maintains a cache to remember the results of previous SCEV rewritting results. This also has the benefit of reducing the overall number of expression rewrites. Reviewers: mzolotukhin, anemet Subscribers: jmolloy, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D14296 llvm-svn: 255122
* Revert r255115 until we figure out how to fix the bot failures.Silviu Baranga2015-12-095-131/+92
| | | | llvm-svn: 255117
* [LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV ↵Silviu Baranga2015-12-095-92/+131
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | expressions Summary: This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is that both LAA and LV should use this interface everywhere. This also solves a problem involving the result of SCEV expression rewritting when the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates P1: {a,+,b} has nsw P2: b = 1. Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies). The SCEVPredicatedLayer maintains the order of transformations by feeding back the results of previous transformations into new transformations, and therefore avoiding this issue. The SCEVPredicatedLayer maintains a cache to remember the results of previous SCEV rewritting results. This also has the benefit of reducing the overall number of expression rewrites. Reviewers: mzolotukhin, anemet Subscribers: jmolloy, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D14296 llvm-svn: 255115
* EarlyCSE: fix typo from rL255054.JF Bastien2015-12-091-1/+1
| | | | llvm-svn: 255102
* The current importing scheme is processing one function at a time,Mehdi Amini2015-12-091-54/+144
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | loading the source Module, linking the function in the destination module, and destroying the source Module before repeating with the next function to import (potentially from the same Module). Ideally we would keep the source Module alive and import the next Function needed from this Module. Unfortunately this is not possible because the linker does not leave it in a usable state. However we can do better by first computing the list of all candidates per Module, and only then load the source Module and import all the function we need for it. The trick to process callees is to materialize function in the source module when building the list of function to import, and inspect them in their source module, collecting the list of callees for each callee. When we move the the actual import, we will import from each source module exactly once. Each source module is loaded exactly once. The only drawback it that it requires to have all the lazy-loaded source Module in memory at the same time. Currently this patch already improves considerably the link time, a multithreaded link of llvm-dis on my laptop was: real 1m12.175s user 6m32.430s sys 0m10.529s and is now: real 0m40.697s user 2m10.237s sys 0m4.375s Note: this is the full link time (linker+Import+Optimizer+CodeGen) Differential Revision: http://reviews.llvm.org/D15178 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255100
* Test commit access - Fix few missing '.' in comments of LoopInterchange code.Vikram TV2015-12-091-4/+4
| | | | llvm-svn: 255095
OpenPOWER on IntegriCloud