summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [EarlyCSE] Change key type of AvailableCalls to Instruction*. NFCI.Geoff Berry2016-05-131-3/+4
| | | | llvm-svn: 269445
* [InstCombine] handle zero constant vectors for LE/GE comparisons tooSanjay Patel2016-05-131-2/+3
| | | | | | | | | | | | Enhancement to: http://reviews.llvm.org/rL269426 With discussion in: http://reviews.llvm.org/D17859 This should complete the fixes for: PR26701, PR26819: https://llvm.org/bugs/show_bug.cgi?id=26701 https://llvm.org/bugs/show_bug.cgi?id=26819 llvm-svn: 269439
* [PGO] Add flags to control IRPGO warnings.Rong Xu2016-05-131-3/+18
| | | | | | | | | | | | | Currently there is no reasonable way to control the warnings in the 'use' phase of the IRPGO pass. This is problematic because the output can be somewhat spammy. This patch adds some flags which allow us to optionally disable these warnings. The current upstream behavior will remain the default. Patch by Jake VanAdrighem (jvanadrighem@gmail.com) Differential Revision: http://reviews.llvm.org/D20195 llvm-svn: 269437
* [MemCpyOpt] Use MaxIntSize in byte instead of bitJun Bum Lim2016-05-131-1/+1
| | | | | | | | | | | | Summary: This change fix the bug in isProfitableToUseMemset() where MaxIntSize shoule be in byte, not bit. Reviewers: arsenm, joker.eph, mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20176 llvm-svn: 269433
* [InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT ↵Sanjay Patel2016-05-131-2/+55
| | | | | | | | | | | | | | | (PR26701, PR26819) *We don't currently handle the edge case constants (min/max values), so it's not a complete canonicalization. To fully solve the motivating bugs, we need to enhance this to recognize a zero vector too because that's a ConstantAggregateZero which is a ConstantData, not a ConstantVector or a ConstantDataVector. Differential Revision: http://reviews.llvm.org/D17859 llvm-svn: 269426
* Revert "[Unroll] Implement a conservative and monotonically increasing cost ↵Michael Zolotukhin2016-05-131-178/+13
| | | | | | | | | | | tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..." This reverts commit r269388. It caused some bots to fail, I'm reverting it until I investigate the issue. llvm-svn: 269395
* [LoopDist] Only run LAA for loops with the pragmaAdam Nemet2016-05-131-17/+17
| | | | | | | This should fix some compile-time regressions after r267672. Thanks to Chris Matthews for bisecting it. llvm-svn: 269392
* [Unroll] Implement a conservative and monotonically increasing cost tracking ↵Michael Zolotukhin2016-05-131-13/+178
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the... Summary: ...loop after the last iteration. This is really hard to do correctly. The core problem is that we need to model liveness through the induction PHIs from iteration to iteration in order to get the correct results, and we need to correctly de-duplicate the common subgraphs of instructions feeding some subset of the induction PHIs. All of this can be driven either from a side effect at some iteration or from the loop values used after the loop finishes. This patch implements this by storing the forward-propagating analysis of each instruction in a cache to recall whether it was free and whether it has become live and thus counted toward the total unroll cost. Then, at each sink for a value in the loop, we recursively walk back through every value that feeds the sink, including looping back through the iterations as needed, until we have marked the entire input graph as live. Because we cache this, we never visit instructions more than twice -- once when we analyze them and put them into the cache, and once when we count their cost towards the unrolled loop. Also, because the cache is only two bits and because we are dealing with relatively small iteration counts, we can store all of this very densely in memory to avoid this from becoming an excessively slow analysis. The code here is still pretty gross. I would appreciate suggestions about better ways to factor or split this up, I've stared too long at the algorithmic side to really have a good sense of what the design should probably look at. Also, it might seem like we should do all of this bottom-up, but I think that is a red herring. Specifically, the simplification power is *much* greater working top-down. We can forward propagate very effectively, even across strange and interesting recurrances around the backedge. Because we use data to propagate, this doesn't cause a state space explosion. Doing this level of constant folding, etc, would be very expensive to do bottom-up because it wouldn't be until the last moment that you could collapse everything. The current solution is essentially a top-down simplification with a bottom-up cost accounting which seems to get the best of both worlds. It makes the simplification incremental and powerful while leaving everything dead until we *know* it is needed. Finally, a core property of this approach is its *monotonicity*. At all times, the current UnrolledCost is a conservatively low estimate. This ensures that we will never early-exit from the analysis due to exceeding a threshold when if we had continued, the cost would have gone back below the threshold. These kinds of bugs can cause incredibly hard to track down random changes to behavior. We could use a techinque similar (but much simpler) within the inliner as well to avoid considering speculated code in the inline cost. Reviewers: chandlerc Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D11758 llvm-svn: 269388
* [PM] Port of the DepndenceAnalysis to the new PM.Chandler Carruth2016-05-123-10/+10
| | | | | | | | | | | | | Ported DA to the new PM by splitting the former DependenceAnalysis Pass into a DependenceInfo result type and DependenceAnalysisWrapperPass type and adding a new PM-style DependenceAnalysis analysis pass returning the DependenceInfo. Patch by Philip Pfaffe, most of the review by Justin. Differential Revision: http://reviews.llvm.org/D18834 llvm-svn: 269370
* Tidied up switch cases. NFCI.Simon Pilgrim2016-05-121-52/+48
| | | | | | Split FCMP//ICMP/SEL from the basic arithmetic cost functions. They were not sharing any notable code path (just the return) and were repeatedly testing the opcode. llvm-svn: 269348
* [PM] Make LowerAtomic a FunctionPass.Davide Italiano2016-05-121-5/+16
| | | | | | Differential Revision: http://reviews.llvm.org/D20025 llvm-svn: 269322
* [LoopVectorizer] LoopVectorBody doesn't need to be a vector. NFC.Michael Kuperstein2016-05-121-40/+22
| | | | | | | | | | | LoopVectorBody was changed from a single pointer to a SmallVector when store predication was introduced in r200270. Since r247139, store predication no longer splits the vector loop body in-place, so we can go back to having a single LoopVectorBody block. This reverts the no-longer-needed changes from r200270. llvm-svn: 269321
* [SCCP] Resolve shifts beyond the bitwidth to undefDavid Majnemer2016-05-121-0/+16
| | | | | | | | | Shifts beyond the bitwidth are undef but SCCP resolved them to zero. Instead, DTRT and resolve them to undef. This reimplements the transform which caused PR27712. llvm-svn: 269269
* All llvm.deoptimize declarations must use the same calling conventionSanjoy Das2016-05-121-1/+7
| | | | | | | | | | | | | | | | | This new verifier rule lets us unambigously pick a calling convention when creating a new declaration for `@llvm.experimental.deoptimize.<ty>`. It is also congruent with our lowering strategy -- since all calls to `@llvm.experimental.deoptimize` are lowered to calls to `__llvm_deoptimize`, it is reasonable to enforce a unique calling convention. Some of the tests that were breaking this verifier rule have had to be split up into different .ll files. The inliner was violating this rule as well, and has been fixed to avoid producing invalid IR. llvm-svn: 269261
* Revert "[SCCP] Partially propagate informations when the input is not fully ↵Davide Italiano2016-05-111-3/+0
| | | | | | | | defined." This reverts commit r269105 as it caused PR27712. llvm-svn: 269252
* [ThinLTO] Don't re-analyze callee at same threshold unnecessarilyTeresa Johnson2016-05-111-1/+1
| | | | | | | | | This should just be a compile-time change. Correct the check for whether we have already analyzed the callee when making summary based decisions. There is no need to reprocess one at the same threshold as when it was last processed. llvm-svn: 269251
* Return a StringRef from getSection.Rafael Espindola2016-05-111-1/+1
| | | | | | This is similar to how getName is handled. llvm-svn: 269218
* Delete mayBeOverridden.Rafael Espindola2016-05-111-1/+1
| | | | | | It is the same as isInterposable which seems to be the preferred name. llvm-svn: 269150
* [PGO] Use WeakAny linkage for __llvm_profile_raw_versionRong Xu2016-05-111-1/+1
| | | | | | | Use WeakAny linkage instead of LinkOnceAny, as the symbol can be removed with LinkOnceAny in O2 (not referenced). llvm-svn: 269146
* Propagate branch metadata when some branch probability is missing.Dehao Chen2016-05-101-5/+13
| | | | | | | | | | | | Summary: In sample profile, some branches may have profile missing due to profile inaccuracy. We want existing branch probability still valid after propagation. Reviewers: hfinkel, davidxl, spatel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19948 llvm-svn: 269137
* [PM]: port IR based profUse pass to new pass managerXinliang David Li2016-05-101-10/+32
| | | | llvm-svn: 269129
* Revert "MemCpyOpt: combine local load/store sequences into memcpy."Tim Northover2016-05-101-222/+48
| | | | | | | This reverts commit r269125. It was in my tree when I ran "git svn dcommit". It's really still under review. llvm-svn: 269127
* MemCpyOpt: combine local load/store sequences into memcpy.Tim Northover2016-05-101-48/+222
| | | | | | | | Sort of the BB-local equivalent to idiom-recognizer: if we have a basic-block that really implements a memcpy operation, backends can benefit from seeing this. llvm-svn: 269125
* Loop unroller: set thresholds for optsize and minsize functions to zeroHans Wennborg2016-05-101-2/+2
| | | | | | | | | | | | | | | Before r268509, Clang would disable the loop unroll pass when optimizing for size. That commit enabled it to be able to support unroll pragmas in -Os builds. However, this regressed binary size in one of Chromium's DLLs with ~100 KB. This restores the original behaviour of no unrolling at -Os, but doing it in LLVM instead of Clang makes more sense, and also allows the pragmas to keep working. Differential revision: http://reviews.llvm.org/D20115 llvm-svn: 269124
* Enable loopreroll for sext of loop control only IVLawrence Hu2016-05-101-12/+33
| | | | | | | | | This patch extend loopreroll to allow the instruction chain of loop control only IV has sext. Differential Revision: http://reviews.llvm.org/D19820 llvm-svn: 269121
* Revert r26084: Enable loopreroll for sext of loop control only IVLawrence Hu2016-05-101-33/+12
| | | | llvm-svn: 269119
* Cloning: Clean up the interface to the CloneFunction function.Peter Collingbourne2016-05-102-31/+13
| | | | | | | | | | | | | | | | | | | | | Remove the ModuleLevelChanges argument, and the ability to create new subprograms for cloned functions. The latter was added without review in r203662, but it has no in-tree clients (all non-test callers pass false for ModuleLevelChanges [1], so it isn't reachable outside of tests). It also isn't clear that adding a duplicate subprogram to the compile unit is always the right thing to do when cloning a function within a module. If this functionality comes back it should be accompanied with a more concrete use case. Furthermore, all in-tree clients add the returned function to the module. Since that's pretty much the only sensible thing you can do with the function, just do that in CloneFunction. [1] http://llvm-cs.pcc.me.uk/lib/Transforms/Utils/CloneFunction.cpp/rCloneFunction Differential Revision: http://reviews.llvm.org/D18628 llvm-svn: 269110
* [InstCombine] Fold icmp ugt/ult (udiv i32 C2, X), C1.Chad Rosier2016-05-101-3/+21
| | | | | | | | | | This patch adds support for two optimizations: icmp ugt (udiv C2, X), C1 -> icmp ule X, C2/(C1+1) icmp ult (udiv C2, X), C1 -> icmp ugt X, C2/C1 Differential Revision: http://reviews.llvm.org/D20123 llvm-svn: 269109
* [SCCP] Partially propagate informations when the input is not fully defined.Davide Italiano2016-05-101-0/+3
| | | | | | | | | | With this patch: %r1 = lshr i64 -1, 4294967296 -> undef Before this patch: %r1 = lshr i64 -1, 4294967296 -> 0 llvm-svn: 269105
* Re-apply r269081 and r269082 with a fix for MSVC.Peter Collingbourne2016-05-101-51/+13
| | | | llvm-svn: 269094
* Revert r269081 and r269082 while I try to find the right incantation to fix ↵Peter Collingbourne2016-05-101-12/+51
| | | | | | MSVC build. llvm-svn: 269091
* [PGO] resubmit r268969Rong Xu2016-05-101-1/+1
| | | | | | Put the test into a target specific directory. llvm-svn: 269090
* Enable loopreroll for sext of loop control only IVLawrence Hu2016-05-101-12/+33
| | | | | | | This patch extend loopreroll to allow the instruction chain of loop control only IV has sext. llvm-svn: 269084
* WholeProgramDevirt: Move logic for finding devirtualizable call sites to ↵Peter Collingbourne2016-05-101-51/+12
| | | | | | | | | | | | | Analysis. The plan is to eventually make this logic simpler, however I expect it to be a little tricky for the foreseeable future (at least until we're rid of pointee types), so move it here so that it can be reused to build a summary index for devirtualization. Differential Revision: http://reviews.llvm.org/D20005 llvm-svn: 269081
* [ThinLTO] Add option to emit imports files for distributed backendsTeresa Johnson2016-05-101-0/+15
| | | | | | | | | | | | | | | | | | | | | Summary: Add support for emission of plaintext lists of the imported files for each distributed backend compilation. Used for distributed build file staging. Invoked with new gold-plugin thinlto-emit-imports-files option, which is only valid with thinlto-index-only (i.e. for distributed builds), or from llvm-lto with new -thinlto-action=emitimports value. Depends on D19556. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19636 llvm-svn: 269067
* Restore "[ThinLTO] Emit individual index files for distributed backends"Teresa Johnson2016-05-101-0/+27
| | | | | | | | | | | | | | | | | | | This restores commit r268627: Summary: When launching ThinLTO backends in a distributed build (currently supported in gold via the thinlto-index-only plugin option), emit an individual index file for each backend process as described here: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098272.html ... Differential Revision: http://reviews.llvm.org/D19556 Address msan failures by avoiding std::prev on map.end(), the theory is that this is causing issues due to some known UB problems in __tree. llvm-svn: 269059
* Update Debug Intrinsics in RewriteUsesOfClonedInstructions in LoopRotationChuang-Yu Cheng2016-05-101-0/+34
| | | | | | | | | | | | | | | | | | | | | | Loop rotation clones instruction from the old header into the preheader. If there were uses of values produced by these instructions that were outside the loop, we have to insert PHI nodes to merge the two values. If the values are used by DbgIntrinsics they will be used as a MetadataAsValue of a ValueAsMetadata of the original values, and iterating all of the uses of the original value will not update the DbgIntrinsics. The new code checks if the values are used by DbgIntrinsics and if so, updates them using essentially the same logic as the original code. The attached testcase demonstrates the issue. Without the fix, the DbgIntrinic outside the loop uses values computed inside the loop, even though these values do not dominate the DbgIntrinsic. Author: Thomas Jablin (tjablin) Reviewers: dblaikie aprantl kbarton hfinkel cycheng http://reviews.llvm.org/D19564 llvm-svn: 269034
* [InstCombine] Remove trivially empty va_start/va_end and va_copy/va_end ranges.Arnaud A. de Grandmaison2016-05-102-22/+59
| | | | | | | | | | | | When a va_start or va_copy is immediately followed by a va_end (ignoring debug information or other start/end in between), then it is safe to remove the pair. As this code shares some commonalities with the lifetime markers, this has been factored to helper functions. This InstCombine pattern kicks-in 3 times when running the LLVM test suite. llvm-svn: 269033
* Revert "[PGO] Fix __llvm_profile_raw_version linkage in MACHO IR ↵Renato Golin2016-05-101-1/+1
| | | | | | | | | | | | | instrumentation generates a COMDAT symbol __llvm_profile_raw_version to overwrite the same symbol in profile run-time to distinguish IR profiles from Clang generated profiles. In MACHO, LinkOnceODR linkage is used due to the lack of COMDAT support." This reverts commits r268969, r268979 and r268984. They had target specific test in generic directories without the correct specifiers and made it hard for us to come up with a good solution by rapidly committing untested changes. This test needs to be in a target specific directory or have the correct REQUIRED identifier. llvm-svn: 269027
* [LoopVectorize] Handling induction variable with non-constant step.Elena Demikhovsky2016-05-102-49/+119
| | | | | | | | | | | | | | | | | | | | | | | Allow vectorization when the step is a loop-invariant variable. This is the loop example that is getting vectorized after the patch: int int_inc; int bar(int init, int *restrict A, int N) { int x = init; for (int i=0;i<N;i++){ A[i] = x; x += int_inc; } return x; } "x" is an induction variable with *loop-invariant* step. But it is not a primary induction. Primary induction variable with non-constant step is not handled yet. Differential Revision: http://reviews.llvm.org/D19258 llvm-svn: 269023
* [LAA] Rename "isStridedPtr" with "getPtrStride". NFC.Denis Zobnin2016-05-102-3/+3
| | | | | | | Changing misleading function name was approved in http://reviews.llvm.org/D17268. Patch by Roman Shirokiy. llvm-svn: 269021
* Minor formatting fixes in LoopUnroll.cpp.Justin Lebar2016-05-101-4/+2
| | | | llvm-svn: 268995
* [IndirectCallPromotion] Remove duplicate comment. NFCAdam Nemet2016-05-091-1/+0
| | | | llvm-svn: 268986
* Typo. NFC.Chad Rosier2016-05-091-1/+1
| | | | llvm-svn: 268975
* Cleanup followup of r268710 - [PM] port IR based PGO prof-gen pass to new ↵Xinliang David Li2016-05-091-26/+25
| | | | | | pass manager llvm-svn: 268974
* [PGO] Fix __llvm_profile_raw_version linkage in MACHORong Xu2016-05-091-1/+1
| | | | | | | | | | | | | | | | | | | IR instrumentation generates a COMDAT symbol __llvm_profile_raw_version to overwrite the same symbol in profile run-time to distinguish IR profiles from Clang generated profiles. In MACHO, LinkOnceODR linkage is used due to the lack of COMDAT support. But LinkOnceODR linkage might have .weak_def_can_be_hidden assembly directive, while the weak variable in run-time has a .weak_definition directive. Linker will not merge these two symbols even they have the same name. The end result is IR profiles are not properly flagged in MACHO. This patch changes the linkage for __llvm_profile_raw_version in each module to LinkOnceAny so that it has same .weak_definition directive as in the run-time. Differential Revision: http://reviews.llvm.org/D20078 llvm-svn: 268969
* [MSan] [AArch64] Fix vararg helper for >1 or non-int fixed arguments.Marcin Koscielnicki2016-05-091-3/+13
| | | | | | | | | | | | | | | | | | This fixes http://llvm.org/PR27646 on AArch64. There are three issues here: - The GR save area is 7 words in size, instead of 8. This is not enough if none of the fixed arguments is passed in GRs (they're all floats or aggregates). - The first argument is ignored (which counteracts the above if it's passed in GR). - Like x86_64, fixed arguments landing in the overflow area are wrongly counted towards the overflow offset. Differential Revision: http://reviews.llvm.org/D20023 llvm-svn: 268967
* [InstCombine] Fold icmp eq/ne (udiv i32 A, B), 0 -> icmp ugt/ule B, A.Chad Rosier2016-05-091-0/+12
| | | | | | Differential Revision: http://reviews.llvm.org/D20036 llvm-svn: 268960
* Optimize a printf with a double procent to putchar.Joerg Sonnenberger2016-05-091-2/+2
| | | | llvm-svn: 268922
* Minor code cleanups. NFC.Junmo Park2016-05-081-3/+3
| | | | llvm-svn: 268888
OpenPOWER on IntegriCloud