summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [GuardWidening] Remove WidenFrequentBranches transformPhilip Reames2019-11-191-68/+6
| | | | This code has never been enabled. While it is tested, it's complicating some refactoring. If we decide to re-implement this, doing it in SimplifyCFG would probably make more sense anyways.
* [NFC] Factor out utilities for manipulating widenable branchesPhilip Reames2019-11-193-13/+30
| | | | | | With the widenable condition construct, we have the ability to reason about branches which can be 'widened' (i.e. made to fail more often). We've got a couple o transforms which leverage this. This patch just cleans up the API a bit. This is prep work for generalizing our definition of a widenable branch slightly. At the moment "br i1 (and A, wc()), ..." is considered widenable, but oddly, neither "br i1 (and wc(), B), ..." or "br i1 wc(), ..." is. That clearly needs addressed, so first, let's centralize the code in one place.
* [LoopPred] Generalize profitability check to handle unswitch outputPhilip Reames2019-11-191-1/+12
| | | | Unswitch (and other loop transforms) like to generate loop exit blocks with unconditional successors, and phi nodes (LCSSA, or simple multiple exiting blocks sharing an exit). Generalize the "likely very rare exit" check slightly to handle this form.
* llvm/ObjCARC: Eliminate inlined AutoreleaseRV callsDuncan P. N. Exon Smith2019-11-191-71/+145
| | | | | | | | | | | | | | | | | | | | | | | | | Pair up inlined AutoreleaseRV calls with their matching RetainRV or ClaimRV. - RetainRV cancels out AutoreleaseRV. Delete both instructions. - ClaimRV is a peephole for RetainRV+Release. Delete AutoreleaseRV and replace ClaimRV with Release. This avoids problems where more aggressive inlining triggers memory regressions. This patch is happy to skip over non-callable instructions and non-ARC intrinsics looking for the pair. It is likely sound to also skip over opaque function calls, but that's harder to reason about, and it's not relevant to the goal here: if there's an opaque function call splitting up a pair, it's very unlikely that a handshake would have happened dynamically without inlining. Note that this patch also subsumes the previous logic that looked backwards from ReleaseRV. https://reviews.llvm.org/D70370 rdar://problem/46509586
* [SLP] fix miscompile on min/max reductions with extra uses (PR43948) (2nd try)Sanjay Patel2019-11-191-10/+25
| | | | | | | | | | | | | | | | | | | | | | The 1st attempt was reverted because it revealed an existing bug where we could produce invalid IR (use of value before definition). That should be fixed with: rG39de82ecc9c2 The bug manifests as replacing a reduction operand with an undef value. The problem appears to be limited to cases where a min/max reduction has extra uses of the compare operand to the select. In the general case, we are tracking "ExternallyUsedValues" and an "IgnoreList" of the reduction operations, but those may not apply to the final compare+select in a min/max reduction. For that, we use replaceAllUsesWith (RAUW) to ensure that the new vectorized reduction values are transferred to all subsequent users. Differential Revision: https://reviews.llvm.org/D70148
* [SLP] fix insertion point for min/max reductionSanjay Patel2019-11-191-2/+17
| | | | | | As discussed in D70148 (and caused a revert of the original commit): if we insert at the select, then we can produce invalid IR because the replacement for the compare may have uses before the select.
* [ThinLTO] Make ValueInfo::operator bool() explicitevgeny2019-11-191-8/+10
| | | | Differential revision: https://reviews.llvm.org/D70383
* [NFC] Test commit. Please ignore.Evgeniy Brevnov2019-11-191-1/+1
| | | | | As a test commit I fixed a misspelling in one of comments in SLP vectorizer.
* Temporarily revert "[SLP] fix miscompile on min/max reductions with extra ↵Eric Christopher2019-11-181-14/+1
| | | | | | | | uses (PR43948)" as it causes an ICE on valid. A testcase was followed up on the original thread. This reverts commit a3e61946c5bd7bdfab15af76b292e52d6ffa27f7.
* [ThinLTO] Avoid extra index lookup during promotionTeresa Johnson2019-11-181-12/+11
| | | | | | | | | | | | | | | | | | | | | | | Summary: Pass down the already accessed ValueInfo to shouldPromoteLocalToGlobal, to avoid an unnecessary extra index lookup. Add some assertion checking to confirm we have a non-empty VI when expected. Also some misc cleanup, merging the two versions of doImportAsDefinition, since one was only called by the other, and unnecessarily passed in a member variable. Reviewers: steven_wu, pcc, evgeny777 Reviewed By: evgeny777 Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70337
* [ThinLTO] Promotion handling cleanup (NFC)Teresa Johnson2019-11-181-21/+12
| | | | | | | | | | | | | | | | | | | | | Summary: Clean up the code that does GV promotion in the ThinLTO backends. Specifically, we don't need to check whether we are importing since that is already checked and handled correctly in shouldPromoteLocalToGlobal. Simply call shouldPromoteLocalToGlobal, and if it returns true we are guaranteed that we are promoting, whether or not we are importing (or in the exporting module). This also makes the handling in getName() consistent with that in getLinkage(), which checks the DoPromote parameter regardless of whether we are importing or exporting. Reviewers: steven_wu, pcc, evgeny777 Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70327
* [LoopPred/WC] Use a dominating widenable condition to remove analyze loop exitsPhilip Reames2019-11-181-9/+206
| | | | | | | | | | | | This implements a version of the predicateLoopExits transform from IndVarSimplify extended to exploit widenable conditions - and thus be much wider in scope of legality. The code structure ends up being almost entirely different, so I chose to duplicate this into the LoopPredication pass instead of trying to reuse the code in the IndVars. The core notions of the transform are as follows: If we have a widenable condition which controls entry into the loop, we're allowed to widen it arbitrarily. Given that, it's simply a *profitability* question as to what conditions to fold into the widenable branch. To avoid pass ordering issues, we want to avoid widening cases that would otherwise be dischargeable. Or... widen in a form which can still be discharged. Thus, we phrase the transform as selecting one analyzeable exit from the set of analyzeable exits to keep. This avoids creating pass ordering complexities. Since none of the above proves that we actually exit through our analyzeable exits - we might exit through something else entirely - we limit ourselves to cases where a) the latch is analyzeable and b) the latch is predicted taken, and c) the exit being removed is statically cold. Differential Revision: https://reviews.llvm.org/D69830
* [ARM,MVE] Add InstCombine rules for pred_i2v / pred_v2i.Simon Tatham2019-11-181-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If you're writing C code using the ACLE MVE intrinsics that passes the result of a vcmp as input to a predicated intrinsic, e.g. mve_pred16_t pred = vcmpeqq(v1, v2); v_out = vaddq_m(v_inactive, v3, v4, pred); then clang's codegen for the compare intrinsic will create calls to `@llvm.arm.mve.pred.v2i` to convert the output of `icmp` into an `mve_pred16_t` integer representation, and then the next intrinsic will call `@llvm.arm.mve.pred.i2v` to convert it straight back again. This will be visible in the generated code as a `vmrs`/`vmsr` pair that move the predicate value pointlessly out of `p0` and back into it again. To prevent that, I've added InstCombine rules to remove round trips of the form `v2i(i2v(x))` and `i2v(v2i(x))`. Also I've taught InstCombine about the known and demanded bits of those intrinsics. As a result, you now get just the generated code you wanted: vpt.u16 eq, q1, q2 vaddt.u16 q0, q3, q4 Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70313
* llvm/ObjCARC: Split OptimizeIndividualCallImpl out of ↵Duncan P. N. Exon Smith2019-11-171-245/+264
| | | | | | | | | | OptimizeIndividualCalls, NFC Split out a helper function for the individual call optimizations and skip useless calls to it (where the instruction is not an ARC intrinsic). Besides reducing indentation (and possibly speeding up compile time in some small way), an upcoming patch will add additional calls and expand out the `switch`.
* llvm/ObjCARC: Use continue to reduce some nesting, NFCDuncan P. N. Exon Smith2019-11-171-66/+66
|
* [InstCombine] prevent crashing/assert on shift constant expression (PR44028)Sanjay Patel2019-11-171-1/+2
| | | | | The binary operator cast implies an instruction, but the matcher for shift does not: https://bugs.llvm.org/show_bug.cgi?id=44028
* [Attributor] Use nofree argument attribute for heap-to-stack conversionStefan Stipanovic2019-11-171-10/+6
| | | | | | | | Reviewers: jdoerfert, uenoku Subscribers: Differential Revision: https://reviews.llvm.org/D70140
* [SimplifyCFG] propagate fast-math-flags (FMF) from phi to selectSanjay Patel2019-11-171-0/+5
| | | | | | | | | | | | | | | | | | | | | | Similar to/extension of D70208 (rGee0882bdf866), but this one may finally allow closing motivating bugs. This is another step towards having FMF apply only to FP values rather than those + fcmp. See PR38086 for one of the original discussions/motivations: https://bugs.llvm.org/show_bug.cgi?id=38086 And the test here is derived from PR39535: https://bugs.llvm.org/show_bug.cgi?id=39535 Currently, we lose FMF when converting any phi to select in SimplifyCFG. There are a small number of similar changes needed to correct within SimplifyCFG, so it should be quick to patch this pass up. FMF was extended to select and phi with: D61917 D67564
* [InstCombine] Canonicalize ssub.with.overflow with clamp to ssub.satDavid Green2019-11-171-16/+51
| | | | | | Working on top of D69252, this adds canonicalisation patterns for ssub.with.overflow to ssub.sats. Differential Revision: https://reviews.llvm.org/D69753
* [InstCombine] Canonicalize sadd.with.overflow with clamp to sadd.satDavid Green2019-11-171-1/+58
| | | | | | | | | | | | | This adds to D69245, adding extra signed patterns for folding from a sadd_with_overflow to a sadd_sat. These are more complex than the unsigned patterns, as the overflow can occur in either direction. For the add case, the positive overflow can only occur if both of the values are positive (same for both the values being negative). So there is an extra select on whether to use the positive or negative overflow limit. Differential Revision: https://reviews.llvm.org/D69252
* Remove Support/Options.h, it is unusedReid Kleckner2019-11-151-1/+1
| | | | | | | | | | | | | | | | | It was added in 2014 in 732e0aa9fb84f1 with one use in Scalarizer.cpp. That one use was then removed when porting to the new pass manager in 2018 in b6f76002d9158628e78. While the RFC and the desire to get off of static initializers for cl::opt all still stand, this code is now dead, and I think we should delete this code until someone is ready to do the migration. There were many clients of CommandLine.h that were it transitively through LLVMContext.h, so I cleaned that up in 4c1a1d3cf97e1ede466. Reviewers: beanz Differential Revision: https://reviews.llvm.org/D70280
* [SimplifyCFG] propagate fast-math-flags (FMF) from phi to selectSanjay Patel2019-11-151-1/+7
| | | | | | | | | | | | | | | | | | | | | This is another step towards having FMF apply only to FP values rather than those + fcmp. See PR38086 for one of the original discussions/motivations: https://bugs.llvm.org/show_bug.cgi?id=38086 And the test here is derived from PR39535: https://bugs.llvm.org/show_bug.cgi?id=39535 Currently, we lose FMF when converting any phi to select in SimplifyCFG. There are a small number of similar changes needed to correct within SimplifyCFG, so it should be quick to patch this pass up. FMF was extended to select and phi with: D61917 D67564 Differential Revision: https://reviews.llvm.org/D70208
* Revert "[LoadStoreVectorize] Use '||' instead of '|' between sides with ↵Richard Smith2019-11-151-6/+6
| | | | | | | | | function calls. NFCI." This broke two tests. Presumably the non-short-circuting '|' was intentional here. This reverts commit f7efea0ded8e16c7751b378523407a491016edd6.
* [GCOV] Skip artificial functions from being emittedAlexandre Ganea2019-11-151-1/+4
| | | | | | | | This is a patch to support D66328, which was reverted until this lands. Enable a compiler-rt test that used to fail previously with D66328. Differential Revision: https://reviews.llvm.org/D67283
* [SVFS] Inject TLI Mappings in VFABI attribute.Francesco Petrogalli2019-11-153-0/+188
| | | | | | | | | | | | | | This patch introduces a function pass to inject the scalar-to-vector mappings stored in the TargetLIbraryInfo (TLI) into the Vector Function ABI (VFABI) variants attribute. The test is testing the injection for three vector libraries supported by the TLI (Accelerate, SVML, MASSV). The pass does not change any of the analysis associated to the function. Differential Revision: https://reviews.llvm.org/D70107
* [ThinLTO] Fix -Wunused-function in NDEBUG builds after ↵Fangrui Song2019-11-151-0/+2
| | | | llvmorg-10-init-9933-g3d708bf5c26
* [LoadStoreVectorize] Use '||' instead of '|' between sides with function ↵Dávid Bolvanský2019-11-151-6/+6
| | | | | | calls. NFCI. Fixes warning from PVS Studio
* Recommit "[ThinLTO] Add correctness check for RO/WO variable import"evgeny2019-11-153-17/+46
| | | | | | ValueInfo has user-defined 'operator bool' which allows incorrect implicit conversion to GlobalValue::GUID (which is unsigned long). This causes bugs which are hard to track and should be removed in future.
* [Scalarizer] Treat values from unreachable blocks as undefMikael Holmen2019-11-151-5/+27
| | | | | | | | | | | | | | | | | | | | | | | | Summary: When scalarizing PHI nodes we might try to examine/rewrite InsertElement nodes in predecessors. If those predecessors are unreachable from entry, then the IR in those blocks could have unexpected properties resulting in infinite loops in Scatterer::operator[]. By simply treating values originating from instructions in unreachable blocks as undef we do not need to analyse them further. This fixes PR41723. Reviewers: bjope Reviewed By: bjope Subscribers: bjope, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70171
* [InstCombine] Don't use getFirstNonPHI in FoldIntegerTypedPHIFrancis Visoiu Mistrih2019-11-141-4/+5
| | | | | | | | | | | | | | | | | | | getFirstNonPHI iterates over all the instructions in a block until it finds a non-PHI. Then, the loop starts from the beginning of the block and goes through all the instructions until it reaches the instruction found by getFirstNonPHI. Instead of doing that, just stop when a non-PHI is found. This reduces the compile-time of a test case discussed in https://reviews.llvm.org/D47023 by 13x. Not entirely sure how to come up with a test case for this since it's a compile time issue that would significantly slow down running the tests. Differential Revision: https://reviews.llvm.org/D70016
* Add missing includes needed to prune LLVMContext.h include, NFCReid Kleckner2019-11-1428-5/+33
| | | | | These are a pre-requisite to removing #include "llvm/Support/Options.h" from LLVMContext.h: https://reviews.llvm.org/D70280
* Revert "Temporarily Revert:"Alexey Bataev2019-11-141-98/+178
| | | | | | | | | | | | This reverts commit e511c4b0dff1692c267addf17dce3cebe8f97faa: Temporarily Revert: "[SLP] Generalization of stores vectorization." "[SLP] Fix -Wunused-variable. NFC" "[SLP] Vectorize jumbled stores." after fixing the problem with compile time.
* [InstCombine] remove duplicate code for simplifying a shuffle; NFCISanjay Patel2019-11-141-7/+0
| | | | | The transform is already handled by InstSimplify or earlier in InstCombine, so trying to do it again is not necessary.
* Revert "[ThinLTO] Add correctness check for RO/WO variable import"Benjamin Kramer2019-11-143-48/+21
| | | | | This reverts commit a2292cc537b561416c21e8d4017715d652c144cc. Breaks clang selfhost w/ThinLTO.
* GCOVProfiling - fix uninitialized variable warnings + make getFuncChecksum() ↵Simon Pilgrim2019-11-141-3/+3
| | | | const. NFCI.
* WholeProgramDevirt - fix uninitialized variable warnings. NFCI.Simon Pilgrim2019-11-141-2/+2
|
* Fix uninitialized variable warning. NFCI.Simon Pilgrim2019-11-141-2/+2
|
* SROA - fix uninitialized variable warnings. NFCI.Simon Pilgrim2019-11-141-5/+5
|
* [LV] PreferPredicateOverEpilog respecting predicate loop hintSjoerd Meijer2019-11-142-3/+5
| | | | | | | | | | | The vectoriser queries TTI->preferPredicateOverEpilogue to determine if tail-folding is preferred for a loop, but it was not respecting loop hint 'predicate' that can disable this, which has now been added. This showed that we were incorrectly initialising loop hint 'vectorize.predicate.enable' with 0 (i.e. FK_Disabled) but this should have been FK_Undefined, which has been fixed. Differential Revision: https://reviews.llvm.org/D70125
* Revert "[InstCombine] Fold PHIs with equal incoming pointers"Daniil Suchkov2019-11-142-70/+0
| | | | | This reverts commit a2f6ae9abffcba260c22bb235879f0576bf3b783. It is reverted due to clang-cmake-armv7-selfhost buildbot failure.
* [InstCombine] Fold PHIs with equal incoming pointersDaniil Suchkov2019-11-142-0/+70
| | | | | | | | | | | | | | | | | | | | | | | | This is a resubmission of bbb29738b58aaf6f6518269abdcf8f64131665a9 that was reverted due to clang tests failures. It includes the fix and additional IR tests for the missed case. Summary: In case when all incoming values of a PHI are equal pointers, this transformation inserts a definition of such a pointer right after definition of the base pointer and replaces with this value both PHI and all it's incoming pointers. Primary goal of this transformation is canonicalization of this pattern in order to enable optimizations that can't handle PHIs. Non-inbounds pointers aren't currently supported. Reviewers: spatel, RKSimon, lebedev.ri, apilipenko Reviewed By: apilipenko Tags: #llvm Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D68128
* [ThinLTO] Add correctness check for RO/WO variable importevgeny2019-11-143-21/+48
| | | | | | | | | | | This patch adds an assertion check for exported read/write-only variables to be also in import list for module. If they aren't we may face linker errors, because read/write-only variables are internalized in their source modules. The patch also changes export lists to store ValueInfo instead of GUID for performance considerations. Differential revision: https://reviews.llvm.org/D70128
* Check result of emitStrLen before passing it to CreateGEPDimitry Andric2019-11-141-2/+2
| | | | | | | | | | | | | | | | | | Summary: This fixes PR43081, where the transformation of `strchr(p, 0) -> p + strlen(p)` can cause a segfault, if `-fno-builtin-strlen` is used. In that case, `emitStrLen` returns nullptr, which CreateGEP is not designed to handle. Also add the minimized code from the PR as a test case. Reviewers: xbolva00, spatel, jdoerfert, efriedma Reviewed By: efriedma Subscribers: lebedev.ri, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70143
* Sink all InitializePasses.h includesReid Kleckner2019-11-13147-41/+196
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211
* [PGO][PGSO] Temporarily disable the large working set size behavior.Hiroshi Yamauchi2019-11-131-0/+5
| | | | | | | | | | | | | | Summary: This temporarily disables the large working set size behavior in profile guided size optimization due to internal benchmark regressions. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70207
* [SLP] fix miscompile on min/max reductions with extra uses (PR43948)Sanjay Patel2019-11-131-1/+14
| | | | | | | | | | | | | | | | | The bug manifests as replacing a reduction operand with an undef value. The problem appears to be limited to cases where a min/max reduction has extra uses of the compare operand to the select. In the general case, we are tracking "ExternallyUsedValues" and an "IgnoreList" of the reduction operations, but those may not apply to the final compare+select in a min/max reduction. For that, we use replaceAllUsesWith (RAUW) to ensure that the new vectorized reduction values are transferred to all subsequent users. Differential Revision: https://reviews.llvm.org/D70148
* [SLP] reduce code duplication for min/max vs. other reductions; NFCISanjay Patel2019-11-131-77/+31
|
* [InstCombine] propagate fast-math-flags (FMF) to select when inverting ↵Sanjay Patel2019-11-131-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fcmp+select As noted by the FIXME comment, this is not correct based on our current FMF semantics. We should be propagating FMF from the final value in a sequence (in this case the 'select'). So the behavior even without this patch is wrong, but we did not allow FMF on 'select' until recently. But if we do the correct thing right now in this patch, we'll inevitably introduce regressions because we have not wired up FMF propagation for 'phi' and 'select' in other passes (like SimplifyCFG) or other places in InstCombine. I'm not seeing a better incremental way to make progress. That said, the potential extra damage over the existing wrong behavior from this patch is very limited. AFAIK, the only way to have different FMF on IR in the same function is if we have LTO inlined IR from 2 modules that were compiled using different fast-math settings. As seen in the tests, we may actually see some improvements with this patch because adding the FMF to the 'select' allows matching to min/max intrinsics that were previously missed (in the common case, the 'fcmp' and 'select' should have identical FMF to begin with). Next steps in the transition: Make similar changes in instcombine as needed. Enable phi-to-select FMF propagation in SimplifyCFG. Remove dependencies on fcmp with FMF. Deprecate FMF on fcmp. Differential Revision: https://reviews.llvm.org/D69720
* SLPVectorizer - make comparison operators + isInSchedulingRegion constSimon Pilgrim2019-11-131-3/+3
| | | | Fixes cppcheck warnings.
* [InstCombine] Avoid moving ops that do restrict undef across shuffles.Florian Hahn2019-11-131-1/+3
| | | | | | | | | | | | | | | | | | | I think we have to be a bit more careful when it comes to moving ops across shuffles, if the op does restrict undef. For example, without this patch, we would move 'and %v, <0, 0, -1, -1>' over a 'shufflevector %a, undef, <undef, undef, 1, 2>'. As a result, the first 2 lanes of the result are undef after the combine, but they really should be 0, unless I am missing something. For ops that do fold to undef on undef operands, the current behavior should be fine. I've add conservative check OpDoesRestrictUndef, maybe there's a better existing utility? Reviewers: spatel, RKSimon, lebedev.ri Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D70093
OpenPOWER on IntegriCloud