summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [NFC][LoopFusion] Use isControlFlowEquivalent() from CodeMoverUtils.Whitney Tsang2019-11-252-14/+14
| | | | | | | | Reviewer: kbarton, jdoerfert, Meinersbur, bmahjour, etiotto Reviewed By: Meinersbur Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D70619
* [InstCombine] prevent infinite loop from conflicting shuffle mask transformsSanjay Patel2019-11-251-2/+4
| | | | | | | | | The pattern in question is currently not possible because we aggressively (wrongly) transform mask elements to undef values if they choose from an undef operand. That, however, would change if we tighten our semantics for shuffles as discussed in D70641. Adding this check gives us the flexibility to make that change with minimal overhead for current definitions.
* [InstCombine] simplify code for shuffle mask canonicalization; NFCSanjay Patel2019-11-251-6/+4
| | | | We never use the local 'Mask' before returning, so that was dead code.
* [InstCombine] remove dead code from shuffle mask canonicalization; NFCSanjay Patel2019-11-251-2/+2
|
* [InstCombine] simplify loop for shuffle mask canonicalization; NFCSanjay Patel2019-11-251-4/+4
|
* [DebugInfo@O2][Utils] Undef instead of delete dbg.values in helper funcOCHyams2019-11-251-15/+7
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Related bug: https://bugs.llvm.org/show_bug.cgi?id=40648 Static helper function rewriteDebugUsers in Local.cpp deletes dbg.value intrinsics when it cannot move or rewrite them, or salvage the deleted instruction's value. It should instead undef them in this case. This patch fixes that and I've added a test which covers the failing test case in bz40648. I've updated the unit test Local.ReplaceAllDbgUsesWith to check for this behaviour (and fixed a typo in the test which would cause the old test to always pass). Reviewers: aprantl, vsk, djtodoro, probinson Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D70604
* Recommit f0c2a5a "[LV] Generalize conditions for sinking instrs for first ↵Florian Hahn2019-11-241-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | order recurrences." This version contains 2 fixes for reported issues: 1. Make sure we do not try to sink terminator instructions. 2. Make sure we bail out, if we try to sink an instruction that needs to stay in place for another recurrence. Original message: If the recurrence PHI node has a single user, we can sink any instruction without side effects, given that all users are dominated by the instruction computing the incoming value of the next iteration ('Previous'). We can sink instructions that may cause traps, because that only causes the trap to occur later, but not on any new paths. With the relaxed check, we also have to make sure that we do not have a direct cycle (meaning PHI user == 'Previous), which indicates a reduction relation, which potentially gets missed by ReductionDescriptor. As follow-ups, we can also sink stores, iff they do not alias with other instructions we move them across and we could also support sinking chains of instructions and multiple users of the PHI. Fixes PR43398. Reviewers: hsaito, dcaballe, Ayal, rengolin Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D69228
* [LoopInterchange] Adjust assertions when updating successors.Florian Hahn2019-11-241-19/+35
| | | | | | | | | | | Currently the assertion in updateSuccessor is overly strict in some cases and overly relaxed in other cases. For branches to the inner and outer loop preheader it is too strict, because they can either be unconditional branches or conditional branches with duplicate targets. Both cases are fine and we can allow updating multiple successors. On the other hand, we have to at least update one successor. This patch adds such an assertion.
* [InstCombine] remove identity shuffle simplification for mask with undefsSanjay Patel2019-11-242-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And simultaneously enhance SimplifyDemandedVectorElts() to rcognize that pattern. That preserves some of the old optimizations in IR. Given a shuffle that includes undef elements in an otherwise identity mask like: define <4 x float> @shuffle(<4 x float> %arg) { %shuf = shufflevector <4 x float> %arg, <4 x float> undef, <4 x i32> <i32 undef, i32 1, i32 2, i32 3> ret <4 x float> %shuf } We were simplifying that to the input operand. But as discussed in PR43958: https://bugs.llvm.org/show_bug.cgi?id=43958 ...that means that per-vector-element poison that would be stopped by the shuffle can now leak to the result. Also note that we still have (and there are tests for) the same transform with no undef elements in the mask (a fully-defined identity mask). I don't think there's any controversy about that case - it's a valid transform under any interpretation of shufflevector/undef/poison. Looking at a few of the diffs into codegen, I don't see any difference in final asm. So depending on your perspective, that's good (no real loss of optimization power) or bad (poison exists in the DAG, so we only partially fixed the bug). Differential Revision: https://reviews.llvm.org/D70246
* [InstCombine] Fix call guard difference with dbgDavide Italiano2019-11-221-4/+4
| | | | | | Patch by Chris Ye! Differential Revision: https://reviews.llvm.org/D68004
* [CodeMoverUtils] Added an API to check if an instruction can be safelyTsang Whitney W.H2019-11-222-0/+169
| | | | | | | | | | | | | | | | | | | moved before another instruction. Summary:Added an API to check if an instruction can be safely moved before another instruction. In future PRs, we will like to add support of moving instructions between blocks that are not control flow equivalent, and add other APIs to enhance usability, e.g. moving basic blocks, moving list of instructions... Loop Fusion will be its first user. When there is intervening code in between two loops, fusion is currently unable to fuse them. Loop Fusion can use this utility to check if the intervening code can be safely moved before or after the two loops, and move them, then it can successfully fuse them. Reviewer:kbarton,jdoerfert,Meinersbur,bmahjour,etiotto Reviewed By:bmahjour Subscribers:mgorny,hiraditya,llvm-commits Tag:LLVM Differential Revision:https://reviews.llvm.org/D70049
* [SLP] Enhance SLPVectorizer to vectorize vector aggregateAnton Afanasyev2019-11-221-6/+27
| | | | | | | | | | | | | | | | | Summary: Vector aggregate is homogeneous aggregate of vectors like `{ <2 x float>, <2 x float> }`. This patch allows `findBuildAggregate()` to consider vector aggregates as well as scalar ones. For instance, `{ <2 x float>, <2 x float> }` maps to `<4 x float>`. Fixes vector part of llvm.org/PR42022 Reviewers: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70068
* [JumpThreading] NFC: Don't cache F.hasProfileData()Kazu Hirata2019-11-221-3/+2
| | | | | | | | | | | | | | | | | | | | | | Summary: With this patch, we no longer cache F.hasProfileData(). We simply call the function again. I'm doing this because: - JumpThreadingPass also has a member variable named HasProfileData, which is very confusing, - the function is very lightweight, and - this patch makes JumpThreading::runOnFunction more consistent with JumpThreadingPass::run. Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70602
* [JumpThreading] Use profile data even with the new pass managerKazu Hirata2019-11-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Without this patch, the jump threading pass ignores profiling data whenever we invoke the pass with the new pass manager. Specifically, JumpThreadingPass::run calls runImpl with class variable HasProfileData always set to false. In turn, runImpl sets HasProfileData to false again: HasProfileData = HasProfileData_; In the end, we don't use profiling data at all with the new pass manager. This patch fixes the problem by passing F.hasProfileData() to runImpl. The bug appears to have been introduced at: https://reviews.llvm.org/D41461 which removed local variable HasProfileData in JumpThreadingPass::run even though there was one more use left in the same function. As a result, the remaining use ended referring to the class variable instead. Note that F.hasProfileData is an extremely lightweight function, so I don't see the need to cache its result. Once this patch is approved, I'm planning to stop caching the result of F.hasProfileData in runOnFunction. Reviewers: wmi, eli.friedman Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70509
* [WIP][Attributor] AAReachability AttributePankaj Gode2019-11-221-0/+30
| | | | | | | | | | | | | | | | | Summary: Working towards Johannes's suggestion for fixme, in Attributor's Noalias attribute deduction. (ii) Check whether the value is captured in the scope using AANoCapture. FIXME: This is conservative though, it is better to look at CFG and // check only uses possibly executed before this call site. A Reachability abstract attribute answers the question "does execution at point A potentially reach point B". If this question is answered with false for all other uses of the value that might be captured, we know it is not *yet* captured and can continue with the noalias deduction. Currently, information AAReachability provides is completely pessimistic. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: uenoku, sstefan1, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D70233
* Test commit.Pankaj Gode2019-11-221-1/+1
|
* [LoopInstSimplify] Move MemorySSA verification under flag.Alina Sbirlea2019-11-211-1/+2
| | | | | | The verification inside loop passes should be done under the VerifyMemorySSA flag (enabled by EXPESIVE_CHECKS or explicitly with opt), in order to not add to compile time during regular builds.
* [LoopPred] Robustly handle partially unswitched loopsPhilip Reames2019-11-211-0/+29
| | | | We may end up with a case where we have a widenable branch above the loop, but not all widenable branches within the loop have been removed. Since a widenable branch inhibit SCEVs ability to reason about exit counts (by design), we have a tradeoff between effectiveness of this optimization and allowing future widening of the branches within the loop. LoopPred is thought to be one of the most important optimizations for range check elimination, so let's pay the cost.
* Further cleanup manipulation of widenable branches [NFC]Philip Reames2019-11-211-22/+18
| | | | This is a follow on to aaea24802bf5. In post commit discussion, Artur and I realized we could cleanup the code using Uses; this patch does so.
* Clang-trunk Generates Wrong Debug values with -O1Vedant Kumar2019-11-211-1/+1
| | | | | | | | | | | | | Bit-Tracking Dead Code Elimination (bdce) do not mark dbg.value as undef after deleting instruction. which shows invalid state of variable in debugger. This patches fixes this by marking the dbg.value as undef which depends on dead instruction. This fixes https://bugs.llvm.org/show_bug.cgi?id=41925 Patch by kamlesh kumar! Differential Revision: https://reviews.llvm.org/D70040
* [JumpThreading] Refactor ThreadEdgeKazu Hirata2019-11-211-9/+20
| | | | | | | | | | | | | | | | | | Summary: This patch moves various checks from ThreadEdge to new function TryThreadEdge The rational behind this is that I'd like to use ThreadEdge without its checks in my upcoming patch. This patch preserves lightweight checks as assertions in ThreadEdge. ThreadEdge does not repeat the cost check, however. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70338
* [cmake] Explicitly mark libraries defined in lib/ as "Component Libraries"Tom Stellard2019-11-2110-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Most libraries are defined in the lib/ directory but there are also a few libraries defined in tools/ e.g. libLLVM, libLTO. I'm defining "Component Libraries" as libraries defined in lib/ that may be included in libLLVM.so. Explicitly marking the libraries in lib/ as component libraries allows us to remove some fragile checks that attempt to differentiate between lib/ libraries and tools/ libraires: 1. In tools/llvm-shlib, because llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of all libraries defined in the whole project, there was custom code needed to filter out libraries defined in tools/, none of which should be included in libLLVM.so. This code assumed that any library defined as static was from lib/ and everything else should be excluded. With this change, llvm_map_components_to_libnames(LIB_NAMES, "all") only returns libraries that have been added to the LLVM_COMPONENT_LIBS global cmake property, so this custom filtering logic can be removed. Doing this also fixes the build with BUILD_SHARED_LIBS=ON and LLVM_BUILD_LLVM_DYLIB=ON. 2. There was some code in llvm_add_library that assumed that libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or ARG_LINK_COMPONENTS set. This is only true because libraries defined lib lib/ use LLVMBuild.txt and don't set these values. This code has been fixed now to check if the library has been explicitly marked as a component library, which should now make it easier to remove LLVMBuild at some point in the future. I have tested this patch on Windows, MacOS and Linux with release builds and the following combinations of CMake options: - "" (No options) - -DLLVM_BUILD_LLVM_DYLIB=ON - -DLLVM_LINK_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON Reviewers: beanz, smeenai, compnerd, phosek Reviewed By: beanz Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70179
* Broaden the definition of a "widenable branch"Philip Reames2019-11-212-16/+45
| | | | | | | | | | | | As a reminder, a "widenable branch" is the pattern "br i1 (and i1 X, WC()), label %taken, label %untaken" where "WC" is the widenable condition intrinsics. The semantics of such a branch (derived from the semantics of WC) is that a new condition can be added into the condition arbitrarily without violating legality. Broaden the definition in two ways: Allow swapped operands to the br (and X, WC()) form Allow widenable branch w/trivial condition (i.e. true) which takes form of br i1 WC() The former is just general robustness (e.g. for X = non-instruction this is what instcombine produces). The later is specifically important as partial unswitching of a widenable range check produces exactly this form above the loop. Differential Revision: https://reviews.llvm.org/D70502
* [InstCombine] add assert in SimplifyDemandedVectorElts and improve ↵Sanjay Patel2019-11-211-19/+22
| | | | readability; NFC
* [LV] PreferPredicateOverEpilog respecting optionSjoerd Meijer2019-11-211-1/+5
| | | | | | | | Follow-up of cb47b8783: don't query TTI->preferPredicateOverEpilogue when option -prefer-predicate-over-epilog is set to false, i.e. when we prefer not to predicate the loop. Differential Revision: https://reviews.llvm.org/D70382
* Reland 9f3fdb0d7fab: [Driver] Use VFS to check if sanitizer blacklists existIlya Biryukov2019-11-211-1/+4
| | | | | | | With updates to various LLVM tools that use SpecialCastList. It was tempting to use RealFileSystem as the default, but that makes it too easy to accidentally forget passing VFS in clang code.
* Revert "[Driver] Use VFS to check if sanitizer blacklists exist"Ilya Biryukov2019-11-211-4/+1
| | | | | This reverts commit ba6f906854263375cff3257d22d241a8a259cf77. Commit caused compilation errors on llvm tests. Will fix and re-land.
* [Driver] Use VFS to check if sanitizer blacklists existIlya Biryukov2019-11-211-1/+4
| | | | | | | | | | | | | | | | | | | Summary: This is a follow-up to 590f279c456bbde632eca8ef89a85c478f15a249, which moved some of the callers to use VFS. It turned out more code in Driver calls into real filesystem APIs and also needs an update. Reviewers: gribozavr2, kadircet Reviewed By: kadircet Subscribers: ormris, mgorny, hiraditya, llvm-commits, jkorous, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70440
* [DebugInfo] Refactor DIExpression [SZ]Ext creation into function [NFC]David Stenberg2019-11-211-4/+2
| | | | | | | | | | | | | | | Summary: Also, replace the SmallVector with a normal C array. Reviewers: vsk Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70498
* D'oh. Fix assert after a84922916e6eddf701b39fbd7fe0222cb0fee1d6.James Y Knight2019-11-201-2/+3
| | | | (Which was attempting to fix unused variable warning in NDEBUG mode after 8ba56f322abf848cec78ff7f814f3ad84cd778be)
* Fix unused variable warning in NDEBUG mode after ↵James Y Knight2019-11-201-2/+2
| | | | 8ba56f322abf848cec78ff7f814f3ad84cd778be
* [MemorySSA] Moving at the end often means before terminator.Alina Sbirlea2019-11-203-3/+8
| | | | | | | | | | | | | Moving accesses in MemorySSA at InsertionPlace::End, when an instruction is moved into a block, almost always means insert at the end of the block, but before the block terminator. This matters when the block terminator is a MemoryAccess itself (an invoke), and the insertion must be done before the terminator for the update to be correct. Insert an additional position: InsertionPlace:BeforeTerminator and update current usages where this applies. Resolves PR44027.
* [MemorySSA] Update analysis when the terminator is a memory instruction.Alina Sbirlea2019-11-202-1/+10
| | | | | Update MemorySSA when moving the terminator instruction, as that may be a memory touching instruction. Resolves PR44029.
* Temporarily Revert "[SLP] allow forming 2-way reduction patterns" and update ↵Eric Christopher2019-11-201-29/+8
| | | | | | | | | | testcases. After speaking with Sanjay - seeing a number of miscompiles and working on tracking down a testcase. None of the follow on patches seem to have helped so far. This reverts commit 8a0aa5310bccbb42d16d11db090419fcefdd1376.
* Temporarily Revert "Temporarily Revert "[SLP] allow forming 2-way reduction ↵Eric Christopher2019-11-201-8/+29
| | | | | | | | patterns"" as there were testcase changes after that need to also be reverted. This reverts commit cd8748a15f2d18861b3548eb26ed2b52e5ee50b4.
* Temporarily Revert "[SLP] allow forming 2-way reduction patterns"Eric Christopher2019-11-201-29/+8
| | | | | | | | After speaking with Sanjay - seeing a number of miscompiles and working on tracking down a testcase. None of the follow on patches seem to have helped so far. This reverts commit 7ff57705ba196ce649d6034614b3b9df57e1f84f.
* Move widenable branch formation into makeGuardControlFlowExplicit helperPhilip Reames2019-11-203-17/+16
| | | | This is mostly NFC, but I removed the setting of the guard's calling convention onto the WC call. Why? Because it was untested, and was producing an ill defined output as the declaration's convention wasn't been changed leaving a mismatch which is UB.
* [GuardWidening] Remove WidenFrequentBranches transformPhilip Reames2019-11-191-68/+6
| | | | This code has never been enabled. While it is tested, it's complicating some refactoring. If we decide to re-implement this, doing it in SimplifyCFG would probably make more sense anyways.
* [NFC] Factor out utilities for manipulating widenable branchesPhilip Reames2019-11-193-13/+30
| | | | | | With the widenable condition construct, we have the ability to reason about branches which can be 'widened' (i.e. made to fail more often). We've got a couple o transforms which leverage this. This patch just cleans up the API a bit. This is prep work for generalizing our definition of a widenable branch slightly. At the moment "br i1 (and A, wc()), ..." is considered widenable, but oddly, neither "br i1 (and wc(), B), ..." or "br i1 wc(), ..." is. That clearly needs addressed, so first, let's centralize the code in one place.
* [LoopPred] Generalize profitability check to handle unswitch outputPhilip Reames2019-11-191-1/+12
| | | | Unswitch (and other loop transforms) like to generate loop exit blocks with unconditional successors, and phi nodes (LCSSA, or simple multiple exiting blocks sharing an exit). Generalize the "likely very rare exit" check slightly to handle this form.
* llvm/ObjCARC: Eliminate inlined AutoreleaseRV callsDuncan P. N. Exon Smith2019-11-191-71/+145
| | | | | | | | | | | | | | | | | | | | | | | | | Pair up inlined AutoreleaseRV calls with their matching RetainRV or ClaimRV. - RetainRV cancels out AutoreleaseRV. Delete both instructions. - ClaimRV is a peephole for RetainRV+Release. Delete AutoreleaseRV and replace ClaimRV with Release. This avoids problems where more aggressive inlining triggers memory regressions. This patch is happy to skip over non-callable instructions and non-ARC intrinsics looking for the pair. It is likely sound to also skip over opaque function calls, but that's harder to reason about, and it's not relevant to the goal here: if there's an opaque function call splitting up a pair, it's very unlikely that a handshake would have happened dynamically without inlining. Note that this patch also subsumes the previous logic that looked backwards from ReleaseRV. https://reviews.llvm.org/D70370 rdar://problem/46509586
* [SLP] fix miscompile on min/max reductions with extra uses (PR43948) (2nd try)Sanjay Patel2019-11-191-10/+25
| | | | | | | | | | | | | | | | | | | | | | The 1st attempt was reverted because it revealed an existing bug where we could produce invalid IR (use of value before definition). That should be fixed with: rG39de82ecc9c2 The bug manifests as replacing a reduction operand with an undef value. The problem appears to be limited to cases where a min/max reduction has extra uses of the compare operand to the select. In the general case, we are tracking "ExternallyUsedValues" and an "IgnoreList" of the reduction operations, but those may not apply to the final compare+select in a min/max reduction. For that, we use replaceAllUsesWith (RAUW) to ensure that the new vectorized reduction values are transferred to all subsequent users. Differential Revision: https://reviews.llvm.org/D70148
* [SLP] fix insertion point for min/max reductionSanjay Patel2019-11-191-2/+17
| | | | | | As discussed in D70148 (and caused a revert of the original commit): if we insert at the select, then we can produce invalid IR because the replacement for the compare may have uses before the select.
* [ThinLTO] Make ValueInfo::operator bool() explicitevgeny2019-11-191-8/+10
| | | | Differential revision: https://reviews.llvm.org/D70383
* [NFC] Test commit. Please ignore.Evgeniy Brevnov2019-11-191-1/+1
| | | | | As a test commit I fixed a misspelling in one of comments in SLP vectorizer.
* Temporarily revert "[SLP] fix miscompile on min/max reductions with extra ↵Eric Christopher2019-11-181-14/+1
| | | | | | | | uses (PR43948)" as it causes an ICE on valid. A testcase was followed up on the original thread. This reverts commit a3e61946c5bd7bdfab15af76b292e52d6ffa27f7.
* [ThinLTO] Avoid extra index lookup during promotionTeresa Johnson2019-11-181-12/+11
| | | | | | | | | | | | | | | | | | | | | | | Summary: Pass down the already accessed ValueInfo to shouldPromoteLocalToGlobal, to avoid an unnecessary extra index lookup. Add some assertion checking to confirm we have a non-empty VI when expected. Also some misc cleanup, merging the two versions of doImportAsDefinition, since one was only called by the other, and unnecessarily passed in a member variable. Reviewers: steven_wu, pcc, evgeny777 Reviewed By: evgeny777 Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70337
* [ThinLTO] Promotion handling cleanup (NFC)Teresa Johnson2019-11-181-21/+12
| | | | | | | | | | | | | | | | | | | | | Summary: Clean up the code that does GV promotion in the ThinLTO backends. Specifically, we don't need to check whether we are importing since that is already checked and handled correctly in shouldPromoteLocalToGlobal. Simply call shouldPromoteLocalToGlobal, and if it returns true we are guaranteed that we are promoting, whether or not we are importing (or in the exporting module). This also makes the handling in getName() consistent with that in getLinkage(), which checks the DoPromote parameter regardless of whether we are importing or exporting. Reviewers: steven_wu, pcc, evgeny777 Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70327
* [LoopPred/WC] Use a dominating widenable condition to remove analyze loop exitsPhilip Reames2019-11-181-9/+206
| | | | | | | | | | | | This implements a version of the predicateLoopExits transform from IndVarSimplify extended to exploit widenable conditions - and thus be much wider in scope of legality. The code structure ends up being almost entirely different, so I chose to duplicate this into the LoopPredication pass instead of trying to reuse the code in the IndVars. The core notions of the transform are as follows: If we have a widenable condition which controls entry into the loop, we're allowed to widen it arbitrarily. Given that, it's simply a *profitability* question as to what conditions to fold into the widenable branch. To avoid pass ordering issues, we want to avoid widening cases that would otherwise be dischargeable. Or... widen in a form which can still be discharged. Thus, we phrase the transform as selecting one analyzeable exit from the set of analyzeable exits to keep. This avoids creating pass ordering complexities. Since none of the above proves that we actually exit through our analyzeable exits - we might exit through something else entirely - we limit ourselves to cases where a) the latch is analyzeable and b) the latch is predicted taken, and c) the exit being removed is statically cold. Differential Revision: https://reviews.llvm.org/D69830
* [ARM,MVE] Add InstCombine rules for pred_i2v / pred_v2i.Simon Tatham2019-11-181-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If you're writing C code using the ACLE MVE intrinsics that passes the result of a vcmp as input to a predicated intrinsic, e.g. mve_pred16_t pred = vcmpeqq(v1, v2); v_out = vaddq_m(v_inactive, v3, v4, pred); then clang's codegen for the compare intrinsic will create calls to `@llvm.arm.mve.pred.v2i` to convert the output of `icmp` into an `mve_pred16_t` integer representation, and then the next intrinsic will call `@llvm.arm.mve.pred.i2v` to convert it straight back again. This will be visible in the generated code as a `vmrs`/`vmsr` pair that move the predicate value pointlessly out of `p0` and back into it again. To prevent that, I've added InstCombine rules to remove round trips of the form `v2i(i2v(x))` and `i2v(v2i(x))`. Also I've taught InstCombine about the known and demanded bits of those intrinsics. As a result, you now get just the generated code you wanted: vpt.u16 eq, q1, q2 vaddt.u16 q0, q3, q4 Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70313
OpenPOWER on IntegriCloud