summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [EarlyCSE] avoid crashing when detecting min/max/abs patterns (PR41083)Sanjay Patel2020-03-191-7/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed in PR41083: https://bugs.llvm.org/show_bug.cgi?id=41083 ...we can assert/crash in EarlyCSE using the current hashing scheme and instructions with flags. ValueTracking's matchSelectPattern() may rely on overflow (nsw, etc) or other flags when detecting patterns such as min/max/abs composed of compare+select. But the value numbering / hashing mechanism used by EarlyCSE intersects those flags to allow more CSE. Several alternatives to solve this are discussed in the bug report. This patch avoids the issue by doing simple matching of min/max/abs patterns that never requires instruction flags. We give up some CSE power because of that, but that is not expected to result in much actual performance difference because InstCombine will canonicalize these patterns when possible. It even has this comment for abs/nabs: /// Canonicalize all these variants to 1 pattern. /// This makes CSE more likely. (And this patch adds PhaseOrdering tests to verify that the expected transforms are still happening in the standard optimization pipelines. I left this code to use ValueTracking's "flavor" enum values, so we don't have to change the callers' code. If we decide to go back to using the ValueTracking call (by changing the hashing algorithm instead), it should be obvious how to replace this chunk. Differential Revision: https://reviews.llvm.org/D74285 (cherry picked from commit b8ebc11f032032c7ca449f020a1fe40346e707c8)
* [WinEH] Fix inttoptr+phi optimization in presence of catchswitchReid Kleckner2020-03-021-4/+14
| | | | | | | | | | | | | | getFirstInsertionPt's return value must be checked for validity before casting it to Instruction*. Don't attempt to insert casts after a phi in a catchswitch block. Fixes PR45033, introduced in D37832. Reviewed By: davidxl, hfinkel Differential Revision: https://reviews.llvm.org/D75381 (cherry picked from commit 1adbe86d87bd4ecffc73ab17c7da56f44816f424)
* SROA: Don't drop atomic load/store alignments (PR45010)Hans Wennborg2020-02-281-0/+4
| | | | | | | | | | | | | SROA will drop the explicit alignment on allocas when the ABI guarantees enough alignment. Because the alignment on new load/store instructions are set based on the alloca's alignment, that means SROA would end up dropping the alignment from atomic loads and stores, which is not allowed (see bug). For those, make sure to always carry over the alignment from the previous instruction. Differential revision: https://reviews.llvm.org/D75266 (cherry picked from commit d48c981697a49653efff9dd14fa692d99e6fa868)
* [InstCombine] foldShiftIntoShiftInAnotherHandOfAndInICmp(): fix miscompile ↵Roman Lebedev2020-02-271-1/+19
| | | | | | | | | | | | | | | | | | | | | (PR44802) Much like with reassociateShiftAmtsOfTwoSameDirectionShifts(), as input, we have the following pattern: icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0 We want to rewrite that as: icmp eq/ne (and (x shift (Q+K)), y), 0 iff (Q+K) u< bitwidth(x) While we know that originally (Q+K) would not overflow (because 2 * (N-1) u<= iN -1), we may have looked past extensions of shift amounts. so it may now overflow in smaller bitwidth. To ensure that does not happen, we need to ensure that the total maximal shift amount is still representable in that smaller bitwidth. If the overflow would happen, (Q+K) u< bitwidth(x) check would be bogus. https://bugs.llvm.org/show_bug.cgi?id=44802 (cherry picked from commit 2855c8fed9326ec44526767f1596a4fe4e55dc70)
* [InstCombine] reassociateShiftAmtsOfTwoSameDirectionShifts(): fix miscompile ↵Roman Lebedev2020-02-271-2/+22
| | | | | | | | | | | | | | | | | | | (PR44802) As input, we have the following pattern: Sh0 (Sh1 X, Q), K We want to rewrite that as: Sh x, (Q+K) iff (Q+K) u< bitwidth(x) While we know that originally (Q+K) would not overflow (because 2 * (N-1) u<= iN -1), we may have looked past extensions of shift amounts. so it may now overflow in smaller bitwidth. To ensure that does not happen, we need to ensure that the total maximal shift amount is still representable in that smaller bitwidth. If the overflow would happen, (Q+K) u< bitwidth(x) check would be bogus. https://bugs.llvm.org/show_bug.cgi?id=44802 (cherry picked from commit 781d077afb0ed9771c513d064c40170c1ccd21c9)
* Revert "[LICM] Support hosting of dynamic allocas out of loops"Philip Reames2020-02-261-45/+0
| | | | | | | | This reverts commit 8d22100f66c4170510c6ff028c60672acfe1cff9. There was a functional regression reported (https://bugs.llvm.org/show_bug.cgi?id=44996). I'm not actually sure the patch is wrong, but I don't have time to investigate currently, and this line of work isn't something I'm likely to get back to quickly. (cherry picked from commit 14845b2c459021e3dbf2ead52d707d4a7db40cbb)
* [LoopRotate] Get and update MSSA only if available in legacy pass manager.Alina Sbirlea2020-02-261-5/+6
| | | | | | | | | | | | | | | | | | | | | Summary: Potential fix for: https://bugs.llvm.org/show_bug.cgi?id=44889 and https://bugs.llvm.org/show_bug.cgi?id=44408 In the legacy pass manager, loop rotate need not compute MemorySSA when not being in the same loop pass manager with other loop passes. There isn't currently a way to differentiate between the two cases, so this attempts to limit the usage in LoopRotate to only update MemorySSA when the analysis is already available. The side-effect of this is that it will split the Loop pipeline. This issue does not apply to the new pass manager, where we have a flag specifying if all loop passes in that loop pass manager preserve MemorySSA. Reviewers: dmgreen, fedor.sergeev, nikic Subscribers: Prazek, hiraditya, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74574 (cherry picked from commit 1326a5a4cfe004181f2ec8231d84ecda2b93cb25)
* Filter callbr insts from critical edge splittingBill Wendling2020-02-212-2/+4
| | | | | | | | Similarly to how splitting predecessors with an indirectbr isn't handled in the generic way, we also shouldn't split callbrs, for similar reasons. (cherry picked from commit 2fe457690da0fc38bc7f9f1d0aee2ba6a6a16ada)
* [SLPVectorizer] Do not assume extracelement idx is a ConstantInt.Florian Hahn2020-02-191-6/+5
| | | | | | | | | | | | | | | | | | The index of an ExtractElementInst is not guaranteed to be a ConstantInt. It can be any integer value. Check explicitly for ConstantInts. The new test cases illustrate scenarios where we crash without this patch. I've also added another test case to check the matching of extractelement vector ops works. Reviewers: RKSimon, ABataev, dtemirbulatov, vporpo Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D74758 (cherry picked from commit e32522ca178acc42e26f21d64ef8fc180ad772bd)
* [InstCombine] Fix infinite min/max canonicalization loop (PR44541)Nikita Popov2020-02-101-0/+6
| | | | | | | | | | | | While D72944 also fixes https://bugs.llvm.org/show_bug.cgi?id=44541, it does so in a more roundabout manner and there might be other loopholes to trigger the same issue. This is a more direct fix, that prevents the transform if the min/max is based on a non-canonical sub X, 0 instruction. Differential Revision: https://reviews.llvm.org/D73849 (cherry picked from commit a148b9e9909db6a592609eb35b4de38c9e67cb8b)
* [InstCombine] Support disabling expensive combines in optNikita Popov2020-02-101-1/+2
| | | | | | | | | | | | | | | Currently, there is no way to disable ExpensiveCombines when doing a standalone opt -instcombine run, as that's the default, and the opt option can currently only be used to force enable, not to force disable. The only way to disable expensive combines is via -O1 or -O2, but that of course also runs the rest of the kitchen sink... This patch allows using opt -instcombine -expensive-combines=0 to run InstCombine without ExpensiveCombines. Differential Revision: https://reviews.llvm.org/D72861 (cherry picked from commit 2ca092f3209579fde7a38ade511c1bbcef213c36)
* [InstCombine] Fix infinite loop in min/max load/store bitcast combine (PR44835)Nikita Popov2020-02-101-0/+5
| | | | | | | | | | | | | | | | | | Fixes https://bugs.llvm.org/show_bug.cgi?id=44835. Skip the transform if it wouldn't actually do anything (apart from removing and reinserting the same instructions). Note that the test case doesn't loop on current master anymore, only on the LLVM 10 release branch. The issue is already mitigated on master due to worklist order fixes, but we should fix the root cause there as well. As a side note, we should probably assert in combineLoadToNewType() that it does not combine to the same type. Not doing this here, because this assertion would also be triggered in another place right now. Differential Revision: https://reviews.llvm.org/D74278 (cherry picked from commit 23db9724d0e5490fa5a2a726acf015f84e2c87cf)
* [LV] Fix predication for branches with matching true and false succs.Florian Hahn2020-02-061-1/+1
| | | | | | | | | | | | | | | | | | | | | Currently due to the edge caching, we create wrong predicates for branches with matching true and false successors. We will cache the condition for the edge from the true successor, and then lookup the same edge (src and dst are the same) for the edge to the false successor. If both successors match, the condition should always be true. At the moment, we cannot really create constant VPValues, but we can just create a true condition as X | !X. Later passes will clean that up. Fixes PR44488. Reviewers: rengolin, hsaito, fhahn, Ayal, dorit, gilr Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D73079 (cherry picked from commit f14f2a856802e086662d919e2ead641718b27555)
* [LV] Do not try to sink dead instructions.Florian Hahn2020-01-292-8/+13
| | | | | | | | | | | | | | | | | Dead instructions do not need to be sunk. Currently we try and record the recipies for them, but there are no recipes emitted for them and there's nothing to sink. They can be removed from SinkAfter while marking them for recording. Fixes PR44634. Reviewers: rengolin, hsaito, fhahn, Ayal, gilr Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D73423 (cherry picked from commit a911fef3dd79e0a04b241be7b476dde7e99744c4)
* [PassManagerBuilder] Remove global extension when a plugin is unloadedElia Geretto2020-01-291-8/+33
| | | | | | | | | | | | This commit fixes PR39321. GlobalExtensions is not guaranteed to be destroyed when optimizer plugins are unloaded. If it is indeed destroyed after a plugin is dlclose-d, the destructor of the corresponding ExtensionFn is not mapped anymore, causing a call to unmapped memory during destruction. This commit guarantees that extensions coming from external plugins are removed from GlobalExtensions when the plugin is unloaded if GlobalExtensions has not been destroyed yet. Differential Revision: https://reviews.llvm.org/D71959 (cherry picked from commit ab2300bc154f7bed43f85f74fd3fe31be71d90e0)
* [msan] Instrument x86.pclmulqdq* intrinsics.Evgenii Stepanov2020-01-271-0/+43
| | | | | | | | | | | | | | | | Summary: These instructions ignore parts of the input vectors which makes the default MSan handling too strict and causes false positive reports. Reviewers: vitalybuka, RKSimon, thakis Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73374 (cherry picked from commit 1df8549b26892198ddf77dfd627eb9f979d45b7e)
* [SLP] Don't allow Div/Rem as alternate opcodesAndrei Elovikov2020-01-231-1/+17
| | | | | | | | | | | | | | | | | | Summary: We don't have control/verify what will be the RHS of the division, so it might happen to be zero, causing UB. Reviewers: Vasilis, RKSimon, ABataev Reviewed By: ABataev Subscribers: vporpo, ABataev, hiraditya, llvm-commits, vdmitrie Tags: #llvm Differential Revision: https://reviews.llvm.org/D72740 (cherry picked from commit e1d6d368529322edc658c893c01eaadaf8053ea6)
* [InstCombine] Fix worklist management in DSE (PR44552)Nikita Popov2020-01-231-2/+5
| | | | | | | | | | | | | | | | | | Fixes https://bugs.llvm.org/show_bug.cgi?id=44552. We need to make sure that the store is reprocessed, because performing DSE may expose more DSE opportunities. There is a slight caveat here though: We need to make sure that we add back the store the worklist first, because that means it will be processed after the operands of the removed store have been processed. This is a general bug in InstCombine worklist management that I hope to address at some point, but for now it means we need to do this manually rather than just returning the instruction as changed. Differential Revision: https://reviews.llvm.org/D72807 (cherry picked from commit 522c030aa9b1dd1881feb5a0d0fa2639b4a5feb7)
* [Attributor] AAValueConstantRange: Value range analysis using constant rangeHideto Ueno2020-01-151-7/+499
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch introduces `AAValueConstantRange`, which answers a possible range for integer value in a specific program point. One of the motivations is propagating existing `range` metadata. (I think we need to change the situation that `range` metadata cannot be put to Argument). The state is a tuple of `ConstantRange` and it is initialized to (known, assumed) = ([-∞, +∞], empty). Currently, AAValueConstantRange is created in `getAssumedConstant` method when `AAValueSimplify` returns `nullptr`(worst state). Supported - BinaryOperator(add, sub, ...) - CmpInst(icmp eq, ...) - !range metadata `AAValueConstantRange` is not intended to extend to polyhedral range value analysis. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: phosek, davezarzycki, baziotis, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71620
* [InstCombine] Fix worklist management when removing guard intrinsicNikita Popov2020-01-141-10/+10
| | | | | | | | | | | | | When multiple guard intrinsics are merged into one, currently the result of eraseInstFromFunction() is returned -- however, this should only be done if the current instruction is being removed. In this case we're removing a different instruction and should instead report that the current one has been modified by returning it. For this test case, this reduces the number of instcombine iterations from 5 to 2 (the minimum possible). Differential Revision: https://reviews.llvm.org/D72558
* [NewPM] Port MergeFunctions passNikita Popov2020-01-142-15/+34
| | | | | | | | | | This ports the MergeFunctions pass to the NewPM. This was rather straightforward, as no analyses are used. Additionally MergeFunctions needs to be conditionally enabled in the PassBuilder, but I left that part out of this patch. Differential Revision: https://reviews.llvm.org/D72537
* [InstCombine] Fix infinite loop due to bitcast <-> phi transformsNikita Popov2020-01-141-3/+8
| | | | | | | | | | | | | | | | | Fix for https://bugs.llvm.org/show_bug.cgi?id=44245. The optimizeBitCastFromPhi() and FoldPHIArgOpIntoPHI() end up fighting against each other, because optimizeBitCastFromPhi() assumes that bitcasts of loads will get folded. This doesn't happen here, because a dangling phi node prevents the one-use fold in https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp#L620-L628 from triggering. This patch fixes the issue by explicitly performing the load combine as part of the bitcast of phi transform. Other attempts to force the load to be combined first were ultimately too unreliable. Differential Revision: https://reviews.llvm.org/D71164
* [InstCombine] Make combineLoadToNewType a method; NFCNikita Popov2020-01-142-13/+15
| | | | | So it can be reused as part of other combines. In particular for D71164.
* [InstCombine] Fix user iterator invalidation in bitcast of phi transformNikita Popov2020-01-141-1/+4
| | | | | | | | | This fixes the issue encountered in D71164. Instead of using a range-based for, manually iterate over the users and advance the iterator beforehand, so we do not skip any users due to iterator invalidation. Differential Revision: https://reviews.llvm.org/D72657
* [ThinLTO/WPD] Remove an overly-aggressive assertTeresa Johnson2020-01-141-8/+3
| | | | | | | | | | | | | | | | | | | | Summary: An assert added to the index-based WPD was trying to verify that we only have multiple vtables for a given guid when they are all non-external linkage. This is too conservative because we may have multiple external vtable with the same guid when they are in comdat. Remove the assert, as we don't have comdat information in the index, the linker should issue an error in this case. See discussion on D71040 for more information. Reviewers: evgeny777, aganea Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72648
* [InstCombine] Let combineLoadToNewType preserve ABI alignment of the load ↵Juneyoung Lee2020-01-151-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | (PR44543) Summary: If aligment on `LoadInst` isn't specified, load is assumed to be ABI-aligned. And said aligment may be different for different types. So if we change load type, but don't pay extra attention to the aligment (i.e. keep it unspecified), we may either overpromise (if the default aligment of the new type is higher), or underpromise (if the default aligment of the new type is smaller). Thus, if no alignment is specified, we need to manually preserve the implied ABI alignment. This addresses https://bugs.llvm.org/show_bug.cgi?id=44543 by making combineLoadToNewType preserve ABI alignment of the load. Reviewers: spatel, lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72710
* Removed PointerUnion3 and PointerUnion4 aliases in favor of the variadic ↵Dmitri Gribenko2020-01-141-1/+1
| | | | template
* Revert "Recommit "[GlobalOpt] Pass DTU to removeUnreachableBlocks instead of ↵Florian Hahn2020-01-141-3/+7
| | | | | | | | | | | recomputing."" This reverts commit a03d7b0f24b65d69721dbbbc871df0629efcf774. As discussed in D68298, this causes a compile-time regression, in case the DTs requested are not used elsewhere in GlobalOpt. We should only get the DTs if they are available here, but this seems not possible with the legacy pass manager from a module pass.
* Make helper functions static or move them into anonymous namespaces. NFC.Benjamin Kramer2020-01-141-1/+1
|
* [PGO][CHR] Guard against 0-to-0 branch weight and avoid division by zero crash.Hiroshi Yamauchi2020-01-131-0/+4
| | | | | | | | | | | | Summary: This fixes a crash in internal builds under SamplePGO. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72653
* [ThinLTO/WPD] Fix index-based WPD for alias vtablesTeresa Johnson2020-01-131-1/+1
| | | | | | | | | | | | | | | | | Summary: A recent fix in D69452 fixed index based WPD in the presence of available_externally vtables. It added a cast of the vtable def summary to a GlobalVarSummary. However, in some cases one def may be an alias, in which case we need to get the base object before casting, otherwise we will crash. Reviewers: evgeny777, steven_wu, aganea Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71040
* Fix uninitialized value clang static analyzer warning. NFC.Simon Pilgrim2020-01-111-1/+1
|
* DSE: fix bug where we would only check libcalls for name rather than whole declNuno Lopes2020-01-111-9/+12
|
* [InstCombine] Preserve nuw on sub of geps (PR44419)Nikita Popov2020-01-112-4/+16
| | | | | | | | | Fix https://bugs.llvm.org/show_bug.cgi?id=44419 by preserving the nuw on sub of geps. We only do this if the offset has a multiplication as the final operation, as we can't be sure the operations is nuw in the other cases without more thorough analysis. Differential Revision: https://reviews.llvm.org/D72048
* Add support for __declspec(guard(nocf))Andrew Paverd2020-01-101-6/+4
| | | | | | | | | | | | | | | | | | Summary: Avoid using the `nocf_check` attribute with Control Flow Guard. Instead, use a new `"guard_nocf"` function attribute to indicate that checks should not be added on indirect calls within that function. Add support for `__declspec(guard(nocf))` following the same syntax as MSVC. Reviewers: rnk, dmajor, pcc, hans, aaron.ballman Reviewed By: aaron.ballman Subscribers: aaron.ballman, tomrittervg, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72167
* Don't use dyn_cast_or_null if we know the pointer is nonnull.Simon Pilgrim2020-01-101-4/+2
| | | | Fix clang static analyzer null dereference warning by using dyn_cast instead.
* [LV] Silence unused variable warning in Release builds. NFC.Benjamin Kramer2020-01-101-0/+1
|
* [LV] VPValues for memory operation pointers (NFCI)Gil Rapaport2020-01-105-104/+141
| | | | | | | | | | | | Memory instruction widening recipes use the pointer operand of their load/store ingredient for generating the needed GEPs, making it difficult to feed these recipes with pointers based on other ingredients or none at all. This patch modifies these recipes to use a VPValue for the pointer instead, in order to reduce ingredient def-use usage by ILV as a step towards full VPlan-based def-use relations. The recipes are constructed with VPValues bound to these ingredients, maintaining current behavior. Differential revision: https://reviews.llvm.org/D70865
* [NFCI][LoopUnrollAndJam] Changing LoopUnrollAndJamPass to a functionWhitney Tsang2020-01-091-42/+62
| | | | | | | | | | | | | | | | | | | pass. Summary: This patch changes LoopUnrollAndJamPass to a function pass, and keeps the loops traversal order same as defined in FunctionToLoopPassAdaptor LoopPassManager.h. The next patch will change the loop traversal to outer to inner order, so more loops can be transform. Discussion in llvm-dev mailing list: https://groups.google.com/forum/#!topic/llvm-dev/LF4rUjkVI2g Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto Reviewed By: dmgreen Subscribers: hiraditya, zzheng, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D72230
* [InstCombine] Z / (1.0 / Y) => (Y * Z)@raghesh (Raghesh Aloor)2020-01-091-0/+8
| | | | | | | | | This is a special case of Z / (X / Y) => (Y * Z) / X, with X = 1.0. The m_OneUse check is avoided because even in the case of the multiple uses for 1.0/Y, the number of instructions remain the same and a division is replaced by a multiplication. Differential Revision: https://reviews.llvm.org/D72319
* [Matrix] Update shape propagation to iterate until done.Florian Hahn2020-01-091-43/+62
| | | | | | | | | | | | | | | | | | This patch updates the shape propagation to iterate until no new shape information is discovered. As initial seed for the forward propagation, we use the matrix intrinsic instructions. Both propagateShapeForward and propagateShapeBackward return new work lists, with the instructions to be used for the next iteration. When propagating forward, we record all instructions we added new shape information for. When propagating backward, we record all users of instructions we added new shape information for. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70901
* [Matrix] Propagate and use shape information for loads.Florian Hahn2020-01-091-13/+29
| | | | | | | | | | | This patch extends to shape propagation to also include load instructions and implements shape aware lowering for vector loads. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70900
* [LoopUtils][NFC] Minor refactoring in getLoopEstimatedTripCount.Evgeniy Brevnov2020-01-091-7/+7
|
* [Matrix] Implement back-propagation of shape information.Florian Hahn2020-01-091-1/+63
| | | | | | | | | | | This patch extends the shape propagation for matrix operations to also propagate the shape of instructions to their operands. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70899
* [LV] Still vectorise when tail-folding can't find a primary inducation variableSjoerd Meijer2020-01-091-26/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This addresses a vectorisation regression for tail-folded loops that are counting down, e.g. loops as simple as this: void foo(char *A, char *B, char *C, uint32_t N) { while (N > 0) { *C++ = *A++ + *B++; N--; } } These are loops that can be vectorised, but when tail-folding is requested, it can't find a primary induction variable which we do need for predicating the loop. As a result, the loop isn't vectorised at all, which it is able to do when tail-folding is not attempted. So, this adds a check for the primary induction variable where we decide how to lower the scalar epilogue. I.e., when there isn't a primary induction variable, a scalar epilogue loop is allowed (i.e. don't request tail-folding) so that vectorisation could still be triggered. Having this check for the primary induction variable make sense anyway, and in addition, in a follow-up of this I will look into discovering earlier the primary induction variable for counting down loops, so that this can also be tail-folded. Differential revision: https://reviews.llvm.org/D72324
* [Attributor][FIX] Carefully change invokes to calls (after manifest)Johannes Doerfert2020-01-081-84/+35
| | | | | | Before we manually inserted unreachable early but that could lead to broken PHI nodes. Now we use the existing late modification functionality.
* [Attributor][FIX] Avoid dangling value pointers during code modificationJohannes Doerfert2020-01-081-2/+3
| | | | | | | When we replace instructions with unreachable we delete instructions. We now avoid dangling pointers to those deleted instructions in the `ToBeChangedToUnreachableInsts` set. Other modification collections might need to be updated in the future as well.
* Revert "[JumpThreading] Thread jumps through two basic blocks"Kazu Hirata2020-01-081-228/+2
| | | | | | | | It looks like my patch breaks the sanitizer-windows build: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/56324 This reverts commit ead815924e6ebeaf02c31c37ebf7a560b5fdf67b.
* [JumpThreading] Thread jumps through two basic blocksKazu Hirata2020-01-081-2/+228
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch teaches JumpThreading.cpp to thread through two basic blocks like: bb3: %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 by duplicating basic blocks like bb3 above. Once we duplicate bb3 as bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have: bb3: %var = phi i32* [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb3.dup: %var = phi i32* [ null, %bb1 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 Then the existing code in JumpThreading.cpp can thread edge bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70247
* Revert "[InstCombine] fold zext of masked bit set/clear"Kadir Cetinkaya2020-01-081-17/+3
| | | | | | | | | | This reverts commit a041c4ec6f7aa659b235cb67e9231a05e0a33b7d. This looks like a non-trivial change and there has been no code reviews (at least there were no phabricator revisions attached to the commit description). It is also causing a regression in one of our downstream integration tests, we haven't been able to come up with a minimal reproducer yet.
OpenPOWER on IntegriCloud