summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Vectorize
Commit message (Collapse)AuthorAgeFilesLines
* [NFC][TTI] Add Alignment for isLegalMasked[Load/Store]Sam Parker2019-10-141-8/+12
| | | | | | | | | Add an extra parameter so the backend can take the alignment into consideration. Differential Revision: https://reviews.llvm.org/D68400 llvm-svn: 374763
* [LV] Merge LLVM_DEBUG blocks.Benjamin Kramer2019-10-121-15/+17
| | | | | | | Avoids unused variable warnings about the range-based for loops in there. NFCI. llvm-svn: 374646
* recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure ↵Zi Xuan Wu2019-10-122-48/+105
| | | | | | | | | | | | | | | | | | | | | | | separately in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374634
* [VPlan] Add moveAfter to VPRecipeBase.Florian Hahn2019-10-112-0/+10
| | | | | | | | | | | | | This patch adds a moveAfter method to VPRecipeBase, which can be used to move elements after other elements, across VPBasicBlocks, if necessary. Reviewers: dcaballe, hsaito, rengolin, hfinkel Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D46825 llvm-svn: 374565
* [LV][NFC] Factor out calculation of "best" estimated trip count.Florian Hahn2019-10-101-28/+26
| | | | | | | | | | | | | | | This is just small refactoring to minimize changes in upcoming patch. In the next path I'm going to introduce changes into heuristic for vectorization of "tiny trip count" loops. Patch by Evgeniy Brevnov <evgueni.brevnov@gmail.com> Reviewers: hsaito, Ayal, fhahn, reames Reviewed By: hsaito Differential Revision: https://reviews.llvm.org/D67690 llvm-svn: 374338
* [Alignment][NFC] Make VectorUtils uas llvm::AlignGuillaume Chatelet2019-10-101-2/+3
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, rogfer01, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68784 llvm-svn: 374330
* [SLP] respect target register width for GEP vectorization (PR43578)Sanjay Patel2019-10-091-4/+10
| | | | | | | | | | | | | | | | | | | | | We failed to account for the target register width (max vector factor) when vectorizing starting from GEPs. This causes vectorization to proceed to obviously illegal widths as in: https://bugs.llvm.org/show_bug.cgi?id=43578 For x86, this also means that SLP can produce rogue AVX or AVX512 code even when the user specifies a narrower vector width. The AArch64 test in ext-trunc.ll appears to be better using the narrower width. I'm not exactly sure what getelementptr.ll is trying to do, but it's testing with "-slp-threshold=-18", so I'm not worried about those diffs. The x86 test is an over-reduction from SPEC h264; this patch appears to restore the perf loss caused by SLP when using -march=haswell. Differential Revision: https://reviews.llvm.org/D68667 llvm-svn: 374183
* [LV] Emitting SCEV checks with OptForSizeSjoerd Meijer2019-10-091-1/+2
| | | | | | | | | | | | | | | | | | | When optimising for size and SCEV runtime checks need to be emitted to check overflow behaviour, the loop vectorizer can run in this assert: LoopVectorize.cpp:2699: void llvm::InnerLoopVectorizer::emitSCEVChecks( llvm::Loop *, llvm::BasicBlock *): Assertion `!BB->getParent()->hasOptSize() && "Cannot SCEV check stride or overflow when opt We should not generate predicates while optimising for size because code will be generated for predicates such as these SCEV overflow runtime checks. This should fix PR43371. Differential Revision: https://reviews.llvm.org/D68082 llvm-svn: 374166
* Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure ↵Jinsong Ji2019-10-082-107/+48
| | | | | | | | | | | | | | separately in loop-vectorize" Also Revert "[LoopVectorize] Fix non-debug builds after rL374017" This reverts commit 9f41deccc0e648a006c9f38e11919f181b6c7e0a. This reverts commit 18b6fe07bcf44294f200bd2b526cb737ed275c04. The patch is breaking PowerPC internal build, checked with author, reverting on behalf of him for now due to timezone. llvm-svn: 374091
* [LoopVectorize] Fix non-debug builds after rL374017Kadir Cetinkaya2019-10-081-6/+8
| | | | llvm-svn: 374021
* [LoopVectorize][PowerPC] Estimate int and float register pressure separately ↵Zi Xuan Wu2019-10-082-48/+105
| | | | | | | | | | | | | | | | | | | | | | | in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374017
* Revert "[SLP] avoid reduction transform on patterns that the backend can ↵Martin Storsjo2019-10-071-12/+3
| | | | | | | | | | load-combine" This reverts SVN r373833, as it caused a failed assert "Non-zero loop cost expected" on building numerous projects, see PR43582 for details and reproduction samples. llvm-svn: 373882
* [SLP] avoid reduction transform on patterns that the backend can load-combineSanjay Patel2019-10-051-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I don't see an ideal solution to these 2 related, potentially large, perf regressions: https://bugs.llvm.org/show_bug.cgi?id=42708 https://bugs.llvm.org/show_bug.cgi?id=43146 We decided that load combining was unsuitable for IR because it could obscure other optimizations in IR. So we removed the LoadCombiner pass and deferred to the backend. Therefore, preventing SLP from destroying load combine opportunities requires that it recognizes patterns that could be combined later, but not do the optimization itself ( it's not a vector combine anyway, so it's probably out-of-scope for SLP). Here, we add a scalar cost model adjustment with a conservative pattern match and cost summation for a multi-instruction sequence that can probably be reduced later. This should prevent SLP from creating a vector reduction unless that sequence is extremely cheap. In the x86 tests shown (and discussed in more detail in the bug reports), SDAG combining will produce a single instruction on these tests like: movbe rax, qword ptr [rdi] or: mov rax, qword ptr [rdi] Not some (half) vector monstrosity as we currently do using SLP: vpmovzxbq ymm0, dword ptr [rdi + 1] # ymm0 = mem[0],zero,zero,.. vpsllvq ymm0, ymm0, ymmword ptr [rip + .LCPI0_0] movzx eax, byte ptr [rdi] movzx ecx, byte ptr [rdi + 5] shl rcx, 40 movzx edx, byte ptr [rdi + 6] shl rdx, 48 or rdx, rcx movzx ecx, byte ptr [rdi + 7] shl rcx, 56 or rcx, rdx or rcx, rax vextracti128 xmm1, ymm0, 1 vpor xmm0, xmm0, xmm1 vpshufd xmm1, xmm0, 78 # xmm1 = xmm0[2,3,0,1] vpor xmm0, xmm0, xmm1 vmovq rax, xmm0 or rax, rcx vzeroupper ret Differential Revision: https://reviews.llvm.org/D67841 llvm-svn: 373833
* [Alignment][NFC] Remove StoreInst::setAlignment(unsigned)Guillaume Chatelet2019-10-031-1/+1
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, bollu, jdoerfert Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68268 llvm-svn: 373595
* [Alignment][NFC] Remove LoadInst::setAlignment(unsigned)Guillaume Chatelet2019-09-301-4/+3
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jdoerfert Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68142 llvm-svn: 373195
* [SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) && ↵Alexey Bataev2019-09-291-67/+85
| | | | | | | | | | | | | | | | | | | | "SCEVAddRecExpr operand is not loop-invariant!") Initially SLP vectorizer replaced all going-to-be-vectorized instructions with Undef values. It may break ScalarEvaluation and may cause a crash. Reworked SLP vectorizer so that it does not replace vectorized instructions by UndefValue anymore. Instead vectorized instructions are marked for deletion inside if BoUpSLP class and deleted upon class destruction. Reviewers: mzolotukhin, mkuper, hfinkel, RKSimon, davide, spatel Subscribers: RKSimon, Gerolf, anemet, hans, majnemer, llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D29641 llvm-svn: 373166
* [Alignment][NFC] Remove unneeded llvm:: scoping on Align typesGuillaume Chatelet2019-09-271-2/+2
| | | | llvm-svn: 373081
* Revert [SLP] Fix for PR31847: Assertion failed: ↵Jordan Rupprecht2019-09-261-75/+66
| | | | | | | | (isLoopInvariant(Operands[i], L) && "SCEVAddRecExpr operand is not loop-invariant!") This reverts r372626 (git commit 6a278d9073bdc158d31d4f4b15bbe34238f22c18) llvm-svn: 373019
* LoopVectorize - silence static analyzer dyn_cast<CmpInst> null dereference ↵Simon Pilgrim2019-09-241-1/+1
| | | | | | | | warning. NFCI. The static analyzer is warning about a potential null dereference, but we should be able to use cast<CmpInst> directly and if not assert will fire for us. llvm-svn: 372732
* [LV] Forced vectorization with runtime checks and OptForSizeSjoerd Meijer2019-09-241-2/+13
| | | | | | | | | | | | | | | When vectorisation is forced with a pragma, we optimise for min size, and we need to emit runtime memory checks, then allow this code growth and don't run in an assert like we currently do. This is the result of D65197 and D66803, and was a use-case not really considered before. If this now happens, we emit an optimisation remark warning about the code-size expansion, which can be avoided by not forcing vectorisation or possibly source-code modifications. Differential Revision: https://reviews.llvm.org/D67764 llvm-svn: 372694
* [SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) && ↵Alexey Bataev2019-09-231-66/+75
| | | | | | | | | | | | | | | | | | | | | "SCEVAddRecExpr operand is not loop-invariant!") Summary: Initially SLP vectorizer replaced all going-to-be-vectorized instructions with Undef values. It may break ScalarEvaluation and may cause a crash. Reworked SLP vectorizer so that it does not replace vectorized instructions by UndefValue anymore. Instead vectorized instructions are marked for deletion inside if BoUpSLP class and deleted upon class destruction. Reviewers: mzolotukhin, mkuper, hfinkel, RKSimon, davide, spatel Subscribers: RKSimon, Gerolf, anemet, hans, majnemer, llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D29641 llvm-svn: 372626
* [VPlan] Silence static analyzer dyn_cast null dereference warning. NFCI.Simon Pilgrim2019-09-221-1/+1
| | | | llvm-svn: 372502
* [LoopVectorize] Don't dereference a dyn_cast result. NFCI.Simon Pilgrim2019-09-171-1/+1
| | | | | | The static analyzer is warning about potential null dereferences of dyn_cast<> results, we can use cast<> directly as we know that these cases should all be CastInst, which is why its working atm and anyway cast<> will assert if they aren't. llvm-svn: 372116
* [VPlanSLP] Don't dereference a cast_or_null<VPInstruction> result. NFCI.Simon Pilgrim2019-09-161-5/+8
| | | | | | The static analyzer is warning about a potential null dereference of the cast_or_null result, I've split the cast_or_null check from the ->getUnderlyingInstr() call to avoid this, but it appears that we weren't seeing any null pointers in the dumped bundles in the first place. llvm-svn: 371975
* [SLPVectorizer] Assert that we find a LastInst to silence analyzer null ↵Simon Pilgrim2019-09-161-0/+1
| | | | | | dereference warning. NFCI. llvm-svn: 371974
* [SLPVectorizer] Don't dereference a dyn_cast result. NFCI.Simon Pilgrim2019-09-161-4/+4
| | | | | | The static analyzer is warning about potential null dereferences of dyn_cast<> results - in these cases we can safely use cast<> directly as we know that these cases should all be the correct type, which is why its working atm and anyway cast<> will assert if they aren't. llvm-svn: 371973
* [LoadStoreVectorizer] vectorizeLoadChain - ensure we find a valid Type down ↵Simon Pilgrim2019-09-151-1/+2
| | | | | | | | the load chain. NFCI. Silence static analyzer uninitialized variable warning by setting the LoadTy to null and then asserting we find a real value. llvm-svn: 371936
* [SLP] limit vectorization of Constant subclasses (PR33958)Sanjay Patel2019-09-151-2/+5
| | | | | | | | | | | | | | This is a fix for: https://bugs.llvm.org/show_bug.cgi?id=33958 It seems universally true that we would not want to transform this kind of sequence on any target, but if that's not correct, then we could view this as a target-specific cost model problem. We could also white-list ConstantInt, ConstantFP, etc. rather than blacklist Global and ConstantExpr. Differential Revision: https://reviews.llvm.org/D67362 llvm-svn: 371931
* [Loads] Move generic code out of vectorizer into a location it might be ↵Philip Reames2019-09-101-51/+0
| | | | | | reused [NFC] llvm-svn: 371558
* [ValueTracking] Factor our common speculation suppression logic [NFC]Philip Reames2019-09-101-15/+0
| | | | | | Expose a utility function so that all places which want to suppress speculation (when otherwise legal) due to ordering and/or sanitizer interaction can do so. llvm-svn: 371556
* [LoopVectorize] Leverage speculation safety to avoid masked.loadsPhilip Reames2019-09-091-4/+85
| | | | | | | | | | | | If we're vectorizing a load in a predicated block, check to see if the load can be speculated rather than predicated. This allows us to generate a normal vector load instead of a masked.load. To do so, we must prove that all bytes accessed on any iteration of the original loop are dereferenceable, and that all loads (across all iterations) are properly aligned. This is equivelent to proving that hoisting the load into the loop header in the original scalar loop is safe. Note: There are a couple of code motion todos in the code. My intention is to wait about a day - to be sure this sticks - and then perform the NFC motion without furthe review. Differential Revision: https://reviews.llvm.org/D66688 llvm-svn: 371452
* Fix typo. NFCISimon Pilgrim2019-09-071-1/+1
| | | | llvm-svn: 371317
* Change TargetLibraryInfo analysis passes to always require FunctionTeresa Johnson2019-09-072-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the first change to enable the TLI to be built per-function so that -fno-builtin* handling can be migrated to use function attributes. See discussion on D61634 for background. This is an enabler for fixing handling of these options for LTO, for example. This change should not affect behavior, as the provided function is not yet used to build a specifically per-function TLI, but rather enables that migration. Most of the changes were very mechanical, e.g. passing a Function to the legacy analysis pass's getTLI interface, or in Module level cases, adding a callback. This is similar to the way the per-function TTI analysis works. There was one place where we were looking for builtins but not in the context of a specific function. See FindCXAAtExit in lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround could provide the wrong behavior in some corner cases. Suggestions welcome. Reviewers: chandlerc, hfinkel Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66428 llvm-svn: 371284
* [LLVM][Alignment] Convert isLegalNTStore/isLegalNTLoad to llvm::AlignGuillaume Chatelet2019-09-051-2/+4
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67223 llvm-svn: 371063
* [NFC] Switch last couple of invariant_load checks to use hasMetadataPhilip Reames2019-09-041-1/+1
| | | | llvm-svn: 370948
* [LV] Fix miscompiles by adding non-header PHI nodes to AllowedExitBjorn Pettersson2019-09-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: Fold-tail currently supports reduction last-vector-value live-out's, but has yet to support last-scalar-value live-outs, including non-header phi's. As it relies on AllowedExit in order to detect them and bail out we need to add the non-header PHI nodes to AllowedExit, otherwise we end up with miscompiles. Solves https://bugs.llvm.org/show_bug.cgi?id=43166 Reviewers: fhahn, Ayal Reviewed By: fhahn, Ayal Subscribers: anna, hiraditya, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67074 llvm-svn: 370721
* [LV] Tail-folding, runtime scev checksSjoerd Meijer2019-09-031-2/+2
| | | | | | | | | Now that we allow tail-folding, not only when we optimise for size, make sure we do not run in this assert. Differential revision: https://reviews.llvm.org/D66932 llvm-svn: 370711
* [LV] Tail-folding with runtime memory checksSjoerd Meijer2019-09-031-1/+4
| | | | | | | | | The loop vectorizer was running in an assert when it tried to fold the tail and had to emit runtime memory disambiguation checks. Differential revision: https://reviews.llvm.org/D66803 llvm-svn: 370707
* Fix cppcheck shadow variable and variable scope warnings. NFCI.Simon Pilgrim2019-08-311-6/+5
| | | | llvm-svn: 370580
* [LV] Fold tail by masking - handle reductionsAyal Zaks2019-08-283-11/+57
| | | | | | | | | | | | Allow vectorizing loops that have reductions when tail is folded by masking. A select is introduced in VPlan, choosing between the last value carried by the loop-exit/live-out instruction of the reduction, and the penultimate value carried by the reduction phi, according to the "i < n" mask of fold-tail. This select replaces the last value as the live-out value of the loop. Differential Revision: https://reviews.llvm.org/D66720 llvm-svn: 370173
* Add a clarify comment for meaning of SafePointes [NFC]Philip Reames2019-08-261-1/+5
| | | | | | Extracted from D66688 as requested. llvm-svn: 369962
* [SLP] use range-for loops, fix formatting; NFCSanjay Patel2019-08-231-32/+32
| | | | | | | These are part of D57059, but that patch doesn't apply cleanly to trunk at this point, so we might as well remove some of the noise. llvm-svn: 369776
* [SLP] fix formatting; NFCSanjay Patel2019-08-231-4/+3
| | | | | | | These are part of D57059, but that patch doesn't apply cleanly to trunk at this point, so we might as well remove some of the noise. llvm-svn: 369769
* [SLP][NFC] Avoid repetitive calls to getSameOpcode()Dinar Temirbulatov2019-08-201-120/+176
| | | | | | | | We can avoid repetitive calls getSameOpcode() for already known tree elements by keeping MainOp and AltOp in TreeEntry. Differential Revision: https://reviews.llvm.org/D64700 llvm-svn: 369315
* [SLP] reduce duplicated code; NFCSanjay Patel2019-08-191-2/+4
| | | | llvm-svn: 369250
* [SLPVectorizer] Make the scheduler aware of the TreeEntry operands.Vasileios Porpodas2019-08-161-79/+171
| | | | | | | | | | | | | | | | | | | | | | Summary: The scheduler's dependence graph gets the use-def dependencies by accessing the operands of the instructions in a bundle. However, buildTree_rec() may change the order of the operands in TreeEntry, and the scheduler is currently not aware of this. This is not causing any functional issues currently, because reordering is restricted to the operands of a single instruction. Once we support operand reordering across multiple TreeEntries, as shown here: http://www.llvm.org/devmtg/2019-04/slides/Poster-Porpodas-Supernode_SLP.pdf , the scheduler will need to get the correct operands from TreeEntry and not from the individual instructions. In short, this patch: - Connects the scheduler's bundle with the corresponding TreeEntry. It introduces new TE and Lane fields in ScheduleData. - Moves the location where the operands of the TreeEntry are initialized. This used to take place in newTreeEntry() setting one operand at a time, but is now moved pre-order just before the recursion of buildTree_rec(). This is required because the scheduler needs to access both operands of the TreeEntry in tryScheduleBundle(). - Updates the scheduler to access the instruction operands through the TreeEntry operands instead of accessing the instruction operands directly. Reviewers: ABataev, RKSimon, dtemirbulatov, Ayal, dorit, hfinkel Reviewed By: ABataev Subscribers: hiraditya, llvm-commits, lebedev.ri, rcorcs Tags: #llvm Differential Revision: https://reviews.llvm.org/D62432 llvm-svn: 369131
* [SLPVectorizer] Silence null dereference warning. NFCI.Simon Pilgrim2019-08-161-0/+1
| | | | | | cppcheck + MSVC analyzer both over zealously warn that we might dereference a null Bundle pointer - add an assertion to check for null to silence the warning, plus its a good idea to check that we succeeded in finding a schedule bundle anyway.... llvm-svn: 369094
* [llvm] Migrate llvm::make_unique to std::make_uniqueJonas Devlieghere2019-08-152-6/+6
| | | | | | | | Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013
* [LV] fold-tail predication should be respected even with assume_safetyDorit Nuzman2019-08-152-5/+5
| | | | | | | | | | | | | | | assume_safety implies that loads under "if's" can be safely executed speculatively (unguarded, unmasked). However this assumption holds only for the original user "if's", not those introduced by the compiler, such as the fold-tail "if" that guards us from loading beyond the original loop trip-count. Currently the combination of fold-tail and assume-safety pragmas results in ignoring the fold-tail predicate that guards the loads, generating unmasked loads. This patch fixes this behavior. Differential Revision: https://reviews.llvm.org/D66106 Reviewers: Ayal, hsaito, fhahn llvm-svn: 368973
* [SLP][NFC] Use pointers to address to ScalarToTreeEntry elements, instead of ↵Dinar Temirbulatov2019-08-141-4/+4
| | | | | | indexes. llvm-svn: 368906
OpenPOWER on IntegriCloud