bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Force vector width for scev-expander-debug.ll test	Florian Hahn	2018-06-25	1	-1/+1
\| \| \| \|	llvm-svn: 335520
*	[SCEVExp] Advance found insertion point until we find a non-dbg instruction.	Florian Hahn	2018-06-25	1	-0/+61
\| \| \| \| \| \| \| \| \| \| \| \| \|	This avoids creating unnecessary casts if the IP used to be a dbg info intrinsic. Fixes PR37727. Reviewers: vsk, aprantl, sanjoy, efriedma Reviewed By: vsk, efriedma Differential Revision: https://reviews.llvm.org/D47874 llvm-svn: 335513
*	[LoopVectorize] regenerate full checks; NFC	Sanjay Patel	2018-06-21	1	-7/+61
\| \| \| \|	llvm-svn: 335257
*	Move redundant-vf2-cost.ll test to X86 directory	Diego Caballero	2018-06-15	1	-0/+0
\| \| \| \| \| \| \| \|	redundant-vf2-cost.ll is X86 specific. Moved from test/Transforms/LoopVectorize/redundant-vf2-cost.ll to test/Transforms/LoopVectorize/X86/redundant-vf2-cost.ll llvm-svn: 334854
*	[LV] Prevent LV to run cost model twice for VF=2	Diego Caballero	2018-06-15	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a minor fix for LV cost model, where the cost for VF=2 was computed twice when the vectorization of the loop was forced without specifying a VF. Reviewers: xusx595, hsaito, fhahn, mkuper Reviewed By: hsaito, xusx595 Differential Revision: https://reviews.llvm.org/D48048 llvm-svn: 334840
*	[LV] Fix PR36983. For a given recurrence, fix all phis in exit block	Roman Shirokiy	2018-06-08	1	-0/+24
\| \| \| \| \| \| \| \| \|	There could be more than one PHIs in exit block using same loop recurrence. Don't assume there is only one and fix each user. Differential Revision: https://reviews.llvm.org/D47788 llvm-svn: 334271
*	[TargetLibraryInfo] add mappings from LLVM sin/cos intrinsics to SVML calls	Sanjay Patel	2018-06-07	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These weren't included in D19544 - probably just an oversight. D40044 made it more likely that we'll have LLVM math intrinsics rather than libcalls, so this bug was more easily exposed. As the tests/code show, we already have the complete mappings for pow/exp/log. I don't have any experience with SVML, so I don't know if anything else is missing. It's also not clear to me that we should be doing this transform in IR rather than DAG/isel, but that's a separate issue. Differential Revision: https://reviews.llvm.org/D47610 llvm-svn: 334211
*	[ConstantFold] Disallow folding vector geps into bitcasts	Karl-Johan Karlsson	2018-06-01	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Getelementptr returns a vector of pointers, instead of a single address, when one or more of its arguments is a vector. In such case it is not possible to simplify the expression by inserting a bitcast of operand(0) into the destination type, as it will create a bitcast between different sizes. Reviewers: majnemer, mkuper, mssimpso, spatel Reviewed By: spatel Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D46379 llvm-svn: 333783
*	[LoopVectorize, x86] add tests to show missing SVML transforms; NFC	Sanjay Patel	2018-05-31	1	-344/+356
\| \| \| \|	llvm-svn: 333707
*	[LoopVectorize, x86] regenerate checks; NFC	Sanjay Patel	2018-05-31	1	-99/+403
\| \| \| \| \| \|	I removed the 'fast' flag from the calls because that's not required. llvm-svn: 333695
*	[VPlan] Reland r332654 and silence unused func warning	Diego Caballero	2018-05-21	1	-0/+51
\| \| \| \| \| \| \| \| \| \|	r332654 was reverted due to an unused function warning in release build. This commit includes the same code with the warning silenced. Differential Revision: https://reviews.llvm.org/D44338 llvm-svn: 332860
*	Delete a test that was missed in the revert r332747.	Amara Emerson	2018-05-18	1	-51/+0
\| \| \| \| \| \|	r332747 originally reverted r332654 which added this test. llvm-svn: 332755
*	[X86][CET] Changing -fcf-protection behavior to comply with gcc (LLVM part)	Alexander Ivchenko	2018-05-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch aims to match the changes introduced in gcc by https://gcc.gnu.org/ml/gcc-cvs/2018-04/msg00534.html. The IBT feature definition is removed, with the IBT instructions being freely available on all X86 targets. The shadow stack instructions are also being made freely available, and the use of all these CET instructions is controlled by the module flags derived from the -fcf-protection clang option. The hasSHSTK option remains since clang uses it to determine availability of shadow stack instruction intrinsics, but it is no longer directly used. Comes with a clang patch (D46881). Patch by mike.dvoretsky Differential Revision: https://reviews.llvm.org/D46882 llvm-svn: 332705
*	[LV][VPlan] Build plain CFG with simple VPInstructions for outer loops.	Diego Caballero	2018-05-17	1	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch #3 from VPlan Outer Loop Vectorization Patch Series #1 (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). Expected to be NFC for the current inner loop vectorization path. It introduces the basic algorithm to build the VPlan plain CFG (single-level CFG, no hierarchical CFG (H-CFG), yet) in the VPlan-native vectorization path using VPInstructions. It includes: - VPlanHCFGBuilder: Main class to build the VPlan H-CFG (plain CFG without nested regions, for now). - VPlanVerifier: Main class with utilities to check the consistency of a H-CFG. - VPlanBlockUtils: Main class with utilities to manipulate VPBlockBases in VPlan. Reviewers: rengolin, fhahn, mkuper, mssimpso, a.elovikov, hfinkel, aprantl. Differential Revision: https://reviews.llvm.org/D44338 llvm-svn: 332654
*	[LV] Add lit testcase for bitcast problem. NFC	Karl-Johan Karlsson	2018-05-09	1	-0/+54
\| \| \| \|	llvm-svn: 331878
*	[DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label.	Shiva Chen	2018-05-09	18	-28/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to set breakpoints on labels and list source code around labels, we need collect debug information for labels, i.e., label name, the function label belong, line number in the file, and the address label located. In order to keep these information in LLVM IR and to allow backend to generate debug information correctly. We create a new kind of metadata for labels, DILabel. The format of DILabel is !DILabel(scope: !1, name: "foo", file: !2, line: 3) We hope to keep debug information as much as possible even the code is optimized. So, we create a new kind of intrinsic for label metadata to avoid the metadata is eliminated with basic block. The intrinsic will keep existing if we keep it from optimized out. The format of the intrinsic is llvm.dbg.label(metadata !1) It has only one argument, that is the DILabel metadata. The intrinsic will follow the label immediately. Backend could get the label metadata through the intrinsic's parameter. We also create DIBuilder API for labels to be used by Frontend. Frontend could use createLabel() to allocate DILabel objects, and use insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR. Differential Revision: https://reviews.llvm.org/D45024 Patch by Hsiangkai Wang. llvm-svn: 331841
*	[LV] Fix for PR37248, Broadcast codegen incorrectly assumed vector loop body ↵	Hideki Saito	2018-05-08	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	is single basic block Summary: Broadcast code generation emitted instructions in pre-header, while the instruction they are dependent on in the vector loop body. This resulted in an IL verification error ---- value used before defined. Reviewers: rengolin, fhahn, hfinkel Reviewed By: rengolin, fhahn Subscribers: dcaballe, Ka-Ka, llvm-commits Differential Revision: https://reviews.llvm.org/D46302 llvm-svn: 331799
*	[LV] Move test/Transforms/LoopVectorize/pr23997.ll	Daniel Neilson	2018-05-01	1	-0/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes a build break with r331269. test/Transforms/LoopVectorize/pr23997.ll should be in: test/Transforms/LoopVectorize/X86/pr23997.ll llvm-svn: 331281
*	[LV] Preserve inbounds on created GEPs	Daniel Neilson	2018-05-01	11	-251/+360
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a fix for PR23997. The loop vectorizer is not preserving the inbounds property of GEPs that it creates. This is inhibiting some optimizations. This patch preserves the inbounds property in the case where a load/store is being fed by an inbounds GEP. Reviewers: mkuper, javed.absar, hsaito Reviewed By: hsaito Subscribers: dcaballe, hsaito, llvm-commits Differential Revision: https://reviews.llvm.org/D46191 llvm-svn: 331269
*	[LV][VPlan] Detect outer loops for explicit vectorization.	Diego Caballero	2018-04-24	3	-0/+553
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch #2 from VPlan Outer Loop Vectorization Patch Series #1 (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). This patch introduces the basic infrastructure to detect, legality check and process outer loops annotated with hints for explicit vectorization. All these changes are protected under the feature flag -enable-vplan-native-path. This should make this patch NFC for the existing inner loop vectorizer. Reviewers: hfinkel, mkuper, rengolin, fhahn, aemerson, mssimpso. Differential Revision: https://reviews.llvm.org/D42447 llvm-svn: 330739
*	[LV] Introduce TTI::getMinimumVF	Krzysztof Parzyszek	2018-04-13	2	-0/+175
\| \| \| \| \| \| \| \| \| \| \|	The function getMinimumVF(ElemWidth) will return the minimum VF for a vector with elements of size ElemWidth bits. This value will only apply to targets for which TTI::shouldMaximizeVectorBandwidth returns true. The value of 0 indicates that there is no minimum VF. Differential Revision: https://reviews.llvm.org/D45271 llvm-svn: 330062
*	[InstCombine] reassociate loop invariant GEP chains to enable LICM	Sebastian Pop	2018-03-26	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change brings performance of zlib up by 10%. The example below is from a hot loop in longest_match() from zlib. do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1 In this example %idx.ext1 is a loop invariant. It will be moved above the use of loop induction variable %idx.ext such that it can be hoisted out of the loop by LICM. The operands that have dependences carried by the loop will be sinked down in the GEP chain. This patch will produce the following output: do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1 %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext llvm-svn: 328539
*	Revert r325687 (workaround for PR36032).	Evgeny Stupachenko	2018-03-22	4	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Revert r325687 workaround for PR36032 since a fix was committed in r326154. Reviewers: sbaranga Differential Revision: http://reviews.llvm.org/D44768 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 328257
*	[LV] Let recordVectorLoopValueForInductionCast to check if IV was created ↵	Andrei Elovikov	2018-03-20	1	-0/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	from the cast. Summary: It turned out to be error-prone to expect the callers to handle that - better to leave the decision to this routine and make the required data to be explicitly passed to the function. This handles the case that was missed in the r322473 and fixes the assert mentioned in PR36524. Reviewers: dorit, mssimpso, Ayal, dcaballe Reviewed By: dcaballe Subscribers: Ka-Ka, hiraditya, dneilson, hsaito, llvm-commits Differential Revision: https://reviews.llvm.org/D43812 llvm-svn: 327960
*	[LV] Adding test for r327109	Renato Golin	2018-03-09	1	-0/+49
\| \| \| \|	llvm-svn: 327155
*	[SCEV] Smart range calculation for SCEVUnknown Phis	Max Kazantsev	2018-03-01	1	-3/+43
\| \| \| \| \| \| \| \| \| \|	The range of SCEVUnknown Phi which merges values `X1, X2, ..., XN` can be evaluated as `U(Range(X1), Range(X2), ..., Range(XN))`. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D43810 llvm-svn: 326418
*	[ARM] add loop vectorizer test based on 482.sphinx3 from SPEC2006; NFC	Sanjay Patel	2018-02-27	1	-0/+165
\| \| \| \| \| \| \| \|	This is a slight reduction of one of the benchmarks that suffered with D43079. Cost model changes should not cause this test to remain scalarized. llvm-svn: 326221
*	Make test agnostic to cost model	Adam Nemet	2018-02-27	1	-1/+1
\| \| \| \| \| \|	This was causing bot failures on greendragon llvm-svn: 326169
*	Fix r326154 buildbots test fail	Evgeny Stupachenko	2018-02-27	2	-3/+2
\| \| \| \| \| \| \| \| \| \|	Summary: Add specific mtriples to tests added in r326154. From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 326158
*	Fix PR36032, PR35432	Evgeny Stupachenko	2018-02-27	2	-0/+367
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The change fix an assert fail at ScalarEvolutionExpander.cpp: assert(ExitCount != SE.getCouldNotCompute() && "Invalid loop count"); Reviewers: sbaranga Differential Revision: http://reviews.llvm.org/D42604 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 326154
*	[LV] Move isLegalMasked* functions from Legality to CostModel	Renato Golin	2018-02-26	2	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All SIMD architectures can emulate masked load/store/gather/scatter through element-wise condition check, scalar load/store, and insert/extract. Therefore, bailing out of vectorization as legality failure, when they return false, is incorrect. We should proceed to cost model and determine profitability. This patch is to address the vectorizer's architectural limitation described above. As such, I tried to keep the cost model and vectorize/don't-vectorize behavior nearly unchanged. Cost model tuning should be done separately. Please see http://lists.llvm.org/pipermail/llvm-dev/2018-January/120164.html for RFC and the discussions. Closes D43208. Patch by: Hideki Saito <hideki.saito@intel.com> llvm-svn: 326079
*	[LV] Fix test checks, NFC	Alexey Bataev	2018-02-21	1	-76/+2363
\| \| \| \|	llvm-svn: 325699
*	[SCEV] Temporarily disable loop versioning for the purpose	Silviu Baranga	2018-02-21	3	-4/+4
\| \| \| \| \| \| \| \| \| \|	of turning SCEVUnknowns of PHIs into AddRecExprs. This feature is now hidden behind the -scev-version-unknown flag. Fixes PR36032 and PR35432. llvm-svn: 325687
*	revert r325515: [TTI CostModel] change default cost of FP ops to 1 (PR36280)	Sanjay Patel	2018-02-21	1	-9/+36
\| \| \| \| \| \| \| \|	There are too many perf regressions resulting from this, so we need to investigate (and add tests for) targets like ARM and AArch64 before trying to reinstate. llvm-svn: 325658
*	[LV] Fix test checks, NFC.	Alexey Bataev	2018-02-20	2	-140/+3506
\| \| \| \|	llvm-svn: 325617
*	[TTI CostModel] change default cost of FP ops to 1 (PR36280)	Sanjay Patel	2018-02-19	1	-36/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change was mentioned at least as far back as: https://bugs.llvm.org/show_bug.cgi?id=26837#c26 ...and I found a real program that is harmed by this: Himeno running on AMD Jaguar gets 6% slower with SLP vectorization: https://bugs.llvm.org/show_bug.cgi?id=36280 ...but the change here appears to solve that bug only accidentally. The div/rem costs for x86 look very wrong in some cases, but that's already true, so we can fix those in follow-up patches. There's also evidence that more cost model changes are needed to solve SLP problems as shown in D42981, but that's an independent problem (though the solution may be adjusted after this change is made). Differential Revision: https://reviews.llvm.org/D43079 llvm-svn: 325515
*	[LV] Fix analyzeInterleaving when -pass-remarks enabled	Mircea Trofin	2018-02-10	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If -pass-remarks=loop-vectorize, atomic ops will be seen by analyzeInterleaving(), even though canVectorizeMemory() == false. This is because we are requesting extra analysis instead of bailing out. In such a case, we end up with a Group in both Load- and StoreGroups, and then we'll try to access freed memory when traversing LoadGroups after having had released the Group when iterating over StoreGroups. The fix is to include mayWriteToMemory() when validating that two instructions are the same kind of memory operation. Reviewers: mssimpso, davidxl Reviewed By: davidxl Subscribers: hsaito, fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D43064 llvm-svn: 324786
*	[LoopVectorize] auto-generate complete checks; NFC	Sanjay Patel	2018-02-08	1	-5/+80
\| \| \| \|	llvm-svn: 324611
*	Verify profile data confirms large loop trip counts.	Mircea Trofin	2018-02-07	1	-1/+121
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Loops with inequality comparers, such as: // unsigned bound for (unsigned i = 1; i < bound; ++i) {...} have getSmallConstantMaxTripCount report a large maximum static trip count - in this case, 0xffff fffe. However, profiling info may show that the trip count is much smaller, and thus counter-recommend vectorization. This change: - flips loop-vectorize-with-block-frequency on by default. - validates profiled loop frequency data supports vectorization, when static info appears to not counter-recommend it. Absence of profile data means we rely on static data, just as we've done so far. Reviewers: twoh, mkuper, davidxl, tejohnson, Ayal Reviewed By: davidxl Subscribers: bkramer, llvm-commits Differential Revision: https://reviews.llvm.org/D42946 llvm-svn: 324543
*	[NFC] Add tests for PR35743	Max Kazantsev	2018-02-05	1	-0/+102
\| \| \| \|	llvm-svn: 324209
*	[LV] Use Demanded Bits and ValueTracking for reduction type-shrinking	Chad Rosier	2018-02-04	1	-2/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The type-shrinking logic in reduction detection, although narrow in scope, is also rather ad-hoc, which has led to bugs (e.g., PR35734). This patch modifies the approach to rely on the demanded bits and value tracking analyses, if available. We currently perform type-shrinking separately for reductions and other instructions in the loop. Long-term, we should probably think about computing minimal bit widths in a more complete way for the loops we want to vectorize. PR35734 Differential Revision: https://reviews.llvm.org/D42309 llvm-svn: 324195
*	[X86] Add support for passing 'prefer-vector-width' function attribute into ↵	Craig Topper	2018-01-20	1	-0/+77
\| \| \| \| \| \| \| \| \| \| \| \|	X86Subtarget and exposing via X86's getRegisterWidth TTI interface. This will cause the vectorizers to do some limiting of the vector widths they create. This is not a strict limit. There are reasons I know of that the loop vectorizer will generate larger vectors for. I've written this in such a way that the interface will only return a properly supported width(0/128/256/512) even if the attribute says something funny like 384 or 10. This has been split from D41895 with the remainder in a follow up commit. llvm-svn: 323015
*	[X86] Use vmovdqu64/vmovdqa64 for unmasked integer vector stores for ↵	Craig Topper	2018-01-18	1	-1/+1
\| \| \| \| \| \| \| \|	consistency with loads. Previously we used 64 for vXi64 stores and 32 for everything else. This change uses 64 for everything just like do for loads. llvm-svn: 322820
*	[LV] Don't call recordVectorLoopValueForInductionCast for newly-created IV ↵	Andrei Elovikov	2018-01-15	1	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	from a trunc. Summary: This method is supposed to be called for IVs that have casts in their use-def chains that are completely ignored after vectorization under PSE. However, for truncates of such IVs the same InductionDescriptor is used during creation/widening of both original IV based on PHINode and new IV based on TruncInst. This leads to unintended second call to recordVectorLoopValueForInductionCast with a VectorLoopVal set to the newly created IV for a trunc and causes an assert due to attempt to store new information for already existing entry in the map. This is wrong and should not be done. Fixes PR35773. Reviewers: dorit, Ayal, mssimpso Reviewed By: dorit Subscribers: RKSimon, dim, dcaballe, hsaito, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D41913 llvm-svn: 322473
*	[LV] Remove unnecessary DoExtraAnalysis guard (silent bug)	Florian Hahn	2017-12-20	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	canVectorize is only checking if the loop has a normalized pre-header if DoExtraAnalysis is true. This doesn't make sense to me because reporting analysis information shouldn't alter legality checks. This is probably the result of a last minute minor change before committing (?). Patch by Diego Caballero. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D40973 llvm-svn: 321172
*	Move Transforms/LoopVectorize/consecutive-ptr-cg-bug.ll into the X86 ↵	Hal Finkel	2017-12-16	1	-0/+0
\| \| \| \| \| \| \| \|	subdirectory This test depends on X86's TTI; move into the X86 subdirectory. llvm-svn: 320914
*	[LV] Extend InstWidening with CM_Widen_Recursive	Hal Finkel	2017-12-16	1	-0/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Changes to the original scalar loop during LV code gen cause the return value of Legal->isConsecutivePtr() to be inconsistent with the return value during legal/cost phases (further analysis and information of the bug is in D39346). This patch is an alternative fix to PR34965 following the CM_Widen approach proposed by Ayal and Gil in D39346. It extends InstWidening enum with CM_Widen_Reverse to properly record the widening decision for consecutive reverse memory accesses and, consequently, get rid of the Legal->isConsetuviePtr() call in LV code gen. I think this is a simpler/cleaner solution to PR34965 than the one in D39346. Fixes PR34965. Patch by Diego Caballero, thanks! Differential Revision: https://reviews.llvm.org/D40742 llvm-svn: 320913
*	[LV] Support efficient vectorization of an induction with redundant casts	Dorit Nuzman	2017-12-14	1	-0/+211
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	D30041 extended SCEVPredicateRewriter to improve handling of Phi nodes whose update chain involves casts; PSCEV can now build an AddRecurrence for some forms of such phi nodes, under the proper runtime overflow test. This means that we can identify such phi nodes as an induction, and the loop-vectorizer can now vectorize such inductions, however inefficiently. The vectorizer doesn't know that it can ignore the casts, and so it vectorizes them. This patch records the casts in the InductionDescriptor, so that they could be marked to be ignored for cost calculation (we use VecValuesToIgnore for that) and ignored for vectorization/widening/scalarization (i.e. treated as TriviallyDead). In addition to marking all these casts to be ignored, we also need to make sure that each cast is mapped to the right vector value in the vector loop body (be it a widened, vectorized, or scalarized induction). So whenever an induction phi is mapped to a vector value (during vectorization/widening/ scalarization), we also map the respective cast instruction (if exists) to that vector value. (If the phi-update sequence of an induction involves more than one cast, then the above mapping to vector value is relevant only for the last cast of the sequence as we allow only the "last cast" to be used outside the induction update chain itself). This is the last step in addressing PR30654. llvm-svn: 320672
*	[LV] Ignore the cost of values that will not appear in the vectorized loop	Dorit Nuzman	2017-12-12	1	-0/+80
\| \| \| \| \| \| \| \| \|	VecValuesToIgnore holds values that will not appear in the vectorized loop. We should therefore ignore their cost when VF > 1. Differential Revision: https://reviews.llvm.org/D40883 llvm-svn: 320463
*	[SCEV] Fix wrong Equal predicate created in getAddRecForPhiWithCasts	Dorit Nuzman	2017-12-10	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CreateAddRecFromPHIWithCastsImpl() adds an IncrementNUSW overflow predicate which allows the PSCEV rewriter to rewrite this scev expression: (zext i8 {0, + , (trunc i32 step to i8)} to i32) into {0, +, (sext i8 (trunc i32 step to i8) to i32)} But then it adds the wrong Equal predicate: %step == (zext i8 (trunc i32 %step to i8) to i32). instead of: %step == (sext i8 (trunc i32 %step to i8) to i32) This is fixed here. Differential Revision: https://reviews.llvm.org/D40641 llvm-svn: 320298