bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[LoopVectorize] Replace manual VPlan memory management with unique_ptr.	Benjamin Kramer	2017-10-31	1	-26/+10
\| \| \| \| \| \|	No functionality change intended. llvm-svn: 317003
*	Do not add discriminator encoding for debug intrinsics.	Dehao Chen	2017-10-26	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: There are certain requirements for debug location of debug intrinsics, e.g. the scope of the DILocalVariable should be the same as the scope of its debug location. As a result, we should not add discriminator encoding for debug intrinsics. Reviewers: dblaikie, aprantl Reviewed By: aprantl Subscribers: JDevlieghere, aprantl, bjope, sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D39343 llvm-svn: 316703
*	[LSV] Avoid adding vectors of pointers as candidates	Bjorn Pettersson	2017-10-26	1	-3/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We no longer add vectors of pointers as candidates for load/store vectorization. It does not seem to work anyway, but without this patch we can end up in asserts when trying to create casts between an integer type and the pointer of vectors type. The test case I've added used to assert like this when trying to cast between i64 and <2 x i16>: opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. #0 PrintStackTraceSignalHandler(void) #1 SignalHandler(int) #2 __restore_rt #3 __GI_raise #4 __GI_abort #5 __GI___assert_fail #6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value, llvm::Type, llvm::Twine const&, llvm::Instruction) #7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value, llvm::Type, llvm::Twine const&) #8 Vectorizer::vectorizeStoreChain(llvm::ArrayRef<llvm::Instruction>, llvm::SmallPtrSet<llvm::Instruction, 16u>) Reviewers: arsenm Reviewed By: arsenm Subscribers: nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D39296 llvm-svn: 316665
*	[LSV] Skip all non-byte sizes, not only less than eight bits	Bjorn Pettersson	2017-10-26	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The code comments indicate that no effort has been spent on handling load/stores when the size isn't a multiple of the byte size correctly. However, the code only avoided types smaller than 8 bits. So for example a load of an i28 could still be considered as a candidate for vectorization. This patch adjusts the code to behave according to the code comment. The test case used to hit the following assert when trying to use "cast" an i32 to i28 using CreateBitOrPointerCast: opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. #0 PrintStackTraceSignalHandler(void) #1 SignalHandler(int) #2 __restore_rt #3 __GI_raise #4 __GI_abort #5 __GI___assert_fail #6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value, llvm::Type, llvm::Twine const&, llvm::Instruction) #7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value, llvm::Type, llvm::Twine const&) #8 (anonymous namespace)::Vectorizer::vectorizeLoadChain(llvm::ArrayRef<llvm::Instruction>, llvm::SmallPtrSet<llvm::Instruction, 16u>*) Reviewers: arsenm Reviewed By: arsenm Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39295 llvm-svn: 316663
*	[Transforms] Fix some Clang-tidy modernize and Include What You Use ↵	Eugene Zelenko	2017-10-17	4	-94/+146
\| \| \| \| \| \|	warnings; other minor fixes (NFC). llvm-svn: 316034
*	Revert rL315894, "SLPVectorizer.cpp: Try to appease stage2-3 difference. ↵	NAKAMURA Takumi	2017-10-16	1	-9/+23
\| \| \| \| \| \|	(D38586)" llvm-svn: 315896
*	SLPVectorizer.cpp: Try to appease stage2-3 difference. (D38586)	NAKAMURA Takumi	2017-10-16	1	-23/+9
\| \| \| \|	llvm-svn: 315894
*	[Transforms] Fix some Clang-tidy modernize and Include What You Use ↵	Eugene Zelenko	2017-10-12	1	-163/+236
\| \| \| \| \| \|	warnings; other minor fixes (NFC). llvm-svn: 315640
*	[NFC] Convert OptimizationRemarkEmitter old emit() calls to new closure	Vivek Pandya	2017-10-11	1	-48/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	parameterized emit() calls Summary: This is not functional change to adopt new emit() API added in r313691. Reviewed By: anemet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38285 llvm-svn: 315476
*	Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.*	Adam Nemet	2017-10-09	1	-1/+1
\| \| \| \| \| \| \|	Sync it up with the name of the class actually defined here. This has been bothering me for a while... llvm-svn: 315249
*	[LV] Fix PR34743 - handle casts that sink after interleaved loads	Ayal Zaks	2017-10-05	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \|	When ignoring a load that participates in an interleaved group, make sure to move a cast that needs to sink after it. Testcase derived from reproducer of PR34743. Differential Revision: https://reviews.llvm.org/D38338 llvm-svn: 314986
*	[LV] Fix PR34711 - widen instruction ranges when sinking casts	Ayal Zaks	2017-10-05	1	-23/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of trying to keep LastWidenRecipe updated after creating each recipe, have tryToWiden() retrieve the last recipe of the current VPBasicBlock and check if it's a VPWidenRecipe when attempting to extend its range. This ensures that such extensions, optimized to maintain the original instruction order, do so only when the instructions are to maintain their relative order. The latter does not always hold, e.g., when a cast needs to sink to unravel first order recurrence (r306884). Testcase derived from reproducer of PR34711. Differential Revision: https://reviews.llvm.org/D38339 llvm-svn: 314981
*	Revert r314806 "[SLP] Vectorize jumbled memory loads."	Hans Wennborg	2017-10-03	1	-185/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All the buildbots are red, e.g. http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/2436/ > Summary: > This patch tries to vectorize loads of consecutive memory accesses, accessed > in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 > which was reverted back due to some basic issue with representing the 'use mask' of > jumbled accesses. > > This patch fixes the mask representation by recording the 'use mask' in the usertree entry. > > Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df > > Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh > > Reviewed By: Ayal > > Subscribers: hans, mzolotukhin > > Differential Revision: https://reviews.llvm.org/D36130 llvm-svn: 314824
*	[SLP] Vectorize jumbled memory loads.	Mohammad Shahid	2017-10-03	1	-84/+185
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' of jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh Reviewed By: Ayal Subscribers: hans, mzolotukhin Differential Revision: https://reviews.llvm.org/D36130 llvm-svn: 314806
*	[LV] Use correct insertion point when type shrinking reductions	Matthew Simpson	2017-09-29	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	When type shrinking reductions, we should insert the truncations and extends at the end of the loop latch block. Previously, these instructions were inserted at the end of the loop header block. The difference is only a problem for loops with predicated instructions (e.g., conditional stores and instructions that may divide by zero). For these instructions, we create new basic blocks inside the vectorized loop, which cause the loop header and latch to no longer be the same block. This should fix PR34687. Reference: https://bugs.llvm.org/show_bug.cgi?id=34687 llvm-svn: 314542
*	Use a BumpPtrAllocator for Loop objects	Sanjoy Das	2017-09-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: And now that we no longer have to explicitly free() the Loop instances, we can (with more ease) use the destructor of LoopBase to do what LoopBase::clear() was doing. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38201 llvm-svn: 314375
*	[SLP] Fix crash on propagate IR flags for undef operands of min/max	Alexey Bataev	2017-09-27	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \|	reductions. If both operands of the newly created SelectInst are Undefs the resulting operation is also Undef, not SelectInst. It may cause crashes when trying to propagate IR flags because function expects exactly SelectInst instruction, nothing else. llvm-svn: 314323
*	[SLP] fix typos/formatting; NFC	Sanjay Patel	2017-09-27	1	-14/+13
\| \| \| \|	llvm-svn: 314315
*	[SLP] Support for horizontal min/max reduction.	Alexey Bataev	2017-09-25	1	-68/+382
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions. Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions. Patch fixes PR26956. Reviewers: spatel, mkuper, hfinkel, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27846 llvm-svn: 314101
*	Revert r313771 "[SLP] Vectorize jumbled memory loads."	Hans Wennborg	2017-09-20	1	-182/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This broke the buildbots, e.g. http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/391 > Summary: > This patch tries to vectorize loads of consecutive memory accesses, accessed > in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 > which was reverted back due to some basic issue with representing the 'use mask' > jumbled accesses. > > This patch fixes the mask representation by recording the 'use mask' in the usertree entry. > > Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df > > Subscribers: mzolotukhin > > Reviewed By: ayal > > Differential Revision: https://reviews.llvm.org/D36130 > > Review comments updated accordingly > > Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0 > > Added a TODO for sortLoadAccesses API > > Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58 > > Modified the TODO for sortLoadAccesses API > > Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565 > > Review comment update for using OpdNum to insert the mask in respective location > > Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce > > Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase > > Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b llvm-svn: 313781
*	[SLP] Vectorize jumbled memory loads.	Mohammad Shahid	2017-09-20	1	-83/+182
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Subscribers: mzolotukhin Reviewed By: ayal Differential Revision: https://reviews.llvm.org/D36130 Review comments updated accordingly Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0 Added a TODO for sortLoadAccesses API Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58 Modified the TODO for sortLoadAccesses API Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565 Review comment update for using OpdNum to insert the mask in respective location Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b llvm-svn: 313771
*	Revert r313736: "[SLP] Vectorize jumbled memory loads."	Alexander Kornienko	2017-09-20	1	-182/+83
\| \| \| \| \| \| \|	The revision breaks buildbots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/6694/steps/test/logs/stdio llvm-svn: 313758
*	[SLP] Vectorize jumbled memory loads.	Mohammad Shahid	2017-09-20	1	-83/+182
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' of jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh Reviewed By: Ayal Subscribers: mzolotukhin Differential Revision: https://reviews.llvm.org/D36130 Commit after rebase for patch D36130 Change-Id: I8add1c265455669ef288d880f870a9522c8c08ab llvm-svn: 313736
*	Allow ORE.emit to take a closure to delay building the remark object	Adam Nemet	2017-09-19	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \|	In the lambda we are now returning the remark by value so we need to preserve its type in the insertion operator. This requires making the insertion operator generic. I've also converted a few cases to use the new API. It seems to work pretty well. See the LoopUnroller for a slightly more interesting case. llvm-svn: 313691
*	[SLP] clean up for vector store case; NFCI	Sanjay Patel	2017-09-18	1	-12/+11
\| \| \| \|	llvm-svn: 313541
*	[SLP] Revert r312791 and other necessary commits, except for TTI and	Chandler Carruth	2017-09-15	1	-245/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CostModel. The original patch added support for horizontal min/max reductions to the SLP vectorizer. This patch causes LLVM to miscompile fairly simple signed min reductions. I have attached a test progrom to http://llvm.org/PR34635 that shows the behavior change after this patch. We found this in a test for the open source Eigen library, but also in other code. Unfortunately, the revert is moderately challenging. It required reverting: r313042: [SLP] Test with multiple uses of conditional op and wrong parent. r312853: [SLP] Fix buildbots, NFC. r312793: [SLP] Fix the warning about paths not returning the value, NFC. r312791: [SLP] Support for horizontal min/max reduction. And even then, I had to completely skip reverting the changes to TTI and CostModel because r312832 rewrote so much of this code. Plus, the cost modeling changes aren implicated in the miscompile, so they should be fine and will just not be used until this gets re-introduced. llvm-svn: 313409
*	This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352	Vivek Pandya	2017-09-15	1	-9/+12
\| \| \| \| \| \| \| \| \| \| \|	It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with command line options. The diagnostic handler used to be callback now this patch adds a class DiagnosticHandler. It has virtual method to provide custom diagnostic handler and methods to control which particular remarks are enabled. However LLVM-C API users can still provide callback function for diagnostic handler. llvm-svn: 313390
*	This reverts r313381	Vivek Pandya	2017-09-15	1	-12/+9
\| \| \| \|	llvm-svn: 313387
*	This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352	Vivek Pandya	2017-09-15	1	-9/+12
\| \| \| \| \| \| \| \| \| \| \|	It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with command line options. The diagnostic handler used to be callback now this patch adds a class DiagnosticHandler. It has virtual method to provide custom diagnostic handler and methods to control which particular remarks are enabled. However LLVM-C API users can still provide callback function for diagnostic handler. llvm-svn: 313382
*	Revert "[SLPVectorizer] Failure to beneficially vectorize 'copyable' ↵	Ilya Biryukov	2017-09-15	1	-317/+142
\| \| \| \| \| \| \| \| \|	elements in integer binary ops." This reverts commit r313348. Reason: it caused buildbot failures. llvm-svn: 313352
*	[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in ↵	Dinar Temirbulatov	2017-09-15	1	-142/+317
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	integer binary ops. Patch tries to improve vectorization of the following code: void add1(int * __restrict dst, const int * __restrict src) { dst++ = src++; dst++ = src++ + 1; dst++ = src++ + 2; dst++ = src++ + 3; } Allows to vectorize even if the very first operation is not a binary add, but just a load. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 313348
*	[SLPVectorizer] Remove duplicated functionality code in initScheduleData ↵	Dinar Temirbulatov	2017-09-15	1	-6/+0
\| \| \| \| \| \|	function, NFCI. llvm-svn: 313341
*	[LV] Fix maximum legal VF calculation	Alon Kom	2017-09-14	1	-28/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes pr34283, which exposed that the computation of maximum legal width for vectorization was wrong, because it relied on MaxInterleaveFactor to obtain the maximum stride used in the loop, however not all strided accesses in the loop have an interleave-group associated with them. Instead of recording the maximum stride in the loop, which can be over conservative (e.g. if the access with the maximum stride is not involved in the dependence limitation), this patch tracks the actual maximum legal width imposed by accesses that are involved in dependencies. Differential Revision: https://reviews.llvm.org/D37507 llvm-svn: 313237
*	[SLPVectorizer] Prefer auto over explicit type for VL0, NFCI.	Dinar Temirbulatov	2017-09-14	1	-1/+1
\| \| \| \|	llvm-svn: 313228
*	[LV] Avoid computing the register usage for default VF. NFC	Anna Thomas	2017-09-13	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \|	These are changes to reduce redundant computations when calculating a feasible vectorization factor: 1. early return when target has no vector registers 2. don't compute register usage for the default VF. Suggested during review for D37702. llvm-svn: 313176
*	[LV] Fix PR34523 - avoid generating redundant selects	Ayal Zaks	2017-09-13	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When converting a PHI into a series of 'select' instructions to combine the incoming values together according their edge masks, initialize the first value to the incoming value In0 of the first predecessor, instead of generating a redundant assignment 'select(Cond[0], In0, In0)'. The latter fails when the Cond[0] mask is null, representing a full mask, which can happen only when there's a single incoming value. No functional changes intended nor expected other than surviving null Cond[0]'s. This fix follows D35725, which introduced using null to represent full masks. Differential Revision: https://reviews.llvm.org/D37619 llvm-svn: 313119
*	[LV] Clamp the VF to the trip count	Anna Thomas	2017-09-12	1	-7/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When the MaxVectorSize > ConstantTripCount, we should just clamp the vectorization factor to be the ConstantTripCount. This vectorizes loops where the TinyTripCountThreshold >= TripCount < MaxVF. Earlier we were finding the maximum vector width, which could be greater than the trip count itself. The Loop vectorizer does all the work for generating a vectorizable loop, but in the end we would always choose the scalar loop (since the VF > trip count). This allows us to choose the VF keeping in mind the trip count if available. This is a fix on top of rL312472. Reviewers: Ayal, zvi, hfinkel, dneilson Reviewed by: Ayal Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37702 llvm-svn: 313046
*	[SLP] Fix for PHINode during horizontal reduction scanning, NFC.	Alexey Bataev	2017-09-12	1	-1/+1
\| \| \| \| \| \|	Reduces number of loops during instructions analysis. llvm-svn: 313035
*	[SLP] Fix buildbots, NFC.	Alexey Bataev	2017-09-09	1	-2/+2
\| \| \| \|	llvm-svn: 312853
*	[SLPVectorizer] Add struct InstructionsState that holds information about ↵	Dinar Temirbulatov	2017-09-08	1	-88/+120
\| \| \| \| \| \| \| \| \| \| \| \|	analysis of vector to be vectorized. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D37212 llvm-svn: 312802
*	[SLP] Fix the warning about paths not returning the value, NFC.	Alexey Bataev	2017-09-08	1	-2/+4
\| \| \| \|	llvm-svn: 312793
*	[SLP] Support for horizontal min/max reduction.	Alexey Bataev	2017-09-08	1	-49/+243
\| \| \| \| \| \| \| \| \| \| \| \| \|	SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions. Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions. Patch fixes PR26956. Differential revision: https://reviews.llvm.org/D27846 llvm-svn: 312791
*	LoopVectorize: MaxVF should not be larger than the loop trip count	Zvi Rackover	2017-09-04	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Improve how MaxVF is computed while taking into account that MaxVF should not be larger than the loop's trip count. Other than saving on compile-time by pruning the possible MaxVF candidates, this patch fixes pr34438 which exposed the following flow: 1. Short trip count identified -> Don't bail out, set OptForSize:=True to avoid tail-loop and runtime checks. 2. Compute MaxVF returned 16 on a target supporting AVX512. 3. OptForSize -> choose VF:=MaxVF. 4. Bail out because TripCount = 8, VF = 16, TripCount % VF !=0 means we need a tail loop. With this patch step 2. will choose MaxVF=8 based on TripCount. Reviewers: Ayal, dorit, mkuper, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D37425 llvm-svn: 312472
*	[LoopVectorize] Turn static DenseSet into switch.	Benjamin Kramer	2017-09-02	1	-16/+47
\| \| \| \| \| \|	LLVM transforms this into a bit test which is a lot faster and smaller. llvm-svn: 312417
*	[Analysis, Transforms] Fix some Clang-tidy modernize and Include What You ↵	Eugene Zelenko	2017-09-01	1	-90/+143
\| \| \| \| \| \|	Use warnings; other minor fixes (NFC). llvm-svn: 312383
*	[LoopVectorizer] Use two step casting for float to pointer types.	Manoj Gupta	2017-09-01	1	-3/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: LoopVectorizer is creating casts between vec<ptr> and vec<float> types on ARM when compiling OpenCV. Since, tIs is illegal to directly cast a floating point type to a pointer type even if the types have same size causing a crash. Fix the crash using a two-step casting by bitcasting to integer and integer to pointer/float. Fixes PR33804. Reviewers: mkuper, Ayal, dlj, rengolin, srhines Reviewed By: rengolin Subscribers: aemerson, kristof.beyls, mkazantsev, Meinersbur, rengolin, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35498 llvm-svn: 312331
*	[SLPVectorizer] Move out Entry->NeedToGather check and assert of inner loop ↵	Dinar Temirbulatov	2017-08-31	1	-5/+6
\| \| \| \| \| \|	as invariant, NFCI. llvm-svn: 312242
*	[Instruction] add moveAfter() convenience function; NFCI	Sanjay Patel	2017-08-29	1	-2/+1
\| \| \| \| \| \| \| \| \|	As suggested in D37121, here's a wrapper for removeFromParent() + insertAfter(), but implemented using moveBefore() for symmetry/efficiency. Differential Revision: https://reviews.llvm.org/D37239 llvm-svn: 312001
*	[LV] Fix PR34248 - recommit D32871 after revert r311304	Ayal Zaks	2017-08-27	4	-523/+2247
\| \| \| \| \| \| \| \| \| \| \|	Original commit r311077 of D32871 was reverted in r311304 due to failures reported in PR34248. This recommit fixes PR34248 by restricting the packing of predicated scalars into vectors only when vectorizing, avoiding doing so when unrolling w/o vectorizing. Added a test derived from the reproducer of PR34248. llvm-svn: 311849
*	Revert r311077: [LV] Using VPlan ...	Chandler Carruth	2017-08-20	4	-2251/+523
\| \| \| \| \| \| \|	This causes LLVM to assert fail on PPC64 and crash / infloop in other cases. Filed http://llvm.org/PR34248 with reproducer attached. llvm-svn: 311304