summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Vectorize
Commit message (Collapse)AuthorAgeFilesLines
...
* [Loads] Move generic code out of vectorizer into a location it might be ↵Philip Reames2019-09-101-51/+0
| | | | | | reused [NFC] llvm-svn: 371558
* [ValueTracking] Factor our common speculation suppression logic [NFC]Philip Reames2019-09-101-15/+0
| | | | | | Expose a utility function so that all places which want to suppress speculation (when otherwise legal) due to ordering and/or sanitizer interaction can do so. llvm-svn: 371556
* [LoopVectorize] Leverage speculation safety to avoid masked.loadsPhilip Reames2019-09-091-4/+85
| | | | | | | | | | | | If we're vectorizing a load in a predicated block, check to see if the load can be speculated rather than predicated. This allows us to generate a normal vector load instead of a masked.load. To do so, we must prove that all bytes accessed on any iteration of the original loop are dereferenceable, and that all loads (across all iterations) are properly aligned. This is equivelent to proving that hoisting the load into the loop header in the original scalar loop is safe. Note: There are a couple of code motion todos in the code. My intention is to wait about a day - to be sure this sticks - and then perform the NFC motion without furthe review. Differential Revision: https://reviews.llvm.org/D66688 llvm-svn: 371452
* Fix typo. NFCISimon Pilgrim2019-09-071-1/+1
| | | | llvm-svn: 371317
* Change TargetLibraryInfo analysis passes to always require FunctionTeresa Johnson2019-09-072-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the first change to enable the TLI to be built per-function so that -fno-builtin* handling can be migrated to use function attributes. See discussion on D61634 for background. This is an enabler for fixing handling of these options for LTO, for example. This change should not affect behavior, as the provided function is not yet used to build a specifically per-function TLI, but rather enables that migration. Most of the changes were very mechanical, e.g. passing a Function to the legacy analysis pass's getTLI interface, or in Module level cases, adding a callback. This is similar to the way the per-function TTI analysis works. There was one place where we were looking for builtins but not in the context of a specific function. See FindCXAAtExit in lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround could provide the wrong behavior in some corner cases. Suggestions welcome. Reviewers: chandlerc, hfinkel Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66428 llvm-svn: 371284
* [LLVM][Alignment] Convert isLegalNTStore/isLegalNTLoad to llvm::AlignGuillaume Chatelet2019-09-051-2/+4
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67223 llvm-svn: 371063
* [NFC] Switch last couple of invariant_load checks to use hasMetadataPhilip Reames2019-09-041-1/+1
| | | | llvm-svn: 370948
* [LV] Fix miscompiles by adding non-header PHI nodes to AllowedExitBjorn Pettersson2019-09-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: Fold-tail currently supports reduction last-vector-value live-out's, but has yet to support last-scalar-value live-outs, including non-header phi's. As it relies on AllowedExit in order to detect them and bail out we need to add the non-header PHI nodes to AllowedExit, otherwise we end up with miscompiles. Solves https://bugs.llvm.org/show_bug.cgi?id=43166 Reviewers: fhahn, Ayal Reviewed By: fhahn, Ayal Subscribers: anna, hiraditya, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67074 llvm-svn: 370721
* [LV] Tail-folding, runtime scev checksSjoerd Meijer2019-09-031-2/+2
| | | | | | | | | Now that we allow tail-folding, not only when we optimise for size, make sure we do not run in this assert. Differential revision: https://reviews.llvm.org/D66932 llvm-svn: 370711
* [LV] Tail-folding with runtime memory checksSjoerd Meijer2019-09-031-1/+4
| | | | | | | | | The loop vectorizer was running in an assert when it tried to fold the tail and had to emit runtime memory disambiguation checks. Differential revision: https://reviews.llvm.org/D66803 llvm-svn: 370707
* Fix cppcheck shadow variable and variable scope warnings. NFCI.Simon Pilgrim2019-08-311-6/+5
| | | | llvm-svn: 370580
* [LV] Fold tail by masking - handle reductionsAyal Zaks2019-08-283-11/+57
| | | | | | | | | | | | Allow vectorizing loops that have reductions when tail is folded by masking. A select is introduced in VPlan, choosing between the last value carried by the loop-exit/live-out instruction of the reduction, and the penultimate value carried by the reduction phi, according to the "i < n" mask of fold-tail. This select replaces the last value as the live-out value of the loop. Differential Revision: https://reviews.llvm.org/D66720 llvm-svn: 370173
* Add a clarify comment for meaning of SafePointes [NFC]Philip Reames2019-08-261-1/+5
| | | | | | Extracted from D66688 as requested. llvm-svn: 369962
* [SLP] use range-for loops, fix formatting; NFCSanjay Patel2019-08-231-32/+32
| | | | | | | These are part of D57059, but that patch doesn't apply cleanly to trunk at this point, so we might as well remove some of the noise. llvm-svn: 369776
* [SLP] fix formatting; NFCSanjay Patel2019-08-231-4/+3
| | | | | | | These are part of D57059, but that patch doesn't apply cleanly to trunk at this point, so we might as well remove some of the noise. llvm-svn: 369769
* [SLP][NFC] Avoid repetitive calls to getSameOpcode()Dinar Temirbulatov2019-08-201-120/+176
| | | | | | | | We can avoid repetitive calls getSameOpcode() for already known tree elements by keeping MainOp and AltOp in TreeEntry. Differential Revision: https://reviews.llvm.org/D64700 llvm-svn: 369315
* [SLP] reduce duplicated code; NFCSanjay Patel2019-08-191-2/+4
| | | | llvm-svn: 369250
* [SLPVectorizer] Make the scheduler aware of the TreeEntry operands.Vasileios Porpodas2019-08-161-79/+171
| | | | | | | | | | | | | | | | | | | | | | Summary: The scheduler's dependence graph gets the use-def dependencies by accessing the operands of the instructions in a bundle. However, buildTree_rec() may change the order of the operands in TreeEntry, and the scheduler is currently not aware of this. This is not causing any functional issues currently, because reordering is restricted to the operands of a single instruction. Once we support operand reordering across multiple TreeEntries, as shown here: http://www.llvm.org/devmtg/2019-04/slides/Poster-Porpodas-Supernode_SLP.pdf , the scheduler will need to get the correct operands from TreeEntry and not from the individual instructions. In short, this patch: - Connects the scheduler's bundle with the corresponding TreeEntry. It introduces new TE and Lane fields in ScheduleData. - Moves the location where the operands of the TreeEntry are initialized. This used to take place in newTreeEntry() setting one operand at a time, but is now moved pre-order just before the recursion of buildTree_rec(). This is required because the scheduler needs to access both operands of the TreeEntry in tryScheduleBundle(). - Updates the scheduler to access the instruction operands through the TreeEntry operands instead of accessing the instruction operands directly. Reviewers: ABataev, RKSimon, dtemirbulatov, Ayal, dorit, hfinkel Reviewed By: ABataev Subscribers: hiraditya, llvm-commits, lebedev.ri, rcorcs Tags: #llvm Differential Revision: https://reviews.llvm.org/D62432 llvm-svn: 369131
* [SLPVectorizer] Silence null dereference warning. NFCI.Simon Pilgrim2019-08-161-0/+1
| | | | | | cppcheck + MSVC analyzer both over zealously warn that we might dereference a null Bundle pointer - add an assertion to check for null to silence the warning, plus its a good idea to check that we succeeded in finding a schedule bundle anyway.... llvm-svn: 369094
* [llvm] Migrate llvm::make_unique to std::make_uniqueJonas Devlieghere2019-08-152-6/+6
| | | | | | | | Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013
* [LV] fold-tail predication should be respected even with assume_safetyDorit Nuzman2019-08-152-5/+5
| | | | | | | | | | | | | | | assume_safety implies that loads under "if's" can be safely executed speculatively (unguarded, unmasked). However this assumption holds only for the original user "if's", not those introduced by the compiler, such as the fold-tail "if" that guards us from loading beyond the original loop trip-count. Currently the combination of fold-tail and assume-safety pragmas results in ignoring the fold-tail predicate that guards the loads, generating unmasked loads. This patch fixes this behavior. Differential Revision: https://reviews.llvm.org/D66106 Reviewers: Ayal, hsaito, fhahn llvm-svn: 368973
* [SLP][NFC] Use pointers to address to ScalarToTreeEntry elements, instead of ↵Dinar Temirbulatov2019-08-141-4/+4
| | | | | | indexes. llvm-svn: 368906
* [LV] Fold-tail flagDorit Nuzman2019-08-141-5/+13
| | | | | | | | | | | This is the compiler-flag equivalent of the Predicate pragma (https://reviews.llvm.org/D65197), to direct the vectorizer to fold the remainder-loop into the main-loop using predication. Differential Revision: https://reviews.llvm.org/D66108 Reviewers: Ayal, hsaito, fhahn, SjoerdMeije llvm-svn: 368801
* [LoopVectorize][X86] Clamp interleave factor if we have a known constant ↵Craig Topper2019-08-071-1/+9
| | | | | | | | | | | | trip count that is less than VF*interleave If we know the trip count, we should make sure the interleave factor won't cause the vectorized loop to exceed it. Improves one of the cases from PR42674 Differential Revision: https://reviews.llvm.org/D65896 llvm-svn: 368215
* Revert "[X86] Add more extract subvector cost model tests for smaller ↵Mitch Phillips2019-08-061-9/+0
| | | | | | | | | | | element sizes and smaller than 128-bit vectors." This reverts commit fc33e33776b7a7ce22e539f0ec2e3bfdb09ad361. This commit depends on the rolled back commit rL367901, and thus needs to be rolled back. llvm-svn: 368109
* [X86] Add more extract subvector cost model tests for smaller element sizes ↵Craig Topper2019-08-061-0/+9
| | | | | | | | | and smaller than 128-bit vectors. With the switch to widening legalization, we need to a better job of costing extractions of less than 128-bits. llvm-svn: 368081
* [LV][NFC] Share the LV illegality reporting with LoopVectorize.Hideki Saito2019-08-062-133/+135
| | | | | | | | | | | | Reviewers: hsaito, fhahn, rengolin Reviewed By: rengolin Patch by psamolysov, thanks! Differential Revision: https://reviews.llvm.org/D62997 llvm-svn: 367980
* Handle casts changing pointer size in the vectorizerStanislav Mekhanoshin2019-08-021-5/+16
| | | | | | | | | Added code to truncate or shrink offsets so that we can continue base pointer search if size has changed along the way. Differential Revision: https://reviews.llvm.org/D65612 llvm-svn: 367646
* Relax load store vectorizer pointer strip checksStanislav Mekhanoshin2019-08-011-3/+2
| | | | | | | | | | | The previous change to fix crash in the vectorizer introduced performance regressions. The condition to preserve pointer address space during the search is too tight, we only need to match the size. Differential Revision: https://reviews.llvm.org/D65600 llvm-svn: 367624
* Follow up of rL367592, fix the buildSjoerd Meijer2019-08-011-2/+0
| | | | | | | Some buildbots complained about: error: default label in switch which covers all enumeration values llvm-svn: 367603
* [LV] Tail-Loop FoldingSjoerd Meijer2019-08-012-54/+99
| | | | | | | | | | | This allows folding of the scalar epilogue loop (the tail) into the main vectorised loop body when the loop is annotated with a "vector predicate" metadata hint. To fold the tail, instructions need to be predicated (masked), enabling/disabling lanes for the remainder iterations. Differential Revision: https://reviews.llvm.org/D65197 llvm-svn: 367592
* [AMDGPU] Fix for vectorizer crash with pointers of different sizeStanislav Mekhanoshin2019-07-311-0/+5
| | | | | | | | | When vectorizer strips pointers it can eventually end up with pointers of two different sizes, then SCEV will crash. Differential Revision: https://reviews.llvm.org/D65480 llvm-svn: 367443
* [LV] Scalar Epilogue Lowering. NFC.Sjoerd Meijer2019-07-252-57/+66
| | | | | | | | | | | | | | | | This refactors boolean 'OptForSize' that was passed around in a lot of places. It controlled folding of the tail loop, the scalar epilogue, into the main loop but code-size reasons may not be the only reason to do this. Thus, this is a first step to generalise the concept of tail-loop folding, and hence OptForSize has been renamed and is using an enum ScalarEpilogueStatus that holds the status how the epilogue should be lowered. This will be followed up by D65197, that picks up the predicate loop hint and performs the tail-loop folding. Differential Revision: https://reviews.llvm.org/D64916 llvm-svn: 366993
* [SLPVectorizer] Revert local change that got accidently got committed in ↵Simon Pilgrim2019-07-231-1/+0
| | | | | | | | rL366799 This wasn't part of D63281 llvm-svn: 366807
* [TargetLowering] Add SimplifyMultipleUseDemandedBitsSimon Pilgrim2019-07-231-0/+1
| | | | | | | | | | | | | | | | | | This patch introduces the DAG version of SimplifyMultipleUseDemandedBits, which attempts to peek through ops (mainly and/or/xor so far) that don't contribute to the demandedbits/elts of a node - which means we can do this even in cases where we have multiple uses of an op, which normally requires us to demanded all bits/elts. The intention is to remove a similar instruction - SelectionDAG::GetDemandedBits - once SimplifyMultipleUseDemandedBits has matured. The InstCombine version of SimplifyMultipleUseDemandedBits can constant fold which I haven't added here yet, and so far I've only wired this up to some basic binops (and/or/xor/add/sub/mul) to demonstrate its use. We do see a couple of regressions that need to be addressed: AMDGPU unsigned dot product codegen retains an AND mask (for ZERO_EXTEND) that it previously removed (but otherwise the dotproduct codegen is a lot better). X86/AVX2 has poor handling of vector ANY_EXTEND/ANY_EXTEND_VECTOR_INREG - it prematurely gets converted to ZERO_EXTEND_VECTOR_INREG. The code owners have confirmed its ok for these cases to fixed up in future patches. Differential Revision: https://reviews.llvm.org/D63281 llvm-svn: 366799
* [SLPVectorizer] Remove null-pointer test. NFCI.Simon Pilgrim2019-07-231-4/+4
| | | | | | cast<CallInst> shouldn't return null and we dereference the pointer in a lot of other places, causing both MSVC + cppcheck to warn about dereferenced null pointers llvm-svn: 366793
* [SLPVectorizer] Fix some MSVC/cppcheck uninitialized variable warnings. NFCI.Simon Pilgrim2019-07-221-3/+3
| | | | llvm-svn: 366712
* Temporarily Revert "[SLP] Recommit: Look-ahead operand reordering heuristic."Eric Christopher2019-07-151-248/+46
| | | | | | | | | As there are some reported miscompiles with AVX512 and performance regressions in Eigen. Verified with the original committer and testcases will be forthcoming. This reverts commit r364964. llvm-svn: 366154
* [LoopVectorize] Pass unfiltered list of arguments to getIntrinsicInstCost.Florian Hahn2019-07-151-5/+2
| | | | | | | We do not compute the scalarization overhead in getVectorIntrinsicCost and TTI::getIntrinsicInstrCost requires the full arguments list. llvm-svn: 366049
* [LV] Exclude loop-invariant inputs from scalar cost computation.Florian Hahn2019-07-141-22/+42
| | | | | | | | | | | | | | | | Loop invariant operands do not need to be scalarized, as we are using the values outside the loop. We should ignore them when computing the scalarization overhead. Fixes PR41294 Reviewers: hsaito, rengolin, dcaballe, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D59995 llvm-svn: 366030
* Delete dead storesFangrui Song2019-07-121-7/+2
| | | | llvm-svn: 365903
* [SLP] Optimize getSpillCost(); NFCINikita Popov2019-07-091-6/+10
| | | | | | | | | | | | For a given set of live values, the spill cost will always be the same for each call. Compute the cost once and multiply it by the number of calls. (I'm not sure this spill cost modeling makes sense if there are multiple calls, as the spill cost will likely be shared across calls in that case. But that's how it currently works.) llvm-svn: 365552
* [SLP] Recommit: Look-ahead operand reordering heuristic.Vasileios Porpodas2019-07-021-46/+248
| | | | | | | | | | | | | | | | Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example). Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk Reviewed By: RKSimon, dtemirbulatov Subscribers: hiraditya, phosek, rnk, rcorcs, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60897 llvm-svn: 364964
* Revert [SLP] Look-ahead operand reordering heuristic.Jordan Rupprecht2019-07-011-236/+46
| | | | | | | | This reverts r364478 (git commit 574cb0eb3a7ac95e62d223a60bef891171dfe321) The patch is causing compilation timeouts. llvm-svn: 364846
* [SLP] Look-ahead operand reordering heuristic.Vasileios Porpodas2019-06-261-46/+236
| | | | | | | | | | | | | | | | Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example). Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk Reviewed By: RKSimon, dtemirbulatov Subscribers: rnk, rcorcs, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60897 llvm-svn: 364478
* [SLP] NFC: Fixed typo in commentVasileios Porpodas2019-06-241-1/+1
| | | | llvm-svn: 364237
* [SLP] Support unary FNeg vectorizationCameron McInally2019-06-241-2/+30
| | | | | | Differential Revision: https://reviews.llvm.org/D63609 llvm-svn: 364219
* Revert [SLP] Look-ahead operand reordering heuristic.Reid Kleckner2019-06-211-232/+46
| | | | | | | | | This reverts r364084 (git commit 5698921be2d567f6abf925479ac9f5a376d6d74f) It caused crashes while compiling a file in Chrome. Reduction forthcoming. llvm-svn: 364111
* [SLP] Look-ahead operand reordering heuristic.Simon Pilgrim2019-06-211-46/+232
| | | | | | | | | | This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example). Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D60897 llvm-svn: 364084
* [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through ↵Orlando Cazalet-Hyams2019-06-191-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. I have set up a separate review D61933 for a fix which is required for this patch. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse Reviewed By: hfinkel, jmorse Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 > llvm-svn: 363046 llvm-svn: 363786
OpenPOWER on IntegriCloud