summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Vectorize
Commit message (Collapse)AuthorAgeFilesLines
...
* [LV] Suppress vectorization in some nontemporal casesWarren Ristow2019-06-172-1/+33
| | | | | | | | | | | | | | | | | | | | | When considering a loop containing nontemporal stores or loads for vectorization, suppress the vectorization if the corresponding vectorized store or load with the aligment of the original scaler memory op is not supported with the nontemporal hint on the target. This adds two new functions: bool isLegalNTStore(Type *DataType, unsigned Alignment) const; bool isLegalNTLoad(Type *DataType, unsigned Alignment) const; to TTI, leaving the target independent default implementation as returning true, but with overriding implementations for X86 that check the legality based on available Subtarget features. This fixes https://llvm.org/PR40759 Differential Revision: https://reviews.llvm.org/D61764 llvm-svn: 363581
* PHINode: introduce setIncomingValueForBlock() function, and use it.Whitney Tsang2019-06-171-4/+2
| | | | | | | | | | | | | | | | Summary: There is PHINode::getBasicBlockIndex() and PHINode::setIncomingValue() but no function to replace incoming value for a specified BasicBlock* predecessor. Clearly, there are a lot of places that could use that functionality. Reviewer: craig.topper, lebedev.ri, Meinersbur, kbarton, fhahn Reviewed By: Meinersbur, fhahn Subscribers: fhahn, hiraditya, zzheng, jsji, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D63338 llvm-svn: 363566
* [LV] Deny irregular types in interleavedAccessCanBeWidenedBjorn Pettersson2019-06-171-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Avoid that loop vectorizer creates loads/stores of vectors with "irregular" types when interleaving. An example of an irregular type is x86_fp80 that is 80 bits, but that may have an allocation size that is 96 bits. So an array of x86_fp80 is not bitcast compatible with a vector of the same type. Not sure if interleavedAccessCanBeWidened is the best place for this check, but it solves the problem seen in the added test case. And it is the same kind of check that already exists in memoryInstructionCanBeWidened. Reviewers: fhahn, Ayal, craig.topper Reviewed By: fhahn Subscribers: hiraditya, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63386 llvm-svn: 363547
* Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step ↵Orlando Cazalet-Hyams2019-06-121-14/+6
| | | | | | | | | through loop even after completion" This reverts commit 1a0f7a2077b70c9864faa476e15b048686cf1ca7. See phabricator thread for D60831. llvm-svn: 363132
* [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through ↵Orlando Cazalet-Hyams2019-06-111-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. I have set up a separate review D61933 for a fix which is required for this patch. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse Reviewed By: hfinkel, jmorse Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 llvm-svn: 363046
* [LV] Fix -Wunused-function after r362736Fangrui Song2019-06-071-0/+2
| | | | llvm-svn: 362762
* [LV] Wrap LV illegality reporting in a function. NFC.Renato Golin2019-06-061-100/+120
| | | | | | | | | | | | | | | | | | | | | | | A function for loop vectorization illegality reporting has been introduced: void LoopVectorizationLegality::reportVectorizationFailure( const StringRef DebugMsg, const StringRef OREMsg, const StringRef ORETag, Instruction * const I) const; The function prints a debug message when the debug for the compilation unit is enabled as well as invokes the optimization report emitter to generate a message with a specified tag. The function doesn't cover any complicated logic when a custom lambda should be passed to the emitter, only generating a message with a tag is supported. The function always prints the instruction `I` after the debug message whenever the instruction is specified, otherwise the debug message ends with a dot: 'LV: Not vectorizing: Disabled/already vectorized.' Patch by Pavel Samolysov <samolisov@gmail.com> llvm-svn: 362736
* [SLP] Fix regression in broadcasts caused by operand reordering patch D59973.Dinar Temirbulatov2019-06-051-5/+35
| | | | | | | | | | | | This patch fixes a regression caused by the operand reordering refactoring patch https://reviews.llvm.org/D59973 . The fix changes the strategy to Splat instead of Opcode, if broadcast opportunities are found. Please see the lit test for some examples. Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D62427 llvm-svn: 362613
* [LoopUtils][SLPVectorizer] clean up management of fast-math-flagsSanjay Patel2019-06-051-3/+9
| | | | | | | | | | | | | | | | Instead of passing around fast-math-flags as a parameter, we can set those using an IRBuilder guard object. This is no-functional-change-intended. The motivation is to eventually fix the vectorizers to use and set the correct fast-math-flags for reductions. Examples of that not behaving as expected are: https://bugs.llvm.org/show_bug.cgi?id=23116 (should be able to reduce with less than 'fast') https://bugs.llvm.org/show_bug.cgi?id=35538 (possible miscompile for -0.0) D61802 (should be able to reduce with IR-level FMF) Differential Revision: https://reviews.llvm.org/D62272 llvm-svn: 362612
* [LV] Remove the redundant using LoopVectorizationPlanner:VPlanPtrFlorian Hahn2019-05-302-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | VPlan.h already contains the declaration of VPlanPtr type alias: using VPlanPtr = std::unique_ptr<VPlan>; The LoopVectorizationPlanner class also contains the same declaration of VPlanPtr and therefore LoopVectorize requires a long wording when its methods return VPlanPtr: LoopVectorizationPlanner::VPlanPtr LoopVectorizationPlanner::buildVPlanWithVPRecipes(...) but LoopVectorize.cpp includes VPlan.h (via LoopVectorizationPlanner.h) and can use VPlanPtr from that header. Patch by Pavel Samolysov. Reviewers: hsaito, rengolin, fhahn Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D62576 llvm-svn: 362126
* [LoopVectorize] Add FNeg instruction supportCraig Topper2019-05-301-9/+20
| | | | | | Differential Revision: https://reviews.llvm.org/D62510 llvm-svn: 362124
* [LV] Inform about exactly reason of loop illegalityFlorian Hahn2019-05-301-2/+10
| | | | | | | | | | | | | | | | | | | | | | | Currently, only the following information is provided by LoopVectorizer in the case when the CF of the loop is not legal for vectorization: LV: Can't vectorize the instructions or CFG LV: Not vectorizing: Cannot prove legality. But this information is not enough for the root cause analysis; what is exactly wrong with the loop should also be printed: LV: Not vectorizing: The exiting block is not the loop latch. Patch by Pavel Samolysov. Reviewers: mkuper, hsaito, rengolin, fhahn Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D62311 llvm-svn: 362056
* [SLPVectorizer] Set flag to previous default.Alina Sbirlea2019-05-231-1/+1
| | | | | | | | | | | | | | | | | Summary: The refactoring in r360276 moved the `RunSLPVectorization` flag and added the default explicitly. The default should have been `false`, as before. The new pass manager used to have SLPVectorization on by default, now it's off in opt, and needs D61617 checked in to enable it in clang. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, eraman, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61955 llvm-svn: 361537
* LoopVectorizationCostModel::selectInterleaveCount - assert we have a ↵Simon Pilgrim2019-05-221-0/+2
| | | | | | | | non-zero loop cost. NFCI. The input LoopCost value can be zero, but if so it should be recalculated with the current VF. After that it should always be non-zero. llvm-svn: 361387
* [SLP] Refactoring of EdgeInfo and UserTreeIdx in buildTree_rec().Dinar Temirbulatov2019-05-191-58/+48
| | | | | | | | | | | This is a follow-up refactoring patch after the introduction of usable TreeEntry pointers in D61706. The EdgeInfo struct can now use a TreeEntry pointer instead of an index in VectorizableTree. Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D61795 llvm-svn: 361110
* [LV] Move getScalarizationOverhead and vector call cost computations to CM. ↵Florian Hahn2019-05-152-62/+61
| | | | | | | | | | | | | | | | | | | | | (NFC) This reduces the number of parameters we need to pass in and they seem a natural fit in LoopVectorizationCostModel. Also simplifies things for D59995. As a follow up refactoring, we could only expose a expose a shouldUseVectorIntrinsic() helper in LoopVectorizationCostModel, instead of calling getVectorCallCost/getVectorIntrinsicCost in InnerLoopVectorizer/VPRecipeBuilder. Reviewers: Ayal, hsaito, dcaballe, rengolin Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D61638 llvm-svn: 360758
* [SLP] Refactor VectorizableTree to use unique_ptr.Simon Pilgrim2019-05-101-48/+67
| | | | | | | | | | This patch fixes the TreeEntry dangling pointer issue caused by reallocations of VectorizableTree. Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D61706 llvm-svn: 360456
* [NewPassManager] Add tuning option: SLPVectorization [NFC].Alina Sbirlea2019-05-081-0/+4
| | | | | | | | | | | | | | Summary: Mirror tuning option from old pass manager in new pass manager. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61616 llvm-svn: 360276
* [MemorySSA] Teach LoopSimplify to preserve MemorySSA.Alina Sbirlea2019-05-081-1/+2
| | | | | | | | | | | | | | | Summary: Preserve MemorySSA in LoopSimplify, in the old pass manager, if the analysis is available. Do not preserve it in the new pass manager. Update tests. Subscribers: nemanjai, jlebar, javed.absar, Prazek, kbarton, zzheng, jsji, llvm-commits, george.burgess.iv, chandlerc Tags: #llvm Differential Revision: https://reviews.llvm.org/D60833 llvm-svn: 360270
* [VPlan] Fix "value never used" static analyzer warning. NFCI.Simon Pilgrim2019-05-081-2/+1
| | | | llvm-svn: 360241
* revert r360162 as it breaks most of the buildbotsKostya Serebryany2019-05-071-14/+6
| | | | llvm-svn: 360190
* [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through ↵Orlando Cazalet-Hyams2019-05-071-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel Reviewed By: hfinkel Subscribers: bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 llvm-svn: 360162
* [LoadStoreVectorizer] vectorizeStoreChain - ensure we find a store type.Simon Pilgrim2019-05-061-1/+2
| | | | | | | | Properly initialize store type to null then ensure we find a real store type in the chain. Fixes scan-build null dereference warning and makes the code clearer. llvm-svn: 360031
* [SLPVectorizer] Prefer pre-increments. NFCI.Simon Pilgrim2019-05-051-3/+3
| | | | llvm-svn: 359989
* [SLPVectorizer] Make getSpillCost() const. NFCI.Simon Pilgrim2019-05-051-2/+9
| | | | | | Ideally getTreeCost() should be const as well but non-const Type creation would need to be addressed first. llvm-svn: 359975
* Enable LoopVectorization by default.Alina Sbirlea2019-04-251-1/+1
| | | | | | | | | | | | | | | | | Summary: When refactoring vectorization flags, vectorization was disabled by default in the new pass manager. This patch re-enables is for both managers, and changes the assumptions opt makes, based on the new defaults. Comments in opt.cpp should clarify the intended use of all flags to enable/disable vectorization. Reviewers: chandlerc, jgorbe Subscribers: jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61091 llvm-svn: 359167
* [SLP] Fix crash after r358519, by V. Porpodas.Alexey Bataev2019-04-241-1/+2
| | | | | | | | | | | | | | | | Summary: The code did not check if operand was undef before casting it to Instruction. Reviewers: RKSimon, ABataev, dtemirbulatov Reviewed By: ABataev Subscribers: uabelho Tags: #llvm Differential Revision: https://reviews.llvm.org/D61024 llvm-svn: 359136
* Use llvm::stable_sortFangrui Song2019-04-231-5/+5
| | | | | | While touching the code, simplify if feasible. llvm-svn: 358996
* [NewPassManager] Adding pass tuning options: loop vectorize.Alina Sbirlea2019-04-191-0/+9
| | | | | | | | | | | | | | | | Summary: Trying to add the plumbing necessary to add tuning options to the new pass manager. Testing with the flags for loop vectorize. Reviewers: chandlerc Subscribers: sanjoy, mehdi_amini, jlebar, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59723 llvm-svn: 358763
* Fix a typo in comments. [NFC]Ali Tamur2019-04-161-1/+1
| | | | llvm-svn: 358531
* [SLP] Refactoring of the operand reordering code.Simon Pilgrim2019-04-161-171/+463
| | | | | | | | | | | | | | This is a refactoring patch which should have all the functionality of the current code. Its goal is twofold: i. Cleanup and simplify the reordering code, and ii. Generalize reordering so that it will work for an arbitrary number of operands, not just 2. This is the second patch in a series of patches that will enable operand reordering across chains of operations. An example of this was presented in EuroLLVM'18 https://www.youtube.com/watch?v=gIEn34LvyNo . Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D59973 llvm-svn: 358519
* [PGO] Profile guided code size optimization.Hiroshi Yamauchi2019-04-151-11/+26
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Enable some of the existing size optimizations for cold code under PGO. A ~5% code size saving in big internal app under PGO. The way it gets BFI/PSI is discussed in the RFC thread http://lists.llvm.org/pipermail/llvm-dev/2019-March/130894.html Note it doesn't currently touch loop passes. Reviewers: davidxl, eraman Reviewed By: eraman Subscribers: mgorny, javed.absar, smeenai, mehdi_amini, eraman, zzheng, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59514 llvm-svn: 358422
* [VPLAN] Minor improvement to testing and debug messages.Florian Hahn2019-04-101-7/+10
| | | | | | | | | | | | 1. Use computed VF for stress testing. 2. If the computed VF does not produce vector code (VF smaller than 2), force VF to be 4. 3. Test vectorization of i64 data on AArch64 to make sure we generate VF != 4 (on X86 that was already tested on AVX). Patch by Francesco Petrogalli <francesco.petrogalli@arm.com> Differential Revision: https://reviews.llvm.org/D59952 llvm-svn: 358056
* [IR] Refactor attribute methods in Function class (NFC)Evandro Menezes2019-04-041-2/+2
| | | | | | | | Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731
* [DebugInfo] Fix pr41180 : Loop Vectorization Debugify FailureVedant Kumar2019-04-021-3/+21
| | | | | | | | | | | | | | | | | | | | | Bug: https://bugs.llvm.org/show_bug.cgi?id=41180 In the bug test case the debug location was missing for the cmp instruction in the "middle block" BB. This patch fixes the bug by copying the debug location from the cmp of the scalar loop's terminator branch, if it exists. The patch also fixes the debug location on the subsequent branch instruction. It was previously using the location of the of the original loop's pre-header block terminator. Both of these instructions will now map to the source line of the conditional branch in the original loop. A regression test has been added that covers these issues. Patch by Orlando Cazalet-Hyams! Differential Revision: https://reviews.llvm.org/D59944 llvm-svn: 357499
* [SLP] reorderInputsAccordingToOpcode is const method. NFCI.Simon Pilgrim2019-04-021-5/+4
| | | | llvm-svn: 357490
* [SLP] getVectorElementSize and isTreeTinyAndNotFullyVectorizable are const ↵Simon Pilgrim2019-04-011-4/+4
| | | | | | methods. NFCI. llvm-svn: 357416
* [SLP] getGatherCost and isFullyVectorizableTinyTree are const methods. NFCI.Simon Pilgrim2019-04-011-7/+7
| | | | llvm-svn: 357414
* [SLP] Add support for commutative icmp/fcmp predicatesSimon Pilgrim2019-03-291-14/+28
| | | | | | | | | | For the cases where the icmp/fcmp predicate is commutative, use reorderInputsAccordingToOpcode to collect and commute the operands. This requires a helper to recognise commutativity in both general Instruction and CmpInstr types - the CmpInst::isCommutative doesn't overload the Instruction::isCommutative method for reasons I'm not clear on (maybe because its based on predicate not opcode?!?). Differential Revision: https://reviews.llvm.org/D59992 llvm-svn: 357266
* [SLP] Add support for swapping icmp/fcmp predicates to permit vectorizationSimon Pilgrim2019-03-291-9/+17
| | | | | | | | We should be able to match elements with the swapped predicate as well - as long as we commute the source operands. Differential Revision: https://reviews.llvm.org/D59956 llvm-svn: 357243
* Make helper functions static. NFC.Benjamin Kramer2019-03-281-2/+2
| | | | llvm-svn: 357187
* [VPlan] Determine Vector Width programmatically.Florian Hahn2019-03-282-20/+37
| | | | | | | | | | | | | | With this change, the VPlan native path is triggered with the directive: #pragma clang loop vectorize(enable) There is no need to specify the vectorize_width(N) clause. Patch by Francesco Petrogalli <francesco.petrogalli@arm.com> Differential Revision: https://reviews.llvm.org/D57598 llvm-svn: 357156
* [SLPVectorizer] Merge reorderAltShuffleOperands into ↵Simon Pilgrim2019-03-251-85/+35
| | | | | | | | | | reorderInputsAccordingToOpcode As discussed on D59738, this generalizes reorderInputsAccordingToOpcode to handle multiple + non-commutative instructions so we can get rid of reorderAltShuffleOperands and make use of the extra canonicalizations that reorderInputsAccordingToOpcode brings. Differential Revision: https://reviews.llvm.org/D59784 llvm-svn: 356939
* [SLPVectorizer] reorderInputsAccordingToOpcode - remove non-Instruction ↵Simon Pilgrim2019-03-251-7/+2
| | | | | | | | | | | | | | canonicalization Remove attempts to commute non-Instructions to the LHS - the codegen changes appear to rely on chance more than anything else and also have a tendency to fight existing instcombine canonicalization which moves constants to the RHS of commutable binary ops. This is prep work towards: (a) reusing reorderInputsAccordingToOpcode for alt-shuffles and removing the similar reorderAltShuffleOperands (b) improving reordering to optimized cases with commutable and non-commutable instructions to still find splat/consecutive ops. Differential Revision: https://reviews.llvm.org/D59738 llvm-svn: 356913
* [SLPVectorizer] shouldReorderOperands - just check for reordering. NFCI.Simon Pilgrim2019-03-241-28/+24
| | | | | | Remove the I.getOperand() calls from inside shouldReorderOperands - reorderInputsAccordingToOpcode should handle the creation of the operand lists and shouldReorderOperands should just check to see whether the i'th element should be commuted. llvm-svn: 356854
* Fix unused variable warning on non-asserts builds. NFCI.Simon Pilgrim2019-03-231-4/+3
| | | | llvm-svn: 356841
* Remove unused function argument. NFCI.Simon Pilgrim2019-03-231-5/+7
| | | | llvm-svn: 356840
* [SLPVectorizer] reorderInputsAccordingToOpcode - use InstructionState ↵Simon Pilgrim2019-03-231-3/+6
| | | | | | directly. NFCI. llvm-svn: 356832
* [SLPVectorizer] Don't repeat VL.size() call. NFCI.Simon Pilgrim2019-03-231-1/+1
| | | | llvm-svn: 356830
* [SLP] Remove redundancy of performing operand reordering twice: once in ↵Simon Pilgrim2019-03-221-82/+171
| | | | | | | | | | | | | | | buildTree() and later in vectorizeTree(). This is a refactoring patch that removes the redundancy of performing operand reordering twice, once in buildTree() and later in vectorizeTree(). To achieve this we need to keep track of the operands within the TreeEntry struct while building the tree, and later in vectorizeTree() we are just accessing them from the TreeEntry in the right order. This patch is the first in a series of patches that will allow for better operand reordering across chains of instructions (e.g., a chain of ADDs), as presented here: https://www.youtube.com/watch?v=gIEn34LvyNo Patch by: @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D59059 llvm-svn: 356814
OpenPOWER on IntegriCloud