summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] scalarizePHI should not assume the code it sees has been CSE'dMichael Kuperstein2016-06-061-12/+26
| | | | | | | | | | | | | | scalarizePHI only looked for phis that have exactly two uses - the "latch" use, and an extract. Unfortunately, we can not assume all equivalent extracts are CSE'd, since InstCombine itself may create an extract which is a duplicate of an existing one. This extends it to handle several distinct extracts from the same index. This should fix at least some of the performance regressions from PR27988. Differential Revision: http://reviews.llvm.org/D20983 llvm-svn: 271961
* Fix an issue where fast math flags were dropped during scalarization.Owen Anderson2016-03-011-2/+4
| | | | | | | Most portions of InstCombine properly propagate fast math flags, but apparently the vector scalarization section was overlooked. llvm-svn: 262376
* function names start with a lowercase letter; NFCSanjay Patel2016-02-011-21/+21
| | | | llvm-svn: 259425
* [InstCombine] avoid an insertelement transformation that induces the ↵Sanjay Patel2016-01-291-1/+17
| | | | | | | | | | | opposite extractelement fold (PR26354) We would infinite loop because we created a shufflevector that was wider than needed and then failed to combine that with the insertelement. When subsequently visiting the extractelement from that shuffle, we see that it's unnecessary, delete it, and trigger another visit to the insertelement. llvm-svn: 259236
* [InstCombine] insert a new shuffle in a safe place (PR25999)Sanjay Patel2016-01-081-10/+7
| | | | | | | | Limit this transform to a basic block and guard against PHIs. Hopefully, this fixes the remaining failures in PR25999: https://llvm.org/bugs/show_bug.cgi?id=25999 llvm-svn: 257133
* [InstCombine] insert a new shuffle before its uses (PR26015)Sanjay Patel2016-01-051-8/+21
| | | | | | | | | | | | | | | | Although this solves the test case in PR26015: https://llvm.org/bugs/show_bug.cgi?id=26015 And may solve PR25999: https://llvm.org/bugs/show_bug.cgi?id=25999 ...I suspect this is not the best solution. I think we want to insert the new shuffle just ahead of the earliest ExtractElementInst that we're replacing, but I don't know how that should be implemented. Differential Revision: http://reviews.llvm.org/D15878 llvm-svn: 256857
* [InstCombine] transform more extract/insert pairs into shuffles (PR2109)Sanjay Patel2015-12-241-3/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is an extension of the shuffle combining from r203229: http://reviews.llvm.org/rL203229 The idea is to widen a short input vector with undef elements so the existing shuffle transform for extract/insert can kick in. The motivation is to finally solve PR2109: https://llvm.org/bugs/show_bug.cgi?id=2109 For that example, the IR becomes: %1 = bitcast <2 x i32>* %P to <2 x float>* %ld1 = load <2 x float>, <2 x float>* %1, align 8 %2 = shufflevector <2 x float> %ld1, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef> %i2 = shufflevector <4 x float> %A, <4 x float> %2, <4 x i32> <i32 0, i32 1, i32 4, i32 5> ret <4 x float> %i2 And x86 SSE output improves from: movq (%rdi), %xmm1 ## xmm1 = mem[0],zero movdqa %xmm1, %xmm2 shufps $229, %xmm2, %xmm2 ## xmm2 = xmm2[1,1,2,3] shufps $48, %xmm0, %xmm1 ## xmm1 = xmm1[0,0],xmm0[3,0] shufps $132, %xmm1, %xmm0 ## xmm0 = xmm0[0,1],xmm1[0,2] shufps $32, %xmm0, %xmm2 ## xmm2 = xmm2[0,0],xmm0[2,0] shufps $36, %xmm2, %xmm0 ## xmm0 = xmm0[0,1],xmm2[2,0] retq To the almost optimal: movhpd (%rdi), %xmm0 Note: There's a tension in the existing transform related to generating arbitrary shufflevector masks. We avoid that in other places in InstCombine because we're scared that codegen can't handle strange masks, but it looks like we're ok with producing those here. I purposely chose weird insert/extract indexes for the regression tests to see the effect in these cases. For PowerPC+Altivec, AArch64, and X86+SSE/AVX, I think the codegen is equal or better for these examples. Differential Revision: http://reviews.llvm.org/D15096 llvm-svn: 256394
* fix typos in comments; NFCSanjay Patel2015-11-291-6/+8
| | | | llvm-svn: 254266
* function names start with a lower case letter; NFCSanjay Patel2015-11-171-20/+20
| | | | llvm-svn: 253348
* use range-based for loop; NFCISanjay Patel2015-11-161-2/+2
| | | | llvm-svn: 253256
* InstCombine: Remove ilist iterator implicit conversions, NFCDuncan P. N. Exon Smith2015-10-131-2/+1
| | | | | | | Stop relying on implicit conversions of ilist iterators in LLVMInstCombine. No functionality change intended. llvm-svn: 250183
* don't repeat function names in comments; NFCSanjay Patel2015-09-091-6/+5
| | | | llvm-svn: 247154
* [InstSimplify] Teach InstSimplify how to simplify extractelementDavid Majnemer2015-07-131-58/+9
| | | | llvm-svn: 242008
* [InstCombine] Use DataLayout to determine vector element widthDavid Majnemer2015-04-031-3/+2
| | | | | | | | | InstCombine didn't realize that it needs to use DataLayout to determine how wide pointers are. This lead to assertion failures. This fixes PR23113. llvm-svn: 234046
* [opaque pointer type] more gep API migrationsDavid Blaikie2015-03-141-1/+2
| | | | | | | | | | Adding nullptr to all the IRBuilder stuff because it's the first thing that fails to build when testing without the back-compat functions, so I'll keep having to re-add these locally for each chunk of migration I do. Might as well check them in to save me the churn. Eventually I'll have to migrate these too, but I'm going breadth-first. llvm-svn: 232270
* DataLayout is mandatory, update the API to reflect it with references.Mehdi Amini2015-03-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first attempt at doing that. This patch is not exactly NFC as for instance some places were passing a nullptr instead of the DataLayout, possibly just because there was a default value on the DataLayout argument to many functions in the API. Even though it is not purely NFC, there is no change in the validation. I turned as many pointer to DataLayout to references, this helped figuring out all the places where a nullptr could come up. I had initially a local version of this patch broken into over 30 independant, commits but some later commit were cleaning the API and touching part of the code modified in the previous commits, so it seemed cleaner without the intermediate state. Test Plan: Reviewers: echristo Subscribers: llvm-commits From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231740
* InstCombine: extract instead of shuffle when performing vector/array type ↵JF Bastien2015-02-251-5/+116
| | | | | | | | | | | | | punning Summary: SROA generates code that isn't quite as easy to optimize and contains unusual-sized shuffles, but that code is generally correct. As discussed in D7487 the right place to clean things up is InstCombine, which will pick up the type-punning pattern and transform it into a more obvious bitcast+extractelement, while leaving the other patterns SROA encounters as-is. Test Plan: make check Reviewers: jvoung, chandlerc Subscribers: llvm-commits llvm-svn: 230560
* [PM] Rename InstCombine.h to InstCombineInternal.h in preparation forChandler Carruth2015-01-221-1/+1
| | | | | | | | | | | | | | | | creating a non-internal header file for the InstCombine pass. I thought about calling this InstCombiner.h or in some way more clearly associating it with the InstCombiner clas that it is primarily defining, but there are several other utility interfaces defined within this for InstCombine. If, in the course of refactoring, those end up moving elsewhere or going away, it might make more sense to make this the combiner's header alone. Naturally, this is a bikeshed to a certain degree, so feel free to lobby for a different shade of paint if this name just doesn't suit you. llvm-svn: 226783
* fixed some typosSanjay Patel2014-07-071-4/+4
| | | | llvm-svn: 212495
* Fix type of shuffle resulted from shuffle merge.Serge Pavlov2014-05-131-6/+4
| | | | | | This fix resolves PR19730. llvm-svn: 208666
* Reorder shuffle and binary operation.Serge Pavlov2014-05-111-10/+25
| | | | | | | | | | | | | This patch enables transformations: BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2)) BinOp(shuffle(v1), const1) -> shuffle(BinOp, const2) They allow to eliminate extra shuffles in some cases. Differential Revision: http://reviews.llvm.org/D3525 llvm-svn: 208488
* [InstCombine] Some cleanup in optimization of redundant insertvalue ↵Michael Zolotukhin2014-05-081-4/+3
| | | | | | | | instructions. And one more test added. llvm-svn: 208355
* [InstCombine] Add optimization of redundant insertvalue instructions.Michael Zolotukhin2014-05-071-0/+36
| | | | | | rdar://problem/11861387 llvm-svn: 208214
* [C++] Use 'nullptr'.Craig Topper2014-04-281-1/+1
| | | | llvm-svn: 207394
* [C++] Use 'nullptr'. Transforms edition.Craig Topper2014-04-251-21/+22
| | | | llvm-svn: 207196
* [Modules] Fix potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-221-1/+2
| | | | | | | | | | | | | | | | | definition below all of the header #include lines, lib/Transforms/... edition. This one is tricky for two reasons. We again have a couple of passes that define something else before the includes as well. I've sunk their name macros with the DEBUG_TYPE. Also, InstCombine contains headers that need DEBUG_TYPE, so now those headers #define and #undef DEBUG_TYPE around their code, leaving them well formed modular headers. Fixing these headers was a large motivation for all of these changes, as "leaky" macros of this form are hard on the modules implementation. llvm-svn: 206844
* [Modules] Sink all the DEBUG_TYPE defines for InstCombine out of theChandler Carruth2014-04-211-0/+1
| | | | | | | | | | | header files and into the cpp files. These files will require more touches as the header files actually use DEBUG(). Eventually, I'll have to introduce a matched #define and #undef of DEBUG_TYPE for the header files, but that comes as step N of many to clean all of this up. llvm-svn: 206777
* [C++11] Add range based accessors for the Use-Def chain of a Value.Chandler Carruth2014-03-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This requires a number of steps. 1) Move value_use_iterator into the Value class as an implementation detail 2) Change it to actually be a *Use* iterator rather than a *User* iterator. 3) Add an adaptor which is a User iterator that always looks through the Use to the User. 4) Wrap these in Value::use_iterator and Value::user_iterator typedefs. 5) Add the range adaptors as Value::uses() and Value::users(). 6) Update *all* of the callers to correctly distinguish between whether they wanted a use_iterator (and to explicitly dig out the User when needed), or a user_iterator which makes the Use itself totally opaque. Because #6 requires churning essentially everything that walked the Use-Def chains, I went ahead and added all of the range adaptors and switched them to range-based loops where appropriate. Also because the renaming requires at least churning every line of code, it didn't make any sense to split these up into multiple commits -- all of which would touch all of the same lies of code. The result is still not quite optimal. The Value::use_iterator is a nice regular iterator, but Value::user_iterator is an iterator over User*s rather than over the User objects themselves. As a consequence, it fits a bit awkwardly into the range-based world and it has the weird extra-dereferencing 'operator->' that so many of our iterators have. I think this could be fixed by providing something which transforms a range of T&s into a range of T*s, but that *can* be separated into another patch, and it isn't yet 100% clear whether this is the right move. However, this change gets us most of the benefit and cleans up a substantial amount of code around Use and User. =] llvm-svn: 203364
* InstCombine: form shuffles from wider range of insert/extractelementsTim Northover2014-03-071-49/+70
| | | | | | | | | | | | Sequences of insertelement/extractelements are sometimes used to build vectorsr; this code tries to put them back together into shuffles, but could only produce a completely uniform shuffle types (<N x T> from two <N x T> sources). This should allow shuffles with different numbers of elements on the input and output sides as well. llvm-svn: 203229
* [Modules] Move the LLVM IR pattern match header into the IR library, itChandler Carruth2014-03-041-1/+1
| | | | | | obviously is coupled to the IR. llvm-svn: 202818
* InstCombine: Don't try to use aggregate elements of ConstantExprs.Benjamin Kramer2014-01-241-5/+7
| | | | | | PR18600. llvm-svn: 200028
* Fix known typosAlp Toker2014-01-241-1/+1
| | | | | | | Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018
* Fix more instances of dropped fast math flags when optimizing FADD ↵Owen Anderson2014-01-181-0/+2
| | | | | | instructions. All found by inspection (aka grep). llvm-svn: 199528
* Fix a bug about generating undef operand when optimising shuffle vector and ↵Hao Liu2014-01-081-2/+3
| | | | | | insert element in instruction combine. llvm-svn: 198730
* Scalarize select vector arguments when extracted.Matt Arsenault2013-11-041-0/+32
| | | | | | | | When the elements are extracted from a select on vectors or a vector select, do the select on the extracted scalars from the input if there is only one use. llvm-svn: 194013
* Use type helper functions.Matt Arsenault2013-09-061-1/+1
| | | | llvm-svn: 190113
* Fix typo.Matt Arsenault2013-08-281-2/+2
| | | | llvm-svn: 189524
* Fix a crash in EvaluateInDifferentElementOrder where it would generate anJoey Gouly2013-07-121-1/+3
| | | | | | | | undef vector of the wrong type. LGTM'd by Nick Lewycky on IRC. llvm-svn: 186224
* Delete dead safety check.Nick Lewycky2013-06-031-6/+1
| | | | llvm-svn: 183167
* When determining the new index for an insertelement, we may not assume that anNick Lewycky2013-06-011-7/+9
| | | | | | | | | | index greater than the size of the vector is invalid. The shuffle may be shrinking the size of the vector. Fixes a crash! Also drop the maximum recursion depth of the safety check for this optimization to five. llvm-svn: 183080
* Reapply with r182909 with a fix to the calculation of the new indices forNick Lewycky2013-05-311-2/+256
| | | | | | insertelement instructions. llvm-svn: 182976
* Revert r182909.Evgeniy Stepanov2013-05-301-245/+0
| | | | | | PR/16177 llvm-svn: 182919
* Swizzle vector inputs if it helps us eliminate shuffles.Nick Lewycky2013-05-301-0/+245
| | | | llvm-svn: 182909
* Run clang-format over the scalarizePHI function.Joey Gouly2013-05-241-12/+8
| | | | llvm-svn: 182640
* scalarizePHI needs to insert the next ExtractElement in the same blockJoey Gouly2013-05-241-2/+4
| | | | | | | | as the BinaryOperator, *not* in the block where the IRBuilder is currently inserting into. Fixes a bug where scalarizePHI would create instructions that would not dominate all uses. llvm-svn: 182639
* Tabs to spaces. No functionality change.Nick Lewycky2013-05-041-3/+3
| | | | llvm-svn: 181082
* Revert "InstCombine: Fold more shuffles of shuffles."Jim Grosbach2013-05-011-12/+5
| | | | | | | | | This reverts commit r180802 There's ongoing discussion about whether this is the right place to make this transformation. Reverting for now while we figure it out. llvm-svn: 180834
* InstCombine: Fold more shuffles of shuffles.Jim Grosbach2013-04-301-5/+12
| | | | | | | | | | | Always fold a shuffle-of-shuffle into a single shuffle when there's only one input vector in the first place. Continue to be more conservative when there's multiple inputs. rdar://13402653 PR15866 llvm-svn: 180802
* Changed back (relative to commit 179786) the operations executed when ↵Anat Shemer2013-04-221-3/+3
| | | | | | extract(cast) is transformed to cast(extract). It uses the Builder class as before. In addition the result node is added to the Worklist, so all the previous extract users will become the new scalar cast users. llvm-svn: 180045
* In the function InstCombiner::visitExtractElementInst() removed the ↵Anat Shemer2013-04-181-4/+4
| | | | | | limitation that extract is promoted over a cast only if the cast has only one use. llvm-svn: 179786
OpenPOWER on IntegriCloud