summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/LoopVectorize/reduction.ll
Commit message (Collapse)AuthorAgeFilesLines
* [LV] Strip wrap flags from vectorized reductionsAyal Zaks2019-12-201-1/+1
| | | | | | | | | | | | A sequence of additions or multiplications that is known not to wrap, may wrap if it's order is changed (i.e., reassociated). Therefore when vectorizing integer sum or product reductions, their no-wrap flags need to be removed. Fixes PR43828 Patch by Denis Antrushin Differential Revision: https://reviews.llvm.org/D69563
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-171-0/+580
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-171-580/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
* add a missed case for binary op FMF propagation under select foldsMichael Berg2018-08-161-1/+1
| | | | llvm-svn: 339938
* NFC: Add a test to LV showing that reduction is not possible when reduction ↵Anna Thomas2018-08-131-0/+39
| | | | | | | | | | | | var is reset in the loop Added a test case to reduction showing where it's illegal to identify vectorize a loop. Resetting the reduction var during loop iterations disallows us from widening the dependency cycle to VF, thereby making it illegal to vectorize the loop. llvm-svn: 339605
* [LV] Allow reductions that have several uses outside the loopMichael Kuperstein2017-01-181-0/+46
| | | | | | | | | | | We currently check whether a reduction has a single outside user. We don't really need to require that - we just need to make sure a single value is used externally. The number of external users of that value shouldn't actually matter. Differential Revision: https://reviews.llvm.org/D28830 llvm-svn: 292424
* [LV] Remove triples from target-independent vectorizer tests. NFC.Michael Kuperstein2016-10-061-1/+0
| | | | | | | | | Vectorizer tests in the target-independent directory should not have a target triple. If a test really needs to query a specific backend, it belongs in the right target subdirectory (which "REQUIRES" the right backend). Otherwise, it should not specify a triple. llvm-svn: 283512
* [LV] Don't bail to MiddleBlock if a runtime check fails, bail to ScalarPH ↵James Molloy2015-09-021-1/+1
| | | | | | | | | | instead We were bailing to two places if our runtime checks failed. If the initial overflow check failed, we'd go to ScalarPH. If any other check failed, we'd go to MiddleBlock. This caused us to have to have an extra PHI per induction and reduction as the vector loop's exit block was not dominated by its latch. There's no need to have this behavior - if we just always go to ScalarPH we can get rid of a bunch of complexity. llvm-svn: 246637
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-271-23/+23
| | | | | | | | | | | | | | | | | | | | | | | | load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-271-23/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | getelementptr instruction One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type. This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions. * This doesn't modify gep operators, only instructions (operators will be handled separately) * Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes. * geps of vectors are transformed as: getelementptr <4 x float*> %x, ... ->getelementptr float, <4 x float*> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float. * address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files. update.py: import fileinput import sys import re ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))") normrep = re.compile( r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))") def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name *.ll | xargs ./apply.sh From llvm/src/tools/clang: find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll | xargs ./apply.sh After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out). The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases. Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7636 llvm-svn: 230786
* Rename getMaximumUnrollFactor -> getMaxInterleaveFactor; also rename option ↵Sanjay Patel2014-09-101-1/+1
| | | | | | | | | | | names controlling this variable. "Unroll" is not the appropriate name for this variable. Clang already uses the term "interleave" in pragmas and metadata for this. Differential Revision: http://reviews.llvm.org/D5066 llvm-svn: 217528
* LoopVectorize: External uses must use the last value in a reduction cycleArnold Schwaighofer2013-10-071-0/+27
| | | | | | | | | Otherwise, we don't perform operations that would have been performed on the scalar version. Fixes PR17498. llvm-svn: 192133
* Update Transforms tests to use CHECK-LABEL for easier debugging. No ↵Stephen Lin2013-07-141-11/+11
| | | | | | | | | | | | | | | | | | | | | | functionality change. This update was done with the following bash script: find test/Transforms -name "*.ll" | \ while read NAME; do echo "$NAME" if ! grep -q "^; *RUN: *llc" $NAME; then TEMP=`mktemp -t temp` cp $NAME $TEMP sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \ while read FUNC; do sed -i '' "s/;\(.*\)\([A-Za-z0-9_]*\):\( *\)@$FUNC\([( ]*\)\$/;\1\2-LABEL:\3@$FUNC(/g" $TEMP done mv $TEMP $NAME fi done llvm-svn: 186268
* LoopVectorizer: Disallow reductions whose header phi is used outside the loopArnold Schwaighofer2013-07-131-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an outside loop user of the reduction value uses the header phi node we cannot just reduce the vectorized phi value in the vector code epilog because we would loose VF-1 reductions. lp: p = phi (0, lv) lv = lv + 1 ... brcond , lp, outside outside: usr = add 0, p (Say the loop iterates two times, the value of p coming out of the loop is one). We cannot just transform this to: vlp: p = phi (<0,0>, lv) lv = lv + <1,1> .. brcond , lp, outside outside: p_reduced = p[0] + [1]; usr = add 0, p_reduced (Because the original loop iterated two times the vectorized loop would iterate one time, but p_reduced ends up being zero instead of one). We would have to execute VF-1 iterations in the scalar remainder loop in such cases. For now, just disable vectorization. PR16522 llvm-svn: 186256
* LoopVectorizer: Improve reduction variable identificationArnold Schwaighofer2013-05-071-0/+119
| | | | | | | The two nested loops were confusing and also conservative in identifying reduction variables. This patch replaces them by a worklist based approach. llvm-svn: 181369
* Remove the -licm pass from the loop vectorizer test because the loop ↵Nadav Rotem2013-01-091-1/+1
| | | | | | vectorizer does it now. llvm-svn: 171930
* Fix a typo. Remove the duplicated test.Nadav Rotem2013-01-051-25/+0
| | | | llvm-svn: 171584
* iLoopVectorize: Non commutative operators can be used as reduction variables ↵Nadav Rotem2013-01-051-2/+30
| | | | | | | | as long as the reduction chain is used in the LHS. PR14803. llvm-svn: 171583
* Force a fixed unroll count on the target independent tests.Nadav Rotem2013-01-051-1/+1
| | | | | | This should fix clang-native-arm-cortex-a9. Thanks Renato. llvm-svn: 171582
* Do not vectorize loops with subtraction reductionsPaul Redmond2013-01-041-0/+50
| | | | | | | | | Since subtraction does not commute the loop vectorizer incorrectly vectorizes reductions such as x = A[i] - x. Disabling for now. llvm-svn: 171537
* LoopVectorize: Emit reductions as log2(vectorsize) shuffles + vector ops ↵Benjamin Kramer2012-12-181-0/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | instead of scalar operations. For example on x86 with SSE4.2 a <8 x i8> add reduction becomes movdqa %xmm0, %xmm1 movhlps %xmm1, %xmm1 ## xmm1 = xmm1[1,1] paddw %xmm0, %xmm1 pshufd $1, %xmm1, %xmm0 ## xmm0 = xmm1[1,0,0,0] paddw %xmm1, %xmm0 phaddw %xmm0, %xmm0 pextrb $0, %xmm0, %edx instead of pextrb $2, %xmm0, %esi pextrb $0, %xmm0, %edx addb %sil, %dl pextrb $4, %xmm0, %esi addb %dl, %sil pextrb $6, %xmm0, %edx addb %sil, %dl pextrb $8, %xmm0, %esi addb %dl, %sil pextrb $10, %xmm0, %edi pextrb $14, %xmm0, %edx addb %sil, %dil pextrb $12, %xmm0, %esi addb %dil, %sil addb %sil, %dl llvm-svn: 170439
* LoopVectorize: Preserve NSW, NUW and IsExact flags.Nadav Rotem2012-10-311-1/+1
| | | | llvm-svn: 167174
* LoopVectorize: Fix a bug in the initialization of reduction variables. AND ↵Nadav Rotem2012-10-301-0/+1
| | | | | | | | needs to start at all-one while XOR, and OR need to start at zero. llvm-svn: 167032
* Add support for additional reduction variables: AND, OR, XOR.Nadav Rotem2012-10-251-0/+79
| | | | | | Patch by Paul Redmond <paul.redmond@intel.com>. llvm-svn: 166649
* LoopVectorizer: Add a basic cost model which uses the VTTI interface.Nadav Rotem2012-10-241-1/+1
| | | | llvm-svn: 166620
* Add support for reduction variables that do not start at zero.Nadav Rotem2012-10-211-4/+34
| | | | | | | | | | | | This is important for nested-loop reductions such as : In the innermost loop, the induction variable does not start with zero: for (i = 0 .. n) for (j = 0 .. m) sum += ... llvm-svn: 166387
* Vectorizer: Add support for loop reductions.Nadav Rotem2012-10-191-0/+122
For example: for (i=0; i<n; i++) sum += A[i] + B[i] + i; llvm-svn: 166351
OpenPOWER on IntegriCloud