summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/LoopVectorize/gcc-examples.ll
Commit message (Collapse)AuthorAgeFilesLines
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-171-0/+685
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-171-685/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
* [LV] Remove triples from target-independent vectorizer tests. NFC.Michael Kuperstein2016-10-061-1/+0
| | | | | | | | | Vectorizer tests in the target-independent directory should not have a target triple. If a test really needs to query a specific backend, it belongs in the right target subdirectory (which "REQUIRES" the right backend). Otherwise, it should not specify a triple. llvm-svn: 283512
* [LV] For some IVs, use vector phis instead of widening in the loop bodyMichael Kuperstein2016-06-011-1/+1
| | | | | | | | | | | | | Previously, whenever we needed a vector IV, we would create it on the fly, by splatting the scalar IV and adding a step vector. Instead, we can create a real vector IV. This tends to save a couple of instructions per iteration. This only changes the behavior for the most basic case - integer primary IVs with a constant step. Differential Revision: http://reviews.llvm.org/D20315 llvm-svn: 271410
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-271-48/+48
| | | | | | | | | | | | | | | | | | | | | | | | load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-271-69/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | getelementptr instruction One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type. This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions. * This doesn't modify gep operators, only instructions (operators will be handled separately) * Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes. * geps of vectors are transformed as: getelementptr <4 x float*> %x, ... ->getelementptr float, <4 x float*> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float. * address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files. update.py: import fileinput import sys import re ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))") normrep = re.compile( r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))") def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name *.ll | xargs ./apply.sh From llvm/src/tools/clang: find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll | xargs ./apply.sh After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out). The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases. Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7636 llvm-svn: 230786
* [LoopVectorize] Induction variables: support arbitrary constant step.Hao Liu2015-01-301-2/+1
| | | | | | | | | | Previously, only -1 and +1 step values are supported for induction variables. This patch extends LV to support arbitrary constant steps. Initial patch by Alexey Volkov. Some bug fixes are added in the following version. Differential Revision: http://reviews.llvm.org/D6051 and http://reviews.llvm.org/D7193 llvm-svn: 227557
* Rename getMaximumUnrollFactor -> getMaxInterleaveFactor; also rename option ↵Sanjay Patel2014-09-101-2/+2
| | | | | | | | | | | names controlling this variable. "Unroll" is not the appropriate name for this variable. Clang already uses the term "interleave" in pragmas and metadata for this. Differential Revision: http://reviews.llvm.org/D5066 llvm-svn: 217528
* [LoopVectorize] Use AA to partition potential dependency checksHal Finkel2014-07-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this change, the loop vectorizer did not make use of the alias analysis infrastructure. Instead, it performed memory dependence analysis using ScalarEvolution-based linear dependence checks within equivalence classes derived from the results of ValueTracking's GetUnderlyingObjects. Unfortunately, this meant that: 1. The loop vectorizer had logic that essentially duplicated that in BasicAA for aliasing based on identified objects. 2. The loop vectorizer could not partition the space of dependency checks based on information only easily available from within AA (TBAA metadata is currently the prime example). This means, for example, regardless of whether -fno-strict-aliasing was provided, the vectorizer would only vectorize this loop with a runtime memory-overlap check: void foo(int *a, float *b) { for (int i = 0; i < 1600; ++i) a[i] = b[i]; } This is suboptimal because the TBAA metadata already provides the information necessary to show that this check unnecessary. Of course, the vectorizer has a limit on the number of such checks it will insert, so in practice, ignoring TBAA means not vectorizing more-complicated loops that we should. This change causes the vectorizer to use an AliasSetTracker to keep track of the pointers in the loop. The resulting alias sets are then used to partition the space of dependency checks, and potential runtime checks; this results in more-efficient vectorizations. When pointer locations are added to the AliasSetTracker, two things are done: 1. The location size is set to UnknownSize (otherwise you'd not catch inter-iteration dependencies) 2. For instructions in blocks that would need to be predicated, TBAA is removed (because the metadata might have a control dependency on the condition being speculated). For non-predicated blocks, you can leave the TBAA metadata. This is safe because you can't have an iteration dependency on the TBAA metadata (if you did, and you unrolled sufficiently, you'd end up with the same pointer value used by two accesses that TBAA says should not alias, and that would yield undefined behavior). llvm-svn: 213486
* Update Transforms tests to use CHECK-LABEL for easier debugging. No ↵Stephen Lin2013-07-141-21/+21
| | | | | | | | | | | | | | | | | | | | | | functionality change. This update was done with the following bash script: find test/Transforms -name "*.ll" | \ while read NAME; do echo "$NAME" if ! grep -q "^; *RUN: *llc" $NAME; then TEMP=`mktemp -t temp` cp $NAME $TEMP sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \ while read FUNC; do sed -i '' "s/;\(.*\)\([A-Za-z0-9_]*\):\( *\)@$FUNC\([( ]*\)\$/;\1\2-LABEL:\3@$FUNC(/g" $TEMP done mv $TEMP $NAME fi done llvm-svn: 186268
* Remove the -licm pass from the loop vectorizer test because the loop ↵Nadav Rotem2013-01-091-2/+2
| | | | | | vectorizer does it now. llvm-svn: 171930
* iLoopVectorize: Non commutative operators can be used as reduction variables ↵Nadav Rotem2013-01-051-1/+1
| | | | | | | | as long as the reduction chain is used in the LHS. PR14803. llvm-svn: 171583
* Force a fixed unroll count on the target independent tests.Nadav Rotem2013-01-051-1/+1
| | | | | | This should fix clang-native-arm-cortex-a9. Thanks Renato. llvm-svn: 171582
* Do not vectorize loops with subtraction reductionsPaul Redmond2013-01-041-1/+1
| | | | | | | | | Since subtraction does not commute the loop vectorizer incorrectly vectorizes reductions such as x = A[i] - x. Disabling for now. llvm-svn: 171537
* LoopVectorizer:Nadav Rotem2013-01-041-0/+39
| | | | | | | | 1. Add code to estimate register pressure. 2. Add code to select the unroll factor based on register pressure. 3. Add bits to TargetTransformInfo to provide the number of registers. llvm-svn: 171469
* LoopVectorizer: Optimize the vectorization of consecutive memory access when ↵Nadav Rotem2012-12-261-1/+2
| | | | | | the iteration step is -1 llvm-svn: 171114
* Loop Vectorize: optimize the vectorization of trunc(induction_var). The ↵Nadav Rotem2012-12-111-1/+1
| | | | | | truncation is now done on scalars. llvm-svn: 169904
* Add support for reverse induction variables. For example:Nadav Rotem2012-12-101-4/+2
| | | | | | | while (i--) sum+=A[i]; llvm-svn: 169752
* LoopVectorizer: Add initial support for pointer induction variables (for ↵Nadav Rotem2012-11-171-2/+1
| | | | | | | | example: *dst++ = *src++). At the moment we still require to have an integer induction variable (for example: i++). llvm-svn: 168231
* Relax the restrictions on vector of pointer types, and vector getelementptr.Duncan Sands2012-11-131-2/+2
| | | | | | | | | | | | | | | Previously in a vector of pointers, the pointer couldn't be any pointer type, it had to be a pointer to an integer or floating point type. This is a hassle for dragonegg because the GCC vectorizer happily produces vectors of pointers where the pointer is a pointer to a struct or whatever. Vector getelementptr was restricted to just one index, but now that vectors of pointers can have any pointer type it is more natural to allow arbitrary vector getelementptrs. There is however the issue of struct GEPs, where if each lane chose different struct fields then from that point on each lane will be working down into unrelated types. This seems like too much pain for too little gain, so when you have a vector struct index all the elements are required to be the same. llvm-svn: 167828
* LoopVectorize: Preserve NSW, NUW and IsExact flags.Nadav Rotem2012-10-311-1/+3
| | | | llvm-svn: 167174
* LoopVectorizer: Add a basic cost model which uses the VTTI interface.Nadav Rotem2012-10-241-1/+1
| | | | llvm-svn: 166620
* Vectorizer: Add support for loop reductions.Nadav Rotem2012-10-191-2/+1
| | | | | | | | | For example: for (i=0; i<n; i++) sum += A[i] + B[i] + i; llvm-svn: 166351
* Vectorizer: Add support for loops with an unknown count. For example:Nadav Rotem2012-10-181-4/+2
| | | | | | | | for (i=0; i<n; i++){ a[i] = b[i+1] + c[i+3]; } llvm-svn: 166165
* Add a loop vectorizer.Nadav Rotem2012-10-171-0/+651
llvm-svn: 166112
OpenPOWER on IntegriCloud