summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AArch64/load-combine.ll
Commit message (Collapse)AuthorAgeFilesLines
* Reland "[DAGCombiner] Allow zextended load combines."Clement Courbet2019-11-221-8/+4
| | | | Check that the generated type is simple.
* Revert "[DAGCombiner] Allow zextended load combines."Clement Courbet2019-11-221-4/+8
| | | | Breaks some bots.
* [DAGCombiner] Allow zextended load combines.Clement Courbet2019-11-221-8/+4
| | | | | | | | | | | | Summary: or(zext(load8(base)), zext(load8(base+1)) -> zext(load16 base) Reviewers: apilipenko, RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70487
* [CodeGen][NFC] Regenerate load-combine test with update_llc_test.Clement Courbet2019-11-201-63/+83
| | | | To prepare for D27861.
* Fix a reoccuring typo in load-combine testsArtur Pilipenko2018-03-271-2/+2
| | | | | | | | | | | %tmp = bitcast i32* %arg to i8* %tmp1 = getelementptr inbounds i8, i8* %tmp, i32 0 - %tmp2 = load i8, i8* %tmp, align 1 + %tmp2 = load i8, i8* %tmp1, align 1 This doesn't change the semantics of the tests but makes use of %tmp1 which was originally intended. llvm-svn: 328642
* [DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combineArtur Pilipenko2017-03-011-6/+1
| | | | | | | | | | | | Resubmit r295336 after the bug with non-zero offset patterns on BE targets is fixed (r296336). Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patters. Reviewed By: filcab Differential Revision: https://reviews.llvm.org/D29591 llvm-svn: 296651
* [DAGCombiner] revert r295336Bill Seurer2017-02-221-1/+6
| | | | | | | | | | | r295336 causes a bootstrapped clang to fail for many compilations on powerpc BE. See http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/2315 for example. Reverting as per the developer's request. llvm-svn: 295849
* [DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combineArtur Pilipenko2017-02-161-6/+1
| | | | | | | | | | | | Resubmit -r295314 with PowerPC and AMDGPU tests updated. Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patters. Reviewed By: filcab Differential Revision: https://reviews.llvm.org/D29591 llvm-svn: 295336
* Rever -r295314 "[DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in ↵Artur Pilipenko2017-02-161-1/+6
| | | | | | | | load combine" This change causes some of AMDGPU and PowerPC tests to fail. llvm-svn: 295316
* [DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combineArtur Pilipenko2017-02-161-6/+1
| | | | | | | | | | Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patters. Reviewed By: filcab Differential Revision: https://reviews.llvm.org/D29591 llvm-svn: 295314
* Add DAGCombiner load combine tests for partially available valuesArtur Pilipenko2017-02-091-1/+136
| | | | | | If some of the trailing or leading bytes of a load combine pattern are zeroes we can combine the pattern to a load + zext and shift. Currently we don't support it, so the tests check the current codegen without load combine. This change will make the patch to support this kind of combine a bit more clear. llvm-svn: 294591
* [DAGCombiner] Support non-zero offset in load combineArtur Pilipenko2017-02-091-39/+11
| | | | | | | | | | | | | | | Enable folding patterns which load the value from non-zero offset: i8 *a = ... i32 val = a[4] | (a[5] << 8) | (a[6] << 16) | (a[7] << 24) => i32 val = *((i32*)(a+4)) Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D29394 llvm-svn: 294582
* Add DAGCombiner load combine tests for {a|s}ext, {a|z|s}ext load nodesArtur Pilipenko2017-02-071-0/+103
| | | | | | | | Currently we don't support these nodes, so the tests check the current codegen without load combine. This change makes the review of the change to support these nodes more clear. Separated from https://reviews.llvm.org/D29591 review. llvm-svn: 294305
* [DAGCombiner] Support bswap as a part of load combine patternsArtur Pilipenko2017-02-061-0/+23
| | | | | | | | Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D29397 llvm-svn: 294201
* Add DAGCombiner load combine tests with non-zero offsetArtur Pilipenko2017-02-061-0/+140
| | | | | | This is separated from https://reviews.llvm.org/D29394 review. llvm-svn: 294185
* [DAGCombiner] Match load by bytes idiom and fold it into a single load. ↵Artur Pilipenko2017-01-251-0/+180
Attempt #2. The previous patch (https://reviews.llvm.org/rL289538) got reverted because of a bug. Chandler also requested some changes to the algorithm. http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161212/413479.html This is an updated patch. The key difference is that collectBitProviders (renamed to calculateByteProvider) now collects the origin of one byte, not the whole value. It simplifies the implementation and allows to stop the traversal earlier if we know that the result won't be used. From the original commit: Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it. Assuming little endian target: i8 *a = ... i32 val = a[0] | (a[1] << 8) | (a[2] << 16) | (a[3] << 24) => i32 val = *((i32)a) i8 *a = ... i32 val = (a[0] << 24) | (a[1] << 16) | (a[2] << 8) | a[3] => i32 val = BSWAP(*((i32)a)) This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations. Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part: i32 val = a[i] | (a[i + 1] << 8) | (a[i + 2] << 16) | (a[i + 3] << 24) Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses. The general scheme is to match OR expressions by recursively calculating the origin of individual bytes which constitute the resulting OR value. If all the OR bytes come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed. Reviewed By: RKSimon, filcab, chandlerc Differential Revision: https://reviews.llvm.org/D27861 llvm-svn: 293036
OpenPOWER on IntegriCloud