summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/InterleavedAccess/ARM/interleaved-accesses.ll
Commit message (Collapse)AuthorAgeFilesLines
* [ARM] Disable VLD4 under MVEDavid Green2019-12-081-64/+21
| | | | | | | | | | Alas, using half the available vector registers in a single instruction is just too much for the register allocator to handle. The mve-vldst4.ll test here fails when these instructions are enabled at present. This patch disables the generation of VLD4 and VST4 by adding a mve-max-interleave-factor option, which we currently default to 2. Differential Revision: https://reviews.llvm.org/D71109
* [ARM] MVE interleaving load and stores.David Green2019-11-191-170/+258
| | | | | | | | | | | | | | | | | | | | Now that we have the intrinsics, we can add VLD2/4 and VST2/4 lowering for MVE. This works the same way as Neon, recognising the load/shuffles combination and converting them into intrinsics in a pre-isel pass, which just calls getMaxSupportedInterleaveFactor, lowerInterleavedLoad and lowerInterleavedStore. The main difference to Neon is that we do not have a VLD3 instruction. Otherwise most of the code works very similarly, with just some minor differences in the form of the intrinsics to work around. VLD3 is disabled by making isLegalInterleavedAccessType return false for those cases. We may need some other future adjustments, such as VLD4 take up half the available registers so should maybe cost more. This patch should get the basics in though. Differential Revision: https://reviews.llvm.org/D69392
* [ARM] Add and update a lot of VLDn tests. NFCDavid Green2019-11-191-509/+998
|
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-171-0/+898
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-171-898/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
* [InterleavedAccessPass] Don't increase the number of bytes loaded.Eli Friedman2019-03-281-6/+19
| | | | | | | | | | | | | | | | | Even if the interleaving transform would otherwise be legal, we shouldn't introduce an interleaved load that is wider than the original load: it might have undefined behavior. It might be possible to perform some sort of mask-narrowing transform in some cases (using a narrower interleaved load, then extending the results using shufflevectors). But I haven't tried to implement that, at least for now. Fixes https://bugs.llvm.org/show_bug.cgi?id=41245 . Differential Revision: https://reviews.llvm.org/D59954 llvm-svn: 357212
* [ARM] Implement interleaved access bug fix from r306334Matthew Simpson2017-07-071-0/+29
| | | | | | | r306334 fixed a bug in AArch64 dealing with wide interleaved accesses having pointer types. The bug also exists in ARM, so this patch copies over the fix. llvm-svn: 307409
* [ARM/AArch64] Ensure valid vector element types for interleaved accessesMatthew Simpson2017-04-101-0/+11
| | | | | | | | | | | This patch refactors and strengthens the type checks performed for interleaved accesses. The primary functional change is to ensure that the interleaved accesses have valid element types. The added test cases previously failed because the element type is f128. Differential Revision: https://reviews.llvm.org/D31817 llvm-svn: 299864
* [ARM/AArch64] Support wide interleaved accessesMatthew Simpson2017-03-021-0/+197
| | | | | | | | | | | | | This patch teaches (ARM|AArch64)ISelLowering.cpp to match illegal vector types to interleaved access intrinsics as long as the types are multiples of the vector register width. A "wide" access will now be mapped to multiple interleave intrinsics similar to the way in which non-interleaved accesses with illegal types are legalized into multiple accesses. I'll update the associated TTI costs (in getInterleavedMemoryOpCost) as a follow-on. Differential Revision: https://reviews.llvm.org/D29466 llvm-svn: 296750
* [ARM] Don't lower f16 interleaved accesses.Ahmed Bougacha2017-02-111-0/+21
| | | | | | | | | | | | There are no vldN/vstN f16 variants, even with +fullfp16. We could use the i16 variants, but, in practice, even with +fullfp16, the f16 sequence leading to the i16 shuffle usually gets scalarized. We'd need to improve our support for f16 codegen before getting there. Reject f16 interleaved accesses. If we try to emit the f16 intrinsics, we'll just end up with a selection failure. llvm-svn: 294818
* [ARM] Unique some redundant CHECK lines. NFC.Ahmed Bougacha2017-02-111-41/+22
| | | | llvm-svn: 294817
* InterleaveAccessPass: Avoid constructing invalid shuffle masksMatthias Braun2017-01-311-0/+18
| | | | | | | | | Fix a bug where we would construct shufflevector instructions addressing invalid elements. Differential Revision: https://reviews.llvm.org/D29313 llvm-svn: 293673
* [ARM/AArch64] Relocate and update InterleavedAccessPass tests (NFC)Matthew Simpson2017-01-271-0/+628
The interleaved access pass is an IR-to-IR transformation that runs before code generation. It matches interleaved memory operations to target-specific intrinsics (that are later lowered to load and store multiple instructions on ARM/AArch64). We place tests for similar passes (e.g., GlobalMergePass) under test/Transforms. This patch moves the InterleavedAccessPass tests out of test/CodeGen and into target-specific directories under test/Transforms/InterleavedAccess. Although the pass is an IR pass, many of the existing tests were llc tests rather opt tests. For example, the tests would check for ldN/stN instructions generated by llc rather than the intrinsic calls the pass actually inserts. Thus, this patch updates all tests to be opt tests that check for the inserted intrinsics. We already have separate CodeGen tests that ensure we lower the interleaved access intrinsics to their corresponding ldN/stN instructions. In addition to migrating the tests to opt, this patch also performs some minor clean-up (to ensure consistent naming, etc.). Differential Revision: https://reviews.llvm.org/D29184 llvm-svn: 293309
OpenPOWER on IntegriCloud