summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/ARM/ARMParallelDSP.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Revert r344693 ("[ARM] bottom-top mul support in ARMParallelDSP")Eli Friedman2018-10-181-194/+27
| | | | | | | Still causing failures on the polly-aosp buildbot; I'll follow up with a reduced testcase. llvm-svn: 344752
* [ARM] bottom-top mul support in ARMParallelDSPSam Parker2018-10-171-27/+194
| | | | | | | | | | | | | | Previously reverted in rL343082. Original commit message: On failing to find sequences that can be converted into dual macs, try to find sequential 16-bit loads that are used by muls which we can then use smultb, smulbt, smultt with a wide load. Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 344693
* Replace most users of UnknownSize with LocationSize::unknown(); NFCGeorge Burgess IV2018-10-101-1/+1
| | | | | | | | | | | | Moving away from UnknownSize is part of the effort to migrate us to LocationSizes (e.g. the cleanup promised in D44748). This doesn't entirely remove all of the uses of UnknownSize; some uses require tweaks to assume that UnknownSize isn't just some kind of int. This patch is intended to just be a trivial replacement for all places where LocationSize::unknown() will Just Work. llvm-svn: 344186
* Revert r342870 "[ARM] bottom-top mul support ARMParallelDSP"Hans Wennborg2018-09-261-154/+27
| | | | | | | | | | | | | | | | | | | | This broke Chromium's Android build (https://crbug.com/889390) and the polly-aosp buildbot (http://lab.llvm.org:8011/builders/aosp-O3-polly-before-vectorizer-unprofitable). > Originally committed in rL342210 but was reverted in rL342260 because > it was causing issues in vectorized code, because I had forgotten to > ensure that we're operating on scalar values. > > Original commit message: > > On failing to find sequences that can be converted into dual macs, > try to find sequential 16-bit loads that are used by muls which we > can then use smultb, smulbt, smultt with a wide load. > > Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 343082
* [ARM] bottom-top mul support ARMParallelDSPSam Parker2018-09-241-27/+154
| | | | | | | | | | | | | | | | Originally committed in rL342210 but was reverted in rL342260 because it was causing issues in vectorized code, because I had forgotten to ensure that we're operating on scalar values. Original commit message: On failing to find sequences that can be converted into dual macs, try to find sequential 16-bit loads that are used by muls which we can then use smultb, smulbt, smultt with a wide load. Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 342870
* Revert r342210 "[ARM] bottom-top mul support in ARMParallelDSP"Reid Kleckner2018-09-141-152/+27
| | | | | | | | | | It causes assertion failures while building Skia for Android in Chromium: https://ci.chromium.org/buildbot/chromium.clang/ToTAndroid/4550 Reduction forthcoming. llvm-svn: 342260
* [ARM] bottom-top mul support in ARMParallelDSPSam Parker2018-09-141-27/+152
| | | | | | | | | | On failing to find sequences that can be converted into dual macs, try to find sequential 16-bit loads that are used by muls which we can then use smultb, smulbt, smultt with a wide load. Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 342210
* [ARM] Follow-up to rL342033Sam Parker2018-09-121-1/+1
| | | | | | Fixed typo which can cause segfault. llvm-svn: 342040
* [ARM] Exchange MAC operands in ARMParallelDSPSam Parker2018-09-121-115/+154
| | | | | | | | | | | | | | | | SMLAD and SMLALD instructions also come in the form of SMLADX and SMLALDX which perform an exchange on their second operand. To support this, more of the loads in the MAC candidates are compared for sequential access and a boolean value has been added to BinOpChain. AddMACCandiate has been refactored into a small pattern matching state machine to reduce the amount of duplicated code, but also to enable the matching to be more flexible. CreateParallelMACPairs now iterates through all the candidates to find parallel ones. Differential Revision: https://reviews.llvm.org/D51424 llvm-svn: 342033
* [ARM] Add smlald support in ARMParallelDSPSam Parker2018-09-111-13/+41
| | | | | | | | | Search from i64 reducing phis, as well as i32, to allow the generation of smlald instructions. Differential Revision: https://reviews.llvm.org/D51101 llvm-svn: 341941
* [ARM] ParallelDSP: add option to enable/disable the passSjoerd Meijer2018-08-141-0/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D50511 llvm-svn: 339645
* [ARM] Use unique_ptr to fix memory leak introduced in r337701Fangrui Song2018-07-231-11/+9
| | | | llvm-svn: 337714
* OpChain has subclasses, so add a virtual destructor.Jordan Rupprecht2018-07-231-0/+1
| | | | | | | | | | | | | | | Summary: OpChain has subclasses, so add a virtual destructor. This fixes an issue when deleting subclasses of OpChain (see MatchSMLAD() specifically) in r337701. Reviewers: javed.absar Subscribers: llvm-commits, SjoerdMeijer, samparker Differential Revision: https://reviews.llvm.org/D49681 llvm-svn: 337713
* [ARM][NFC] ParallelDSP reorganisationSam Parker2018-07-231-88/+103
| | | | | | | | | | | | | | | | | In preparing to allow ARMParallelDSP pass to parallelise more than smlads, I've restructed some elements: - The ParallelMAC struct has been renamed to BinOpChain. - The BinOpChain struct holds two value lists: LHS and RHS, as well as inheriting from the OpChain base class. - The OpChain struct holds all the values of the represented chain and has had the memory locations functionality inserted into it. - ParallelMACList becomes OpChainList and it now holds pointers instead of objects. Differential Revision: https://reviews.llvm.org/D49020 llvm-svn: 337701
* [ARM] ParallelDSP: multiple reduction stmts in loopSjoerd Meijer2018-07-111-40/+75
| | | | | | | | | | This fixes an issue that we were not properly supporting multiple reduction stmts in a loop, and not generating SMLADs for these cases. The alias analysis checks were done too early, making it too conservative. Differential revision: https://reviews.llvm.org/D49125 llvm-svn: 336795
* [ARM] ParallelDSP: added statistics, NFC.Sjoerd Meijer2018-07-061-3/+7
| | | | | | | | | Added statistics for the number of SMLAD instructions created, and als renamed the pass name to -arm-parallel-dsp. Differential Revision: https://reviews.llvm.org/D48971 llvm-svn: 336441
* [ARM] ParallelDSP: only support i16 loads for nowSjoerd Meijer2018-07-051-28/+25
| | | | | | | | | We were miscompiling i8 loads, so reject them as unsupported narrow operations for now. Differential Revision: https://reviews.llvm.org/D48944 llvm-svn: 336319
* [ARM] Fix inconsistent declaration parameter name in r336195Fangrui Song2018-07-031-1/+1
| | | | llvm-svn: 336223
* [ARM][NFC] Refactor sequential access for DSPSam Parker2018-07-031-18/+27
| | | | | | | | | | With a view to support parallel operations that have their results stored to memory, refactor the consecutive access helper out so it could support stores instructions. Differential Revision: https://reviews.llvm.org/D48872 llvm-svn: 336195
* Remove unnecessary semicolon. NFCI.Simon Pilgrim2018-06-281-2/+2
| | | | | | Fixes -Wpedantic warning. llvm-svn: 335901
* [ARM] Parallel DSP PassSjoerd Meijer2018-06-281-0/+613
Armv6 introduced instructions to perform 32-bit SIMD operations. The purpose of this pass is to do some straightforward IR pattern matching to create ACLE DSP intrinsics, which map on these 32-bit SIMD operations. Currently, only the SMLAD instruction gets recognised. This instruction performs two multiplications with 16-bit operands, and stores the result in an accumulator. We will follow this up with patches to recognise SMLAD in more cases, and also to generate other DSP instructions (like e.g. SADD16). Patch by: Sam Parker and Sjoerd Meijer Differential Revision: https://reviews.llvm.org/D48128 llvm-svn: 335850
OpenPOWER on IntegriCloud