summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/ARM/ParallelDSP/unroll-n-jam-smlad.ll
Commit message (Collapse)AuthorAgeFilesLines
* [ARM][ParallelDSP] Change smlad insertion orderSam Parker2019-10-161-3/+3
| | | | | | | | | | Instead of inserting everything after the 'root' of the reduction, insert all instructions as close to their operands as possible. This can help reduce register pressure. Differential Revision: https://reviews.llvm.org/D67392 llvm-svn: 374981
* [TableGen] Fix a bug that MCSchedClassDesc is interfered between different ↵QingShan Zhang2019-10-111-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SchedModel Assume that, ModelA has scheduling resource for InstA and ModelB has scheduling resource for InstB. This is what the llvm::MCSchedClassDesc looks like: llvm::MCSchedClassDesc ModelASchedClasses[] = { ... InstA, 0, ... InstB, -1,... }; llvm::MCSchedClassDesc ModelBSchedClasses[] = { ... InstA, -1,... InstB, 0,... }; The -1 means invalid num of macro ops, while it is valid if it is >=0. This is what we look like now: llvm::MCSchedClassDesc ModelASchedClasses[] = { ... InstA, 0, ... InstB, 0,... }; llvm::MCSchedClassDesc ModelBSchedClasses[] = { ... InstA, 0,... InstB, 0,... }; And compiler hit the assertion here because the SCDesc is valid now for both InstA and InstB. Differential Revision: https://reviews.llvm.org/D67950 llvm-svn: 374524
* [ARM] Cortex-M4 schedule additionsDavid Green2019-09-291-1/+0
| | | | | | | | | | | | | | | | | | | This is an attempt to fill in some of the missing instructions from the Cortex-M4 schedule, and make it easier to do the same for other ARM cpus. - Some instructions are marked as hasNoSchedulingInfo as they are pseudos or otherwise do not require scheduling info - A lot of features have been marked not supported - Some WriteRes's have been added for cvt instructions. - Some extra instruction latencies have been added, notably by relaxing the regex for dsp instruction to catch more cases, and some fp instructions. This goes a long way to get the CompleteModel working for this CPU. It does not go far enough as to get all scheduling info for all output operands correct. Differential Revision: https://reviews.llvm.org/D67957 llvm-svn: 373163
* [NFC][ARM] Add and modify testsSam Parker2019-09-111-2/+14
| | | | | | Add test for ParallelDSP. llvm-svn: 371594
* [ARM] Remove EarlyCSE from backendSam Parker2019-03-151-6/+2
| | | | | | | | | | | There is an issue with early CSE hitting an assert, so temporarily remove the pass from the Arm backend. Bug: https://bugs.llvm.org/show_bug.cgi?id=41081 Differential Revision: https://reviews.llvm.org/D59410 llvm-svn: 356259
* [ARM][ParallelDSP] Disable for big-endianSam Parker2019-03-151-0/+5
| | | | | | | | | Bail early when we don't have a preheader and also if the target is big endian because it's written with only little endian in mind! Differential Revision: https://reviews.llvm.org/D59368 llvm-svn: 356243
* [ARM][ParallelDSP] Enable multiple uses of loadsSam Parker2019-03-141-0/+217
When choosing whether a pair of loads can be combined into a single wide load, we check that the load only has a sext user and that sext also only has one user. But this can prevent the transformation in the cases when parallel macs use the same loaded data multiple times. To enable this, we need to fix up any other uses after creating the wide load: generating a trunc and a shift + trunc pair to recreate the narrow values. We also need to keep a record of which loads have already been widened. Differential Revision: https://reviews.llvm.org/D59215 llvm-svn: 356132
OpenPOWER on IntegriCloud