summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [AArch64] Fix clobber computation in A57LoadBalancing pass.Chad Rosier2014-11-241-1/+7
| | | | | | | Extremely difficult to reproduce, so no test case included. PR21637 llvm-svn: 222677
* Eliminate some deep std::vector copies. NFC.Benjamin Kramer2014-10-031-2/+2
| | | | llvm-svn: 218999
* [A57FPLoadBalancing] Modify r217689 - actually we do need to check defsJames Molloy2014-09-141-6/+6
| | | | | | | | ... Just make sure we check uses first so we see the kill first. It turns out ignoring defs gives some pretty nasty runtime failures. I'm certain this is the fix but I'm still reducing a testcase. llvm-svn: 217735
* [A57FPLoadBalancing] Remove support for vector typesJames Molloy2014-09-121-5/+0
| | | | | | | | Vector MUL/MLAs have tied operands, which gives us extra constraints that we currently can't handle. Instead of silently doing the wrong thing, remove support to be readded later properly. llvm-svn: 217690
* [A57FPLoadBalancing] Ignore <def>s when checking if a chain may be killed.James Molloy2014-09-121-0/+4
| | | | | | | | Defs are seen before uses, so a def without the kill flag doesn't necessarily mean that the register is not killed on that instruction. It may be killed in a later use operand. llvm-svn: 217689
* [A57LoadBalancing] unique_ptr-ify.James Molloy2014-09-121-25/+20
| | | | | | Thanks to David Blakie for the in-depth review! llvm-svn: 217682
* [AArch64] FPLoadBalancing: move ownership of the chain to its current ↵Arnaud A. de Grandmaison2014-08-291-5/+14
| | | | | | | | | | | | | | | | | accumulator register and forget about the previously used accumulator. Coming up with a simple testcase is not easy, as this highly depends on what the register allocator is doing: this issue showed up while working with the PBQP allocator, which produced a different allocation scheme. A testcase would need to come up with chain starting in D[0-7], then moving to D[8-15], followed by a call to a function whose regmask clobbers the starting accumulator in D[0-7], then another use of the chain. Fixed some formatting, added some invariant checks while there. llvm-svn: 216721
* Change the return value of "getEnd()" from a MachineInstr* to a ↵James Molloy2014-08-261-1/+1
| | | | | | | | MachineBasicBlock::iterator. It seems on Darwin the illegal round-trip ::iterator -> MachineInstr* -> ::iterator breaks execution horribly when the iterator is not a real MachineInstr, like ::end(). llvm-svn: 216455
* AArch64: avoid deleting the current iterator in a loop.Tim Northover2014-08-081-3/+4
| | | | | | | | std::map invalidates the iterator to any element that gets deleted, which means we can't increment it correctly afterwards. This was causing Darwin test failures. llvm-svn: 215233
* AArch64A57FPLoadBalancing.cpp: Define ColorNames in !NDEBUG.NAKAMURA Takumi2014-08-081-0/+2
| | | | llvm-svn: 215226
* [AArch64] Add an FP load balancing pass for Cortex-A57James Molloy2014-08-081-0/+694
For best-case performance on Cortex-A57, we should try to use a balanced mix of odd and even D-registers when performing a critical sequence of independent, non-quadword FP/ASIMD floating-point multiply or multiply-accumulate operations. This pass attempts to detect situations where the register allocation may adversely affect this load balancing and to change the registers used so as to better utilize the CPU. Ideally we'd just take each multiply or multiply-accumulate in turn and allocate it alternating even or odd registers. However, multiply-accumulates are most efficiently performed in the same functional unit as their accumulation operand. Therefore this pass tries to find maximal sequences ("Chains") of multiply-accumulates linked via their accumulation operand, and assign them all the same "color" (oddness/evenness). This optimization affects S-register and D-register floating point multiplies and FMADD/FMAs, as well as vector (floating point only) muls and FMADD/FMA. Q register instructions (and 128-bit vector instructions) are not affected. llvm-svn: 215199
OpenPOWER on IntegriCloud