summaryrefslogtreecommitdiffstats
path: root/llvm/docs/CommandGuide/llvm-profdata.rst
diff options
context:
space:
mode:
authorMatt Arsenault <Matthew.Arsenault@amd.com>2016-07-19 00:35:03 +0000
committerMatt Arsenault <Matthew.Arsenault@amd.com>2016-07-19 00:35:03 +0000
commitcb540bc03c29ad9e9c1982267135d2cee3033058 (patch)
treec12c1a69b6c89f9635b7c557695e61d2ab78ce05 /llvm/docs/CommandGuide/llvm-profdata.rst
parent0de9b91a71db1a67e2c5c742a2a19c48c22d7f72 (diff)
downloadbcm5719-llvm-cb540bc03c29ad9e9c1982267135d2cee3033058.tar.gz
bcm5719-llvm-cb540bc03c29ad9e9c1982267135d2cee3033058.zip
AMDGPU: Expand register indexing pseudos in custom inserter
This is to help moveSILowerControlFlow to before regalloc. There are a couple of tradeoffs with this. The complete CFG is visible to more passes, the loop body avoids an extra copy of m0, vcc isn't required, and immediate offsets can be shrunk into s_movk_i32. The disadvantage is the register allocator doesn't understand that the single lane's vector is dead within the loop body, so an extra register is used to outlive the loop block when expanding the VGPR -> m0 loop. This also now results in worse waitcnt insertion before the loop instead of after for pending operations at the point of the indexing, but that should be fixed by future improvements to cross block waitcnt insertion. v_movreld_b32's operands are now modeled more correctly since vdst is not a true output. This is kind of a hack to treat vdst as a use operand. Extra checking is required in the verifier since I can't seem to get tablegen to emit an implicit operand for a virtual register. llvm-svn: 275934
Diffstat (limited to 'llvm/docs/CommandGuide/llvm-profdata.rst')
0 files changed, 0 insertions, 0 deletions
OpenPOWER on IntegriCloud