diff options
author | Matt Arsenault <Matthew.Arsenault@amd.com> | 2017-07-28 18:40:05 +0000 |
---|---|---|
committer | Matt Arsenault <Matthew.Arsenault@amd.com> | 2017-07-28 18:40:05 +0000 |
commit | c06574ffc0053cd9bf54d174cc9467e8a8edd94d (patch) | |
tree | 0ede0fbb6791fb60a12b9545cfc637af236e83cf /llvm/lib/Target/AMDGPU/AMDGPU.h | |
parent | 1b179bc5ff845e49fe787783ca3c7c8f2949c5bb (diff) | |
download | bcm5719-llvm-c06574ffc0053cd9bf54d174cc9467e8a8edd94d.tar.gz bcm5719-llvm-c06574ffc0053cd9bf54d174cc9467e8a8edd94d.zip |
AMDGPU: Add pass to replace out arguments
It is better to return arguments directly in registers
if we are making a call rather than introducing expensive
stack usage. In one of sample compile from one of
Blender's many kernel variants, this fires on about
~20 different functions. Future improvements may be to
recognize simple cases where the pointer is indexing a small
array. This also fails when the store to the out argument
is in a separate block from the return, which happens in
a few of the Blender functions. This should also probably
be using MemorySSA which might help with that.
I'm not sure this is correct as a FunctionPass, but
MemoryDependenceAnalysis seems to not work with
a ModulePass.
I'm also not sure where it should run.I think it should
run before DeadArgumentElimination, so maybe either
EP_CGSCCOptimizerLate or EP_ScalarOptimizerLate.
llvm-svn: 309416
Diffstat (limited to 'llvm/lib/Target/AMDGPU/AMDGPU.h')
-rw-r--r-- | llvm/lib/Target/AMDGPU/AMDGPU.h | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h index bbb542eb4fc..f4df15e9500 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPU.h +++ b/llvm/lib/Target/AMDGPU/AMDGPU.h @@ -51,6 +51,7 @@ FunctionPass *createSIInsertWaitsPass(); FunctionPass *createSIInsertWaitcntsPass(); FunctionPass *createAMDGPUCodeGenPreparePass(); FunctionPass *createAMDGPUMachineCFGStructurizerPass(); +FunctionPass *createAMDGPURewriteOutArgumentsPass(); void initializeAMDGPUMachineCFGStructurizerPass(PassRegistry&); extern char &AMDGPUMachineCFGStructurizerID; @@ -65,6 +66,9 @@ ModulePass *createAMDGPULowerIntrinsicsPass(); void initializeAMDGPULowerIntrinsicsPass(PassRegistry &); extern char &AMDGPULowerIntrinsicsID; +void initializeAMDGPURewriteOutArgumentsPass(PassRegistry &); +extern char &AMDGPURewriteOutArgumentsID; + void initializeSIFoldOperandsPass(PassRegistry &); extern char &SIFoldOperandsID; |