diff options
| author | Changpeng Fang <changpeng.fang@gmail.com> | 2019-01-15 23:12:36 +0000 |
|---|---|---|
| committer | Changpeng Fang <changpeng.fang@gmail.com> | 2019-01-15 23:12:36 +0000 |
| commit | 20fe3d2f35ac8474952340f7f4c32a754e8d5102 (patch) | |
| tree | 5a336c2223296df3270cbbed5c442afff4899d6c /llvm/lib/Target | |
| parent | a4fd381dc26e0002ad82132f7a45841c182d75f8 (diff) | |
| download | bcm5719-llvm-20fe3d2f35ac8474952340f7f4c32a754e8d5102.tar.gz bcm5719-llvm-20fe3d2f35ac8474952340f7f4c32a754e8d5102.zip | |
AMDGPU: Raise the priority of MAD24 in instruction selection.
Summary:
We have seen performance regression when v_add3 is generated. The major reason is that the v_mad pattern
is broken when v_add3 is generated. We also see the register pressure increased. While we could not properly
estimate register pressure during instruction selection, we can give mad a higher priority.
In this work, we raise the priority for mad24 in selection and resolve the performance regression.
Reviewers:
rampitec
Differential Revision:
https://reviews.llvm.org/D56745
llvm-svn: 351273
Diffstat (limited to 'llvm/lib/Target')
| -rw-r--r-- | llvm/lib/Target/AMDGPU/AMDGPUInstructions.td | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td b/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td index 282d1c11833..eb8f2002ff2 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td +++ b/llvm/lib/Target/AMDGPU/AMDGPUInstructions.td @@ -842,6 +842,7 @@ def cvt_flr_i32_f32 : PatFrag < [{ (void)N; return TM.Options.NoNaNsFPMath; }] >; +let AddedComplexity = 2 in { class IMad24Pat<Instruction Inst, bit HasClamp = 0> : AMDGPUPat < (add (AMDGPUmul_i24 i32:$src0, i32:$src1), i32:$src2), !if(HasClamp, (Inst $src0, $src1, $src2, (i1 0)), @@ -853,6 +854,7 @@ class UMad24Pat<Instruction Inst, bit HasClamp = 0> : AMDGPUPat < !if(HasClamp, (Inst $src0, $src1, $src2, (i1 0)), (Inst $src0, $src1, $src2)) >; +} // AddedComplexity. class RcpPat<Instruction RcpInst, ValueType vt> : AMDGPUPat < (fdiv FP_ONE, vt:$src), |

