diff options
author | Nicolai Haehnle <nhaehnle@gmail.com> | 2018-11-30 22:55:38 +0000 |
---|---|---|
committer | Nicolai Haehnle <nhaehnle@gmail.com> | 2018-11-30 22:55:38 +0000 |
commit | a7b00058e05f6862d4ef2c8f8bb287b09f7e41b1 (patch) | |
tree | 3f571b7d7ba5368d8ca4dc8010ef04ffe0ee6eef /llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp | |
parent | a9cc92c247ce5d0ecc3399e7af6e40a3d59bbf6c (diff) | |
download | bcm5719-llvm-a7b00058e05f6862d4ef2c8f8bb287b09f7e41b1.tar.gz bcm5719-llvm-a7b00058e05f6862d4ef2c8f8bb287b09f7e41b1.zip |
AMDGPU: Divergence-driven selection of scalar buffer load intrinsics
Summary:
Moving SMRD to VMEM in SIFixSGPRCopies is rather bad for performance if
the load is really uniform. So select the scalar load intrinsics directly
to either VMEM or SMRD buffer loads based on divergence analysis.
If an offset happens to end up in a VGPR -- either because a floating
point calculation was involved, or due to other remaining deficiencies
in SIFixSGPRCopies -- we use v_readfirstlane.
There is some unrelated churn in tests since we now select MUBUF offsets
in a unified way with non-scalar buffer loads.
Change-Id: I170e6816323beb1348677b358c9d380865cd1a19
Reviewers: arsenm, alex-t, rampitec, tpr
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D53283
llvm-svn: 348050
Diffstat (limited to 'llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp')
-rw-r--r-- | llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp | 7 |
1 files changed, 5 insertions, 2 deletions
diff --git a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp index 1cce9812dd1..bbcb73dcbb5 100644 --- a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp +++ b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp @@ -908,9 +908,12 @@ bool isLegalSMRDImmOffset(const MCSubtargetInfo &ST, int64_t ByteOffset) { // Given Imm, split it into the values to put into the SOffset and ImmOffset // fields in an MUBUF instruction. Return false if it is not possible (due to a // hardware bug needing a workaround). +// +// The required alignment ensures that individual address components remain +// aligned if they are aligned to begin with. It also ensures that additional +// offsets within the given alignment can be added to the resulting ImmOffset. bool splitMUBUFOffset(uint32_t Imm, uint32_t &SOffset, uint32_t &ImmOffset, - const GCNSubtarget *Subtarget) { - const uint32_t Align = 4; + const GCNSubtarget *Subtarget, uint32_t Align) { const uint32_t MaxImm = alignDown(4095, Align); uint32_t Overflow = 0; |