summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/local-stack-slot-bug.ll
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: GFX9 GS and HS shaders always have the scratch wave offset in SGPR5Marek Olsak2017-05-041-26/+0
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D32645 llvm-svn: 302200
* AMDGPU: Always allocate emergency stack slot at offset 0Matt Arsenault2017-02-221-4/+3
| | | | | | | | | This allows us to ensure that 0 is never a valid pointer to a user object, and ensures that the offset is always legal without needing a register to access it. This comes at the cost of usable offsets and wasted stack space. llvm-svn: 295877
* Enable FeatureFlatForGlobal on Volcanic IslandsMatt Arsenault2017-01-241-1/+1
| | | | | | | | | | | This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982
* DAG: Avoid OOB when legalizing vector indexingMatt Arsenault2017-01-101-2/+5
| | | | | | | | | If a vector index is out of bounds, the result is supposed to be undefined but is not undefined behavior. Change the legalization for indexing the vector on the stack so that an out of bounds index does not create an out of bounds memory access. llvm-svn: 291604
* AMDGPU: Properly implement SIRegisterInfo::isFrameOffsetLegal and ↵Nicolai Haehnle2016-12-081-8/+3
| | | | | | | | | | | | | | | | | | | | | | | | | needsFrameBaseReg Summary: Without the fix to isFrameOffsetLegal to consider the instruction's immediate offset, the new test case hits the corresponding assertion in resolveFrameIndex, because the LocalStackSlotAllocation pass re-uses a different base register. With only the fix to isFrameOffsetLegal, code quality reduces in a bunch of places because frame base registers are added where they're not needed. This is addressed by properly implementing needsFrameBaseReg, which also helps to avoid unnecessary zero frame indices in a bunch of other places. Fixes piglit glsl-1.50/execution/variable-indexing/gs-output-array-vec4-index-wr.shader_test Reviewers: arsenm, tstellarAMD Subscribers: qcolombet, kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D27344 llvm-svn: 289048
* AMDGPU: Materialize frame index before addMatt Arsenault2016-11-291-3/+4
| | | | | | | | | | | It isn't generally safe to fold the frame index directly into the operand since it will possibly not be an inline immediate after it is expanded. This surprisingly seems to produce better code, since the FI doesn't prevent folding other immediate operands. llvm-svn: 288185
* Reapply "AMDGPU: Don't use offen if it is 0"Matt Arsenault2016-10-261-2/+8
| | | | | | This reverts r283003 llvm-svn: 285203
* Revert "AMDGPU: Don't use offen if it is 0"Mehdi Amini2016-10-011-8/+2
| | | | | | | This reverts commit r282999. Tests are not passing: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/20038 llvm-svn: 283003
* AMDGPU: Don't use offen if it is 0Matt Arsenault2016-10-011-2/+8
| | | | | | This removes many re-initializations of a base register to 0. llvm-svn: 282999
* AMDGPU: Fix broken FrameIndex handlingMatt Arsenault2016-09-171-2/+4
| | | | | | | | | | | | | | | | | We were trying to avoid using a FrameIndex operand in non-pointer operands in a convoluted way, and would break because of using TargetFrameIndex. The TargetFrameIndex should only be used in the case where it makes sense to fold it as part of the addressing mode, otherwise it requires materialization like a normal constant. This wasn't working reliably and failed in the added testcase, hitting the assert when processing the frame index. The TargetFrameIndex was coming from trying to produce an AssertZext limiting the maximum stack size. I'm not sure this was correct to begin with, because it is apparently possible to have a single workitem dispatch that requires all 4G of private memory. llvm-svn: 281824
* AMDGPU: Support folding FrameIndex operandsMatt Arsenault2016-09-141-4/+2
| | | | | | This avoids test regressions in a future commit. llvm-svn: 281491
* AMDGPU: fix local stack slot allocation bugsNicolai Haehnle2016-07-111-0/+22
Summary: The main bug fix here is using the 32-bit encoding of V_ADD_I32 in materializeFrameBaseRegister and resolveFrameIndex, so that arbitrary immediates work. The second part is that we may now require the SegmentWaveByteOffset even when there are initially no stack objects and VGPR spilling isn't enabled, for stack slots that are allocated later. This means that some bits become effectively dead and can be cleaned up. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96602 Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21551 llvm-svn: 275108
OpenPOWER on IntegriCloud