summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Fix global isel crashesMatt Arsenault2016-06-282-6/+9
| | | | llvm-svn: 274039
* AMDGPU: Fix typoMatt Arsenault2016-06-281-7/+6
| | | | llvm-svn: 274034
* AMDGPU: Remove unused functionMatt Arsenault2016-06-282-33/+0
| | | | llvm-svn: 274033
* AMDGPU: Fix out of bounds indirect indexing errorsMatt Arsenault2016-06-281-8/+19
| | | | | | | This was producing acceses to registers beyond the super register's limits, resulting in verifier failures. llvm-svn: 273977
* AMDGPU: Fix global isel buildMatt Arsenault2016-06-282-15/+15
| | | | llvm-svn: 273964
* AMDGPU: Set MinInstAlignmentMatt Arsenault2016-06-271-0/+1
| | | | | | Not sure this actually changes anything llvm-svn: 273947
* AMDGPU: Implement per-function subtargetsMatt Arsenault2016-06-274-42/+76
| | | | llvm-svn: 273940
* AMDGPU: Move subtarget feature checks into passesMatt Arsenault2016-06-276-28/+37
| | | | llvm-svn: 273937
* AMDGPU: Fix verifier errors with undef vector indicesMatt Arsenault2016-06-271-27/+37
| | | | | | Also fix pointlessly adding exec to liveins. llvm-svn: 273916
* Fix "not all control paths return a value" warning on MSVCSimon Pilgrim2016-06-271-0/+2
| | | | llvm-svn: 273872
* SIMachineFunctionInfo.cpp: Appease msc18 to use std::array.NAKAMURA Takumi2016-06-272-4/+5
| | | | llvm-svn: 273860
* Reformat.NAKAMURA Takumi2016-06-271-1/+1
| | | | llvm-svn: 273859
* Reformat blank lines.NAKAMURA Takumi2016-06-272-3/+0
| | | | llvm-svn: 273858
* AMDGPU/R600: Fix GlobalValue regressions.Jan Vesely2016-06-252-2/+3
| | | | | | | | | | | | | | | Don't cast GV expression to MCSymbolRefExpr. r272705 changed GV to binary expressions by including offset even if the offset it 0 (we haven't hit this sooner since tested workloads don't include static offsets) We don't really care about the type of expression, so set it directly. Fixes: r272705 Consider section relative relocations. Since all const as data is in one boffer section relative is equivalent to abs32. Fixes: r273166 Differential Revision: http://reviews.llvm.org/D21633 llvm-svn: 273785
* [AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in ↵Konstantin Zhuravlyov2016-06-2511-5/+207
| | | | | | | | | | | | | | | | | | | | | | | the kernel code header Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue. Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format: - offset 0: work group ID x - offset 4: work group ID y - offset 8: work group ID z - offset 16: work item ID x - offset 20: work item ID y - offset 24: work item ID z Set - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled Differential Revision: http://reviews.llvm.org/D20335 llvm-svn: 273769
* AMDGPU/SI: Make sure not to fold offsets into local address space globalsTom Stellard2016-06-252-0/+10
| | | | | | | | | | | | | | Summary: Offset folding only works if you are emitting relocations, and we don't emit relocations for local address space globals. Reviewers: arsenm, nhaustov Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21647 llvm-svn: 273765
* AMDGPU: Define a schedule class for COPY.Matthias Braun2016-06-242-0/+24
| | | | | | | | | | COPY was lacking a scheduling class, define it to avoid regressions in the upcoming change to the bidirectional MachineScheduler. Approved by tstellar on IRC. Differential Revision: http://reviews.llvm.org/D21540 llvm-svn: 273751
* AMDGPU: Add stub custom CodeGenPrepare passMatt Arsenault2016-06-244-0/+88
| | | | | | | | This will do various things including ones CodeGenPrepare does, but with knowledge of uniform values. llvm-svn: 273657
* AMDGPU: Remove disable-irstructurizer subtarget featureMatt Arsenault2016-06-244-14/+7
| | | | | | | | The only real reason to use it is for testing, so replace it with a command line option instead of a potentially function dependent feature. llvm-svn: 273653
* AMDGPU: Cleanup subtarget handling.Matt Arsenault2016-06-2458-706/+879
| | | | | | | | | Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652
* Support/ELF: Add R_AMDGPU_GOTPCREL relocationTom Stellard2016-06-231-0/+7
| | | | | | | | | | | | | Summary: We will start generating this in a future patch. Reviewers: arsenm, kzhuravl, rafael, ruiu, tony-tye Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21482 llvm-svn: 273628
* AMDGPU: Add option to disable spilling SGPRs to VGPRs.Matt Arsenault2016-06-231-2/+9
| | | | | | This can help debug spilling problems. llvm-svn: 273605
* [AMDGPU] Enable absolute expression initializer for amd_kernel_code_t fields.Valery Pykhtin2016-06-234-23/+26
| | | | | | Differential Revision: http://reviews.llvm.org/D21380 llvm-svn: 273561
* [AMDGPU] Remove exit-on-error in test (PR27761)Diana Picus2016-06-232-2/+5
| | | | | | | | | | | | | | | | | The exit-on-error flag was necessary in order to avoid an assertion when handling DYNAMIC_STACKALLOC nodes in SelectionDAGLegalize. We can avoid the assertion by creating some dummy nodes. This enables us to remove the exit-on-error flag on the first 2 run lines (SI), but on the third run line (R600) we would run into another assertion when trying to reserve indirect registers. This patch also replaces that assertion with an early exit from the function. Fixes PR27761. Differential Revision: http://reviews.llvm.org/D20852 llvm-svn: 273550
* AMDGPU: readlane/writelane do not read execMatt Arsenault2016-06-232-2/+26
| | | | llvm-svn: 273525
* AMDGPU: Fix liveness when expanding m0 loopMatt Arsenault2016-06-222-23/+67
| | | | llvm-svn: 273514
* AMDGPU/SI: Define an intrinsic to expose ds_swizzle_b32Changpeng Fang2016-06-221-0/+12
| | | | | | | | Reviewers: tstellarAMD, arsenm Differential Revision: http://reviews.llvm.org/D21533 llvm-svn: 273496
* AMDGPU: Run verifier after 2nd run of SIShrinkInstructionsMatt Arsenault2016-06-221-1/+1
| | | | llvm-svn: 273469
* AMDGPU: Fix verifier errors in SILowerControlFlowMatt Arsenault2016-06-2210-133/+217
| | | | | | | | | | | | | The main sin this was committing was using terminator instructions in the middle of the block, and then not updating the block successors / predecessors. Split the blocks up to avoid this and introduce new pseudo instructions for branches taken with exec masking. Also use a pseudo instead of emitting s_endpgm and erasing it in the special case of a non-void return. llvm-svn: 273467
* AMDGPU: Make FrameLowering stack alignment 16Matt Arsenault2016-06-221-3/+4
| | | | | | | We don't need it to be that high. The natural alignment for a single workitem's stack is 16. llvm-svn: 273448
* AMDGPU: Fix gcc warningsMatt Arsenault2016-06-224-197/+60
| | | | | | | Mostly removing dead code. Apparently gcc's warning for unused functions is better llvm-svn: 273363
* Delete more dead code.Rafael Espindola2016-06-213-191/+0
| | | | | | Found by gcc 6. llvm-svn: 273322
* AMDGPU: Add implicitarg.ptr intrinsic.Jan Vesely2016-06-213-11/+24
| | | | | | | | Points to the start of implicit arguments (appended after explicit arguments) Differential Revision: http://reviews.llvm.org/D20297 llvm-svn: 273317
* Delete some dead code.Rafael Espindola2016-06-213-25/+0
| | | | | | Found by gcc 6. llvm-svn: 273303
* AMDGPU: Preserve undef flag on vcc when shrinking v_cndmask_b32Matt Arsenault2016-06-201-16/+13
| | | | | | | | | The implicit operand is added by the initial instruction construction, so this was adding an additional vcc use. The original one was missing the undef flag the original condition had, so the verifier would complain. llvm-svn: 273182
* AMDGPU: Fold more custom nodes to undefMatt Arsenault2016-06-201-11/+40
| | | | | | | | | | | This will help sneak undefs past GVN into the DAG for some tests. Also add missing intrinsic for rsq_legacy, even though the node was already selected to the instruction. Also start passing the debug location to intrinsic errors. llvm-svn: 273181
* Generalize DiagnosticInfoStackSize to support other limitsMatt Arsenault2016-06-201-3/+11
| | | | | | | Backends may want to report errors on resources other than stack size. llvm-svn: 273177
* AMDGPU: Use correct method for determining instruction sizeMatt Arsenault2016-06-201-2/+4
| | | | llvm-svn: 273172
* AMDGPU: Add support for R_AMDGPU_REL32 relocationsTom Stellard2016-06-202-1/+8
| | | | | | | | | | Reviewers: arsenm, kzhuravl, rafael Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21401 llvm-svn: 273168
* AMDGPU: Emit R_AMDGPU_ABS32_{HI,LO} for scratch buffer relocationsTom Stellard2016-06-201-4/+15
| | | | | | | | | | Reviewers: arsenm, rafael, kzhuravl Subscribers: rafael, arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21400 llvm-svn: 273166
* Reformat blank lines.NAKAMURA Takumi2016-06-203-11/+0
| | | | llvm-svn: 273131
* Untabify.NAKAMURA Takumi2016-06-203-9/+6
| | | | llvm-svn: 273129
* AMDGPU: Fix kernel argument alignment impacting stack sizeMatt Arsenault2016-06-184-14/+29
| | | | | | | | Don't use AllocateStack because kernel arguments have nothing to do with the stack. The ensureMaxAlignment call was still changing the stack alignment. llvm-svn: 273080
* AMDGPU: Temporarily select trap to s_endpgmMatt Arsenault2016-06-173-0/+21
| | | | | | | | | | | | This should select to s_trap, but that requires additonal work to setup and enable the trap handler. For now emit s_endpgm so bugpoint stops getting stuck on the unsupported call to abort. Emit a warning that this will only terminate the wave and not really trap. llvm-svn: 273062
* AMDGPU/SI: Simplify code in SITargetLowering::LowerGlobalAddress()Tom Stellard2016-06-171-1/+1
| | | | | | This change were suggested in http://reviews.llvm.org/D21154. llvm-svn: 273059
* AMDGPU: Remove llvm.SI.tid intrinsicMatt Arsenault2016-06-173-9/+0
| | | | | | Mesa doesn't emit this for llvm >= 3.8 anymore. llvm-svn: 273050
* AMDGPU/SI: Propagate the Kill flag in storeRegToStackSlot and ↵Changpeng Fang2016-06-163-15/+29
| | | | | | | | | | eliminateFrameIndex Reviewers: arsenm, tstellarAMD Differential Revision: http://reviews.llvm.org/21438 llvm-svn: 272958
* AMDGPU: Fix maximum instruction size for amdgcnMatt Arsenault2016-06-161-1/+3
| | | | | | | This was causing the conservative estimate of inline asm size to be twice as big as expected. llvm-svn: 272956
* AMDGPU: Add v_mad 16-bit instructions definition.Wei Ding2016-06-162-0/+11
| | | | | | Differential Revision: http://reviews.llvm.org/D21362 llvm-svn: 272919
* [AMDGPU] Fix few coding style issues. NFC.Valery Pykhtin2016-06-152-23/+23
| | | | llvm-svn: 272785
OpenPOWER on IntegriCloud