summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU
Commit message (Collapse)AuthorAgeFilesLines
* Rename LiveIntervalAnalysis.h to LiveIntervals.hMatthias Braun2017-12-1310-10/+10
| | | | | | | | | | Headers/Implementation files should be named after the class they declare/define. Also eliminated an `#include "llvm/CodeGen/LiveIntervalAnalysis.h"` in favor of `class LiveIntarvals;` llvm-svn: 320546
* LSR: Check more intrinsic pointer operandsMatt Arsenault2017-12-112-0/+28
| | | | llvm-svn: 320424
* [AMDGPU][MC][GFX9] Corrected encoding of ttmp registers, disabled tba/tmaDmitry Preobrazhensky2017-12-118-69/+164
| | | | | | | | | | | | See bugs 35494 and 35559: https://bugs.llvm.org/show_bug.cgi?id=35494 https://bugs.llvm.org/show_bug.cgi?id=35559 Reviewers: vpykhtin, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D41007 llvm-svn: 320375
* AMDGPU/GCN: Bring processors in sync with AMDGPUUsageKonstantin Zhuravlyov2017-12-084-46/+21
| | | | | | | | | | | | - Add gfx704 - Change bonaire to gfx704 - Remove gfx804 - Remove gfx901 - Remove gfx903 Differential Revision: https://reviews.llvm.org/D40046 llvm-svn: 320194
* AMDGPU: Set IntrReadMem on memtime intrinsicsMatt Arsenault2017-12-081-5/+2
| | | | llvm-svn: 320188
* AMDGPU: image_getlod and image_getresinfo do not read memoryMatt Arsenault2017-12-082-13/+40
| | | | llvm-svn: 320187
* AMDGPU: Preserve MMO in adjustWritemaskMatt Arsenault2017-12-081-0/+2
| | | | | | | | Follow up to r319705. Currently the MMO is produced after this in the custom inserter, so this doesn't change anything yet. llvm-svn: 320186
* AMDGPU: Report Arg's Value name in metadata if kernel_arg_name metadata is ↵Konstantin Zhuravlyov2017-12-081-0/+2
| | | | | | | | not available Differential Revision: https://reviews.llvm.org/D40924 llvm-svn: 320176
* [AMDGPU] add labels to +DumpCode outputTim Renouf2017-12-082-4/+29
| | | | | | | | | | | | | | Summary: +DumpCode is a hack to embed disassembly in the ELF file. This commit fixes it to include labels, to make it slightly more useful. Reviewers: arsenm, kzhuravl Subscribers: nhaehnle, timcorringham, dstuttard, llvm-commits, t-tye, yaxunl, wdng, kzhuravl Differential Revision: https://reviews.llvm.org/D40169 llvm-svn: 320146
* [AMDGPU] Revert "[AMDGPU] Add options for waitcnt pass debugging; add instr ↵Mark Searles2017-12-071-62/+8
| | | | | | | | | | | | | | count in debug output." Patch caused a buildbot failure; http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/15733/steps/build_Lld/logs/stdio : lib/Target/AMDGPU/SIInsertWaitcnts.cpp:396:11: error: private field 'InstCnt' is not used [-Werror,-Wunused-private-field] int32_t InstCnt = 0; ^ 1 error generated. " This reverts commit 71627f79010aafe74fdcba901bba28dd7caa0869. llvm-svn: 320086
* [AMDGPU] Add options for waitcnt pass debugging; add instr count in debug ↵Mark Searles2017-12-071-8/+62
| | | | | | | | | | | | | output. -amdgpu-waitcnt-forcezero={1|0} Force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) -amdgpu-waitcnt-forceexp=<n> Force emit a s_waitcnt expcnt(0) before the first <n> instrs -amdgpu-waitcnt-forcelgkm=<n> Force emit a s_waitcnt lgkmcnt(0) before the first <n> instrs -amdgpu-waitcnt-forcevm=<n> Force emit a s_waitcnt vmcnt(0) before the first <n> instrs Differential Revision: https://reviews.llvm.org/D40091 llvm-svn: 320084
* [AMDGPU] Add GCNHazardRecognizer::checkInlineAsmHazards() and ↵Mark Searles2017-12-072-17/+64
| | | | | | | | GCNHazardRecognizer::checkVALUHazardsHelper(). checkInlineAsmHazards() checks INLINEASM for hazards that we particularly care about (so not exhaustive); this patch adds a check for INLINEASM that defs vregs that hold data-to-be stored by immediately preceding store of more than 8 bytes. If the instr were not within an INLINEASM, this scenario would be handled by checkVALUHazard(). Add checkVALUHazardsHelper(), which will be called by both checkVALUHazards() and checkInlineAsmHazards(). Differential Revision: https://reviews.llvm.org/D40098 llvm-svn: 320083
* [CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.Francis Visoiu Mistrih2017-12-077-23/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Work towards the unification of MIR and debug output by refactoring the interfaces. For MachineOperand::print, keep a simple version that can be easily called from `dump()`, and a more complex one which will be called from both the MIRPrinter and MachineInstr::print. Add extra checks inside MachineOperand for detached operands (operands with getParent() == nullptr). https://reviews.llvm.org/D40836 * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/<def>//g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<def[ ]*,[ ]*dead>/dead \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ]*,[ ]*dead>/implicit-def dead \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g' llvm-svn: 320022
* AMDGPU: Fix SDWA crash on inline asmMatt Arsenault2017-12-051-1/+2
| | | | | | | | This was only searching for explicit defs, and asserting for any implicit or variadic instruction defs, like inline asm. llvm-svn: 319826
* AMDGPU: Fix infinite loop with dbg_valueMatt Arsenault2017-12-051-1/+4
| | | | | | | | | Surprisingly SIOptimizeExecMaskingPreRA can infinite loop in some case with DBG_VALUE. Most tests using dbg_value are run at -O0, so don't run this pass. This seems to only happen when the value argument is undef. llvm-svn: 319808
* AMDGPU: Fix missing subtarget feature initializerMatt Arsenault2017-12-051-0/+1
| | | | llvm-svn: 319733
* AMDGPU: Fix crash when scheduling DBG_VALUEMatt Arsenault2017-12-051-1/+5
| | | | | | | | | | | | | | This calls handleMove with a DBG_VALUE instruction, which isn't tracked by LiveIntervals. I'm not sure this is the correct place to fix this. The generic scheduler seems to have more deliberate region selection that skips dbg_value. The test is also really hard to reduce. I haven't been able to figure out what exactly causes this particular case to try moving the dbg_value. llvm-svn: 319732
* AMDGPU/EG: Add a new FeatureFMA and use it to selectively enable FMA instructionJan Vesely2017-12-046-3/+23
| | | | | | | | | Only used by pre-GCN targets v2: fix predicate setting for FMA_Common Differential Revision: https://reviews.llvm.org/D40692 llvm-svn: 319712
* AMDGPU: Disable fp64 support on pre GCN asicsJan Vesely2017-12-043-14/+19
| | | | | | | | | | | It's not implemented. Passing +fp64-fp16-denormal feature enables fp64 even on asics that don't support it v2: fix hasFP64 query Differential Revision: https://reviews.llvm.org/D39931 llvm-svn: 319709
* AMDGPU: Fix creating invalid copy when adjusting dmaskMatt Arsenault2017-12-046-61/+129
| | | | | | | | | Move the entire optimization to one place. Before it was possible to adjust dmask without changing the register class of the output instruction, since they were done in separate places. Fix all lane sizes and move all of the optimization into the DAG folding. llvm-svn: 319705
* AMDGPU: Use return value of MorphNodeToMatt Arsenault2017-12-041-3/+1
| | | | llvm-svn: 319704
* [CodeGen] Unify MBB reference format in both MIR and debug outputFrancis Visoiu Mistrih2017-12-046-64/+70
| | | | | | | | | | | | | | | | As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g' * find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665
* [AMDGPU] SDWA: add support for PRESERVE into SDWA peephole.Sam Kolton2017-12-042-285/+534
| | | | | | | | | | | | Summary: Reviewers: arsenm, vpykhtin, rampitec Subscribers: kzhuravl, wdng, nhaehnle, mgorny, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D37817 llvm-svn: 319662
* AMDGPU: fix missing s_waitcntTim Corringham2017-12-041-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The pass that inserts s_waitcnt instructions where needed propagated info used to track dependencies for each block by iterating over the predecessor blocks. The iteration was terminated when a predecessor that had not yet been processed was encountered. Any info in blocks later in the list was therefore not processed, leading to the possiblility of a required s_waitcnt not being inserted. The fix is simply to change the "break" to "continue" for the relevant loops, so that all visited blocks are processed. This is likely what was intended when the code was written. There is no test case provided for this fix because: 1) the only example that reproduces this is large and resistant to being reduced 2) the change is trivial Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D40544 llvm-svn: 319651
* [AMDGPU] SiFixSGPRCopies should not modify non-divergent PHIAlexander Timofeev2017-12-011-15/+64
| | | | | | Differential revision: https://reviews.llvm.org/D40556 llvm-svn: 319534
* AMDGPU: Use carry-less adds in FI eliminationMatt Arsenault2017-11-302-9/+6
| | | | llvm-svn: 319501
* AMDGPU: Use gfx9 carry-less add/sub instructionsMatt Arsenault2017-11-306-34/+118
| | | | llvm-svn: 319491
* [CodeGen] Print "%vreg0" as "%0" in both MIR and debug outputFrancis Visoiu Mistrih2017-11-305-37/+37
| | | | | | | | | | | | | | | | | As part of the unification of the debug format and the MIR format, avoid printing "vreg" for virtual registers (which is one of the current MIR possibilities). Basically: * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g" * grep -nr '%vreg' . and fix if needed * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g" * grep -nr 'vreg[0-9]\+' . and fix if needed Differential Revision: https://reviews.llvm.org/D40420 llvm-svn: 319427
* AMDGPU: Allow negative MUBUF vaddr for gfx9Matt Arsenault2017-11-302-4/+21
| | | | | | | | GFX9 does not enable bounds checking for the resource descriptors used for private access, so it should be OK to use vaddr with a potentially negative value. llvm-svn: 319393
* [AMDGPU][MC][GFX9] Corrected mapping of GFX9 v_add/sub/subrev_u32Dmitry Preobrazhensky2017-11-291-9/+14
| | | | | | | | | | | When translating pseudo to MC, v_add/sub/subrev_u32 shall be mapped via a separate table as GFX8 has opcodes with the same names. These instructions shall also be labelled as renamed for pseudoToMCOpcode to handle them correctly. Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D40550 llvm-svn: 319311
* DAG: Add nuw when splitting loads and storesMatt Arsenault2017-11-292-15/+8
| | | | | | | | | | | The object can't straddle the address space wrap around, so I think it's OK to assume any offsets added to the base object pointer can't overflow. Similar logic already appears to be applied in SelectionDAGBuilder when lowering aggregate returns. llvm-svn: 319272
* AMDGPU: Select DS insts without m0 initializationMatt Arsenault2017-11-296-77/+178
| | | | | | | | | GFX9 stopped using m0 for most DS instructions. Select a different instruction without the use. I think this will be less error prone than trying to manually maintain m0 uses as needed. llvm-svn: 319270
* AMDGPU: Enable IPRAMatt Arsenault2017-11-281-0/+4
| | | | llvm-svn: 319256
* AMDGPU: Add num spilled s/vgprs to metadataKonstantin Zhuravlyov2017-11-281-0/+2
| | | | | | | | This was requested by tools. Differential Revision: https://reviews.llvm.org/D40321 llvm-svn: 319192
* [CodeGen] Print register names in lowercase in both MIR and debug outputFrancis Visoiu Mistrih2017-11-285-20/+20
| | | | | | | | | | | As part of the unification of the debug format and the MIR format, always print registers as lowercase. * Only debug printing is affected. It now follows MIR. Differential Revision: https://reviews.llvm.org/D40417 llvm-svn: 319187
* [CodeGen] Rename functions PrintReg* to printReg*Francis Visoiu Mistrih2017-11-285-74/+74
| | | | | | | | | | | LLVM Coding Standards: Function names should be verb phrases (as they represent actions), and command-like function should be imperative. The name should be camel case, and start with a lower case letter (e.g. openFile() or isFoo()). Differential Revision: https://reviews.llvm.org/D40416 llvm-svn: 319168
* AMDGPU: Re-organize the outer loop of SILoadStoreOptimizerNicolai Haehnle2017-11-281-6/+5
| | | | | | | | | | | | | | | | | | | Summary: The entire algorithm operates per basic-block, so for cache locality it should be better to re-optimize a basic-block immediately rather than in a separate loop. I don't have performance measurements. Change-Id: I85106570bd623c4ff277faaa50ee43258e1ddcc5 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D40344 llvm-svn: 319156
* AMDGPU: Consistently check for immediates in SIInstrInfo::FoldImmediateNicolai Haehnle2017-11-281-23/+22
| | | | | | | | | | | | | | | | | | | | | | | Summary: The PeepholeOptimizer pass calls this function solely based on checking DefMI->isMoveImmediate(), which only checks the MoveImm bit of the instruction description. So it's up to FoldImmediate itself to properly check that DefMI *actually* moves from an immediate. I don't have a separate test case for this, but the next patch introduces a test case which happens to crash without this change. This error is caught by the assertion in MachineOperand::getImm(). Change-Id: I88e7cdbcf54d75e1a296822e6fe5f9a5f095bbf8 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D40342 llvm-svn: 319155
* [AMDGPU][MC][DISASSEMBLER][GFX9] Corrected decoding of GLOBAL/SCRATCH opcodesDmitry Preobrazhensky2017-11-273-6/+6
| | | | | | | | | See bug 35433: https://bugs.llvm.org/show_bug.cgi?id=35433 Differential Revision: https://reviews.llvm.org/D40493 Reviewers: artem.tamazov, SamWot, arsenm llvm-svn: 319050
* [AMDGPU] Add custom lowering for llvm.log{,10}.{f16,f32} intrinsicsVedran Miletic2017-11-272-0/+32
| | | | | | | | | | | | | | AMDGPU backend errors with "unsupported call to function" upon encountering a call to llvm.log{,10}.{f16,f32} intrinsics. This patch adds custom lowering to avoid that error on both R600 and SI. Reviewers: arsenm, jvesely Subscribers: tstellar Differential Revision: https://reviews.llvm.org/D29942 llvm-svn: 319025
* [AMDGPU][MC][GFX9] Added v_interp_p2_f16 and v_interp_p2_legacy_f16Dmitry Preobrazhensky2017-11-241-2/+18
| | | | | | | | | | See bug 33629: https://bugs.llvm.org//show_bug.cgi?id=33629 Reviewers: artem.tamazov, SamWot, arsenm Differential Revision: https://reviews.llvm.org/D39488 llvm-svn: 318955
* Make helpers static. NFC.Benjamin Kramer2017-11-241-4/+4
| | | | llvm-svn: 318953
* [AMDGPU][MC][GFX9] Added support of 'inst_offset' modifier for compatibility ↵Dmitry Preobrazhensky2017-11-241-2/+5
| | | | | | | | | | | | with SP3 See bug 35329: https://bugs.llvm.org//show_bug.cgi?id=35329 Reviewers: arsenm, vpykhtin, artem.tamazov Differential Revision: https://reviews.llvm.org/D40350 llvm-svn: 318947
* [AMDGPU] Fix SITargetLowering::LowerCall for pointer info of byval argumentYaxun Liu2017-11-221-2/+3
| | | | | | | | | | | SITargetLowering::LowerCall uses dummy pointer info for byval argument, which causes flat load instead of buffer load. This patch fixes that. Differential Revision: https://reviews.llvm.org/D40040 llvm-svn: 318844
* AMDGPU: Consider memory dependencies with moved instructions in ↵Nicolai Haehnle2017-11-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | SILoadStoreOptimizer Summary: This bug seems to have gone unnoticed because critical cases with LDS instructions are eliminated by the peephole optimizer. However, equivalent situations arise with buffer loads and stores as well, so this fixes regressions since r317751 ("AMDGPU: Merge S_BUFFER_LOAD_DWORD_IMM into x2, x4"). Fixes at least: KHR-GL45.shader_storage_buffer_object.basic-operations-case1-cs KHR-GL45.cull_distance.functional piglit tes-input-gl_ClipDistance.shader_test ... and probably more Change-Id: I0e371536288eb8e6afeaa241a185266fd45d129d Reviewers: arsenm, mareko, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D40303 llvm-svn: 318829
* [AMDGPU][MC][GFX8][GFX9] Corrected names of integer ↵Dmitry Preobrazhensky2017-11-207-100/+149
| | | | | | | | | | | | v_{add/addc/sub/subrev/subb/subbrev} See bug 34765: https://bugs.llvm.org//show_bug.cgi?id=34765 Reviewers: tamazov, SamWot, arsenm, vpykhtin Differential Revision: https://reviews.llvm.org/D40088 llvm-svn: 318675
* AMDGPU: Partial ILP scheduler port from SelectionDAG to SchedulingDAG ↵Valery Pykhtin2017-11-205-4/+431
| | | | | | | | (experimental) Differential revision: https://reviews.llvm.org/D39897 llvm-svn: 318649
* AMDGPU: Move hazard avoidance out of waitcnt pass.Matt Arsenault2017-11-174-93/+60
| | | | | | | | | | This is mostly moving VMEM clause breaking into the hazard recognizer. Also move another hazard currently handled in the waitcnt pass. Also stops breaking clauses unless xnack is enabled. llvm-svn: 318557
* [AMDGPU][MC][GFX9][disassembler] Corrected decoding of op_sel_hi for v_mad_mix*Dmitry Preobrazhensky2017-11-175-26/+26
| | | | | | | | | | See bug 35148: https://bugs.llvm.org//show_bug.cgi?id=35148 Reviewers: tamazov, SamWot, arsenm Differential Revision: https://reviews.llvm.org/D39492 llvm-svn: 318526
* AMDGPU: Replace list of SMEM buffer opcodesMatt Arsenault2017-11-172-10/+14
| | | | llvm-svn: 318506
OpenPOWER on IntegriCloud