summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Partially shrink 64-bit shifts if reduced to 16-bitMatt Arsenault2018-05-091-0/+30
| | | | | | | | | This is an extension of an existing combine to reduce wider shls if the result fits in the final result type. This introduces the same combine, but reduces the shift to a middle sized type to avoid the slow 64-bit shift. llvm-svn: 331916
* AMDGPU: Add combine for trunc of bitcast from build_vectorMatt Arsenault2018-05-092-0/+31
| | | | | | | | | | | | If the truncate is only accessing the first element of the vector, we can use the original source value. This helps with some combine ordering issues after operations are lowered to integer operations between bitcasts of build_vector. In particular it stops unnecessarily materializing the unused top half of a vector in some cases. llvm-svn: 331909
* AMDGPU: Stop special casing constant indexes of extract_vector_eltMatt Arsenault2018-05-091-15/+0
| | | | | | | The same result folds out of the dynamic expansion logic if the index is constant. llvm-svn: 331906
* [DebugInfo] Examine all uses of isDebugValue() for debug instructions.Shiva Chen2018-05-099-16/+16
| | | | | | | | | | | | | | | | | | Because we create a new kind of debug instruction, DBG_LABEL, we need to check all passes which use isDebugValue() to check MachineInstr is debug instruction or not. When expelling debug instructions, we should expel both DBG_VALUE and DBG_LABEL. So, I create a new function, isDebugInstr(), in MachineInstr to check whether the MachineInstr is debug instruction or not. This patch has no new test case. I have run regression test and there is no difference in regression test. Differential Revision: https://reviews.llvm.org/D45342 Patch by Hsiangkai Wang. llvm-svn: 331844
* [AMDGPU] Provide machine -> name mappingTim Renouf2018-05-082-51/+70
| | | | | | | | | | | | | | | Summary: AMDGPU stores a numerical code for the particular GPU variant in EFlags in the ELF file. This commit provides a mapping from that number into the machine name for use by objdump-type tools. Change-Id: Id37fc0bebad443bd89c0080985ce298c4e7e9319 Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46587 llvm-svn: 331798
* AMDGPU: Fix broken dynamic vector indexing for packed typesMatt Arsenault2018-05-081-4/+4
| | | | | | The intention of this was to multiply by 16, not shift by 16. llvm-svn: 331793
* AMDGPU: Use eraseFromParent to delete am instruction when it is no longer ↵Changpeng Fang2018-05-081-3/+6
| | | | | | | | | | | needed. Reviewer: Nicolai Differential Revision: https://reviews.llvm.org/D46438 llvm-svn: 331788
* [AMDGPU] Added checks for dpp_ctrl valueStanislav Mekhanoshin2018-05-084-36/+99
| | | | | | | | | | | | - Report error for invalid dpp_ctrl values. - Changed the way it is reported, now the error will be emitted into asm and will work with release build as well. - Added dpp_ctrl value verifier for codegen. - Added symbolic constants for dpp_ctrl. Differential Revision: https://reviews.llvm.org/D46565 llvm-svn: 331775
* AMDGPU/GlobalISel: Don't try to lower hull shadersTom Stellard2018-05-071-2/+3
| | | | | | | | | | | | Summary: The AMDGPU_HS calling convention is not supported yet. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46149 llvm-svn: 331691
* [AMDGPU][Waitcnt] Remove the old waitcnt passMark Searles2018-05-075-718/+15
| | | | | | | | | Remove the old waitcnt pass ( si-insert-waits ), which is no longer maintained and getting crufty Differential Revision: https://reviews.llvm.org/D46448 llvm-svn: 331641
* [AMDGPU] Don't force WQM for DS opTim Renouf2018-05-071-3/+1
| | | | | | | | | | | | | | | | | | | | | | Summary: Previously, all DS ops forced WQM in a pixel shader. That was a hack to allow for graphics frontends using ds_swizzle to implement explicit derivatives, on SI/CI at least where DPP is not available. But it forced WQM for _any_ DS op. With this commit, DS ops no longer force WQM. Both graphics frontends (Mesa and LLPC) need to change to issue an explicit llvm.amdgcn.wqm intrinsic call when calculating explicit derivatives. The required Mesa change is: "amd/common: use llvm.amdgcn.wqm for explicit derivatives". Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46051 Change-Id: I9b745b626fa91bbd66456e6cf41ee07eeea42f81 llvm-svn: 331633
* AMDGPU/NFC: Update D16PreservesUnusedBits description based Tony Tye's commentsKonstantin Zhuravlyov2018-05-041-1/+3
| | | | llvm-svn: 331564
* AMDGPU/NFC: Fix formatting for 900, 902 ISA Version featuresKonstantin Zhuravlyov2018-05-041-4/+2
| | | | llvm-svn: 331553
* AMDGPU: Add D16 instructions preserve unused bits featureKonstantin Zhuravlyov2018-05-046-9/+27
| | | | | | | | | - Predicate D16 patterns on this new feature - Added this new feature to gfx900/2/4 Differential Revision: https://reviews.llvm.org/D46366 llvm-svn: 331551
* Fast Math Flag mapping into SDNodeMichael Berg2018-05-041-4/+3
| | | | | | | | | | | | | | Summary: Adding support for Fast flags in the SDNode to leverage fast math sub flag usage. Reviewers: spatel, arsenm, jbhateja, hfinkel, escha, qcolombet, echristo, wristow, javed.absar Reviewed By: spatel Subscribers: llvm-commits, rampitec, nhaehnle, tstellar, FarhanaAleen, nemanjai, javed.absar, jbhateja, hfinkel, wdng Differential Revision: https://reviews.llvm.org/D45710 llvm-svn: 331547
* AMDGPU: Make getSubRegFromChannel a static member of AMDGPURegisterInfoTom Stellard2018-05-035-9/+9
| | | | | | | | | | | | | | | | Summary: This makes is possible to have R600RegisterInfo and SIRegisterInfo not inherit from AMDGPURegisterInfo. Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D46280 llvm-svn: 331490
* Rename invariant.group.barrier to launder.invariant.groupPiotr Padlewski2018-05-031-2/+2
| | | | | | | | | | | | | | Summary: This is one of the initial commit of "RFC: Devirtualization v2" proposal: https://docs.google.com/document/d/16GVtCpzK8sIHNc2qZz6RN8amICNBtvjWUod2SujZVEo/edit?usp=sharing Reviewers: rsmith, amharc, kuhar, sanjoy Subscribers: arsenm, nhaehnle, javed.absar, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45111 llvm-svn: 331448
* [AMDGPU] A trivial fix for a buildbot failure caused by "commit ↵Farhana Aleen2018-05-021-1/+1
| | | | | | | 224a839fcbbead221f872cd32a1dd0c308d37299". Author: FarhanaAleen llvm-svn: 331383
* Revert "[AMDGPU] performAddCombine should run after DAG is legalized."Farhana Aleen2018-05-021-1/+1
| | | | | | This reverts commit 6b97d2995566b4dddd6bf0d75579ff44501d4494. llvm-svn: 331371
* [AMDGPU] performAddCombine should run after DAG is legalized.Farhana Aleen2018-05-021-1/+1
| | | | | | | | | | | | | | | | Summary: performAddCombine should run after DAG is legalized; Otherwise generic optimization in the DAGCombiner can optimize an addcarry+trunc into an addcarry instruction with illegal types. Author: FarhanaAleen Reviewed By: rampitec Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D46337 llvm-svn: 331368
* [AMDGPU] Support horizontal vectorization.Farhana Aleen2018-05-013-0/+43
| | | | | | | | | | | | Author: FarhanaAleen Reviewed By: rampitec, arsenm Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D46213 llvm-svn: 331313
* AMDGPU: Remove remnants of gfx901 (it was deprecated some time ago)Konstantin Zhuravlyov2018-05-011-2/+1
| | | | llvm-svn: 331298
* Remove @brief commands from doxygen comments, too.Adrian Prantl2018-05-011-1/+1
| | | | | | | | | | | | | | | | | This is a follow-up to r331272. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done https://reviews.llvm.org/D46290 llvm-svn: 331275
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-0168-220/+220
| | | | | | | | | | | | | | | | We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
* AMDGPU: Add Vega12 and Vega20Matt Arsenault2018-04-3018-90/+290
| | | | | | | | Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331215
* AMDGPU: Remove some dead codeTom Stellard2018-04-301-4/+0
| | | | llvm-svn: 331196
* AMDGPU/GlobalISel: Don't try to lower geometry shadersTom Stellard2018-04-301-0/+3
| | | | | | | | | | | | | | Summary: The AMDGPU_GS calling convention is not supported yet. Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46041 llvm-svn: 331186
* IWYU for llvm-config.h in llvm, additions.Nico Weber2018-04-304-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | See r331124 for how I made a list of files missing the include. I then ran this Python script: for f in open('filelist.txt'): f = f.strip() fl = open(f).readlines() found = False for i in xrange(len(fl)): p = '#include "llvm/' if not fl[i].startswith(p): continue if fl[i][len(p):] > 'Config': fl.insert(i, '#include "llvm/Config/llvm-config.h"\n') found = True break if not found: print 'not found', f else: open(f, 'w').write(''.join(fl)) and then looked through everything with `svn diff | diffstat -l | xargs -n 1000 gvim -p` and tried to fix include ordering and whatnot. No intended behavior change. llvm-svn: 331184
* DAG: Fix not legalizing vector fcanonicalizesMatt Arsenault2018-04-261-0/+1
| | | | | | If an fcanoncialize was done on a vector type that was legal, llvm-svn: 330981
* AMDGPU: Extend extract_vector_elt fneg combine to fabsMatt Arsenault2018-04-261-2/+3
| | | | | | Fixes a regression in a future commit. llvm-svn: 330980
* AMDGPU: Consolidate SubtargetPredicate definitionsMatt Arsenault2018-04-262-7/+7
| | | | llvm-svn: 330979
* [AMDGPU][Waitcnt] As of gfx7, VMEM operations do not increment the export ↵Mark Searles2018-04-262-1/+5
| | | | | | | | counter and the input registers are available in the next instruction; update the waitcnt pass to take this into account. Differential Revision: https://reviews.llvm.org/D46067 llvm-svn: 330954
* AMDGPU/R600: Move int_r600_store_stream_output to the public intrinsic fileTom Stellard2018-04-251-4/+0
| | | | | | | | | | | | | | | | Summary: The TableGen'd GlobalISel instruction selector assumes all intrinsics are in the public Intrinsic:: namespace. Reviewers: jvesely, nhaehnle Reviewed By: jvesely, nhaehnle Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45989 llvm-svn: 330866
* [AMDGPU] Waitcnt pass: add debug optionsMark Searles2018-04-251-11/+75
| | | | | | | | | | | | | | | | | - Add "amdgpu-waitcnt-forcezero" to force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) - Add debug counters to control force emit of s_waitcnt instrs; debug counters: si-insert-waitcnts-forceexp: force emit s_waitcnt expcnt(0) instrs si-insert-waitcnts-forcevm: force emit s_waitcnt lgkmcnt(0) instrs si-insert-waitcnts-forcelgkm: force emit s_waitcnt vmcnt(0) instrs - Add some debug statements Note that a variant of this patch was previously committed/reverted. Differential Revision: https://reviews.llvm.org/D45888 llvm-svn: 330862
* [AMDGPU] Revert b0efc4fd6 (https://reviews.llvm.org/D40556)Alexander Timofeev2018-04-251-64/+15
| | | | llvm-svn: 330818
* AMDGPU: Remove deprecated llvm.AMDGPU.kilp intrinsicTom Stellard2018-04-243-11/+0
| | | | | | | | | | | | | | Summary: This is no longer used by mesa since its 18.0.0 release. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D45988 llvm-svn: 330775
* AMDGPU/GlobalISel: Fall-back to SelectionDAG for non-void functionsTom Stellard2018-04-241-0/+4
| | | | | | | | | | | | Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45843 llvm-svn: 330774
* AMDGPU/GlobalISel: Add support for amdgpu_ps calling conventionTom Stellard2018-04-241-14/+49
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45837 llvm-svn: 330767
* [AMDGPU] Truncate packed inline constantStanislav Mekhanoshin2018-04-242-1/+8
| | | | | | | | | | | | | | | | If a packed inline constant is sign extended it must be truncated after the shift. I.e. a constant (0xH0000, 0xHBC00), will be represented as 0xFFFFFFFFBC000000 in the IR because the immediate is sign extended to 64 bit. After the value shifted right by 16 to use it in a low part with op_sel_hi it becomes 0xFFFFFFFFBC00 and does not qualify as inline constant any longer. Fixed the error and added verification code. Without the fix and with the verification bug is causing pk_max_f16_literal.ll to fail. Differential Revision: https://reviews.llvm.org/D45987 llvm-svn: 330752
* [AMDGPU][Waitcnt] NFC. Cleanup some code/naming consistency:Mark Searles2018-04-241-38/+38
| | | | | | - s/SWaitcnt/Waitcnt s/WaitCnt/Waitcnt llvm-svn: 330730
* AMDGPU: Move a flawed assert when spilling SGPRsMatt Arsenault2018-04-232-4/+4
| | | | | | | | It's possible to validly spill the frame offset register in a call sequence to a VGPR. There are definitely issues with SGPR spilling to memory, so move the assert later. llvm-svn: 330612
* AMDGPU: Assign enum name to stack IDMatt Arsenault2018-04-233-2/+10
| | | | | | | | | Also assert that it is correct for SGPRs. There is currently a bug where stack slot coloring replaces SGPR spill FIs with one with the default ID, which results in a more confusing assert later about a dead object. llvm-svn: 330607
* AMDGPU: Fix SDWA peephole for V_AND_B32Nicolai Haehnle2018-04-231-1/+1
| | | | | | | | | | | | | | | | | | | Summary: Found by inspection. We care about the operand that *doesn't* contain the immediate. I believe this is currently not hit because we fold 0xff / 0xffff immediates only later. Change-Id: Ic3cf8538bc7da5eff3200d96eccf9d339e6345a7 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45886 llvm-svn: 330586
* AMDGPU: Fix a corner case crash in SIOptimizeExecMaskingNicolai Haehnle2018-04-231-1/+1
| | | | | | | | | | | | | | | | Summary: See the new test case; this is really unlikely to happen with real code, but I ran into this while attempting to bugpoint-reduce a different issue. Change-Id: I9ade1dc1aa8fd9c4d9fc83661d7b80e310b5c4a6 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45885 llvm-svn: 330585
* Consistently sort add_subdirectory calls in lib/Target/*/CMakeLists.txtNico Weber2018-04-231-2/+2
| | | | llvm-svn: 330584
* AMDGPU: Legalize the operand of SI_INIT_M0Nicolai Haehnle2018-04-201-0/+15
| | | | | | | | | | | | | | | | | | | | Summary: This fixes a case where the argument to a sendmsg intrinsic ends up in a VGPR, for whatever reason. The underlying performance issue is that a multiplication that can be an s_mul_i32 is instead needlessly generated as v_mul_u32_u24, but this is not addressed by this patch. Change-Id: I61fd4034314d5acdf6074632c30b65364dfa7328 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45826 llvm-svn: 330393
* [AMDGPU] Use packed literals with zero either lower or hi partStanislav Mekhanoshin2018-04-192-2/+21
| | | | | | Differential Revision: https://reviews.llvm.org/D45790 llvm-svn: 330365
* [AMDGPU] Do not only rely on BB number when finding bottom loopMark Searles2018-04-191-20/+45
| | | | | | | | We should also check that the "bottom" basic block of a loopis a successor of the "header" basic block, otherwise we don't propagate the information correctly when the CFG is complex. This fixes an important rendering problem with Wolfsentein 2, because of one vector-memory wait was missing. Differential Revision: https://reviews.llvm.org/D43831 llvm-svn: 330337
* [AMDGPU] Fix issues for backend divergence trackingDavid Stuttard2018-04-181-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: A change to use divergence analysis in the AMDGPU backend was getting formal arguments incorrect (not tagged as divergent) unless they were VGPR0, VGPR1 or VGPR2 For graphics shaders it is possible to have more than these passed in as VGPR Modified the checking code to check for any VGPR registers passed in as formal arguments. Also, some intrinsics that are sources of divergence may have been lowered during instruction selection and are missed on subsequent calls to isSDNodeSourceOfDivergence - added the relevant AMDGPUISD checks as well. Finally, the FunctionLoweringInfo tracks virtual registers that are live across basic block boundaries. This is used to check for divergence of CopyFromRegister registers using the DivergenceAnalysis analysis. For multiple blocks the lazily evaluated inverted map VirtReg2Value was not cleared when the ValueMap map was. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45372 Change-Id: I112f3bd6dfe0f62e63ce9b43b893982778e4bee3 llvm-svn: 330257
* [AMDGPU] Enabled v2.16 literals for VOP3PStanislav Mekhanoshin2018-04-172-8/+19
| | | | | | | | Literal encoding needs op_sel_hi to select low 16 bit in this case. Differential Revision: https://reviews.llvm.org/D45745 llvm-svn: 330230
OpenPOWER on IntegriCloud