summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU
Commit message (Collapse)AuthorAgeFilesLines
...
* [AMDGPU] SI Load Store Optimizer: When merging with offset, use ↵Mark Searles2018-01-221-14/+18
| | | | | | | | | | | V_ADD_{I|U}32_e64 - Change inserted add ( V_ADD_{I|U}32_e32 ) to _e64 version ( V_ADD_{I|U}32_e64 ) so that the add uses a vreg for the carry; this prevents inserted v_add from killing VCC; the _e64 version doesn't accept a literal in its encoding, so we need to introduce a mov instr as well to get the imm into a register. - Change pass name to "SI Load Store Optimizer"; this removes the '/', which complicates scripts. Differential Revision: https://reviews.llvm.org/D42124 llvm-svn: 323153
* [NFC] fix trivial typos in commentsHiroshi Inoue2018-01-222-3/+3
| | | | | | "the the" -> "the" llvm-svn: 323074
* [AMDGPU][MC] Corrected parsing of image modifiers and encoding of image atomicsDmitry Preobrazhensky2018-01-191-6/+6
| | | | | | | | | | | See bugs 35962: https://bugs.llvm.org/show_bug.cgi?id=35962 35963: https://bugs.llvm.org/show_bug.cgi?id=35963 Differential Revision: https://reviews.llvm.org/D42184 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 322942
* Split MachineLICM into EarlyMachineLICM and MachineLICM; NFCMatthias Braun2018-01-191-1/+1
| | | | | | | | | | | | | This avoids playing games with pseudo pass IDs and avoids using an unreliable MRI::isSSA() check to determine whether register allocation has happened. Note that this renames: - MachineLICMID -> EarlyMachineLICM - PostRAMachineLICMID -> MachineLICMID to be consistent with the EarlyTailDuplicate/TailDuplicate naming. llvm-svn: 322927
* AMDGPU/SI: Fix typos in d16 support patch the buffer intrinsics.Changpeng Fang2018-01-181-4/+4
| | | | llvm-svn: 322906
* AMDGPU/SI: Add d16 support for image intrinsics.Changpeng Fang2018-01-188-202/+1009
| | | | | | | | | | | | | Summary: This patch implements d16 support for image load, image store and image sample intrinsics. Reviewers: Matt, Brian. Differential Revision: https://reviews.llvm.org/D3991 llvm-svn: 322903
* [GISel] Make constrainSelectedInstRegOperands() available to the legalizer. NFCAditya Nandakumar2018-01-171-0/+1
| | | | | | https://reviews.llvm.org/D42149 llvm-svn: 322743
* AMDGPU: Error in SIAnnotateControlFlow instead of assertMatt Arsenault2018-01-171-1/+5
| | | | | | | | This assert typically happens if an unstructured CFG is passed to the pass. This can happen if the pass is run independently without the structurizer. llvm-svn: 322685
* [AMDGPU] add LDS f32 intrinsicsDaniil Fukalov2018-01-177-10/+78
| | | | | | | | | | | | added llvm.amdgcn.atomic.{add|min|max}.f32 intrinsics to allow generate ds_{add|min|max}[_rtn]_f32 instructions needed for OpenCL float atomics in LDS Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D37985 llvm-svn: 322656
* [AMDGPU][MC][GFX9] Enable inline constants for SDWA operandsDmitry Preobrazhensky2018-01-175-65/+124
| | | | | | | | | See bug 35771: https://bugs.llvm.org/show_bug.cgi?id=35771 Differential Revision: https://reviews.llvm.org/D42058 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 322655
* [AMDGPU] Add HW_REG_SH_MEM_BASES symbolic name for s_getreg_b32Stanislav Mekhanoshin2018-01-154-4/+19
| | | | | | Differential Revision: https://reviews.llvm.org/D41617 llvm-svn: 322500
* [AMDGPU] Copy impdefs from pseudo to real instructionsStanislav Mekhanoshin2018-01-154-0/+4
| | | | | | | | | | In some cases we do not copy implicit defs from pseudo to real VOP instructions. It has no visible impact at the moment thus no tests are affected or added. Differential Revision: https://reviews.llvm.org/D41783 llvm-svn: 322496
* [AMDGPU] stop image_store being moved illegallyTim Renouf2018-01-121-6/+2
| | | | | | | | | | | | | | | | | | | | Summary: A recent change 321556: AMDGPU: Remove mayLoad/hasSideEffects from MIMG stores can allow the machine instruction scheduler to move an image store past an image load using the same descriptor. V2: Fixed by marking image ops as mayAlias and isAliased. This may be overly conservative, and we may need to revisit. V3: Reverted test change done on 321556. Reviewers: arsenm, nhaehnle, dstuttard Subscribers: llvm-commits, t-tye, yaxunl, wdng, kzhuravl Differential Revision: https://reviews.llvm.org/D41969 llvm-svn: 322419
* AMDGPU/SI: Add d16 support for buffer intrinsics.Changpeng Fang2018-01-1210-52/+446
| | | | | | | | | | Differential Revision: https://reviews.llvm.org/D38906 Reviewers: Matt and Brian. llvm-svn: 322402
* [AMDGPU][MC][GFX8][GFX9] Added XNACK_MASK supportDmitry Preobrazhensky2018-01-108-5/+54
| | | | | | | | | See bug 35764: https://bugs.llvm.org/show_bug.cgi?id=35764 Differential Revision: https://reviews.llvm.org/D41614 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 322189
* [AMDGPU] Fixed incorrect uniform branch conditionTim Renouf2018-01-091-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I had a case where multiple nested uniform ifs resulted in code that did v_cmp comparisons, combining the results with s_and_b64, s_or_b64 and s_xor_b64 and using the resulting mask in s_cbranch_vccnz, without first ensuring that bits for inactive lanes were clear. There was already code for inserting an "s_and_b64 vcc, exec, vcc" to clear bits for inactive lanes in the case that the branch is instruction selected as s_cbranch_scc1 and is then changed to s_cbranch_vccnz in SIFixSGPRCopies. I have added the same code into SILowerControlFlow for the case that the branch is instruction selected as s_cbranch_vccnz. This de-optimizes the code in some cases where the s_and is not needed, because vcc is the result of a v_cmp, or multiple v_cmp instructions combined by s_and/s_or. We should add a pass to re-optimize those cases. Reviewers: arsenm, kzhuravl Subscribers: wdng, yaxunl, t-tye, llvm-commits, dstuttard, timcorringham, nhaehnle Differential Revision: https://reviews.llvm.org/D41292 llvm-svn: 322119
* AMDGPU: Remove dead fileMatt Arsenault2018-01-031-12/+0
| | | | llvm-svn: 321752
* Thread MCSubtargetInfo through Target::createMCAsmBackendAlex Bradbury2018-01-032-4/+5
| | | | | | | | | | | | | | | | | | | | | Currently it's not possible to access MCSubtargetInfo from a TgtMCAsmBackend. D20830 threaded an MCSubtargetInfo reference through MCAsmBackend::relaxInstruction, but this isn't the only function that would benefit from access. This patch removes the Triple and CPUString arguments from createMCAsmBackend and replaces them with MCSubtargetInfo. This patch just changes the interface without making any intentional functional changes. Once in, several cleanups are possible: * Get rid of the awkward MCSubtargetInfo handling in ARMAsmBackend * Support 16-bit instructions when valid in MipsAsmBackend::writeNopData * Get rid of the CPU string parsing in X86AsmBackend and just use a SubtargetFeature for HasNopl * Emit 16-bit nops in RISCVAsmBackend::writeNopData if the compressed instruction set extension is enabled (see D41221) This change initially exposed PR35686, which has since been resolved in r321026. Differential Revision: https://reviews.llvm.org/D41349 llvm-svn: 321692
* AMDGPU: Use unique PSVs for buffer resourcesMatt Arsenault2017-12-293-39/+87
| | | | | | | Also fixes using the wrong memory type for some intrinsics when custom lowering them. llvm-svn: 321557
* AMDGPU: Remove mayLoad/hasSideEffects from MIMG storesMatt Arsenault2017-12-291-5/+5
| | | | | | | Atomics still have hasSideEffects set on them because of the mess that is the memory properties. llvm-svn: 321556
* AMDGPU: Implement getTgtMemIntrinsic for imagesMatt Arsenault2017-12-293-20/+169
| | | | | | | | | | | | | Currently all images are lowered to have a single image PseudoSourceValue. Image stores happen to have overly strict mayLoad/mayStore/hasSideEffects flags set on them, so this happens to work. When these are fixed to be correct, the scheduler breaks this because the identical PSVs are assumed to be the same address. These need to be unique to the image resource value. llvm-svn: 321555
* [AMDGPU][MC] Incorrect parsing of flat/global atomic modifiersDmitry Preobrazhensky2017-12-291-1/+39
| | | | | | | | | See bug 35730: https://bugs.llvm.org/show_bug.cgi?id=35730 Differential Revision: https://reviews.llvm.org/D41598 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 321552
* (Re-landing) Expose a TargetMachine::getTargetTransformInfo functionSanjoy Das2017-12-222-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-land r321234. It had to be reverted because it broke the shared library build. The shared library build broke because there was a missing LLVMBuild dependency from lib/Passes (which calls TargetMachine::getTargetIRAnalysis) to lib/Target. As far as I can tell, this problem was always there but was somehow masked before (perhaps because TargetMachine::getTargetIRAnalysis was a virtual function). Original commit message: This makes the TargetMachine interface a bit simpler. We still need the std::function in TargetIRAnalysis to avoid having to add a dependency from Analysis to Target. See discussion: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html I avoided adding all of the backend owners to this review since the change is simple, but let me know if you feel differently about this. Reviewers: echristo, MatzeB, hfinkel Reviewed By: hfinkel Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D41464 llvm-svn: 321375
* [AMDGPU][MC] Corrected handling of negative expressionsDmitry Preobrazhensky2017-12-221-1/+6
| | | | | | | | | | See bug 35716: https://bugs.llvm.org/show_bug.cgi?id=35716 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D41488 llvm-svn: 321372
* [AMDGPU][MC] Corrected parsing of optional operands for ds_swizzle_b32Dmitry Preobrazhensky2017-12-221-1/+3
| | | | | | | | | | See bug 35645: https://bugs.llvm.org/show_bug.cgi?id=35645 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D41186 llvm-svn: 321367
* [AMDGPU][MC] Added support of 256- and 512-bit tuples of ttmp registersDmitry Preobrazhensky2017-12-227-43/+197
| | | | | | | | | | | | See bug 35561: https://bugs.llvm.org/show_bug.cgi?id=35561 This patch also affects implementation of SGPR and VGPR registers though changes are cosmetic. Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D41437 llvm-svn: 321359
* Revert "Expose a TargetMachine::getTargetTransformInfo function"Sanjoy Das2017-12-212-4/+6
| | | | | | This reverts commit r321234. It breaks the -DBUILD_SHARED_LIBS=ON build. llvm-svn: 321243
* Expose a TargetMachine::getTargetTransformInfo functionSanjoy Das2017-12-212-6/+4
| | | | | | | | | | | | | | | | | | | | | | | Summary: This makes the TargetMachine interface a bit simpler. We still need the std::function in TargetIRAnalysis to avoid having to add a dependency from Analysis to Target. See discussion: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html I avoided adding all of the backend owners to this review since the change is simple, but let me know if you feel differently about this. Reviewers: echristo, MatzeB, hfinkel Reviewed By: hfinkel Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D41464 llvm-svn: 321234
* [AMDGPU, AsmParser] Enable the mnemonic spell corrector.Matt Arsenault2017-12-201-2/+15
| | | | | | Patch by Dmitry Venikov llvm-svn: 321202
* [AMDGPU] Turn off MergeConsecutiveStores() before Instruction Selection for ↵Mark Searles2017-12-191-0/+10
| | | | | | | | AMDGPU. Commit dbbb6c5fc3642987430866dffdf710df4f616ac7 turned on MergeConsecutiveStores() before Instruction Selection for all targets. Enough AMDGPU compiles go into an infinite loop ( MergeConsecutiveStores() merges two stores; LegalizeStoreOps() un-merges; MergeConsecutiveStores() re-merges, etc. ) to warrant turning it off until the issues can be addressed. Differential Revision: https://reviews.llvm.org/D41377 llvm-svn: 321100
* MachineFunction: Return reference from getFunction(); NFCMatthias Braun2017-12-1529-111/+111
| | | | | | The Function can never be nullptr so we can return a reference. llvm-svn: 320884
* Recommit CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.valueYaxun Liu2017-12-151-8/+11
| | | | | | The regression on ppc64 was not due to this commit. llvm-svn: 320788
* TLI: Allow using PSV for intrinsic mem operandsMatt Arsenault2017-12-142-0/+2
| | | | llvm-svn: 320756
* DAG: Expose all MMO flags in getTgtMemIntrinsicMatt Arsenault2017-12-141-3/+4
| | | | | | | | | | | | | | Rather than adding more bits to express every MMO flag you could want, just directly use the MMO flags. Also fixes using a bunch of bool arguments to getMemIntrinsicNode. On AMDGPU, buffer and image intrinsics should always have MODereferencable set, but currently there is no way to do that directly during the initial intrinsic lowering. llvm-svn: 320746
* Revert CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.valueYaxun Liu2017-12-141-11/+8
| | | | | | This commit might have caused regression on ppc64. Revert it to verify that. llvm-svn: 320712
* CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.valueYaxun Liu2017-12-131-8/+11
| | | | | | | | | | | | | | | | | Two issues were found about machine inst scheduler when compiling ProRender with -g for amdgcn target: GCNScheduleDAGMILive::schedule tries to update LiveIntervals for DBG_VALUE, which it should not since DBG_VALUE is not mapped in LiveIntervals. when DBG_VALUE is the last instruction of MBB, ScheduleDAGInstrs::buildSchedGraph and ScheduleDAGMILive::scheduleMI does not move RPTracker properly, which causes assertion. This patch fixes that. Differential Revision: https://reviews.llvm.org/D41132 llvm-svn: 320650
* AMDGPU: Partially fix disassembly of MIMG instructionsMatt Arsenault2017-12-138-78/+128
| | | | | | | | | | | | | | | | | | | | | Stores failed to decode at all since they didn't have a DecoderNamespace set. Loads worked, but did not change the register width displayed to match the numbmer of enabled channels. The number of printed registers for vaddr is still wrong, but I don't think that's encoded in the instruction so there's not much we can do about that. Image atomics are still broken. MIMG is the same encoding for SI/VI, but the image atomic classes are split up into encoding specific versions unlike every other MIMG instruction. They have isAsmParserOnly set on them for some reason. dmask is also special for these, so we probably should not have it as an explicit operand as it is now. llvm-svn: 320614
* [Targets] Don't automatically include the scheduler class enum from ↵Craig Topper2017-12-131-0/+2
| | | | | | | | | | *GenInstrInfo.inc with GET_INSTRINFO_ENUM. Make targets request is separately. Most of the targets don't need the scheduler class enum. I have an X86 scheduler model change that causes some names in the enum to become about 18000 characters long. This is because using instregex in scheduler models causes the scheduler class to get named with every instruction that matches the regex concatenated together. MSVC has a limit of 4096 characters for an identifier name. Rather than trying to come up with way to reduce the name length, I'm just going to sidestep the problem by not including the enum in X86. llvm-svn: 320552
* Rename LiveIntervalAnalysis.h to LiveIntervals.hMatthias Braun2017-12-1310-10/+10
| | | | | | | | | | Headers/Implementation files should be named after the class they declare/define. Also eliminated an `#include "llvm/CodeGen/LiveIntervalAnalysis.h"` in favor of `class LiveIntarvals;` llvm-svn: 320546
* LSR: Check more intrinsic pointer operandsMatt Arsenault2017-12-112-0/+28
| | | | llvm-svn: 320424
* [AMDGPU][MC][GFX9] Corrected encoding of ttmp registers, disabled tba/tmaDmitry Preobrazhensky2017-12-118-69/+164
| | | | | | | | | | | | See bugs 35494 and 35559: https://bugs.llvm.org/show_bug.cgi?id=35494 https://bugs.llvm.org/show_bug.cgi?id=35559 Reviewers: vpykhtin, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D41007 llvm-svn: 320375
* AMDGPU/GCN: Bring processors in sync with AMDGPUUsageKonstantin Zhuravlyov2017-12-084-46/+21
| | | | | | | | | | | | - Add gfx704 - Change bonaire to gfx704 - Remove gfx804 - Remove gfx901 - Remove gfx903 Differential Revision: https://reviews.llvm.org/D40046 llvm-svn: 320194
* AMDGPU: Set IntrReadMem on memtime intrinsicsMatt Arsenault2017-12-081-5/+2
| | | | llvm-svn: 320188
* AMDGPU: image_getlod and image_getresinfo do not read memoryMatt Arsenault2017-12-082-13/+40
| | | | llvm-svn: 320187
* AMDGPU: Preserve MMO in adjustWritemaskMatt Arsenault2017-12-081-0/+2
| | | | | | | | Follow up to r319705. Currently the MMO is produced after this in the custom inserter, so this doesn't change anything yet. llvm-svn: 320186
* AMDGPU: Report Arg's Value name in metadata if kernel_arg_name metadata is ↵Konstantin Zhuravlyov2017-12-081-0/+2
| | | | | | | | not available Differential Revision: https://reviews.llvm.org/D40924 llvm-svn: 320176
* [AMDGPU] add labels to +DumpCode outputTim Renouf2017-12-082-4/+29
| | | | | | | | | | | | | | Summary: +DumpCode is a hack to embed disassembly in the ELF file. This commit fixes it to include labels, to make it slightly more useful. Reviewers: arsenm, kzhuravl Subscribers: nhaehnle, timcorringham, dstuttard, llvm-commits, t-tye, yaxunl, wdng, kzhuravl Differential Revision: https://reviews.llvm.org/D40169 llvm-svn: 320146
* [AMDGPU] Revert "[AMDGPU] Add options for waitcnt pass debugging; add instr ↵Mark Searles2017-12-071-62/+8
| | | | | | | | | | | | | | count in debug output." Patch caused a buildbot failure; http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/15733/steps/build_Lld/logs/stdio : lib/Target/AMDGPU/SIInsertWaitcnts.cpp:396:11: error: private field 'InstCnt' is not used [-Werror,-Wunused-private-field] int32_t InstCnt = 0; ^ 1 error generated. " This reverts commit 71627f79010aafe74fdcba901bba28dd7caa0869. llvm-svn: 320086
* [AMDGPU] Add options for waitcnt pass debugging; add instr count in debug ↵Mark Searles2017-12-071-8/+62
| | | | | | | | | | | | | output. -amdgpu-waitcnt-forcezero={1|0} Force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) -amdgpu-waitcnt-forceexp=<n> Force emit a s_waitcnt expcnt(0) before the first <n> instrs -amdgpu-waitcnt-forcelgkm=<n> Force emit a s_waitcnt lgkmcnt(0) before the first <n> instrs -amdgpu-waitcnt-forcevm=<n> Force emit a s_waitcnt vmcnt(0) before the first <n> instrs Differential Revision: https://reviews.llvm.org/D40091 llvm-svn: 320084
* [AMDGPU] Add GCNHazardRecognizer::checkInlineAsmHazards() and ↵Mark Searles2017-12-072-17/+64
| | | | | | | | GCNHazardRecognizer::checkVALUHazardsHelper(). checkInlineAsmHazards() checks INLINEASM for hazards that we particularly care about (so not exhaustive); this patch adds a check for INLINEASM that defs vregs that hold data-to-be stored by immediately preceding store of more than 8 bytes. If the instr were not within an INLINEASM, this scenario would be handled by checkVALUHazard(). Add checkVALUHazardsHelper(), which will be called by both checkVALUHazards() and checkInlineAsmHazards(). Differential Revision: https://reviews.llvm.org/D40098 llvm-svn: 320083
OpenPOWER on IntegriCloud