summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Split vi-insts subtarget featureMatt Arsenault2016-02-273-6/+24
| | | | | | | This will be more useful for marking builtins acceptable for which subtargets. llvm-svn: 262121
* AMDGPU: Add s_sleep intrinsicMatt Arsenault2016-02-272-1/+17
| | | | llvm-svn: 262120
* AMDGPU: Implement readcyclecounterMatt Arsenault2016-02-277-10/+68
| | | | | | | | | | This matches the behavior of the HSAIL clock instruction. s_realmemtime is used if the subtarget supports it, and falls back to s_memtime if not. Also introduces new intrinsics for each of s_memtime / s_memrealtime. llvm-svn: 262119
* CodeGen: Take MachineInstr& in SlotIndexes and LiveIntervals, NFCDuncan P. N. Exon Smith2016-02-272-13/+13
| | | | | | | | | | | | | | Take MachineInstr by reference instead of by pointer in SlotIndexes and the SlotIndex wrappers in LiveIntervals. The MachineInstrs here are never null, so this cleans up the API a bit. It also incidentally removes a few implicit conversions from MachineInstrBundleIterator to MachineInstr* (see PR26753). At a couple of call sites it was convenient to convert to a range-based for loop over MachineBasicBlock::instr_begin/instr_end, so I added MachineBasicBlock::instrs. llvm-svn: 262115
* [AMDGPU] Assembler: Basic support for MIMGNikolay Haustov2016-02-266-59/+202
| | | | | | | | | | | Add parsing and printing of image operands. Matches legacy sp3 assembler. Change image instruction order to have data/image/sampler operands in the beginning. This is needed because optional operands in MC are always last. Update SITargetLowering for new order. Add basic MC test. Update CodeGen tests. Review: http://reviews.llvm.org/D17574 llvm-svn: 261995
* [AMDGPU] Disassembler: Support for all VOP1 instructions.Nikolay Haustov2016-02-253-62/+242
| | | | | | | | | | | | | | | Support all instructions with VOP1 encoding with 32 or 64-bit operands for VI subtarget: VGPR_32 and VReg_64 operand register classes VS_32 and VS_64 operand register classes with inline and literal constants Tests for VOP1 instructions. Patch by: skolton Reviewers: arsenm, tstellarAMD Review: http://reviews.llvm.org/D17194 llvm-svn: 261878
* [AMDGPU] Assembler: Simplify handling of optional operandsNikolay Haustov2016-02-253-75/+77
| | | | | | | | | | | | | | | | | | | | | | Resubmit with index problem fixed. Verified with valgrind. Prepare to support DPP encodings. For DPP encodings, we want row_mask/bank_mask/bound_ctrl to be optional operands. However this means that when parsing instruction which has no mnemonic prefix, we cannot add both default values for VOP3 and for DPP optional operands to OperandVector - neither instructions would match. So add default values for optional operands to MCInst during conversion instead. Mark more operands as IsOptional = 1 in .td files. Do not add default values for optional operands to OperandVector in AMDGPUAsmParser. Add default values for optional operands during conversion using new helper addOptionalImmOperand. Change to cvtVOP3_2_mod to check instruction flag instead of presence of modifiers. In the future, cvtVOP3* functions can be combined into one. Separate cvtFlat and cvtFlatAtomic. Fix CNDMASK_B32 definition to have no modifiers. Review: http://reviews.llvm.org/D17445 llvm-svn: 261856
* Revert r261742, "[AMDGPU] Assembler: Simplify handling of optional operands"NAKAMURA Takumi2016-02-253-79/+75
| | | | | | It brought undefined behavior. llvm-svn: 261839
* [AMDGPU] Assembler: Simplify handling of optional operandsNikolay Haustov2016-02-243-75/+79
| | | | | | | | | | | | | | | | | | Prepare to support DPP encodings. For DPP encodings, we want row_mask/bank_mask/bound_ctrl to be optional operands. However this means that when parsing instruction which has no mnemonic prefix, we cannot add both default values for VOP3 and for DPP optional operands to OperandVector - neither instructions would match. So add default values for optional operands to MCInst during conversion instead. Mark more operands as IsOptional = 1 in .td files. Do not add default values for optional operands to OperandVector in AMDGPUAsmParser. Add default values for optional operands during conversion using new helper addOptionalImmOperand. Change to cvtVOP3_2_mod to check instruction flag instead of presence of modifiers. In the future, cvtVOP3* functions can be combined into one. Separate cvtFlat and cvtFlatAtomic. Fix CNDMASK_B32 definition to have no modifiers. Review: http://reviews.llvm.org/D17445 Reviewers: tstellarAMD llvm-svn: 261742
* [AMDGPU] fix amd_kernel_code_t bit field position as per spec (added missing ↵Nikolay Haustov2016-02-241-7/+15
| | | | | | | | | | | reserved fields) lit tests passed before and after because it doesn't test the binary representation of amd_kernel_code_t. Patch by: Valery Pykhtin (Valery.Pykhtin@amd.com) Reviewers: arsenm llvm-svn: 261732
* AMDGPU: Check cheaper condition before SignBitIsZeroMatt Arsenault2016-02-241-7/+6
| | | | | | | Don't do an expensive computeKnownBits call when we can do the cheap check for legal offsets first. llvm-svn: 261720
* [AMDGPU] Fix operands of S_BFE_U64 and S_BFM_B64Nikolay Haustov2016-02-232-2/+7
| | | | | | | | | | | src1 of s_bfe_u64 is 32-bit (same as s_bfe_i64). src0 and src1 of s_bfm_b64 are 32-bit. Update tests. Review: http://reviews.llvm.org/D17480 Reviewers: arsenm llvm-svn: 261621
* CodeGen: TII: Take MachineInstr& in predicate API, NFCDuncan P. N. Exon Smith2016-02-233-38/+33
| | | | | | | | | | | | | Change TargetInstrInfo API to take `MachineInstr&` instead of `MachineInstr*` in the functions related to predicated instructions (I'll try to come back later and get some of the rest). All of these functions require non-null parameters already, so references are more clear. As a bonus, this happens to factor away a host of implicit iterator => pointer conversions. No functionality change intended. llvm-svn: 261605
* CodeGen: Bring back MachineBasicBlock::iterator::getInstrIterator()...Duncan P. N. Exon Smith2016-02-222-2/+2
| | | | | | | | | | | | | | | | | | This is a little embarrassing. When I reverted r261504 (getIterator() => getInstrIterator()) in r261567, I did a `git grep` to see if there were new calls to `getInstrIterator()` that I needed to migrate. There were 10-20 hits, and I blindly did a `sed ...` before calling `ninja check`. However, these were `MachineInstrBundleIterator::getInstrIterator()`, which predated r261567. Perhaps coincidentally, these had an identical name and return type. This commit undoes my careless sed and restores `MachineBasicBlock::iterator::getInstrIterator()`. llvm-svn: 261577
* AMDGPU/R600: Implement allowsMisalignedMemoryAccessMatt Arsenault2016-02-222-0/+24
| | | | | | | | This avoids some test regressions in a future commit when unaligned operations are expanded when they have custom lowering. llvm-svn: 261570
* Revert "CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC"Duncan P. N. Exon Smith2016-02-223-3/+3
| | | | | | | | | | This reverts commit r261504, since it's not obvious the new name is better: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160222/334298.html I'll recommit if we get consensus that it's the right direction. llvm-svn: 261567
* [AMDGPU][llvm-mc] Support for 32-bit inline literalsTom Stellard2016-02-222-34/+59
| | | | | | | | | | | | | | | | | | | | | | | Patch by: Artem Tamazov Summary: Note: Support for 64-bit inline literals TBD Added: Support of abs/neg modifiers for literals (incomplete; parsing TBD). Added: Some TODO comments. Reworked/clarity: rename isInlineImm() to isInlinableImm() Reworked/robustness: disallow BitsToFloat() with undefined value in isInlinableImm() Reworked/reuse: isSSrc32/64(), isVSrc32/64() Tests added. Reviewers: tstellarAMD, arsenm Subscribers: vpykhtin, nhaustov, SamWot, arsenm Projects: #llvm-amdgpu-spb Differential Revision: http://reviews.llvm.org/D17204 llvm-svn: 261559
* [AMDGPU] [llvm-mc] [VI] Fix encoding of LDS/GDS instructions.Tom Stellard2016-02-221-1/+3
| | | | | | | | | | | | | | | | Patch by: Artem Tamazov Summary: Tests added. Reviewers: tstellarAMD, arsenm Subscribers: vpykhtin, SamWot, #llvm-amdgpu-spb Projects: #llvm-amdgpu-spb Differential Revision: http://reviews.llvm.org/D17271 llvm-svn: 261558
* CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFCDuncan P. N. Exon Smith2016-02-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Delete MachineInstr::getIterator(), since the term "iterator" is overloaded when talking about MachineInstr. - Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so that ilist_node::getIterator() is still available. - Add it back as MachineInstr::getInstrIterator(). This matches the naming in MachineBasicBlock. - Add MachineInstr::getBundleIterator(). This is explicitly called "bundle" (not matching MachineBasicBlock) to disintinguish it clearly from ilist_node::getIterator(). - Update all calls. Some of these I switched to `auto` to remove boiler-plate, since the new name is clear about the type. There was one call I updated that looked fishy, but it wasn't clear what the right answer was. This was in X86FrameLowering::inlineStackProbe(), added in r252578 in lib/Target/X86/X86FrameLowering.cpp. I opted to leave the behaviour unchanged, but I'll reply to the original commit on the list in a moment. llvm-svn: 261504
* AMDGPU/SI: Use v_readfirstlane to legalize SMRD with VGPR base pointerTom Stellard2016-02-202-238/+22
| | | | | | | | | | | | | | | | | | | | | | Summary: Instead of trying to replace SMRD instructions with a VGPR base pointer with an equivalent MUBUF instruction, we now copy the base pointer to SGPRs using v_readfirstlane. This is safe to do, because any load selected as an SMRD instruction has been proven to have a uniform base pointer, so each thread in the wave will have the same pointer value in VGPRs. This will fix some errors on VI from trying to replace SMRD instructions with addr64-enabled MUBUF instructions that don't exist. Reviewers: arsenm, cfang, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17305 llvm-svn: 261385
* AMDGPU/SI: Fix s_waitcnt insertion for flat instructionsTom Stellard2016-02-191-2/+4
| | | | | | | | | | | | | | | | Summary: This was broken in r260694 which swapped the address and data operands for flat store instructions. The code in SIInsertWaits assumes that the data operand always comes before the address operand, so we need to add a special case for flat. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17366 llvm-svn: 261330
* AMDGPU/SI: add llvm.amdgcn.image.load/store[.mip] intrinsicsNicolai Haehnle2016-02-183-30/+75
| | | | | | | | | | | | | Summary: These correspond to IMAGE_LOAD/STORE[_MIP] and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. IMAGE_LOAD is already matched by llvm.SI.image.load. That intrinsic has a legacy name and pretends not to read memory. Differential Revision: http://reviews.llvm.org/D17276 llvm-svn: 261224
* Test commit access.Nikolay Haustov2016-02-181-1/+0
| | | | llvm-svn: 261199
* [AMDGPU] Disassembler: Added basic disassembler for AMDGPU targetTom Stellard2016-02-1812-49/+551
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes: - Added disassembler project - Fixed all decoding conflicts in .td files - Added DecoderMethod=“NONE” option to Target.td that allows to disable decoder generation for an instruction. - Created decoding functions for VS_32 and VReg_32 register classes. - Added stubs for decoding all register classes. - Added several tests for disassembler Disassembler only supports: - VI subtarget - VOP1 instruction encoding - 32-bit register operands and inline constants [Valery] One of the point that requires to pay attention to is how decoder conflicts were resolved: - Groups of target instructions were separated by using different DecoderNamespace (SICI, VI, CI) using similar to AssemblerPredicate approach. - There were conflicts in IMAGE_<> instructions caused by two different reasons: 1. dmask wasn’t specified for the output (fixed) 2. There are image instructions that differ only by the number of the address components but have the same encoding by the HW spec. The actual number of address components is determined by the HW at runtime using image resource descriptor starting from the VGPR encoded in an IMAGE instruction. This means that we should choose only one instruction from conflicting group to be the rule for decoder. I didn’t find the way to disable decoder generation for an arbitrary instruction and therefore made a onelinear fix to tablegen generator that would suppress decoder generation when DecoderMethod is set to “NONE”. This is a change that should be reviewed and submitted first. Otherwise I would need to specify different DecoderNamespace for every instruction in the conflicting group. I haven’t checked yet if DecoderMethod=“NONE” is not used in other targets. 3. IMAGE_GATHER decoder generation is for now disabled and to be done later. [/Valery] Patch By: Sam Kolton Differential Revision: http://reviews.llvm.org/D16723 llvm-svn: 261185
* [AMDGPU] Rename $dst operand to $vdst for VOP instructions.Tom Stellard2016-02-166-77/+120
| | | | | | | | | | | | | | Summary: This change renames output operand for VOP instructions from dst to vdst. This is needed to enable decoding named operands for disassembler. Reviewers: vpykhtin, tstellarAMD, arsenm Subscribers: arsenm, llvm-commits, nhaustov Projects: #llvm-amdgpu-spb Differential Revision: http://reviews.llvm.org/D16920 llvm-svn: 260986
* AMDGPU: Prepare for reducing private element size.Matt Arsenault2016-02-131-14/+48
| | | | | | | | | | | | Tests for the new scalarize all private access options will be included with a future commit. The only functional change is to make the split/scalarize behavior for private access of > 4 element vectors to be consistent with the flat/global handling. This makes the spilling worse in the two changed tests. llvm-svn: 260804
* AMDGPU/SI: Add llvm.amdgcn.mov.dpp intrinsicTom Stellard2016-02-131-0/+11
| | | | | | | This intrinsic will be used to expose dpp functionality to higher-level languages. It will map to the dpp version of v_mov_b32. llvm-svn: 260792
* AMDGPU: Cleanup includes and random macrosMatt Arsenault2016-02-131-11/+4
| | | | llvm-svn: 260784
* AMDGPU: Add intrinsics for sin/cosMatt Arsenault2016-02-132-1/+18
| | | | | | | These provide direct access to the hardware instruction without the unit version required like llvm.sin/llvm.cos lowering requires. llvm-svn: 260782
* AMDGPU: Rename intrinsic to better match instruction nameMatt Arsenault2016-02-137-9/+9
| | | | | | Also fixes missing f32 test. llvm-svn: 260780
* AMDGPU/SI: Add instruction defs for VOP1 DPP instructionsTom Stellard2016-02-132-0/+107
| | | | | | | | | | Reviewers: nhaustov, cfang, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17159 llvm-svn: 260774
* AMDGPU: Fix broken condition causing warningMatt Arsenault2016-02-131-1/+1
| | | | llvm-svn: 260773
* AMDGPU/SI: Detect uniform branches and emit s_cbranch instructionsTom Stellard2016-02-1215-41/+266
| | | | | | | | | | Reviewers: arsenm Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16603 llvm-svn: 260765
* [AMDGPU] Assembler: Swap operands of flat_store instructions to match AMD ↵Tom Stellard2016-02-122-3/+3
| | | | | | | | | | | | | | assembler Historically, AMD internal sp3 assembler has flat_store* addr, data format. To match existing code and to enable reuse, change LLVM definitions to match. Also update MC and CodeGen tests. Differential Revision: http://reviews.llvm.org/D16927 Patch by: Nikolay Haustov llvm-svn: 260694
* AMDGPU/SI: Annotate Loops with Constant Condition in SIAnnotateControlFlow pass.Changpeng Fang2016-02-121-4/+10
| | | | | | | | | | | | | | | Summary: It is possible that the loop condition can be a boolean constant (infinite loop, for example). So we sould handle constant condition in annotating a loop. This patch adds this functionality to support annotating constant condition. Reviewers: tstellarAMD, arsenm Subscribers: llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D15093 llvm-svn: 260692
* Fix uninitialized memory read.Benjamin Kramer2016-02-121-2/+2
| | | | | | Found by msan. llvm-svn: 260676
* AMDGPU: Set flat_scratch from flat_scratch_init regMatt Arsenault2016-02-127-56/+102
| | | | | | | | | | | | | | This was hardcoded to the static private size, but this would be missing the offset and additional size for someday when we have dynamic sizing. Also stops always initializing flat_scratch even when unused. In the future we should stop emitting this unless flat instructions are used to access private memory. For example this will initialize it almost always on VI because flat is used for global access. llvm-svn: 260658
* AMDGPU: Set element_size in private resource descriptorMatt Arsenault2016-02-127-2/+56
| | | | | | | | | Introduce a subtarget feature for this, and leave the default with the current behavior which assumes up to 16-byte loads/stores can be used. The field also seems to have the ability to be set to 2 bytes, but I'm not sure what that would be used for. llvm-svn: 260651
* AMDGPU: Fix mishandling alignment when scalarizing vector loads/storesMatt Arsenault2016-02-121-2/+5
| | | | | | | I don't think this was causing any real problems, so I'm not sure how to test for this. llvm-svn: 260646
* AMDGPU: Initialize SILowerControlFlowMatt Arsenault2016-02-123-30/+43
| | | | llvm-svn: 260645
* AMDGPU: Remove trailing whitespaceMatt Arsenault2016-02-121-4/+4
| | | | llvm-svn: 260644
* AMDGPU/SI: Make sure MIMG descriptors and samplers stay in SGPRsTom Stellard2016-02-115-0/+72
| | | | | | | | | | | | | | | | Summary: It's possible to have resource descriptors and samplers stored in VGPRs, either by a VMEM instruction or in the case of samplers, floating-point calculations. When this happens, we need to use v_readfirstlane to copy these values back to sgprs. Reviewers: mareko, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17102 llvm-svn: 260599
* AMDGPU/SI: When splitting SMRD instructions, add its users to VALU worklistTom Stellard2016-02-111-0/+2
| | | | | | | | | | | | | | | | | | | | | | Summary: When we split SMRD instructions into two MUBUFs we were adding the users of the newly created MUBUFs to the VALU worklist. However, the only users these instructions had was the REG_SEQUENCE that was inserted by splitSMRD when the original SMRD instruction was split. We need to make sure to add the users of the original SMRD to the VALU worklist before it is split. I have a test case, but it requires one other bug fix, so it will be added in a later commt. Reviewers: mareko, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17101 llvm-svn: 260588
* [AMDGPU] Fix for "v_div_scale_f64 reg, vcc, ..." parsingTom Stellard2016-02-113-10/+10
| | | | | | | | | | | | | | | | | | Summary: Added support for "VOP3Only" attribute in VOP3bInst encoding. Set VOP3Only=1 for V_DIV_SCALE_F64/32 insns. Added support for multi-dest instructions in AMDGPUAs::cvt*(). Added lit test for "V_DIV_SCALE_F64|F32 vreg,vcc|sreg,vreg,vreg,vreg". Reviewers: tstellarAMD, arsenm Subscribers: arsenm, SamWot, nhaustov, vpykhtin Differential Revision: http://reviews.llvm.org/D16995 Patch By: Artem Tamazov llvm-svn: 260560
* AMDGPU: Fix constant bus use check with subregistersMatt Arsenault2016-02-111-4/+8
| | | | | | | | | | | If the two operands to an instruction were both subregisters of the same super register, it would incorrectly think this counted as the same constant bus use. This fixes the verifier error in fmin_legacy.ll which was missing -verify-machineinstrs. llvm-svn: 260495
* AMDGPU: Fix passes depending on dominator tree for no reasonMatt Arsenault2016-02-112-16/+4
| | | | llvm-svn: 260494
* AMDGPU: Fix not handling new workitem intrinsics in DivergenceAnalysisMatt Arsenault2016-02-111-0/+3
| | | | llvm-svn: 260491
* AMDGPU: Split R600 and SI store loweringMatt Arsenault2016-02-115-90/+89
| | | | | | | These were only sharing some somewhat incorrect logic for when to scalarize or split vectors. llvm-svn: 260490
* [AMDGPU] Assembler: Fix VOP3 only instructionsTom Stellard2016-02-114-92/+145
| | | | | | | | | | | | | | | | | | | | | Separate methods to convert parsed instructions to MCInst: - VOP3 only instructions (always create modifiers as operands in MCInst) - VOP2 instrunctions with modifiers (create modifiers as operands in MCInst when e64 encoding is forced or modifiers are parsed) - VOP2 instructions without modifiers (do not create modifiers as operands in MCInst) - Add VOP3Only flag. Pass HasMods flag to VOP3Common. - Simplify code that deals with modifiers (-1 is now same as 0). This is no longer needed. - Add few tests (more will be added separately). Update error message now correct. Patch By: Nikolay Haustov Differential Revision: http://reviews.llvm.org/D16778 llvm-svn: 260483
* AMDGPU: Release the scavenged offset register during VGPR spillNicolai Haehnle2016-02-101-1/+8
| | | | | | | | | | | | | | | | | | | Summary: This fixes a crash where subsequent spills would be unable to scavenge a register. In particular, it fixes a crash in piglit's spec@glsl-1.50@execution@geometry@max-input-components (the test still has a shader that fails to compile because of too many SGPR spills, but at least it doesn't crash any more). This is a candidate for the release branch. Reviewers: arsenm, tstellarAMD Subscribers: qcolombet, arsenm Differential Revision: http://reviews.llvm.org/D16558 llvm-svn: 260427
OpenPOWER on IntegriCloud