summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU][llvm-mc] Predefined symbols to access -mcpu from the assembly ↵Artem Tamazov2016-06-141-2/+15
| | | | | | | | | | | | source (.option.machine_version...) The feature allows for conditional assembly etc. TODO: make those symbols read-only. Test added. Differential Revision: http://reviews.llvm.org/D21238 llvm-svn: 272673
* AMDGPU/SI: Set INDEX_STRIDE for scratch coalescingMarek Olsak2016-06-132-3/+6
| | | | | | | | | | | | | | | | | Summary: Mesa and other users must set this to enable coalescing: - STRIDE = 0 - SWIZZLE_ENABLE = 1 This makes one particular compute shader 8x faster. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D21136 llvm-svn: 272556
* AMDGPU: Fix post-RA verifier errors with trackLivenessAfterRegAllocMatt Arsenault2016-06-131-14/+16
| | | | | | | The condition reg of the cndmask_b64 expansion can't be killed by the first one, and the implicit super register implicit def is needed. llvm-svn: 272554
* Move instances of std::function.Benjamin Kramer2016-06-122-5/+5
| | | | | | Or replace with llvm::function_ref if it's never stored. NFC intended. llvm-svn: 272513
* Pass DebugLoc and SDLoc by const ref.Benjamin Kramer2016-06-1212-138/+109
| | | | | | | | This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512
* AMDGPU/SI: Don't use fixup_si_rodata for scratch rsrc relocationsTom Stellard2016-06-102-1/+7
| | | | | | | | | | | | | | | Summary: We need to set the fixup type to FK_Data_4 for the SCRATCH_RSRC_DWORD[01] symbols, since these require absolute relocations, and fixup_si_rodata is for relative relocations. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21153 llvm-svn: 272417
* [AMDGPU] AsmParser: Support for sext() modifier in SDWA. Some code cleaning ↵Sam Kolton2016-06-105-257/+348
| | | | | | | | | | | | | | | | | | in AMDGPUOperand. Summary: sext() modifier is supported in SDWA instructions only for integer operands. Spec is unclear should integer operands support abs and neg modifiers with sext - for now they are not supported. Renamed InputModsWithNoDefault to FloatInputMods. Added SextInputMods for operands that support sext() modifier. Added AMDGPUOperand::Modifier struct to handle register and immediate modifiers. Code cleaning in AMDGPUOperand class: organize method in groups (render-, predicate-methods...). Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20968 llvm-svn: 272384
* AMDGPU: Fix trailing whitespaceMatt Arsenault2016-06-1010-40/+40
| | | | llvm-svn: 272364
* AMDGPU: v_cndmask_b32 does not def vccMatt Arsenault2016-06-101-2/+2
| | | | | | Fixes verifier errors after SIShrinkInstructions. llvm-svn: 272351
* AMDGPU/SI: Make sure to emit TargetConstant nodes when matching ds_*permuteTom Stellard2016-06-101-2/+2
| | | | | | | | | | | | | | | Summary: This fixes a bug with ds_*permute instructions where if it was passed a constant address, then the offset operand would get assigned a register operand instead of an immediate. Reviewers: scchan, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19994 llvm-svn: 272349
* AMDGPU/SI: Use common topological sort algorithm in SIScheduleDAGMITom Stellard2016-06-091-64/+3
| | | | | | | | | | Reviewers: arsenm, axeldavy Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19823 llvm-svn: 272346
* AMDGPU: Fix flat atomicsMatt Arsenault2016-06-094-41/+90
| | | | | | | | The flat atomics could already be selected, but only when using flat instructions for global memory. Add patterns for flat addresses. llvm-svn: 272345
* AMDGPU: Fix i64 global cmpxchgMatt Arsenault2016-06-094-40/+82
| | | | | | | | | | This was using extract_subreg sub0 to extract the low register of the result instead of sub0_sub1, producing an invalid copy. There doesn't seem to be a way to use the compound subreg indices in tablegen since those are generated, so manually select it. llvm-svn: 272344
* AMDGPU: Run verifer after insert waits passMatt Arsenault2016-06-091-1/+1
| | | | llvm-svn: 272338
* AMDGPU: Remove incorrect assertionMatt Arsenault2016-06-091-4/+0
| | | | | | | I'm still not sure under what circumstances the offset here is non-0, but private memory is not limited to 27-bits. llvm-svn: 272337
* AMDGPU: Properly initialize SIShrinkInstructionsMatt Arsenault2016-06-093-8/+6
| | | | llvm-svn: 272336
* AMDGPU/SI: Fix 32-bit fdiv loweringWei Ding2016-06-091-16/+53
| | | | | | | | | We were using the fast fdiv lowering for all division, implementation of IEEE754 fdiv is added. http://reviews.llvm.org/D20557 llvm-svn: 272292
* [AMDGPU] Disassembler: Support for sdwa instructionsSam Kolton2016-06-091-1/+5
| | | | | | | | | | Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D21129 llvm-svn: 272255
* AMDGPU: Add amdgpu-ps-wqm-outputs function attributesNicolai Haehnle2016-06-071-2/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The presence of this attribute indicates that VGPR outputs should be computed in whole quad mode. This will be used by Mesa for prolog pixel shaders, so that derivatives can be taken of shader inputs computed by the prolog, fixing a bug. The generated code could certainly be improved: if a prolog pixel shader is used (which isn't common in modern OpenGL - they're used for gl_Color, polygon stipples, and forcing per-sample interpolation), Mesa will use this attribute unconditionally, because it has to be conservative. So WQM may be used in the prolog when it isn't really needed, and furthermore a silly back-and-forth switch is likely to happen at the boundary between prolog and main shader parts. Fixing this is a bit involved: we'd first have to add a mechanism by which LLVM writes the WQM-related input requirements to the main shader part binary, and then Mesa specializes the prolog part accordingly. At that point, we may as well just compile a monolithic shader... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130 Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D20839 llvm-svn: 272063
* Revert "Differential Revision: http://reviews.llvm.org/D20557"Eric Christopher2016-06-071-55/+17
| | | | | | | | | | | | | | | | Author: Wei Ding <wei.ding2@amd.com> Date: Tue Jun 7 19:04:44 2016 +0000 Differential Revision: http://reviews.llvm.org/D20557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272044 91177308-0d34-0410-b5e6-96231b3b80d8 as it was breaking the bots. This reverts commit r272044. llvm-svn: 272056
* Differential Revision: http://reviews.llvm.org/D20557Wei Ding2016-06-071-17/+55
| | | | llvm-svn: 272044
* AMDGPU: Add function for getting instruction sizeMatt Arsenault2016-06-062-0/+51
| | | | llvm-svn: 271936
* AMDGPU: Fix constantexpr addrspacecastsMatt Arsenault2016-06-062-4/+71
| | | | | | | | If we had a constant group address space cast the queue pointer wasn't enabled for the function, resulting in a crash on noreg later. llvm-svn: 271935
* [AMDGPU][llvm-mc] v_cndmask_b32: src2 is mandatory; do not enforce VOP2 when ↵Artem Tamazov2016-06-062-14/+76
| | | | | | | | | | | | src2 == VCC. Another step for unification llvm assembler/disassembler with sp3. Besides, CodeGen output is a bit improved, thus changes in CodeGen tests. Assembler/Disassembler tests updated/added. Differential Revision: http://reviews.llvm.org/D20796 llvm-svn: 271900
* [test/AMDGPU] Square-braced-syntax for registers: add macro test/example.Artem Tamazov2016-06-031-19/+45
| | | | | | | | | | Test added as per discussion in http://reviews.llvm.org/D20588. The macro is just a demonstration, useless in practice. Coding style fixes. Differential Revision: http://reviews.llvm.org/D20797 llvm-svn: 271675
* [AMDGPU] Assembler: More tests for SDWA instructions. Fix for SDWA float ↵Sam Kolton2016-06-032-16/+23
| | | | | | | | | | | | | | modifiers. Summary: Depends on D20625 Reviewers: tstellarAMD, vpykhtin, artem.tamazov Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20674 llvm-svn: 271662
* [AMDGPU] Assembler: Custom converters for SDWA instructions. Support for ↵Sam Kolton2016-06-032-62/+162
| | | | | | | | | | | | | | | | _dpp and _sdwa suffixes in mnemonics. Summary: Added custom converters for SDWA instruction to support optional operands and modifiers. Support for _dpp and _sdwa suffixes that allows to force DPP or SDWA encoding for instructions. Reviewers: tstellarAMD, vpykhtin, artem.tamazov Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20625 llvm-svn: 271655
* AMDGPU: Handle flat in getMemOpBaseRegImmOfsMatt Arsenault2016-06-021-0/+7
| | | | | | | It can still report the base register, and the uses give up when it fails. llvm-svn: 271575
* AMDGPU: Cleanup load testsMatt Arsenault2016-06-021-0/+14
| | | | | | | | | There are a lot of different kinds of loads to test for, and these were scattered around inconsistently with some redundancy. Try to comprehensively test all loads in a consistent way. llvm-svn: 271571
* AMDGPU: Temporary fix for broken store combineMatt Arsenault2016-06-021-0/+2
| | | | llvm-svn: 271567
* AMDGPU: Fix crashes on unknown processor nameMatt Arsenault2016-06-024-7/+9
| | | | | | | | | | | | | | If the processor name failed to parse for amdgcn, the resulting output would have R600 ISA in it. If the processor name was missing or invalid for R600, the wavefront size would not be set and there would be crashes from missing itinerary data. Fixes crashes in future commit caused by dividing by the unset/0 wavefront size. llvm-svn: 271561
* AMDGPU: Fix incorrectly setting kill flag when copying register tuplesMatt Arsenault2016-06-021-1/+1
| | | | | | | This fixes some verifier errors when trackLivenessAfterRegAlloc is enabled. llvm-svn: 271446
* AMDGPU: SIDebuggerInsertNops preserves CFGMatt Arsenault2016-06-022-0/+6
| | | | | | | This saves an additional run of the DominatorTree and MachineLoopInfo llvm-svn: 271444
* AMDGPU: Remove unused address spaceMatt Arsenault2016-05-312-12/+10
| | | | | | Also return a single StringRef instead of building a string. llvm-svn: 271296
* AMDGPU: Fix trailing whitespaceMatt Arsenault2016-05-281-5/+5
| | | | llvm-svn: 271081
* AMDGPU: Add fract intrinsicMatt Arsenault2016-05-283-36/+21
| | | | | | | | | Remove broken patterns matching it. This was matching the unsafe math pattern and expanding the fix for the buggy instruction from the pattern. The problems are also on CI. Remove the workarounds and only use fract with unsafe math or from the intrinsic. llvm-svn: 271078
* [AMDGPU][llvm-mc] Square-braced-syntax for registers - make ":expr2" optional.Artem Tamazov2016-05-271-6/+10
| | | | | | | | | | | | | | | | | Register numbers may be specified as assembly-time expressions. This feature can be useful in macros and alike. However, expressions are supported within sqare braces only. Sqare braces were initially intended to support specifying of multiple (pairs/quads...) registers. Syntax like v[8:8] which specifies single register is also supported. That allows expressions but looks a bit unnatural. This change supports syntax REG[EXPR]. Tests added. Differential Revision: http://reviews.llvm.org/D20588 llvm-svn: 270990
* Avoid some copies by using const references.Benjamin Kramer2016-05-271-3/+1
| | | | | | | clang-tidy's performance-unnecessary-copy-initialization with some manual fixes. No functional changes intended. llvm-svn: 270988
* Apply clang-tidy's misc-static-assert where it makes sense.Benjamin Kramer2016-05-271-12/+6
| | | | | | | Also fold conditions into assert(0) where it makes sense. No functional change intended. llvm-svn: 270982
* AMDGPU/SI: Enable load-store-opt by default.Changpeng Fang2016-05-261-1/+1
| | | | | | | | | | Summary: Enable load-store-opt by default, and update LIT tests. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D20694 llvm-svn: 270894
* [AMDGPU][llvm-mc] s_getreg/setreg* - hwreg - factor out strings/literals etc.Artem Tamazov2016-05-269-159/+196
| | | | | | | | | | | Hwreg(...) syntax implementation unified with sendmsg(...). Common strings moved to Utils MathExtras.h functionality utilized. Added missing build dependency in Disassembler. Differential Revision: http://reviews.llvm.org/D20381 llvm-svn: 270871
* Fix build warning introduced in r270552 "[AMDGPU][llvm-mc] Disassembler: ↵Artem Tamazov2016-05-261-1/+2
| | | | | | support for TTMP/TBA/TMA registers." llvm-svn: 270859
* [AMDGPU] Remove exit-on-error flag from test (PR27762)Diana Picus2016-05-261-1/+1
| | | | | | | | | | Similar to r269948, but for argument lowering. Fixes PR27762 Differential Revision: http://reviews.llvm.org/D20430 llvm-svn: 270856
* AMDGPU: Fix v2i64/v2f64 bitcastsMatt Arsenault2016-05-251-0/+2
| | | | | | | These operations tend to get promoted away to v4i32 so this doesn't happen often. llvm-svn: 270740
* AMDGPU: Fix inconsistent lowering of select of vectorsMatt Arsenault2016-05-251-1/+9
| | | | | | | | | f32 vectors would use a sequence of BFI instructions instead of unrolled cmp + select. This was better in the case of a VALU select with SGPR inputs, but we don't have a way of dealing with that in the DAG. llvm-svn: 270731
* Soften assertion in AMDGPU emitPrologue.Nirav Dave2016-05-251-2/+3
| | | | | | | | | | | | | | [AMDGPU] emitPrologue looks for an unused unallocated SGPR that is not the scratch descriptor. Continue search if unused register found fails other requirements. Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D20526 llvm-svn: 270646
* [AMDGPU][NFC] Rename ReserveTrapVGPRs -> ReserveRegsKonstantin Zhuravlyov2016-05-247-23/+25
| | | | | | Differential Revision: http://reviews.llvm.org/D20081 llvm-svn: 270594
* [AMDGPU] Assembler: rework parsing of optional operands.Sam Kolton2016-05-242-282/+86
| | | | | | | | | | | | | | | Summary: Change process of parsing of optional operands. All optional operands use same parsing method - parseOptionalOperand(). No default values are added to OperandsVector. Get rid of WORKAROUND_USE_DUMMY_OPERANDS_INSTEAD_MUTIPLE_DEFAULT_OPERANDS. Reviewers: tstellarAMD, vpykhtin, artem.tamazov, nhaustov Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20527 llvm-svn: 270556
* [AMDGPU][llvm-mc] Disassembler: support for TTMP/TBA/TMA registers.Artem Tamazov2016-05-243-44/+116
| | | | | | Differential Revision: http://reviews.llvm.org/D20476 llvm-svn: 270552
* [AMDGPU] Assembler: refactor parsing of modifiers and immediates. Allow ↵Sam Kolton2016-05-232-175/+245
| | | | | | | | | | | | modifiers for imms. Reviewers: nhaustov, tstellarAMD Subscribers: kzhuravl, arsenm Differential Revision: http://reviews.llvm.org/D20166 llvm-svn: 270415
OpenPOWER on IntegriCloud