summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/v_mac.ll
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] Support for v3i32/v3f32Tim Renouf2019-03-211-1/+1
| | | | | | | | | | | | | | | Added support for dwordx3 for most load/store types, but not DS, and not intrinsics yet. SI (gfx6) does not have dwordx3 instructions, so they are not enabled there. Some of this patch is from Matt Arsenault, also of AMD. Differential Revision: https://reviews.llvm.org/D58902 Change-Id: I913ef54f1433a7149da8d72f4af54dbb13436bd9 llvm-svn: 356659
* AMDGPU: Allow SIShrinkInstructions to work in non-SSAMatt Arsenault2017-07-101-2/+2
| | | | | | | | Immediates can be folded as long as the immediate is a vreg. Also undo commuting instructions if it didn't fold an immediate. llvm-svn: 307575
* [AMDGPU] Switch scalarize global loads ON by defaultAlexander Timofeev2017-07-041-3/+3
| | | | | | Differential revision: https://reviews.llvm.org/D34407 llvm-svn: 307097
* Revert r307026, "[AMDGPU] Switch scalarize global loads ON by default"NAKAMURA Takumi2017-07-041-3/+3
| | | | | | | | | It broke a testcase. Failing Tests (1): LLVM :: CodeGen/AMDGPU/alignbit-pat.ll llvm-svn: 307054
* [AMDGPU] Switch scalarize global loads ON by defaultAlexander Timofeev2017-07-031-3/+3
| | | | | | Differential revision: https://reviews.llvm.org/D34407 llvm-svn: 307026
* AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernelMatt Arsenault2017-03-211-13/+13
| | | | | | | | | | | | Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
* DAG: Check no signed zeros instead of unsafe math attributeMatt Arsenault2017-03-091-6/+6
| | | | llvm-svn: 297354
* DAG: Constant fold fp16_to_fp/fp16_to_fpMatt Arsenault2017-01-301-2/+6
| | | | | | | This fixes emitting conversions of constants on targets without legal f16 that need to use these for legalization. llvm-svn: 293499
* Enable FeatureFlatForGlobal on Volcanic IslandsMatt Arsenault2017-01-241-2/+2
| | | | | | | | | | | This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982
* AMDGPU: Combine fp16/fp64 subtarget featuresMatt Arsenault2017-01-231-3/+4
| | | | | | | The same control register controls both, and are set to the same defaults. Keep the old names around as aliases. llvm-svn: 292837
* AMDGPU: Fix folding immediates into mac src2Matt Arsenault2017-01-111-0/+66
| | | | | | | Whether it is legal or not needs to check for the instruction it will be replaced with. llvm-svn: 291711
* AMDGPU: Remove superfluous string attributes from testsMatt Arsenault2016-07-111-7/+68
| | | | | | Also fix v_mac.ll not testing right thing for fneg llvm-svn: 275129
* AMDGPU/SI: Assembler: Unify parsing/printing of operands.Nikolay Haustov2016-04-291-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The goal is for each operand type to have its own parse function and at the same time share common code for tracking state as different instruction types share operand types (e.g. glc/glc_flat, etc). Introduce parseAMDGPUOperand which can parse any optional operand. DPP and Clamp/OMod have custom handling for now. Sam also suggested to have class hierarchy for operand types instead of table. This can be done in separate change. Remove parseVOP3OptionalOps, parseDS*OptionalOps, parseFlatOptionalOps, parseMubufOptionalOps, parseDPPOptionalOps. Reduce number of definitions of AsmOperand's and MatchClasses' by using common base class. Rename AsmMatcher/InstPrinter methods accordingly. Print immediate type when printing parsed immediate operand. Use 'off' if offset/index register is unused instead of skipping it to make it more readable (also agreed with SP3). Update tests. Reviewers: tstellarAMD, SamWot, artem.tamazov Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19584 llvm-svn: 268015
* AMDGPU: Run SIFoldOperands after PeepholeOptimizerMatt Arsenault2016-04-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PeepholeOptimizer cleans up redundant copies, which makes the operand folding more effective. shader-db stats: Totals: SGPRS: 34200 -> 34336 (0.40 %) VGPRS: 22118 -> 21655 (-2.09 %) Code Size: 632144 -> 633460 (0.21 %) bytes LDS: 11 -> 11 (0.00 %) blocks Scratch: 10240 -> 11264 (10.00 %) bytes per wave Max Waves: 8822 -> 8918 (1.09 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 7704 -> 7840 (1.77 %) VGPRS: 5169 -> 4706 (-8.96 %) Code Size: 234444 -> 235760 (0.56 %) bytes LDS: 2 -> 2 (0.00 %) blocks Scratch: 0 -> 1024 (0.00 %) bytes per wave Max Waves: 1188 -> 1284 (8.08 %) Wait states: 0 -> 0 (0.00 %) Increases: SGPRS: 35 (0.01 %) VGPRS: 1 (0.00 %) Code Size: 59 (0.02 %) LDS: 0 (0.00 %) Scratch: 1 (0.00 %) Max Waves: 48 (0.02 %) Wait states: 0 (0.00 %) Decreases: SGPRS: 26 (0.01 %) VGPRS: 54 (0.02 %) Code Size: 68 (0.03 %) LDS: 0 (0.00 %) Scratch: 0 (0.00 %) Max Waves: 4 (0.00 %) Wait states: 0 (0.00 %) llvm-svn: 266378
* AMDGPU: Add volatile to test loads and storesMatt Arsenault2016-04-121-8/+8
| | | | | | | | When the memory vectorizer is enabled, these tests break. These tests don't really care about the memory instructions, and it's easier to write check lines with the unmerged loads. llvm-svn: 266071
* AMDGPU/SI: Select mad patterns to v_mac_f32Tom Stellard2015-07-131-0/+155
The two-address instruction pass will convert these back to v_mad_f32 if necessary. Differential Revision: http://reviews.llvm.org/D11060 llvm-svn: 242038
OpenPOWER on IntegriCloud