summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/shift-i128.ll
Commit message (Collapse)AuthorAgeFilesLines
* [MachineScheduler] Reduce reordering due to mem op clusteringJay Foad2020-01-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | Summary: Mem op clustering adds a weak edge in the DAG between two loads or stores that should be clustered, but the direction of this edge is pretty arbitrary (it depends on the sort order of MemOpInfo, which represents the operands of a load or store). This often means that two loads or stores will get reordered even if they would naturally have been scheduled together anyway, which leads to test case churn and goes against the scheduler's "do no harm" philosophy. The fix makes sure that the direction of the edge always matches the original code order of the instructions. Reviewers: atrick, MatzeB, arsenm, rampitec, t.p.northover Subscribers: jvesely, wdng, nhaehnle, kristof.beyls, hiraditya, javed.absar, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72706
* AMDGPU: Decompose all values to 32-bit pieces for calling conventionsMatt Arsenault2019-07-191-99/+102
| | | | | | | | | | This is the more natural lowering, and presents more opportunities to reduce 64-bit ops to 32-bit. This should also help avoid issues graphics shaders have had with 64-bit values, and simplify argument lowering in globalisel. llvm-svn: 366578
* AMDGPU: Use getTargetConstantMatt Arsenault2019-07-171-4/+4
| | | | | | Avoids creating an extra intermediate mov. llvm-svn: 366340
* AMDGPU: Make s34 the FP registerMatt Arsenault2019-07-081-278/+62
| | | | | | | | | | | | | | | | | | | | | | | Make the FP register callee saved. This is tricky because now the FP needs to be spilled in the prolog relative to the incoming SP register, rather than the frame register used throughout the rest of the function. I don't like how this bypassess the standard mechanism for CSR spills just to get the correct insert point. I may look for a better solution, since all CSR VGPRs may also need to have all lanes activated. Another option might be to make getFrameIndexReference change the base register if the frame index is a CSR, and then try to figure out the right insertion point in emitProlog. If there is a free VGPR lane available for SGPR spilling, try to use it for the FP. If that would require intrtoducing a new VGPR spill, try to use a free call clobbered SGPR. Only fallback to introducing a new VGPR spill as a last resort. This also doesn't attempt to handle SGPR spilling with scalar stores. llvm-svn: 365372
* AMDGPU: Don't use the default cpu in a few testsMatt Arsenault2019-04-031-796/+628
| | | | | | Avoids unnecessary test changes in a future commit. llvm-svn: 357539
* [AMDGPU] Divergence driven instruction selection. Part 1.Alexander Timofeev2018-09-211-50/+47
| | | | | | | | | | | | | Summary: This change is the first part of the AMDGPU target description change. The aim of it is the effective splitting the vector and scalar flows at the selection stage. Selection uses predicate functions based on the framework implemented earlier - https://reviews.llvm.org/D35267 Differential revision: https://reviews.llvm.org/D52019 Reviewers: rampitec llvm-svn: 342719
* AMDGPU: Fix shifts for i128Matt Arsenault2018-08-081-0/+1047
llvm-svn: 339270
OpenPOWER on IntegriCloud