summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU/SI: Move some ISel helpers into utils so they can be shared with GISelTom Stellard2017-01-271-11/+1
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D29068 llvm-svn: 293321
* AMDGPU add support for spilling to a user sgpr pointed buffersTom Stellard2017-01-251-4/+14
| | | | | | | | | | | | | | | | | Summary: This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1]. Patch By: Dave Airlie Reviewers: nhaehnle, arsenm, tstellarAMD Reviewed By: arsenm Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25428 llvm-svn: 293000
* AMDGPU : Add trap handler support.Wei Ding2017-01-241-19/+24
| | | | llvm-svn: 292893
* AMDGPU: Custom lower more vector operationsMatt Arsenault2017-01-231-0/+100
| | | | | | This avoids stack usage. llvm-svn: 292846
* AMDGPU: Remove unnecessary checkMatt Arsenault2017-01-231-3/+0
| | | | | | There are no scalar FP types that can be extended. llvm-svn: 292816
* [AMDGPU] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-01-211-38/+63
| | | | | | other minor fixes (NFC). llvm-svn: 292688
* AMDGPU: Add replacement export intrinsicsMatt Arsenault2017-01-171-9/+58
| | | | llvm-svn: 292205
* Apply clang-tidy's performance-unnecessary-value-param to LLVM.Benjamin Kramer2017-01-131-5/+8
| | | | | | | With some minor manual fixes for using function_ref instead of std::function. No functional change intended. llvm-svn: 291904
* [CodeGen] Rename MachineInstrBuilder::addOperand. NFCDiana Picus2017-01-131-28/+27
| | | | | | | | | | | Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. See https://reviews.llvm.org/D28057 for the whole discussion. Differential Revision: https://reviews.llvm.org/D28556 llvm-svn: 291891
* AMDGPU: Add Assert[SZ]Ext during argument load creationMatt Arsenault2017-01-091-12/+15
| | | | | | | | | | | For i16 zeroext arguments when i16 was a legal type, the known bits information from the truncate was lost. Insert a zeroext so the known bits optimizations work with the 32-bit loads. Fixes code quality regressions vs. SI in min.ll test. llvm-svn: 291461
* AMDGPU/R600: Don't use REGISTER_{LOAD,STORE} ISD nodesJan Vesely2017-01-061-0/+12
| | | | | | | | This will make transition to SCRATCH_MEMORY easier Differential Revision: https://reviews.llvm.org/D24746 llvm-svn: 291279
* AMDGPU/SI: Implement sendmsghalt intrinsicJan Vesely2017-01-041-1/+8
| | | | | | | | v2: expose using amdgcn prefix Differential Revision: https://reviews.llvm.org/D23511 llvm-svn: 290977
* AMDGPU: Use i16 for i16 shift amountMatt Arsenault2016-12-221-2/+4
| | | | llvm-svn: 290351
* AMDGPU: Use i16 comparison instructionsMatt Arsenault2016-12-221-3/+1
| | | | llvm-svn: 290348
* AMDGPU: Swap order of operands in fadd/fsub combineMatt Arsenault2016-12-221-4/+4
| | | | | | | FMA is canonicalized to constant in the middle operand. Do the same so fmad matches and avoid an extra combine step. llvm-svn: 290313
* AMDGPU: Check fast math flags in fadd/fsub combinesMatt Arsenault2016-12-221-6/+13
| | | | llvm-svn: 290312
* AMDGPU: Form more FMAs if fusion is allowedMatt Arsenault2016-12-221-30/+45
| | | | | | | Extend the existing fadd/fsub->fmad combines to produce FMA if allowed. llvm-svn: 290311
* AMDGPU: Move combines into separate functionsMatt Arsenault2016-12-221-152/+169
| | | | llvm-svn: 290309
* AMDGPU: Enable some f32 fadd/fsub combines for f16Matt Arsenault2016-12-221-7/+12
| | | | llvm-svn: 290308
* AMDGPU: Implement isFMAFasterThanFMulAndFAdd for f16Matt Arsenault2016-12-221-0/+2
| | | | llvm-svn: 290307
* AMDGPU: Allow rcp and rsq usage with f16Matt Arsenault2016-12-221-3/+8
| | | | llvm-svn: 290302
* AMDGPU: Custom lower f16 fdivMatt Arsenault2016-12-221-1/+21
| | | | llvm-svn: 290301
* AMDGPU: Implement f16 fcanonicalizeMatt Arsenault2016-12-221-0/+3
| | | | llvm-svn: 290300
* AMDGPU: Allow 16-bit types in inline asm constraintsMatt Arsenault2016-12-201-0/+2
| | | | llvm-svn: 290193
* AMDGPU/SI: Add a MachineMemOperand when lowering llvm.amdgcn.buffer.load.*Tom Stellard2016-12-201-0/+26
| | | | | | | | | | Reviewers: arsenm, nhaehnle, mareko Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27834 llvm-svn: 290184
* AMDGPU/SI: Add a MachineMemOperand to MIMG instructionsTom Stellard2016-12-201-6/+24
| | | | | | | | | | | | | | | Summary: Without a MachineMemOperand, the scheduler was assuming MIMG instructions were ordered memory references, so no loads or stores could be reordered across them. Reviewers: arsenm Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27536 llvm-svn: 290179
* AMDGPU: Select branch on undef to uniform scc branchMatt Arsenault2016-12-151-0/+9
| | | | llvm-svn: 289877
* Fix for regression after Global Load Scalarization patchAlexander Timofeev2016-12-151-1/+2
| | | | llvm-svn: 289822
* AMDGPU: Fix isTypeDesirableForOp for i16Matt Arsenault2016-12-091-4/+16
| | | | | | This should do nothing for targets without i16. llvm-svn: 289235
* AMDGPU: Make f16 ConstantFP legalMatt Arsenault2016-12-081-13/+1
| | | | | | | | | | | | | | Not having this legal led to combine failures, resulting in dumb things like bitcasts of constants not being folded away. The only reason I'm leaving the v_mov_b32 hack that f32 already uses is to avoid madak formation test regressions. PeepholeOptimizer has an ordering issue where the immediate fold attempt is into the sgpr->vgpr copy instead of the actual use. Running it twice avoids that problem. llvm-svn: 289096
* [AMDGPU] Scalarization of global uniform loads.Alexander Timofeev2016-12-081-2/+17
| | | | | | | | | | | | | | | | | | Summary: LC can currently select scalar load for uniform memory access basing on readonly memory address space only. This restriction originated from the fact that in HW prior to VI vector and scalar caches are not coherent. With MemoryDependenceAnalysis we can check that the memory location corresponding to the memory operand of the LOAD is not clobbered along the all paths from the function entry. Reviewers: rampitec, tstellarAMD, arsenm Subscribers: wdng, arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D26917 llvm-svn: 289076
* AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.Tom Stellard2016-12-071-11/+101
| | | | | | | | | | | | | | Patch By: Wei Ding Summary: This patch fixes the fdiv precision issues. Reviewers: b-sumner, cfang, wdng, arsenm Subscribers: kzhuravl, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D26424 llvm-svn: 288879
* AMDGPU: Add llvm.amdgcn.interp.mov intrinsicTom Stellard2016-12-061-0/+6
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D26725 llvm-svn: 288865
* AMDGPU: Refactor exp instructionsMatt Arsenault2016-12-051-0/+23
| | | | | | | | | | | | | | | Structure the definitions a bit more like the other classes. The main change here is to split EXP with the done bit set to a separate opcode, so we can set mayLoad = 1 so that it won't be reordered before the other exp stores, since this has the special constraint that if the done bit is set then this should be the last exp in she shader. Previously all exp instructions were inferred to have unmodeled side effects. llvm-svn: 288695
* AMDGPU: Implement isCheapAddrSpaceCastMatt Arsenault2016-12-021-2/+12
| | | | llvm-svn: 288523
* AMDGPU: Use SGPR_64 for argument loweringsMatt Arsenault2016-11-291-7/+7
| | | | llvm-svn: 288190
* AMDGPU/SI: Use float as the operand type for amdgcn.interp intrinsicsTom Stellard2016-11-261-0/+2
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D26724 llvm-svn: 287962
* AMDGPU/SI: Add back reverted SGPR spilling code, but disable itMarek Olsak2016-11-251-11/+11
| | | | | | suggested as a better solution by Matt llvm-svn: 287942
* Revert "AMDGPU: Make m0 unallocatable"Marek Olsak2016-11-251-11/+11
| | | | | | This reverts commit 124ad83dae04514f943902446520c859adee0e96. llvm-svn: 287932
* AMDGPU: Make m0 unallocatableMatt Arsenault2016-11-241-11/+11
| | | | | | | | | | | m0 may need to be written for spill code, so we don't want general code uses relying on the value stored in it. This introduces a few code quality regressions where copies from m0 are not coalesced into copies of a copy of m0. llvm-svn: 287841
* AMDGPU: Fix unused variable warningMatt Arsenault2016-11-181-5/+4
| | | | llvm-svn: 287362
* AMDGPU: Fix crash on illegal type for inlineasmMatt Arsenault2016-11-181-0/+2
| | | | | | | There are still crashes on non-MVT types in other places. llvm-svn: 287310
* [AMDGPU] Custom lower f16 = fp_round f64Konstantin Zhuravlyov2016-11-171-0/+20
| | | | llvm-svn: 287203
* [AMDGPU] Promote f16/i16 conversions to f32/i32Konstantin Zhuravlyov2016-11-171-52/+8
| | | | llvm-svn: 287201
* [AMDGPU] Expand `br_cc` for f16Konstantin Zhuravlyov2016-11-171-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D26732 llvm-svn: 287199
* [AMDGPU] Handle f16 select{_cc}Konstantin Zhuravlyov2016-11-161-0/+1
| | | | | | | | | | - Select `select` to `v_cndmask_b32` - Expand `select_cc` - Refactor patterns Differential Revision: https://reviews.llvm.org/D26714 llvm-svn: 287074
* AMDGPU/SI: Support data types other than V4f32 in image intrinsicsChangpeng Fang2016-11-141-2/+5
| | | | | | | | | | | | | | | | Summary: Extend image intrinsics to support data types of V1F32 and V2F32. TODO: we should define a mapping table to change the opcode for data type of V2F32 but just one channel is active, even though such case should be very rare. Reviewers: tstellarAMD Differential Revision: http://reviews.llvm.org/D26472 llvm-svn: 286860
* [AMDGPU] Add f16 support (VI+)Konstantin Zhuravlyov2016-11-131-13/+103
| | | | | | Differential Revision: https://reviews.llvm.org/D25975 llvm-svn: 286753
* AMDGPU/SI: Promote i16 = fp_[us]int f32 for VITom Stellard2016-11-121-0/+6
| | | | | | | | | | | | Summary: This fixes a regression caused by r286464. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D26570 llvm-svn: 286687
* AMDGPU: Add VI i16 supportTom Stellard2016-11-101-4/+72
| | | | | | | | Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 286464
OpenPOWER on IntegriCloud