summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Fix folding immediate into readfirstlane through reg_sequenceMatt Arsenault2019-06-192-2/+6
| | | | | | | | | | | | | The def instruction for the vreg may not match, because it may be folding through a reg_sequence. The assert was overly conservative and not necessary. It's not actually important if DefMI really defined the register, because the fold that will be done cares about the def of the value that will be folded. For some reason copies aren't making it through the reg_sequence, although they should. llvm-svn: 363876
* Reapply "AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsics"Matt Arsenault2019-06-196-21/+155
| | | | | | | | This reapplies r363678, using the correct chain for the CopyToReg for v0. glueCopyToM0 counterintuitively changes the operands of the original node. llvm-svn: 363870
* Revert rL363678 : AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsicsSimon Pilgrim2019-06-196-156/+21
| | | | | | | | | There may or may not be additional work to handle this correctly on SI/CI. ........ Breaks EXPENSIVE_CHECKS buildbots - http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/78/ llvm-svn: 363797
* Rename ExpandISelPseudo->FinalizeISel, delay register reservationMatt Arsenault2019-06-191-1/+2
| | | | | | | | | | | This allows targets to make more decisions about reserved registers after isel. For example, now it should be certain there are calls or stack objects in the frame or not, which could have been introduced by legalization. Patch by Matthias Braun llvm-svn: 363757
* [AMDGPU] gfx10 wave32 patternsStanislav Mekhanoshin2019-06-183-15/+86
| | | | | | Differential Revision: https://reviews.llvm.org/D63511 llvm-svn: 363729
* [AMDGPU] gfx1010 disassembler changes for wave32Stanislav Mekhanoshin2019-06-182-3/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D63506 llvm-svn: 363721
* AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsicsMatt Arsenault2019-06-186-21/+156
| | | | | | | There may or may not be additional work to handle this correctly on SI/CI. llvm-svn: 363678
* AMDGPU: Change API for checking for exec modificationMatt Arsenault2019-06-184-27/+55
| | | | | | | | | | | | | | | | | | Invert the name and return value to better reflect the imprecise nature. Force passing in the DefMI, since it's known in the 2 users and could possibly fail for an arbitrary vreg. Allow specifying a specific user instruction. Scan through use instructions, instead of use operands. Add scan thresholds instead of searching infinitely. Stop using a set to track seen uses. I didn't understand this usage, or why it would not check the last use. I don't think the use list has any particular order. llvm-svn: 363675
* AMDGPU: Fold readlane from copy of SGPR or immMatt Arsenault2019-06-182-0/+42
| | | | | | These may be inserted to assert uniformity somewhere. llvm-svn: 363670
* AMDGPU: Remove unnecessary check for virtual registerMatt Arsenault2019-06-181-17/+4
| | | | | | | The copy was found by searching the uses of a virtual register, so it's already known to be virtual. llvm-svn: 363669
* AMDGPU: Fix iterator crash in AMDGPUPromoteAllocaMatt Arsenault2019-06-181-5/+9
| | | | | | The lifetime intrinsic was erased, which was the next iterator. llvm-svn: 363668
* AMDGPU/GlobalISel: RegBankSelect for amdgcn.div.scaleMatt Arsenault2019-06-181-0/+14
| | | | llvm-svn: 363667
* [AMDGPU] Speed up live-in virtual register set computaion in ↵Valery Pykhtin2019-06-184-5/+80
| | | | | | | | GCNScheduleDAGMILive. Differential revision: https://reviews.llvm.org/D62401 llvm-svn: 363661
* [AMDGPU] Use custom inserter for gfx10 VOP2bStanislav Mekhanoshin2019-06-171-1/+3
| | | | | | | | | This is part of the approved D63204 pending parent revision. This small change is in fact a part of the VOP2b legalization which does not technically belong to wave32 support, so extracted separately. llvm-svn: 363625
* [AMDGPU] Propagate function attributes thru bitcastsStanislav Mekhanoshin2019-06-171-3/+4
| | | | | | | | | AMDGPUPropagateAttributes will not work on function bitcatsts, so move AMDGPUFixFunctionBitcasts before it. Differential Revision: https://reviews.llvm.org/D63455 llvm-svn: 363614
* AMDGPU/GFX10: Don't generate s_code_end padding in the asm-printerNicolai Haehnle2019-06-171-1/+7
| | | | | | | | | | | | | | | | | | | | | | Summary: The purpose of the padding is to guard against stale code being fetched into the instruction cache by the lowest level prefetching. We're generating relocatable ELF here, and so the padding should arguably be added by the linker. This is in fact what Mesa does. This also fixes multi-part shaders for Mesa. Change-Id: I6bfede58f20e9f337762ccf39ef9e0e263e69e82 Reviewers: arsenm, rampitec, t-tye Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63427 llvm-svn: 363602
* [AMDGPU] gfx1010 wavefrontsize intrinsic foldingStanislav Mekhanoshin2019-06-173-16/+59
| | | | | | Differential Revision: https://reviews.llvm.org/D63206 llvm-svn: 363588
* [AMDGPU] Pass to propagate ABI attributes from kernels to the functionsStanislav Mekhanoshin2019-06-174-4/+356
| | | | | | | | | | | | | | | | The pass works in two modes: Mode 1: Just set attributes starting from kernels. This can work at the very beginning of opt and llc pipeline, but cannot clone functions because it must be a function pass. Mode 2: Actually clone functions for new attributes. This can only work after all function passes in the opt pipeline because it has to be a module pass. Differential Revision: https://reviews.llvm.org/D63208 llvm-svn: 363586
* GlobalISel: Verify intrinsicsMatt Arsenault2019-06-171-26/+34
| | | | | | | | I keep using the wrong instruction when manually writing tests. This really needs to check the number of operands, but I don't see an easy way to do that right now. llvm-svn: 363579
* AMDGPU/GlobalISel: Account for multiple defs when finding intrinsic IDMatt Arsenault2019-06-171-2/+1
| | | | llvm-svn: 363578
* [AMDGPU] gfx1010 wave32 metadataStanislav Mekhanoshin2019-06-178-2/+85
| | | | | | Differential Revision: https://reviews.llvm.org/D63207 llvm-svn: 363577
* AMDGPU/GlobalISel: Implement select for G_ICMP and G_SELECTTom Stellard2019-06-173-1/+195
| | | | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60640 llvm-svn: 363576
* AMDGPU: Ignore subtarget for InferAddressSpacesMatt Arsenault2019-06-171-2/+1
| | | | | | | Even if the target doesn't have flat instructions, addrspace(0) is still flat. It just happens to not work. llvm-svn: 363561
* AMDGPU/GlobalISel: Fix default mapping for non-register operandsMatt Arsenault2019-06-171-1/+5
| | | | | | Tests will be in future commits when new intrinsics are handled here. llvm-svn: 363559
* AMDGPU: Cleanup custom PseudoSourceValue definitionsMatt Arsenault2019-06-171-16/+23
| | | | | | | Use separate enums for each kind, avoid repeating overloads, and add missing classof implementation. llvm-svn: 363558
* Fix clang -Wcovered-switch-default after stack-id change by D60137Fangrui Song2019-06-171-8/+7
| | | | llvm-svn: 363543
* Describe stack-id as an enumSander de Smalen2019-06-174-10/+16
| | | | | | | | | | | | | | | | | This patch changes MIR stack-id from an integer to an enum, and adds printing/parsing support for this in MIR files. The default stack-id '0' is now renamed to 'default'. This should make MIR tests that have stack objects with different stack-ids more descriptive. It also clarifies code operating on StackID. Reviewers: arsenm, thegameg, qcolombet Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D60137 llvm-svn: 363533
* AMDGPU: Prepare for explicit absolute relocations in code generationNicolai Haehnle2019-06-164-8/+24
| | | | | | | | | | | | | | | | | Summary: We will use absolute relocations for LDS symbols. Change-Id: I9a32795ed0ea835e433a787129cfe3c57ee9a325 Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61492 llvm-svn: 363517
* AMDGPU: Be explicit about whether the high-word in SI_PC_ADD_REL_OFFSET is 0Nicolai Haehnle2019-06-163-10/+19
| | | | | | | | | | | | | | | | | | | | | Summary: Instead of encoding a high-word of 0 using a fake TargetGlobalAddress, just use a literal target constant. This simplifies some subsequent changes. The generated assembly is now more explicit about the kind of relocation that is to be used. Change-Id: I066835202d23b5941fa7a358eb4b89e9b71ab6f8 Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61491 llvm-svn: 363516
* AMDGPU/GFX10: Support DLC bit in llvm.amdgcn.s.buffer.load intrinsicNicolai Haehnle2019-06-164-13/+22
| | | | | | | | | | | | | | | Summary: Change-Id: Ie4c971462a7749740938c687144e77441dac2539 Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62486 Change-Id: Iae59523edd75c74918d2118df6571a7b671717a0 llvm-svn: 363514
* [AMDGPU] gfx10 conditional registers handlingStanislav Mekhanoshin2019-06-1618-211/+599
| | | | | | | | | This is cpp source part of wave32 support, excluding overriden getRegClass(). Differential Revision: https://reviews.llvm.org/D63351 llvm-svn: 363513
* Reapply "GlobalISel: Avoid producing Illegal copies in RegBankSelect"Matt Arsenault2019-06-151-3/+71
| | | | | | | This reapplies r363410, avoiding null dereference if there is no AltRegBank. llvm-svn: 363478
* Revert "GlobalISel: Avoid producing Illegal copies in RegBankSelect"Mitch Phillips2019-06-141-71/+3
| | | | | | | | | | | This patch breaks UBSan build bots. See https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild for a guide as to how to reproduce the error. This reverts commit c2864c0de07efb5451d32d27a7d4ff2984830929. This reverts rL363410. llvm-svn: 363476
* AMDGPU: Avoid most waitcnts before callsMatt Arsenault2019-06-141-66/+90
| | | | | | | | | | | Currently you get extra waits, because waits are inserted for the register dependencies of the call, and the function prolog waits on everything. Currently waits are still inserted on returns. It may make sense to not do this, and wait in the caller instead. llvm-svn: 363465
* AMDGPU: Fix dropping memref for ds append/consumeMatt Arsenault2019-06-141-1/+3
| | | | | | | | | The way SelectionDAG treats memory operands is very frustrating, and by default drops them unless a property is set on the pattern. There is no pattern for manually selected instructions, so this requires manually setting them. llvm-svn: 363455
* AMDGPU: Set isTrap on S_TRAPMatt Arsenault2019-06-141-1/+4
| | | | | | | This seems to only be used for generating some kind of documentation, but might as well set it. llvm-svn: 363454
* [AMDGPU] Don't constrain callees with inlinehint from inlining on MaxBB checkValery Pykhtin2019-06-141-1/+1
| | | | | | | | | | | | | | Summary: Function bodies marked inline in an opencl source are eliminated but MaxBB check may prevent inlining them leaving undefined references. Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, Anastasia, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63337 llvm-svn: 363418
* [AMDGPU] gfx1010 BoolReg definition. NFC.Stanislav Mekhanoshin2019-06-142-0/+32
| | | | | | | | | | | Earlier commit has added AMDGPUOperand::isBoolReg(). Turns out gcc issues warning about unused function since D63204 is not yet submitted. Added NFC part of D63204 to have a use of that function and mute the warning. llvm-svn: 363416
* GlobalISel: Avoid producing Illegal copies in RegBankSelectMatt Arsenault2019-06-141-3/+71
| | | | | | | | | | | | | | | | | | | | | | Avoid producing illegal register bank copies for reg_sequence and phi. The default implementation assumes it is possible to pick any operand's bank and use that for the result, introducing a copy for operands with a different bank. This does not check for illegal copies. It is not legal to introduce a VGPR->SGPR copy, so any VGPR operand requires the result to be a VGPR. The changes in getInstrMappingImpl aren't strictly necessary, since AMDGPU now just bypasses this for reg_sequence/phi. This could be replaced with an assert in case other targets run into this. It is currently responsible for producing the error for unsatisfiable copies, but this will be better served with a verifier check. For phis, for now assume any undetermined operands must be VGPRs. Eventually, this needs to be able to defer mapping these operations. This also does not yet have a way to check for whether the block is in a divergent region. llvm-svn: 363410
* AMDGPU: Fix input chain when gluing copies to m0Matt Arsenault2019-06-141-2/+5
| | | | | | | I don't think this was causing any observable issues, but was making reading the DAG dump confusing. llvm-svn: 363389
* AMDGPU: Refactor to prepare for manually selecting more intrinsicsMatt Arsenault2019-06-141-9/+19
| | | | llvm-svn: 363385
* AMDGPU: Fix printing trailing whitespace after s_endpgmMatt Arsenault2019-06-142-2/+2
| | | | llvm-svn: 363384
* AMDGPU: Fix missing constMatt Arsenault2019-06-142-2/+2
| | | | llvm-svn: 363383
* [AMDGPU] gfx1011/gfx1012 targetsStanislav Mekhanoshin2019-06-147-0/+147
| | | | | | Differential Revision: https://reviews.llvm.org/D63307 llvm-svn: 363344
* [AMDGPU] gfx1010 wave32 icmp/fcmp intrinsic changes for wave32Stanislav Mekhanoshin2019-06-135-24/+66
| | | | | | Differential Revision: https://reviews.llvm.org/D63301 llvm-svn: 363339
* [AMDGPU] gfx1010 AMDGPUSetCCOp definitionStanislav Mekhanoshin2019-06-131-1/+1
| | | | | | | It was missing from D63293 and breaks in a debug tablegen w/o this part. llvm-svn: 363323
* [AMDGPU] gfx1010 base changes for wave32Stanislav Mekhanoshin2019-06-1312-25/+209
| | | | | | Differential Revision: https://reviews.llvm.org/D63293 llvm-svn: 363299
* [AMDGPU] ImmArg and SourceOfDivergence for permlane/dppStanislav Mekhanoshin2019-06-131-0/+5
| | | | | | | | | Added missing ImmArg and SourceOfDivergence to the crosslane intrinsics. Differential Revision: https://reviews.llvm.org/D63216 llvm-svn: 363276
* [AMDGPU][MC] Enabled constant expressions as operands of s_getreg/s_setregDmitry Preobrazhensky2019-06-135-107/+174
| | | | | | | | | | See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D61125 llvm-svn: 363255
* [AMDGPU] gfx1010 dpp16 and dpp8Stanislav Mekhanoshin2019-06-1214-34/+585
| | | | | | Differential Revision: https://reviews.llvm.org/D63203 llvm-svn: 363186
OpenPOWER on IntegriCloud