summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Try to use op_sel when selecting packed instructionsMatt Arsenault2017-05-171-1/+29
| | | | | | | | | | | | Avoids instructions to pack a vector when the source is really a scalar being broadcast. Also be smarter and look for per-component fneg. Doesn't yet handle scalar from upper half of register or other swizzles. llvm-svn: 303291
* AMDGPU: Use appropriate soffset for spillingMatt Arsenault2017-05-172-20/+20
| | | | | | | This needs to be the frame offset register, and not the global scratch wave offset register. For kernels, these are the same. llvm-svn: 303287
* AMDGPU: Fix min3/max3 combines for f16/i16Matt Arsenault2017-05-173-2/+25
| | | | | | Fix missing instruction definitions for min3/max3. llvm-svn: 303284
* [AMDGPU] Use GCNRPTracker dumper methods in schedulerStanislav Mekhanoshin2017-05-163-18/+21
| | | | | | Differential Revision: https://reviews.llvm.org/D33244 llvm-svn: 303186
* [AMDGPU] Cache live-ins and register pressure in schedulerStanislav Mekhanoshin2017-05-162-75/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using LIS can be quite expensive, so caching of calculated region live-ins and pressure is implemented. It does two things: 1. Caches the info for the second stage when we schedule with decreased target occupancy. 2. Tracks the basic block from top to bottom thus eliminating the need to scan whole register file liveness at every region split in the middle of the block. The scheduling is now done in 3 stages instead of two, with the first one being really a no-op and only used to collect scheduling regions as sent by the scheduler driver. There is no functional change to the current behavior, only compilation speed is affected. In general computeBlockPressure() could be simplified if we switch to backward RP tracker, because scheduler sends regions within a block starting from the last upward. We could use a natural order of upward tracker to seamlessly change between regions of the same block, since live reg set of a previous tracked region would become a live-out of the next region. That however requires fixing upward tracker to properly account defs and uses of the same instruction as both are contributing to the current pressure. When we converge on the produced pressure we should be able to switch between them back and forth. In addition, backward tracker is less expensive as it uses LIS in recede less often than forward uses it in advance. At the moment the worst known case compilation time has improved from 26 minutes to 8.5. Differential Revision: https://reviews.llvm.org/D33117 llvm-svn: 303184
* [AMDGPU] Turn register pressure estimation into forward trackerStanislav Mekhanoshin2017-05-164-135/+196
| | | | | | | | | | This factors register pressure estimation mechanism from the GCNSchedStrategy into the forward tracker to unify interface with other strategies and expose it to other interested phases. Differential Revision: https://reviews.llvm.org/D33105 llvm-svn: 303179
* AMDGPUCodeGen: Fix warnings in r303111. [-Wunused-variable]NAKAMURA Takumi2017-05-162-2/+4
| | | | llvm-svn: 303137
* [AMDGPU] Kill now unused phiInfoElementGetDebugLoc(). NFCI.Davide Italiano2017-05-151-5/+0
| | | | llvm-svn: 303122
* Re-submit AMDGPUMachineCFGStructurizer.Jan Sjodin2017-05-157-12/+3245
| | | | | | Differential Revision: https://reviews.llvm.org/D23209 llvm-svn: 303111
* Revert 303091.Jan Sjodin2017-05-157-3380/+12
| | | | llvm-svn: 303098
* Add AMDGPUMachineCFGStructurizer.Jan Sjodin2017-05-157-12/+3380
| | | | | | Differential Revision: https://reviews.llvm.org/D23209 llvm-svn: 303091
* [AMDGPU][MC] Corrected several VI opcodes to avoid printing _e64Dmitry Preobrazhensky2017-05-151-11/+22
| | | | | | | | | | See bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D33123 llvm-svn: 303070
* [AMDGPU][MC] Removed V_MQSAD_U16_U8Dmitry Preobrazhensky2017-05-151-3/+0
| | | | | | | | | | | | This instruction does not really exist See Bug 33018: https://bugs.llvm.org//show_bug.cgi?id=33018 Reviewers: vpykhtin, artem.tamazov Differential Revision: https://reviews.llvm.org/D33126 llvm-svn: 303055
* AMDGPU/SI: Don't promote to vector if the load/store is volatile.Changpeng Fang2017-05-121-2/+5
| | | | | | | | | | | | | Summary: We should not change volatile loads/stores in promoting alloca to vector. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D33107 llvm-svn: 302943
* [KnownBits] Add bit counting methods to KnownBits struct and use them where ↵Craig Topper2017-05-121-1/+1
| | | | | | | | | | | | possible This patch adds min/max population count, leading/trailing zero/one bit counting methods. The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give. Differential Revision: https://reviews.llvm.org/D32931 llvm-svn: 302925
* AMDGPU/GlobalISel: Mark 32-bit integer constants as legalTom Stellard2017-05-121-0/+1
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33115 llvm-svn: 302919
* [AMDGPU] Placate unused variable warning in release builds.Davide Italiano2017-05-111-0/+1
| | | | llvm-svn: 302821
* AMDGPU: Remove tfe bit from flat instruction definitionsMatt Arsenault2017-05-113-23/+22
| | | | | | | | | | We don't use it and it was removed in gfx9, and the encoding bit repurposed. Additionally actually using it requires changing the output register class, which wasn't done anyway. llvm-svn: 302814
* AMDGPU: Pull fneg out of extract_vector_eltMatt Arsenault2017-05-114-1/+31
| | | | | | | This allows folding source modifiers in more f16 cases. Makes it easier to select per-component packed neg modifiers. llvm-svn: 302813
* [AMDGPU] Fix incorrect register pressure calculationStanislav Mekhanoshin2017-05-111-2/+3
| | | | | | | | | Earlier fix D32572 introduced a bug where live-ins were calculated for basic block instead of scheduling region. This change fixes it. Differential Revision: https://reviews.llvm.org/D33086 llvm-svn: 302812
* Remove now useless trailing nullptr in StructType::getSerge Guelton2017-05-111-1/+1
| | | | llvm-svn: 302779
* AMDGPU: Make some packed shuffles freeMatt Arsenault2017-05-102-1/+36
| | | | | | | VOP3P instructions can encode access to either half of the register. llvm-svn: 302730
* AMDGPU: Add new subtarget features for gfx9 flat instructionsMatt Arsenault2017-05-103-1/+38
| | | | | | | Flat instructions gain an immediate offset, and 2 new sets of segment specific flat instructions are added. llvm-svn: 302729
* [AMDGPU][MC] Corrected v_madak/madmk to avoid printing "_e32" in ↵Dmitry Preobrazhensky2017-05-101-6/+12
| | | | | | | | | | | | disassembler output See bug 32927: https://bugs.llvm.org//show_bug.cgi?id=32927 Reviewers: vpykhtin, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D32913 llvm-svn: 302648
* [AMDGPU] Fixed typo in GCNRegPressure, NFCStanislav Mekhanoshin2017-05-092-15/+15
| | | | | | VGRP -> VGPR, SGRP -> SGPR llvm-svn: 302586
* [RegisterBankInfo] Uniquely allocate instruction mapping.Quentin Colombet2017-05-052-47/+49
| | | | | | | | | | This is a step toward having statically allocated instruciton mapping. We are going to tablegen them eventually, so let us reflect that in the API. NFC. llvm-svn: 302316
* [AMDGPU] In the new waitcnt insertion pass, use getHeader Kannan Narayanan2017-05-051-5/+5
| | | | | | | | instead of getTopBlock to find the loop header. Differential Revision: https://reviews.llvm.org/D32831 llvm-svn: 302290
* AMDGPU/AMDHSA: Set COMPUTE_PGM_RSRC2:LDS_SIZE to 0Konstantin Zhuravlyov2017-05-051-1/+2
| | | | | | | | This field is populated by the CP Differential Revision: https://reviews.llvm.org/D32619 llvm-svn: 302277
* [KnownBits] Add wrapper methods for setting and clear all bits in the ↵Craig Topper2017-05-051-1/+1
| | | | | | | | | | underlying APInts in KnownBits. This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown. Differential Revision: https://reviews.llvm.org/D32637 llvm-svn: 302262
* AMDGPU: GFX9 GS and HS shaders always have the scratch wave offset in SGPR5Marek Olsak2017-05-043-5/+20
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D32645 llvm-svn: 302200
* AMDGPU: Don't promote alloca to LDS for leaf functionsMatt Arsenault2017-05-021-1/+8
| | | | | | LDS use in leaf functions not currently handled. llvm-svn: 301958
* AMDGPU: Refactor AsmPrinterMatt Arsenault2017-05-022-128/+245
| | | | | | | Avoid analyzing functions multiple times. This allows asserting that each function is only analyzed once. llvm-svn: 301938
* AMDGPU: Make intrinsics speculatableMatt Arsenault2017-05-021-1/+1
| | | | llvm-svn: 301937
* AMDGPU: Add AMDGPU_HS calling conventionMarek Olsak2017-05-027-0/+7
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: mehdi_amini, kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D32644 llvm-svn: 301930
* Rename WeakVH to WeakTrackingVH; NFCSanjoy Das2017-05-011-9/+9
| | | | | | This relands r301424. llvm-svn: 301812
* Generalize the specialized flag-carrying SDNodes by moving flags into SDNode.Amara Emerson2017-05-013-13/+13
| | | | | | | | This removes BinaryWithFlagsSDNode, and flags are now all passed by value. Differential Revision: https://reviews.llvm.org/D32527 llvm-svn: 301803
* AMDGPU: Fix copies from physical registers in SIFixSGPRCopiesMatt Arsenault2017-04-291-4/+9
| | | | | | | | | This would assert when there were multiple defs of a physical register. We just need to move all of the users of it. llvm-svn: 301730
* AMDGPU: Add new amdgcn.init.exec intrinsicsMarek Olsak2017-04-285-0/+101
| | | | | | | | | | v2: More tests, bug fixes, cosmetic changes. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D31762 llvm-svn: 301677
* [SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and ↵Craig Topper2017-04-283-19/+17
| | | | | | | | | | | | simplifyDemandedBits This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently. This is largely a mechanical transformation from KnownZero to Known.Zero. Differential Revision: https://reviews.llvm.org/D32569 llvm-svn: 301620
* [AMDGPU] DPP: add support for GFX9Sam Kolton2017-04-271-1/+1
| | | | | | | | | | Reviewers: artem.tamazov Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D32588 llvm-svn: 301551
* AMDGPU: Fix assert in schedulerKonstantin Zhuravlyov2017-04-271-1/+2
| | | | | | | | Assert is triggered if DBG_VALUE is first instruction in BB Differential Revision: https://reviews.llvm.org/D32572 llvm-svn: 301511
* [AMDGPU][MC] Added arg checks for vmcnt, expcnt, lgkmcnt helpersDmitry Preobrazhensky2017-04-261-16/+48
| | | | | | | | | | | | | | Summary of changes: - corrected vmcnt, expcnt, lgkmcnt helpers to checks their argument for truncation; - added saturated versions of these helpers. See bug 32711 for details: https://bugs.llvm.org//show_bug.cgi?id=32711 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D32546 llvm-svn: 301439
* Reverts commit r301424, r301425 and r301426Sanjoy Das2017-04-261-9/+9
| | | | | | | | | | | | Commits were: "Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts" "Add a new WeakVH value handle; NFC" "Rename WeakVH to WeakTrackingVH; NFC" The changes assumed pointers are 8 byte aligned on all architectures. llvm-svn: 301429
* Rename WeakVH to WeakTrackingVH; NFCSanjoy Das2017-04-261-9/+9
| | | | | | | | | | | | | | | | Summary: I plan to use WeakVH to mean "nulls itself out on deletion, but does not track RAUW" in a subsequent commit. Reviewers: dblaikie, davide Reviewed By: davide Subscribers: arsenm, mehdi_amini, mcrosier, mzolotukhin, jfb, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D32266 llvm-svn: 301424
* [AMDGPU][MC] Added check for truncation of SOPK imm operandDmitry Preobrazhensky2017-04-263-17/+45
| | | | | | | | | | See bug 30827: https://bugs.llvm.org//show_bug.cgi?id=30827 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D32535 llvm-svn: 301418
* [AMDGPU] Garbage collect dead code. NFCI.Davide Italiano2017-04-261-10/+0
| | | | llvm-svn: 301375
* AMDGPU: Shift down reserved SP register like scratch wave offsetMatt Arsenault2017-04-252-17/+59
| | | | llvm-svn: 301367
* AMDGPU: Clean up VOP3NoMods patternMatt Arsenault2017-04-253-35/+22
| | | | | | | There is no need to copy the operands or inspect the sources. Also remove some unnecessary clamp/omod usage. llvm-svn: 301363
* AMDGPU: Fix ValueKind code object metadata for imagesKonstantin Zhuravlyov2017-04-251-12/+12
| | | | | | Differential Revision: https://reviews.llvm.org/D32504 llvm-svn: 301360
* AMDGPU: Slightly simplify prolog reserved register handlingMatt Arsenault2017-04-241-25/+27
| | | | | | | | | | | | | | Rely on MachineRegisterInfo's knowledge of used physical registers. Move flat_scratch initialization earlier, so the uses are visible when making these decisions. This will make it easier to add another reserved register at the end for the stack pointer rather than handling another special case. llvm-svn: 301254
OpenPOWER on IntegriCloud