summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | | | drm/amdgpu: Use mmu_interval_insert instead of hmm_mirrorJason Gunthorpe2019-11-236-281/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the interval tree in the driver and rely on the tree maintained by the mmu_notifier for delivering mmu_notifier invalidation callbacks. For some reason amdgpu has a very complicated arrangement where it tries to prevent duplicate entries in the interval_tree, this is not necessary, each amdgpu_bo can be its own stand alone entry. interval_tree already allows duplicates and overlaps in the tree. Also, there is no need to remove entries upon a release callback, the mmu_interval API safely allows objects to remain registered beyond the lifetime of the mm. The driver only has to stop touching the pages during release. Link: https://lore.kernel.org/r/20191112202231.3856-12-jgg@ziepe.ca Reviewed-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | drm/amdgpu: Call find_vma under mmap_semJason Gunthorpe2019-11-231-16/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | find_vma() must be called under the mmap_sem, reorganize this code to do the vma check after entering the lock. Further, fix the unlocked use of struct task_struct's mm, instead use the mm from hmm_mirror which has an active mm_grab. Also the mm_grab must be converted to a mm_get before acquiring mmap_sem or calling find_vma(). Fixes: 66c45500bfdc ("drm/amdgpu: use new HMM APIs and helpers") Fixes: 0919195f2b0d ("drm/amdgpu: Enable amdgpu_ttm_tt_get_user_pages in worker threads") Link: https://lore.kernel.org/r/20191112202231.3856-11-jgg@ziepe.ca Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | | | | drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10changzhu2019-11-223-2/+116
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It may lose gpuvm invalidate acknowldege state across power-gating off cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore acquire before invalidation and semaphore release after invalidation. After adding semaphore acquire before invalidation, the semaphore register become read-only if another process try to acquire semaphore. Then it will not be able to release this semaphore. Then it may cause deadlock problem. If this deadlock problem happens, it needs a semaphore firmware fix. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
* | | | | | drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhubchangzhu2019-11-226-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SW must acquire/release one of the vm_invalidate_eng*_sem around the invalidation req/ack. Through this way,it can avoid losing invalidate acknowledge state across power-gating off cycle. To use vm_invalidate_eng*_sem, it needs to initialize vm_invalidate_eng*_sem firstly. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
* | | | | | drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF.Jack Zhang2019-11-221-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After rlcg fw 2.1, kmd driver starts to load extra fw for LIST_CNTL,GPM_MEM,SRM_MEM. We needs to skip the three fw because all rlcg related fw have been loaded by host driver. Guest driver would load the three fw fail without this change. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | | | | drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VFJack Zhang2019-11-221-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Temporarily skip ras,dtm,hdcp initialize and terminate for arcturus VF Currently the three features haven't been enabled at SRIOV, it would trigger guest driver load fail with the bare-metal path of the three features. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | | | | drm/amdgpu/gfx10: re-init clear state buffer after gpu resetXiaojie Yuan2019-11-221-6/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes 2nd baco reset failure with gfxoff enabled on navi1x. clear state buffer (resides in vram) is corrupted after 1st baco reset, upon gfxoff exit, CPF gets garbage header in CSIB and hangs. Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
* | | | | | merge fix for "ftrace: Rework event_create_dir()"Stephen Rothwell2019-11-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | | | | drm/amdgpu: Update Arcturus golden registersJay Cornwall2019-11-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | | | | drm/amdgpu/gfx10: fix out-of-bound mqd_backup array accessXiaojie Yuan2019-11-221-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes: 0900a9efdb7909 ("drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2)") Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | | | | drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhaltXiaojie Yuan2019-11-221-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 50us is not enough to wait for cp ready after gpu reset on some navi asics. Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Suggested-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
* | | | | | Revert "drm/amd/display: enable S/G for RAVEN chip"Alex Deucher2019-11-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 1c4259159132ae4ceaf7c6db37a6cf76417f73d9. S/G display is not stable with the IOMMU enabled on some platforms. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205523 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | | | | drm/amdgpu: disable gfxoff on original ravenAlex Deucher2019-11-221-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are still combinations of sbios and firmware that are not stable. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204689 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | | | | drm/amdgpu: remove experimental flag for Navi14Alex Deucher2019-11-221-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 5.4 and newer works fine with navi14. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | | | | drm/amdgpu: disable gfxoff when using register read interfaceAlex Deucher2019-11-221-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When gfxoff is enabled, accessing gfx registers via MMIO can lead to a hang. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205497 Acked-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | | | | drm/amdgpu: fix bad DMA from INTERRUPT_CNTL2Sam Bobroff2019-11-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The INTERRUPT_CNTL2 register expects a valid DMA address, but is currently set with a GPU MC address. This can cause problems on systems that detect the resulting DMA read from an invalid address (found on a Power8 guest). Instead, use the DMA address of the dummy page because it will always be safe. Fixes: 27ae10641e9c ("drm/amdgpu: add interupt handler implementation for si v3") Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | | | | drm/amdgpu/nv: add asic func for fetching vbios from rom directlyAlex Deucher2019-11-221-2/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Needed as a fallback if the vbios can't be fetched by other means. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | | | | drm/amdgpu: put flush_delayed_work at firstYintian Tao2019-11-221-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is one regression from 042f3d7b745cd76aa To put flush_delayed_work after adev->shutdown = true which will make amdgpu_ih_process not response the irq At last, all ib ring tests will be failed just like below [drm] amdgpu: finishing device. [drm] Fence fallback timer expired on ring gfx [drm] Fence fallback timer expired on ring comp_1.0.0 [drm] Fence fallback timer expired on ring comp_1.1.0 [drm] Fence fallback timer expired on ring comp_1.2.0 [drm] Fence fallback timer expired on ring comp_1.3.0 [drm] Fence fallback timer expired on ring comp_1.0.1 amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.1.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.2.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.3.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma0 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on uvd_enc_0.0 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on vce0 (-110). [drm:amdgpu_device_delayed_init_work_handler [amdgpu]] *ERROR* ib ring test failed (-110). v2: replace cancel_delayed_work_sync() with flush_delayed_work() Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | | | | drm/amdgpu/vcn2.5: fix the enc loop with hw finiLeo Liu2019-11-221-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | | | | drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2)Xiaojie Yuan2019-11-223-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. no need to allocate an extra member for 'mqd_backup' array 2. backup/restore mqd to/from the correct 'mqd_backup' array slot v2: warning fix (Alex) Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | | | | Merge tag 'drm-next-5.5-2019-11-15' of ↵Dave Airlie2019-11-215-1/+28
|\ \ \ \ \ \ | | |_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://people.freedesktop.org/~agd5f/linux into drm-next drm-next-5.5-2019-11-15: amdgpu: - Fix AVFS handling on SMU7 parts with custom power tables - Enable Overdrive sysfs interface for Navi parts - Fix power limit handling on smu11 parts - Fix pcie link sysfs output for Navi - Probably cancel MM worker threads on shutdown radeon: - Cleanup for ppc change Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191115163516.3714-1-alexander.deucher@amd.com
| * | | | | drm/amdgpu/vcn: finish delay work before release resourcesAlex Deucher2019-11-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | flush/cancel delayed works before doing finalization to avoid concurrently requests. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amd/amdgpu: finish delay works before release resourcesJesse Zhang2019-11-113-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | flush/cancel delayed works before doing finalization to avoid concurrently requests. Signed-off-by: Jesse Zhang <zhexi.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: avoid upload corrupted ta ucode to pspHawking Zhang2019-11-111-1/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xgmi, ras, hdcp and dtm ta are actually separated ucode and need to handled case by case to upload to psp. We support the case that ta binary have one or multiple of them built-in. As a result, the driver should check each ta binariy's availablity before decide to upload them to psp. In the terminate (unload) case, the driver will check the context readiness before perform unload activity. It's fine to keep it as is. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | | | | | Merge tag 'drm-next-5.5-2019-11-08' of ↵Dave Airlie2019-11-1414-27/+203
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://people.freedesktop.org/~agd5f/linux into drm-next drm-next-5.5-2019-11-08: amdgpu: - Enable VCN dynamic powergating on RV/RV2 - Fixes for Navi14 - Misc Navi fixes - Fix MSI-X tear down - Misc Arturus fixes - Fix xgmi powerstate handling - Documenation fixes scheduler: - Fix static code checker warning - Fix possible thread reactivation while thread is stopped - Avoid cleanup if thread is parked radeon: - SI dpm fix ported from amdgpu Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191108212713.5078-1-alexander.deucher@amd.com
| * | | | | drm/amdgpu: allow direct upload save restore list for raven2changzhu2019-11-081-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It will cause modprobe atombios stuck problem in raven2 if it doesn't allow direct upload save restore list from gfx driver. So it needs to allow direct upload save restore list for raven2 temporarily. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: Avoid accidental thread reactivation.Andrey Grodzovsky2019-11-071-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: During GPU reset we call the GPU scheduler to suspend it's thread, those two functions in amdgpu also suspend and resume the sceduler for their needs but this can collide with GPU reset in progress and accidently restart a suspended thread before time. Fix: Serialize with GPU reset. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | Revert "drm/amdgpu: dont schedule jobs while in reset"Andrey Grodzovsky2019-11-071-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 89b3d86403f1025f6b430d8f9ffc590efbadce62. We will do a proper fix in next patch. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: fix vega20 pstate status changeJonathan Kim2019-11-071-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vega20 only requires all devices be set to same pstate level for low pstate and not high. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Evan Quan <Evan.Quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: fix sysfs interface pcie_replay_count error on navi asicKevin Wang2019-11-071-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the asic callback function of get_pcie_replay_count is not implement on navi asic, it will cause null pinter error when read this interface. Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: Need to disable msix when unloading driverEmily Deng2019-11-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For driver reload test, it will report "can't enable MSI (MSI-X already enabled)". Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: Add comments to gmc structureOak Zeng2019-11-071-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Explain fields like aper_base, agp_start etc. The definition of those fields are confusing as they are from different view (CPU or GPU). Add comments for easier understand. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Alex Deucher <Alex.Deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: Improve RAS documentation (v2)Alex Deucher2019-11-061-7/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clarify some areas, clean up formatting, add section for unrecoverable error handling. v2: fix grammatical errors Reviewed-by: Yong Zhao <yong.zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu/renoir: move gfxoff handling into gfx9 moduleAlex Deucher2019-11-062-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To properly handle the option parsing ordering. Reviewed-by: Yong Zhao <yong.zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: fix double reference droppingPan Bian2019-11-061-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reference to object fence is dropped at the end of the loop. However, it is dropped again outside the loop. The reference can be dropped immediately after calling dma_fence_wait() in the loop and thus the dropping operation outside the loop can be removed. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: fix potential double drop fence referencePan Bian2019-11-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The object fence is not set to NULL after its reference is dropped. As a result, its reference may be dropped again if error occurs after that, which may lead to a use after free bug. To avoid the issue, fence is explicitly set to NULL after dropping its reference. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: change read of GPU clock counter on Vega10 VFEric Huang2019-11-061-3/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using unified VBIOS has performance drop in sriov environment. The fix is switching to another register instead. Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: add warning for GRBM 1-cycle delay issue in gfx9changzhu2019-11-061-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It needs to add warning to update firmware in gfx9 in case that firmware is too old to have function to realize dummy read in cp firmware. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: add dummy read by engines for some GCVM status registers in gfx10changzhu2019-11-064-6/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The GRBM register interface is now capable of bursting 1 cycle per register wr->wr, wr->rd much faster than previous muticycle per transaction done interface. This has caused a problem where status registers requiring HW to update have a 1 cycle delay, due to the register update having to go through GRBM. For cp ucode, it has realized dummy read in cp firmware.It covers the use of WAIT_REG_MEM operation 1 case only.So it needs to call gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to update firmware in case firmware is too old to have function to realize dummy read in cp firmware. For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is moved to gfxhub in gfx10. So it needs to add dummy read in driver between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: perform p-state switch after the whole hive initializedEvan Quan2019-11-061-12/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | P-state switch should be performed after all devices from the hive get initialized. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: fix possible pstate switch race conditionEvan Quan2019-11-062-2/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added lock protection so that the p-state switch will be guarded to be sequential. Also update the hive pstate only all device from the hive are in the same state. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: register gpu instance before fan boost feature enablmentEvan Quan2019-11-062-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise, the feature enablement will be skipped due to wrong count. Fixes: beff74bc6e0fa91 ("drm/amdgpu: fix a race in GPU reset with IB test (v2)") Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: disallow direct upload save restore list from gfx driverHawking Zhang2019-11-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Direct uploading save/restore list via mmio register writes breaks the security policy. Instead, the driver should pass s&r list to psp. For all the ASICs that use rlc v2_1 headers, the driver actually upload s&r list twice, in non-psp ucode front door loading phase and gfx pg initialization phase. The latter is not allowed. VG12 is the only exception where the driver still keeps legacy approach for S&R list uploading. In theory, this can be elimnated if we have valid srcntl ucode for VG12. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Candice Li <Candice.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu/discovery: Need to free discovery memoryEmily Deng2019-11-061-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When unloading driver, need to free discovery memory. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu/gpuvm: add some additional comments in amdgpu_vm_update_ptesAlex Deucher2019-11-061-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To better clarify what is happening in this function. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: enable VCN DPG on Raven and Raven2Alex Deucher2019-11-061-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's safe to enable dynamic VCN powergating on raven and raven2 for increased power savings. Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: add navi14 PCI IDTianci.Yin2019-11-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the navi14 PCI device id. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amd/powerplay: support xgmi pstate setting on powerplay routine V2Evan Quan2019-11-061-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add xgmi pstate setting on powerplay routine. V2: split the change of is_support_sw_smu_xgmi into a separate patch Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: change pstate only after all XGMI device initializedEvan Quan2019-11-061-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pstate settings should be performed after all device of the XGMI setup get initialized. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * | | | | drm/amdgpu: dont schedule jobs while in resetShirish S2019-11-061-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [Why] doing kthread_park()/unpark() from drm_sched_entity_fini while GPU reset is in progress defeats all the purpose of drm_sched_stop->kthread_park. If drm_sched_entity_fini->kthread_unpark() happens AFTER drm_sched_stop->kthread_park nothing prevents from another (third) thread to keep submitting job to HW which will be picked up by the unparked scheduler thread and try to submit to HW but fail because the HW ring is deactivated. [How] grab the reset lock before calling drm_sched_entity_fini() Signed-off-by: Shirish S <shirish.s@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
OpenPOWER on IntegriCloud