summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd
Commit message (Collapse)AuthorAgeFilesLines
* drm/amdgpu: add sdma v5 packet header fileHawking Zhang2019-06-201-0/+4806
| | | | | | | | Defines the SDMA packet formats. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add gfx v10 clear state header v2Huang Rui2019-06-201-0/+975
| | | | | | | | | | | Clear state for gfx pipe. v2: squash in updates Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add v10 structs header (v2)Huang Rui2019-06-201-0/+1258
| | | | | | | | | | Header for CP structures (MQD, etc.) V2: squash in updates Signed-off-by: Huang Rui <ray.huang@amd.com> Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: parse the new members added by gpu_info ucode v1_1Hawking Zhang2019-06-201-0/+9
| | | | | | | | Parse the new parameters for gfx10. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add gpu_info_firmware v1_1 structure for navi10Hawking Zhang2019-06-201-0/+6
| | | | | | | | | two new members that specific for navi10 are included in v2_0: num_sc_per_sh and num_packer_per_sc Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add navi10 gpu info firmwareHuang Rui2019-06-201-0/+4
| | | | | | | | | | gpu info firmware stores configuration data for various IP blocks. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add gfx10 specific new member pa_sc_tile_steering_overrideHawking Zhang2019-06-201-0/+1
| | | | | | | | New gfx config parameter. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add gfx10 specific config in amdgpu_gfx_configHawking Zhang2019-06-201-0/+3
| | | | | | | | The two members are used to cache the values from gpu_info fw accordingly Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: Add GDDR6 in vram_name arraryHawking Zhang2019-06-201-0/+1
| | | | | | | | For printing vram type. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <Tao.Zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add navi10 asic typeHuang Rui2019-06-201-0/+1
| | | | | | | Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add navi10 ip offset headerHawking Zhang2019-06-202-1/+858
| | | | | Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add doorbell assignement for navi10Hawking Zhang2019-06-201-0/+40
| | | | | | | | | Update mappings for Navi10. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: atomfirmware.h updates for navi10Hawking Zhang2019-06-201-8/+180
| | | | | | | Updated tables for Navi10. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add navi10 enums headerHawking Zhang2019-06-201-0/+22764
| | | | | Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add SMUIO 11.0 register headersHawking Zhang2019-06-202-0/+1012
| | | | | Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add OSS 5.0 register headersHawking Zhang2019-06-202-0/+1658
| | | | | Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add MMHUB 2.0 register headersHawking Zhang2019-06-203-0/+10293
| | | | | Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add GC 10.1 register headers (v4)Hawking Zhang2019-06-203-0/+61330
| | | | | | | | | v2: Update regs (Alex) v3: More updates (Alex) v4: more updates (Alex) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add VCN 2.0 register headersHawking Zhang2019-06-202-0/+4823
| | | | | Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add NBIO 2.3 register headersHawking Zhang2019-06-203-0/+153523
| | | | | Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add MP 11.0 register headersHawking Zhang2019-06-201-0/+429
| | | | | Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add HDP 5.0 register headersHawking Zhang2019-06-202-0/+876
| | | | | Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add DCN 2.0 register headersHawking Zhang2019-06-202-0/+85539
| | | | | Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add CLK 11.0 register headersHawking Zhang2019-06-202-0/+71
| | | | | Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add ATHUB 2.0 register headersHawking Zhang2019-06-203-0/+3050
| | | | | Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Revert "drm/amd/display: Copy stream updates onto streams"Alex Deucher2019-06-201-69/+0
| | | | | | | | This reverts commit 6e5155ae6b66054db35d8f3c64f9863b9d0466c1. Revert this to apply the version that includes DCN2 support. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Revert "drm/amd/display: Use macro for invalid OPP ID"Alex Deucher2019-06-203-6/+4
| | | | | | | | This reverts commit 1760bd06c8e94e1b184139ae35201856403638cf. Revert this to apply the version that includes DCN2 support. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Revert "drm/amd/display: Rework CRTC color management"Alex Deucher2019-06-203-356/+159
| | | | | | | | This reverts commit 7cd4b70091a5cfa1f58d3a529535304a116acc95. Revert this to apply the version that includes DCN2 support. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Revert "drm/amd/display: move vmid determination logic out of dc"Alex Deucher2019-06-209-13/+143
| | | | | | | | This reverts commit 11cd74cdb98aa6f4d6f54a0082dd28e0d4743746. Revert this to apply the version that includes DCN2 support. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Revert "drm/amd/display: Add Underflow Asserts to dc"Alex Deucher2019-06-205-38/+2
| | | | | | | | This reverts commit 9ed43ef84d9d1e668acdf43c95510fb7b11f8d71. Revert this to apply the version that includes DCN2 support. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Revert "drm/amd/display: make clk_mgr call enable_pme_wa"Alex Deucher2019-06-202-20/+19
| | | | | | | | This reverts commit a1651530a3bacf1d796fdb7bc587faef9f305d36. Revert this to apply the version that includes DCN2 support. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Revert "drm/amd/display: Enable fast plane updates when state->allow_modeset ↵Nicholas Kazlauskas2019-06-201-0/+8
| | | | | | | | | | | | | | | | | = true" This reverts commit ebc8c6f18322ad54275997a888ca1731d74b711f. There are still missing corner cases with cursor interaction and these fast plane updates on Picasso and Raven2 leading to endless PSTATE warnings for typical desktop usage depending on the userspace. This change should be reverted until these issues have been resolved. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110949 Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: David Francis <david.francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu/sriov: fix Tonga load driver failedJack Zhang2019-06-201-2/+0
| | | | | | | | | Tonga sriov need to use smu to load firmware. Remove sriov flag because the default return value is zero. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Trigger Huang <Trigger.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: add pmu countersJonathan Kim2019-06-204-1/+324
| | | | | | | | adding perf event counters Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: update df_v3_6 for xgmi perfmons (v2)Jonathan Kim2019-06-204-256/+224
| | | | | | | | | | | | add pmu attribute groups and structures for perf events. add sysfs to track available df perfmon counters fix overflow handling in perfmon counter reads. v2: squash in fix (Alex) Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amd/display: Fix null-deref on vega20 with xgmiRoman Li2019-06-203-15/+11
| | | | | | | | | | | | | | | | [Why] After clkmgr rework it gets initialized after resource pool. The clkmgr is used in resource pool init for xgmi path. That causes driver crash on Vega20 with xgmi due to NULL deref. [How] Move xgmi compensation code to dce121_clk_mgr_construct() That also allows to make dce121_clock_patch_xgmi_ss_info() internal static function. Signed-off-by: Roman Li <Roman.Li@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdkfd: Add procfs-style information for KFD processesKent Russell2019-06-203-2/+113
| | | | | | | | | | | Add a folder structure to /sys/class/kfd/kfd/ called proc which contains subfolders, each representing an active KFD process' PID, containing 1 file: pasid. Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: improve HMM error -ENOMEM and -EBUSY handlingPhilip Yang2019-06-201-32/+6
| | | | | | | | | | | | | Under memory pressure, hmm_range_fault may return error code -ENOMEM or -EBUSY, change pr_info to pr_debug to remove unnecessary kernel log message because we will retry restore again. Call get_user_pages_done if TTM get user pages failed will have WARN_ONCE kernel calling stack dump log. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amd/amdgpu: cast mem->num_pages to 64-bits when shifting (v2)Tom St Denis2019-06-201-4/+5
| | | | | | | | | | On 32-bit hosts mem->num_pages is 32-bits and can overflow when shifted. Add a cast to avoid this. (v2): Style fix. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: Do error injection even vram reserve failsxinhui pan2019-06-201-6/+7
| | | | | | | | | As long as the address is mapped with vram, we can do an error injection. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: wait to fetch the vbios until after common initAlex Deucher2019-06-171-11/+13
| | | | | | | | We need the asic_funcs set for the get rom callbacks in some cases. Tested-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amd/powerplay: Delete a redundant memory setting in ↵Markus Elfring2019-06-171-1/+0
| | | | | | | | | | | | vega20_set_default_od8_setttings() The memory was set to zero already by a call of the function “kzalloc”. Thus remove an extra call of the function “memset” for this purpose. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amd/display: Delete a redundant memory setting in ↵Markus Elfring2019-06-171-2/+0
| | | | | | | | | | | | amdgpu_dm_irq_register_interrupt() The memory was set to zero already by a call of the function “kzalloc”. Thus remove an extra call of the function “memset” for this purpose. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdgpu: fix error handling in df_v3_6_pmc_startArnd Bergmann2019-06-171-4/+9
| | | | | | | | | | | | | | | | | | | When df_v3_6_pmc_get_ctrl_settings() fails for some reason, we store uninitialized data in a register, as gcc points out: drivers/gpu/drm/amd/amdgpu/df_v3_6.c: In function 'df_v3_6_pmc_start': drivers/gpu/drm/amd/amdgpu/amdgpu.h:1012:29: error: 'lo_val' may be used uninitialized in this function [-Werror=maybe-uninitialized] #define WREG32_PCIE(reg, v) adev->pcie_wreg(adev, (reg), (v)) ^~~~ drivers/gpu/drm/amd/amdgpu/df_v3_6.c:334:39: note: 'lo_val' was declared here uint32_t lo_base_addr, hi_base_addr, lo_val, hi_val; ^~~~~~ Make it return a proper error code that we can catch in the caller. Fixes: 992af942a6cf ("drm/amdgpu: add df perfmon regs and funcs for xgmi") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amd/display: Add missing newline at end of fileGeert Uytterhoeven2019-06-171-1/+1
| | | | | | | | | | | "git diff" says: \ No newline at end of file after modifying the file. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amd/powerplay: detect version of smu backend (v2)Prike Liang2019-06-1713-0/+13
| | | | | | | | | | Print the backend type. v2: whitespace fixes (Alex) Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdkfd: Fix sdma queue allocate race conditionOak Zeng2019-06-171-1/+6
| | | | | | | | | SDMA queue allocation requires the dqm lock as it modify the global dqm members. Enclose it in the dqm_lock. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdkfd: Fix a circular lock dependencyOak Zeng2019-06-171-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idea to break the circular lock dependency is to temporarily drop dqm lock before calling allocate_mqd. See callstack #1 below. [ 59.510149] [drm] Initialized amdgpu 3.30.0 20150101 for 0000:04:00.0 on minor 0 [ 513.604034] ====================================================== [ 513.604205] WARNING: possible circular locking dependency detected [ 513.604375] 4.18.0-kfd-root #2 Tainted: G W [ 513.604530] ------------------------------------------------------ [ 513.604699] kswapd0/611 is trying to acquire lock: [ 513.604840] 00000000d254022e (&dqm->lock_hidden){+.+.}, at: evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.605150] but task is already holding lock: [ 513.605307] 00000000961547fc (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe4/0x250 [ 513.605540] which lock already depends on the new lock. [ 513.605747] the existing dependency chain (in reverse order) is: [ 513.605944] -> #4 (&anon_vma->rwsem){++++}: [ 513.606106] __vma_adjust+0x147/0x7f0 [ 513.606231] __split_vma+0x179/0x190 [ 513.606353] mprotect_fixup+0x217/0x260 [ 513.606553] do_mprotect_pkey+0x211/0x380 [ 513.606752] __x64_sys_mprotect+0x1b/0x20 [ 513.606954] do_syscall_64+0x50/0x1a0 [ 513.607149] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 513.607380] -> #3 (&mapping->i_mmap_rwsem){++++}: [ 513.607678] rmap_walk_file+0x1f0/0x280 [ 513.607887] page_referenced+0xdd/0x180 [ 513.608081] shrink_page_list+0x853/0xcb0 [ 513.608279] shrink_inactive_list+0x33b/0x700 [ 513.608483] shrink_node_memcg+0x37a/0x7f0 [ 513.608682] shrink_node+0xd8/0x490 [ 513.608869] balance_pgdat+0x18b/0x3b0 [ 513.609062] kswapd+0x203/0x5c0 [ 513.609241] kthread+0x100/0x140 [ 513.609420] ret_from_fork+0x24/0x30 [ 513.609607] -> #2 (fs_reclaim){+.+.}: [ 513.609883] kmem_cache_alloc_trace+0x34/0x2e0 [ 513.610093] reservation_object_reserve_shared+0x139/0x300 [ 513.610326] ttm_bo_init_reserved+0x291/0x480 [ttm] [ 513.610567] amdgpu_bo_do_create+0x1d2/0x650 [amdgpu] [ 513.610811] amdgpu_bo_create+0x40/0x1f0 [amdgpu] [ 513.611041] amdgpu_bo_create_reserved+0x249/0x2d0 [amdgpu] [ 513.611290] amdgpu_bo_create_kernel+0x12/0x70 [amdgpu] [ 513.611584] amdgpu_ttm_init+0x2cb/0x560 [amdgpu] [ 513.611823] gmc_v9_0_sw_init+0x400/0x750 [amdgpu] [ 513.612491] amdgpu_device_init+0x14eb/0x1990 [amdgpu] [ 513.612730] amdgpu_driver_load_kms+0x78/0x290 [amdgpu] [ 513.612958] drm_dev_register+0x111/0x1a0 [ 513.613171] amdgpu_pci_probe+0x11c/0x1e0 [amdgpu] [ 513.613389] local_pci_probe+0x3f/0x90 [ 513.613581] pci_device_probe+0x102/0x1c0 [ 513.613779] driver_probe_device+0x2a7/0x480 [ 513.613984] __driver_attach+0x10a/0x110 [ 513.614179] bus_for_each_dev+0x67/0xc0 [ 513.614372] bus_add_driver+0x1eb/0x260 [ 513.614565] driver_register+0x5b/0xe0 [ 513.614756] do_one_initcall+0xac/0x357 [ 513.614952] do_init_module+0x5b/0x213 [ 513.615145] load_module+0x2542/0x2d30 [ 513.615337] __do_sys_finit_module+0xd2/0x100 [ 513.615541] do_syscall_64+0x50/0x1a0 [ 513.615731] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 513.615963] -> #1 (reservation_ww_class_mutex){+.+.}: [ 513.616293] amdgpu_amdkfd_alloc_gtt_mem+0xcf/0x2c0 [amdgpu] [ 513.616554] init_mqd+0x223/0x260 [amdgpu] [ 513.616779] create_queue_nocpsch+0x4d9/0x600 [amdgpu] [ 513.617031] pqm_create_queue+0x37c/0x520 [amdgpu] [ 513.617270] kfd_ioctl_create_queue+0x2f9/0x650 [amdgpu] [ 513.617522] kfd_ioctl+0x202/0x350 [amdgpu] [ 513.617724] do_vfs_ioctl+0x9f/0x6c0 [ 513.617914] ksys_ioctl+0x66/0x70 [ 513.618095] __x64_sys_ioctl+0x16/0x20 [ 513.618286] do_syscall_64+0x50/0x1a0 [ 513.618476] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 513.618695] -> #0 (&dqm->lock_hidden){+.+.}: [ 513.618984] __mutex_lock+0x98/0x970 [ 513.619197] evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.619459] kfd_process_evict_queues+0x3b/0xb0 [amdgpu] [ 513.619710] kgd2kfd_quiesce_mm+0x1c/0x40 [amdgpu] [ 513.620103] amdgpu_amdkfd_evict_userptr+0x38/0x70 [amdgpu] [ 513.620363] amdgpu_mn_invalidate_range_start_hsa+0xa6/0xc0 [amdgpu] [ 513.620614] __mmu_notifier_invalidate_range_start+0x70/0xb0 [ 513.620851] try_to_unmap_one+0x7fc/0x8f0 [ 513.621049] rmap_walk_anon+0x121/0x290 [ 513.621242] try_to_unmap+0x93/0xf0 [ 513.621428] shrink_page_list+0x606/0xcb0 [ 513.621625] shrink_inactive_list+0x33b/0x700 [ 513.621835] shrink_node_memcg+0x37a/0x7f0 [ 513.622034] shrink_node+0xd8/0x490 [ 513.622219] balance_pgdat+0x18b/0x3b0 [ 513.622410] kswapd+0x203/0x5c0 [ 513.622589] kthread+0x100/0x140 [ 513.622769] ret_from_fork+0x24/0x30 [ 513.622957] other info that might help us debug this: [ 513.623354] Chain exists of: &dqm->lock_hidden --> &mapping->i_mmap_rwsem --> &anon_vma->rwsem [ 513.623900] Possible unsafe locking scenario: [ 513.624189] CPU0 CPU1 [ 513.624397] ---- ---- [ 513.624594] lock(&anon_vma->rwsem); [ 513.624771] lock(&mapping->i_mmap_rwsem); [ 513.625020] lock(&anon_vma->rwsem); [ 513.625253] lock(&dqm->lock_hidden); [ 513.625433] *** DEADLOCK *** [ 513.625783] 3 locks held by kswapd0/611: [ 513.625967] #0: 00000000f14edf84 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30 [ 513.626309] #1: 00000000961547fc (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe4/0x250 [ 513.626671] #2: 0000000067b5cd12 (srcu){....}, at: __mmu_notifier_invalidate_range_start+0x5/0xb0 [ 513.627037] stack backtrace: [ 513.627292] CPU: 0 PID: 611 Comm: kswapd0 Tainted: G W 4.18.0-kfd-root #2 [ 513.627632] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 513.627990] Call Trace: [ 513.628143] dump_stack+0x7c/0xbb [ 513.628315] print_circular_bug.isra.37+0x21b/0x228 [ 513.628581] __lock_acquire+0xf7d/0x1470 [ 513.628782] ? unwind_next_frame+0x6c/0x4f0 [ 513.628974] ? lock_acquire+0xec/0x1e0 [ 513.629154] lock_acquire+0xec/0x1e0 [ 513.629357] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.629587] __mutex_lock+0x98/0x970 [ 513.629790] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.630047] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.630309] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.630562] evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.630816] kfd_process_evict_queues+0x3b/0xb0 [amdgpu] [ 513.631057] kgd2kfd_quiesce_mm+0x1c/0x40 [amdgpu] [ 513.631288] amdgpu_amdkfd_evict_userptr+0x38/0x70 [amdgpu] [ 513.631536] amdgpu_mn_invalidate_range_start_hsa+0xa6/0xc0 [amdgpu] [ 513.632076] __mmu_notifier_invalidate_range_start+0x70/0xb0 [ 513.632299] try_to_unmap_one+0x7fc/0x8f0 [ 513.632487] ? page_lock_anon_vma_read+0x68/0x250 [ 513.632690] rmap_walk_anon+0x121/0x290 [ 513.632875] try_to_unmap+0x93/0xf0 [ 513.633050] ? page_remove_rmap+0x330/0x330 [ 513.633239] ? rcu_read_unlock+0x60/0x60 [ 513.633422] ? page_get_anon_vma+0x160/0x160 [ 513.633613] shrink_page_list+0x606/0xcb0 [ 513.633800] shrink_inactive_list+0x33b/0x700 [ 513.633997] shrink_node_memcg+0x37a/0x7f0 [ 513.634186] ? shrink_node+0xd8/0x490 [ 513.634363] shrink_node+0xd8/0x490 [ 513.634537] balance_pgdat+0x18b/0x3b0 [ 513.634718] kswapd+0x203/0x5c0 [ 513.634887] ? wait_woken+0xb0/0xb0 [ 513.635062] kthread+0x100/0x140 [ 513.635231] ? balance_pgdat+0x3b0/0x3b0 [ 513.635414] ? kthread_delayed_work_timer_fn+0x80/0x80 [ 513.635626] ret_from_fork+0x24/0x30 [ 513.636042] Evicting PASID 32768 queues [ 513.936236] Restoring PASID 32768 queues [ 524.708912] Evicting PASID 32768 queues [ 524.999875] Restoring PASID 32768 queues Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Revert "drm/amdkfd: Fix a circular lock dependency"Oak Zeng2019-06-171-10/+11
| | | | | | | | | | | This reverts commit 06b89b38f3cc518a761164f9f958a9607bbb3587. This fix is not proper. allocate_mqd can't be moved before allocate_sdma_queue as it depends on q->properties->sdma_id set in later. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Revert "drm/amdkfd: Fix sdma queue allocate race condition"Oak Zeng2019-06-171-12/+14
| | | | | | | | | | | This reverts commit f77dac6cd62e5d4bcadd740620af1218bfb54cc6. This fix is not proper. allocate_mqd can't be moved before allocate_sdma_queue as it depends on q->properties->sdma_id set in later. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
OpenPOWER on IntegriCloud