summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdkfd
Commit message (Collapse)AuthorAgeFilesLines
...
| * drm/amdkfd: Support flat memory apertures for GFXv9Felix Kuehling2018-04-101-28/+87
| | | | | | | | | | | | Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Remove limit on number of GPUs (follow-up)Felix Kuehling2018-04-101-3/+1
| | | | | | | | | | | | | | | | This condition was missed in a previous commit with the same title. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Add 64-bit doorbell and wptr support to kernel queueFelix Kuehling2018-04-087-7/+63
| | | | | | | | | | | | | | | | v2: Removed redundant 0x before %p. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Fix kernel queue rollback_packetFelix Kuehling2018-04-101-1/+1
| | | | | | | | | | | | | | | | | | | | kq->queue->properties.write_ptr is a GPU address which can'd be derefenced in the kernel. Use kq->wptr_kernel instead, which is the kernel CPU address of the same buffer. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Fix goto usageFelix Kuehling2018-04-101-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Missed a spot in previous cleanup commit: Remove gotos that do not feature any common cleanup, and use gotos instead of repeating cleanup commands. According to kernel.org: "The goto statement comes in handy when a function exits from multiple locations and some common work such as cleanup has to be done. If there is no cleanup needed then just return directly." Signed-off-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Add SOC15 interrupt processing supportFelix Kuehling2018-04-104-1/+134
| | | | | | | | | | | | | | | | Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Add GFXv9 device queue managerFelix Kuehling2018-04-106-2/+106
| | | | | | | | | | | | | | Signed-off-by: John Bridgman <john.bridgman@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Add GFXv9 MQD managerFelix Kuehling2018-04-105-1/+451
| | | | | | | | | | | | | | | | Signed-off-by: John Bridgman <john.bridgman@amd.com> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Add GFXv9 PM4 packet writer functionsFelix Kuehling2018-04-106-12/+937
| | | | | | | | | | | | | | Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Move packet writer functions into ASIC-specific fileFelix Kuehling2018-04-104-316/+420
| | | | | | | | | | | | | | | | | | | | This is in preparation for GFXv9 (Vega10) which uses incompatible PM4 packet formats from previous ASIC generations. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Implement doorbell allocation for SOC15Felix Kuehling2018-04-106-17/+139
| | | | | | | | | | | | | | | | | | | | | | Allocate doorbells according to the doorbell routing information on SOC15 ASICs (Vega10 and later). On older ASICs we continue to use the queue_id as the doorbell ID to maintain compatibility with the Thunk. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Clean up KFD_MMAP_ offset handlingHarish Kasiviswanathan2018-04-105-33/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use bit-rotate for better clarity and remove _MASK from the #defines as these represent mmap types. Centralize all the parsing of the mmap offset in kfd_mmap and add device parameter to doorbell and reserved_mem map functions. Encode gpu_id into upper bits of vm_pgoff. This frees up the lower bits for encoding the the doorbell ID on Vega10. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Make doorbell size ASIC-dependentFelix Kuehling2018-04-103-26/+39
| | | | | | | | | | | | | | | | This prepares for GFXv9 (Vega10), which has 64-bit doorbells. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Add quiesce_mm and resume_mm to kgd2kfd_callsFelix Kuehling2018-03-234-5/+49
| | | | | | | | | | | | | | | | | | | | These interfaces allow KGD to stop and resume all GPU user mode queue access to a process address space. This is needed for handling MMU notifiers of userptrs mapped for GPU access in KFD VMs. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: GFP_NOIO while holding locks taken in MMU notifierFelix Kuehling2018-03-233-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When an MMU notifier runs in memory reclaim context, it can deadlock trying to take locks that are already held in the thread causing the memory reclaim. The solution is to avoid memory reclaim while holding locks that are taken in MMU notifiers by using GFP_NOIO. This commit fixes memory allocations done while holding the dqm->lock which is needed in the MMU notifier (dqm->ops.evict_process_queues). Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* | drm/amdkfd: fix build, select MMU_NOTIFIERRandy Dunlap2018-04-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_MMU_NOTIFIER is not enabled, struct mmu_notifier has an incomplete type definition, which causes build errors. ../drivers/gpu/drm/amd/amdkfd/kfd_priv.h:607:22: error: field 'mmu_notifier' has incomplete type ../include/linux/kernel.h:979:32: error: dereferencing pointer to incomplete type ../include/linux/kernel.h:980:18: error: dereferencing pointer to incomplete type ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:434:2: error: implicit declaration of function 'mmu_notifier_unregister_no_release' [-Werror=implicit-function-declaration] ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:435:2: error: implicit declaration of function 'mmu_notifier_call_srcu' [-Werror=implicit-function-declaration] ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:438:21: error: variable 'kfd_process_mmu_notifier_ops' has initializer but incomplete type ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: error: unknown field 'release' specified in initializer ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: warning: excess elements in struct initializer [enabled by default] ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: warning: (near initialization for 'kfd_process_mmu_notifier_ops') [enabled by default] ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:534:2: error: implicit declaration of function 'mmu_notifier_register' [-Werror=implicit-function-declaration] Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Anders Roxell <anders.roxell@linaro.org> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* | drm/amdkfd: fix clock counter retrieval for node without GPUAndres Rodriguez2018-04-241-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently if a user requests clock counters for a node without a GPU resource we will always return EINVAL. Instead if no GPU resource is attached, fill the gpu_clock_counter argument with zeroes so that we may proceed and return valid CPU counters. Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* | drm/amdkfd: Fix the error return code in kfd_ioctl_unmap_memory_from_gpu()Wei Yongjun2018-04-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Passing NULL pointer to PTR_ERR will result in return value of 0 indicating success which is clearly not what it is intended here. This patch returns -EINVAL instead. v2: change ret code to -ENODEV Fixes: 5ec7e02854b3 ("drm/amdkfd: Add ioctls for GPUVM memory management") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* | drm/amdkfd: kfd_dev_is_large_bar() can be statickbuild test robot2018-04-241-1/+1
|/ | | | | | Fixes: 5ec7e02854b3 ("drm/amdkfd: Add ioctls for GPUVM memory management") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* Merge tag 'drm-for-v4.17' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds2018-04-0225-326/+2538
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm updates from Dave Airlie: "Cannonlake and Vega12 support are probably the two major things. This pull lacks nouveau, Ben had some unforseen leave and a few other blockers so we'll see how things look or maybe leave it for this merge window. core: - Device links to handle sound/gpu pm dependency - Color encoding/range properties - Plane clipping into plane check helper - Backlight helpers - DP TP4 + HBR3 helper support amdgpu: - Vega12 support - Enable DC by default on all supported GPUs - Powerplay restructuring and cleanup - DC bandwidth calc updates - DC backlight on pre-DCE11 - TTM backing store dropping support - SR-IOV fixes - Adding "wattman" like functionality - DC crc support - Improved DC dual-link handling amdkfd: - GPUVM support for dGPU - KFD events for dGPU - Enable PCIe atomics for dGPUs - HSA process eviction support - Live-lock fixes for process eviction - VM page table allocation fix for large-bar systems panel: - Raydium RM68200 - AUO G104SN02 V2 - KEO TX31D200VM0BAA - ARM Versatile panels i915: - Cannonlake support enabled - AUX-F port support added - Icelake base enabling until internal milestone of forcewake support - Query uAPI interface (used for GPU topology information currently) - Compressed framebuffer support for sprites - kmem cache shrinking when GPU is idle - Avoid boosting GPU when waited item is being processed already - Avoid retraining LSPCON link unnecessarily - Decrease request signaling latency - Deprecation of I915_SET_COLORKEY_NONE - Kerneldoc and compiler warning cleanup for upcoming CI enforcements - Full range ycbcr toggling - HDCP support i915/gvt: - Big refactor for shadow ppgtt - KBL context save/restore via LRI cmd (Weinan) - Properly unmap dma for guest page (Changbin) vmwgfx: - Lots of various improvements etnaviv: - Use the drm gpu scheduler - prep work for GC7000L support vc4: - fix alpha blending - Expose perf counters to userspace pl111: - Bandwidth checking/limiting - Versatile panel support sun4i: - A83T HDMI support - A80 support - YUV plane support - H3/H5 HDMI support omapdrm: - HPD support for DVI connector - remove lots of static variables msm: - DSI updates from 10nm / SDM845 - fix for race condition with a3xx/a4xx fence completion irq - some refactoring/prep work for eventual a6xx support (ie. when we have a userspace) - a5xx debugfs enhancements - some mdp5 fixes/cleanups to prepare for eventually merging writeback - support (ie. when we have a userspace) tegra: - mmap() fixes for fbdev devices - Overlay plane for hw cursor fix - dma-buf cache maintenance support mali-dp: - YUV->RGB conversion support rockchip: - rk3399/chromebook fixes and improvements rcar-du: - LVDS support move to drm bridge - DT bindings for R8A77995 - Driver/DT support for R8A77970 tilcdc: - DRM panel support" * tag 'drm-for-v4.17' of git://people.freedesktop.org/~airlied/linux: (1646 commits) drm/i915: Fix hibernation with ACPI S0 target state drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt drm/i915: Specify which engines to reset following semaphore/event lockups drm/i915/dp: Write to SET_POWER dpcd to enable MST hub. drm/amdkfd: Use ordered workqueue to restore processes drm/amdgpu: Fix acquiring VM on large-BAR systems drm/amd/pp: clean header file hwmgr.h drm/amd/pp: use mlck_table.count for array loop index limit drm: Fix uabi regression by allowing garbage mode->type from userspace drm/amdgpu: Add an ATPX quirk for hybrid laptop drm/amdgpu: fix spelling mistake: "asssert" -> "assert" drm/amd/pp: Add new asic support in pp_psm.c drm/amd/pp: Clean up powerplay code on Vega12 drm/amd/pp: Add smu irq handlers for legacy asics drm/amd/pp: Fix set wrong temperature range on smu7 drm/amdgpu: Don't change preferred domian when fallback GTT v5 drm/vmwgfx: Bump version patchlevel and date drm/vmwgfx: use monotonic event timestamps drm/vmwgfx: Unpin the screen object backup buffer when not used drm/vmwgfx: Stricter count of legacy surface device resources ...
| * drm/amdkfd: Use ordered workqueue to restore processesFelix Kuehling2018-03-233-6/+32
| | | | | | | | | | | | | | | | | | | | | | Restoring multiple processes concurrently can lead to live-locks where each process prevents the other from validating all its BOs. v2: fix duplicate check of same variable Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Add module option for testing large-BAR functionalityFelix Kuehling2018-03-154-0/+19
| | | | | | | | | | | | | | | | | | | | Simulate large-BAR system by exporting only visible memory. This limits the amount of available VRAM to the size of the BAR, but enables CPU access to VRAM. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Kmap event page for dGPUsFelix Kuehling2018-03-153-2/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The events page must be accessible in user mode by the GPU and CPU as well as in kernel mode by the CPU. On dGPUs user mode virtual addresses are managed by the Thunk's GPU memory allocation code. Therefore we can't allocate the memory in kernel mode like we do on APUs. But KFD still needs to map the memory for kernel access. To facilitate this, the Thunk provides the buffer handle of the events page to KFD when creating the first event. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Add ioctls for GPUVM memory managementFelix Kuehling2018-03-152-0/+385
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: * Fix error handling after kfd_bind_process_to_device in kfd_ioctl_map_memory_to_gpu v3: * Add ioctl to acquire VM from a DRM FD v4: * Return number of successful map/unmap operations in failure cases * Facilitate partial retry after failed map/unmap * Added comments with parameter descriptions to new APIs * Defined AMDKFD_IOC_FREE_MEMORY_OF_GPU write-only Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Add TC flush on VMID deallocation for HawaiiFelix Kuehling2018-03-154-1/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | On GFX7 the CP does not perform a TC flush when queues are unmapped. To avoid TC eviction from accessing an invalid VMID, flush it explicitly before releasing a VMID. v2: Fix unnecessary list_for_each_entry_safe v3: Moved allocation to kfd_process_device_init_vm Signed-off-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Allocate CWSR trap handler memory for dGPUsFelix Kuehling2018-03-151-10/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add helpers for allocating GPUVM memory in kernel mode and use them to allocate memory for the CWSR trap handler. v2: Use dev instead of pdd->dev in kfd_process_free_gpuvm v3: * Cleaned up and simplified kfd_process_alloc_gpuvm * Moved allocation for dGPU to kfd_process_device_init_vm Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Add per-process IDR for buffer handlesFelix Kuehling2018-03-152-0/+84
| | | | | | | | | | | | | | | | | | | | Also used for cleaning up on process termination. v2: Refactored cleanup on process termination Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Aperture setup for dGPUsFelix Kuehling2018-03-152-8/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set up the GPUVM aperture for SVM (shared virtual memory) that allows sharing a part of virtual address space between GPUs and CPUs. Report the size of the GPUVM aperture that is supported by KGD accurately. The low part of the GPUVM aperture is reserved for kernel use. This is for kernel-allocated buffers that are only accessed on the GPU: - CWSR trap handler - IB for submitting commands in user-mode context from kernel mode Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Remove limit on number of GPUsFelix Kuehling2018-03-152-12/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the number of GPUs is limited by aperture placement options available on GFX7 and GFX8 hardware. This limitation is not necessary. Scratch and LDS represent per-work-item and per-work-group storage respectively. Different work-items and work-groups use the same virtual address to access their own data. Work running on different GPUs is by definition in different work-groups (different dispatches, in fact). That means the same virtual addresses can be used for these apertures on different GPUs. Add a new AMDKFD_IOC_GET_PROCESS_APERTURES_NEW ioctl that removes the artificial limitation on the number of GPUs that can be supported. The new ioctl allows user mode to query the number of GPUs to allocate enough memory for all GPUs to be reported. This deprecates AMDKFD_IOC_GET_PROCESS_APERTURES. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Populate DRM render device minorOak Zeng2018-03-152-0/+5
| | | | | | | | | | | | | | | | | | Populate DRM render device minor in kfd topology Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Create KFD VMs on demandFelix Kuehling2018-03-152-10/+53
| | | | | | | | | | | | | | | | | | | | | | | | Instead of creating all VMs on process creation, create them when a process is bound to a device. This will later allow registering an existing VM from a DRM render node FD at runtime, before the process is bound to the device. This way the render node VM can be used for KFD instead of creating our own redundant VM. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: fix uninitialized variable useArnd Bergmann2018-03-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_ACPI is disabled, we never initialize the acpi_table structure in kfd_create_crat_image_virtual: drivers/gpu/drm/amd/amdkfd/kfd_crat.c: In function 'kfd_create_crat_image_virtual': drivers/gpu/drm/amd/amdkfd/kfd_crat.c:888:40: error: 'acpi_table' may be used uninitialized in this function [-Werror=maybe-uninitialized] The undefined behavior also happens for any other acpi_get_table() failure, but then the compiler can't warn about it. This adds an error check that prevents the structure from being used in error, avoiding both the undefined behavior and the warning about it. Fixes: 520b8fb755cc ("drm/amdkfd: Add topology support for CPUs") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Implement KFD process eviction/restoreFelix Kuehling2018-02-068-8/+547
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the TTM memory manager in KGD evicts BOs, all user mode queues potentially accessing these BOs must be evicted temporarily. Once user mode queues are evicted, the eviction fence is signaled, allowing the migration of the BO to proceed. A delayed worker is scheduled to restore all the BOs belonging to the evicted process and restart its queues. During suspend/resume of the GPU we also evict all processes to allow KGD to save BOs in system memory, since VRAM will be lost. v2: * Account for eviction when updating of q->is_active in MQD manager Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Add GPUVM virtual address space to PDDFelix Kuehling2018-02-063-0/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | Create/destroy the GPUVM context during PDD creation/destruction. Get VM page table base and program it during process registration (HWS) or VMID allocation (non-HWS). v2: * Used dev instead of pdd->dev in kfd_flush_tlb Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Remove unaligned memory accessHarish Kasiviswanathan2018-02-061-16/+9
| | | | | | | | | | | | | | | | | | | | | | | | Unaligned atomic operations can cause problems on some CPU architectures. Use simpler bitmask operations instead. Atomic bit manipulations are not necessary since dqm->lock is held during these operations. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Centralize IOMMUv2 code and make it conditionalFelix Kuehling2017-12-0811-265/+495
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on ASIC information. Also allow building KFD without IOMMUv2 support. This is still useful for dGPUs and prepares for enabling KFD on architectures that don't support AMD IOMMUv2. v2: * Centralize IOMMUv2 code to avoid #ifdefs in too many places v3: * Imply AMD_IOMMU_V2 in Kconfig Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian Konig <christian.koenig@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Add dGPU device IDs and device infoFelix Kuehling2018-01-041-2/+143
| | | | | | | | | | | | | | | | | | v2: remove needs_iommu field as it doesn't exists CC: linux-pci@vger.kernel.org Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Add dGPU support to kernel_queue_initFelix Kuehling2018-01-041-0/+5
| | | | | | | | | | | | | | | | Recognize dGPU ASIC families. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Add dGPU support to the MQD managerFelix Kuehling2018-01-044-3/+64
| | | | | | | | | | | | | | | | | | On dGPUs don't set ATC addressing bits and use MTYPE_UC for coherent memory. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Add dGPU support to the device queue managerFelix Kuehling2018-01-044-0/+164
| | | | | | | | | | | | | | | | | | | | GFXv7 and v8 dGPUs use a different addressing mode for KFD compared to APUs (GPUVM64 vs HSA64). And dGPUs don't support MTYPE_CC. They use MTYPE_UC instead for memory that requires coherency. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Make sched_policy a per-device settingFelix Kuehling2018-01-046-9/+29
| | | | | | | | | | | | | | | | | | Some dGPUs don't support HWS. Allow them to use a per-device sched_policy that may be different from the global default. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Conditionally enable PCIe atomicsFelix Kuehling2018-01-042-0/+18
| | | | | | | | | | | | | | | | | | This will be needed for most dGPUs. CC: linux-pci@vger.kernel.org Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
| * drm/amdkfd: Use ARRAY_SIZE macro in kfd_build_sysfs_node_entryGustavo A. R. Silva2018-01-181-1/+1
| | | | | | | | | | | | | | | | | | | | Use ARRAY_SIZE instead of dividing sizeof array with sizeof an element. This issue was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Felix Kuehling<Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* | drm/amdkfd: Deallocate SDMA queues correctlyFelix Kuehling2018-03-231-6/+14
| | | | | | | | | | | | | | | | | | Deallocate SDMA queues during abnormal process termination and when queue creation fails after the SDMA allocation. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* | drm/amdkfd: Fix scratch memory with HWS enabledFelix Kuehling2018-03-231-2/+1
|/ | | | | | | | | Program sh_hidden_private_base_vmid correctly in the map-process PM4 packet. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* drm/amdkfd: Fix potential NULL pointer dereferencesGustavo A. R. Silva2018-01-101-1/+7
| | | | | | | | | | | | | | | | In case kfd_get_process_device_data returns null, there are some null pointer dereferences in functions kfd_bind_processes_to_device and kfd_unbind_processes_from_device. Fix this by printing a WARN_ON for PDDs that aren't found and skip them with continue statements. Addresses-Coverity-ID: 1463794 ("Dereference null return value") Addresses-Coverity-ID: 1463772 ("Dereference null return value") Suggested-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* drm/amdkfd: add ull suffix to 64bit definesOded Gabbay2018-01-101-3/+3
| | | | | Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
* drm/amdkfd: don't always call execute_queues_cpsch()Yong Zhao2018-01-021-5/+5
| | | | | | | | | When destroying an inactive queue, we don't need to call execute_queues_cpsch. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Oak Zeng <oak.zeng@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* drm/amdkfd: Fix return value 0 when execute_queues_cpsch failsYong Zhao2018-01-021-1/+1
| | | | | | Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Oak Zeng <oak.zeng@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* drm/amdkfd: Ignore ACPI CRAT for non-APU systemsHarish Kasiviswanathan2017-12-081-2/+22
| | | | | | | | | | | | Some AMD motherboards without an APU have a broken CRAT table which causes KFD initialization failures or incorrect information about NUMA nodes, CPU cores or system memory. Ignore CRAT tables without GPUs and rely on KFD's code to create a CRAT table for the CPU. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
OpenPOWER on IntegriCloud