blackbird-op-linux - Blackbird™ Linux sources for OpenPOWER

	Commit message (Collapse)	Author	Age	Files	Lines
*	drm/i915: Hold reference to intel_frontbuffer as we track activity	Chris Wilson	2019-12-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since obj->frontbuffer is no longer protected by the struct_mutex, as we are processing the execbuf, it may be removed. Mark the intel_frontbuffer as rcu protected, and so acquire a reference to the struct as we track activity upon it. Closes: https://gitlab.freedesktop.org/drm/intel/issues/827 Fixes: 8e7cb1799b4f ("drm/i915: Extract intel_frontbuffer active tracking") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v5.4+ Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191218104043.3539458-1-chris@chris-wilson.co.uk
*	drm/i915/gem: Serialise object before changing cache-level	Chris Wilson	2019-12-14	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Wait for the object to be idle before changing its cache-level and unbinding. This was dropped as supposedly superfluous from commit 8b1c78e06e61 ("drm/i915: Avoid calling i915_gem_object_unbind holding object lock"), but it turns out to prevent some cache dirt escaping. Smells like papering over a race... Closes: https://gitlab.freedesktop.org/drm/intel/issues/820 Fixes: 8b1c78e06e61 ("drm/i915: Avoid calling i915_gem_object_unbind holding object lock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191213223140.1830738-1-chris@chris-wilson.co.uk
*	drm/i915/gem: Avoid rcu_barrier() from shrinker paths	Chris Wilson	2019-12-09	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As i915_gem_object_unbind() waits on an rcu_barrier() to flush vm releases (and destruction of their bound vma), we have to be careful not to invoke that barrier from beneath the shrinker: <4> [430.222671] WARNING: possible circular locking dependency detected <4> [430.222673] 5.4.0-rc8-CI-CI_DRM_7508+ #1 Tainted: G U <4> [430.222675] ------------------------------------------------------ <4> [430.222677] gem_pwrite/2317 is trying to acquire lock: <4> [430.222678] ffffffff82248218 (rcu_state.barrier_mutex){+.+.}, at: rcu_barrier+0x23/0x190 <4> [430.222685] but task is already holding lock: <4> [430.222687] ffffffff82263a40 (fs_reclaim){+.+.}, at: fs_reclaim_acquire.part.117+0x0/0x30 <4> [430.222691] which lock already depends on the new lock. <4> [430.222693] the existing dependency chain (in reverse order) is: <4> [430.222695] -> #2 (fs_reclaim){+.+.}: <4> [430.222698] fs_reclaim_acquire.part.117+0x24/0x30 <4> [430.222702] kmem_cache_alloc_trace+0x2a/0x2c0 <4> [430.222705] intel_cpuc_prepare+0x37/0x1a0 <4> [430.222709] cpuhp_invoke_callback+0x9b/0x9d0 <4> [430.222712] _cpu_up+0xa2/0x140 <4> [430.222714] do_cpu_up+0x61/0xa0 <4> [430.222718] smp_init+0x57/0x96 <4> [430.222722] kernel_init_freeable+0xac/0x1c7 <4> [430.222725] kernel_init+0x5/0x100 <4> [430.222728] ret_from_fork+0x24/0x50 <4> [430.222729] -> #1 (cpu_hotplug_lock.rw_sem){++++}: <4> [430.222733] cpus_read_lock+0x34/0xd0 <4> [430.222734] rcu_barrier+0xaa/0x190 <4> [430.222736] kernel_init+0x21/0x100 <4> [430.222737] ret_from_fork+0x24/0x50 <4> [430.222739] -> #0 (rcu_state.barrier_mutex){+.+.}: <4> [430.222742] __lock_acquire+0x1328/0x15d0 <4> [430.222743] lock_acquire+0xa7/0x1c0 <4> [430.222746] __mutex_lock+0x9a/0x9d0 <4> [430.222747] rcu_barrier+0x23/0x190 <4> [430.222850] i915_gem_object_unbind+0x264/0x3d0 [i915] <4> [430.222882] i915_gem_shrink+0x297/0x5f0 [i915] <4> [430.222912] i915_gem_shrink_all+0x38/0x60 [i915] <4> [430.222934] i915_drop_caches_set+0x1f0/0x240 [i915] <4> [430.222938] simple_attr_write+0xb0/0xd0 <4> [430.222941] full_proxy_write+0x51/0x80 <4> [430.222943] vfs_write+0xb9/0x1d0 <4> [430.222944] ksys_write+0x9f/0xe0 <4> [430.222946] do_syscall_64+0x4f/0x210 <4> [430.222948] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4> [430.222950] other info that might help us debug this: <4> [430.222952] Chain exists of: rcu_state.barrier_mutex --> cpu_hotplug_lock.rw_sem --> fs_reclaim <4> [430.222955] Possible unsafe locking scenario: <4> [430.222957] CPU0 CPU1 <4> [430.222958] ---- ---- <4> [430.222960] lock(fs_reclaim); <4> [430.222961] lock(cpu_hotplug_lock.rw_sem); <4> [430.222963] lock(fs_reclaim); <4> [430.222964] lock(rcu_state.barrier_mutex); <4> [430.222966] * DEADLOCK * <4> [430.222968] 3 locks held by gem_pwrite/2317: <4> [430.222969] #0: ffff88849e2d9408 (sb_writers#14){.+.+}, at: vfs_write+0x1a4/0x1d0 <4> [430.222973] #1: ffff888496976db0 (&attr->mutex){+.+.}, at: simple_attr_write+0x36/0xd0 <4> [430.222976] #2: ffffffff82263a40 (fs_reclaim){+.+.}, at: fs_reclaim_acquire.part.117+0x0/0x30 <4> [430.222980] stack backtrace: <4> [430.222982] CPU: 1 PID: 2317 Comm: gem_pwrite Tainted: G U 5.4.0-rc8-CI-CI_DRM_7508+ #1 <4> [430.222985] Hardware name: Intel Corporation Tiger Lake Client Platform/TigerLake U DDR4 SODIMM RVP, BIOS TGLSFWI1.R00.2321.A08.1909162051 09/16/2019 <4> [430.222989] Call Trace: <4> [430.222992] dump_stack+0x71/0x9b <4> [430.222995] check_noncircular+0x19b/0x1c0 <4> [430.222998] ? __lock_acquire+0x1328/0x15d0 <4> [430.222999] __lock_acquire+0x1328/0x15d0 <4> [430.223001] ? mark_held_locks+0x49/0x70 <4> [430.223003] lock_acquire+0xa7/0x1c0 <4> [430.223005] ? rcu_barrier+0x23/0x190 <4> [430.223008] __mutex_lock+0x9a/0x9d0 <4> [430.223009] ? rcu_barrier+0x23/0x190 <4> [430.223011] ? rcu_barrier+0x23/0x190 <4> [430.223013] ? find_held_lock+0x2d/0x90 <4> [430.223045] ? i915_gem_object_unbind+0x24a/0x3d0 [i915] <4> [430.223048] ? rcu_barrier+0x23/0x190 <4> [430.223049] rcu_barrier+0x23/0x190 <4> [430.223081] i915_gem_object_unbind+0x264/0x3d0 [i915] <4> [430.223119] i915_gem_shrink+0x297/0x5f0 [i915] Closes: https://gitlab.freedesktop.org/drm/intel/issues/743 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191208161252.3015727-1-chris@chris-wilson.co.uk
*	drm/i915: Avoid calling i915_gem_object_unbind holding object lock	Chris Wilson	2019-12-07	1	-30/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the extreme case, we may wish to wait on an rcu-barrier to reap stale vm to purge the last of the object bindings. However, we are not allowed to use rcu_barrier() beneath the dma_resv (i.e. object) lock and do not take lightly the prospect of unlocking a mutex deep in the bowels of the routine. i915_gem_object_unbind() itself does not need the object lock, and it turns out the callers do not need to the unbind as part of a locked sequence around set-cache-level, so rearrange the code to avoid taking the object lock in the callers. <4> [186.816311] ====================================================== <4> [186.816313] WARNING: possible circular locking dependency detected <4> [186.816316] 5.4.0-rc8-CI-CI_DRM_7486+ #1 Tainted: G U <4> [186.816318] ------------------------------------------------------ <4> [186.816320] perf_pmu/1321 is trying to acquire lock: <4> [186.816322] ffff88849487c4d8 (&mm->mmap_sem#2){++++}, at: __might_fault+0x39/0x90 <4> [186.816331] but task is already holding lock: <4> [186.816333] ffffe8ffffa05008 (&cpuctx_mutex){+.+.}, at: perf_event_ctx_lock_nested+0xa9/0x1b0 <4> [186.816339] which lock already depends on the new lock. <4> [186.816341] the existing dependency chain (in reverse order) is: <4> [186.816343] -> #6 (&cpuctx_mutex){+.+.}: <4> [186.816349] __mutex_lock+0x9a/0x9d0 <4> [186.816352] perf_event_init_cpu+0xa4/0x140 <4> [186.816357] perf_event_init+0x19d/0x1cd <4> [186.816362] start_kernel+0x372/0x4f4 <4> [186.816365] secondary_startup_64+0xa4/0xb0 <4> [186.816381] -> #5 (pmus_lock){+.+.}: <4> [186.816385] __mutex_lock+0x9a/0x9d0 <4> [186.816387] perf_event_init_cpu+0x6b/0x140 <4> [186.816404] cpuhp_invoke_callback+0x9b/0x9d0 <4> [186.816406] _cpu_up+0xa2/0x140 <4> [186.816409] do_cpu_up+0x61/0xa0 <4> [186.816411] smp_init+0x57/0x96 <4> [186.816413] kernel_init_freeable+0xac/0x1c7 <4> [186.816416] kernel_init+0x5/0x100 <4> [186.816419] ret_from_fork+0x24/0x50 <4> [186.816421] -> #4 (cpu_hotplug_lock.rw_sem){++++}: <4> [186.816424] cpus_read_lock+0x34/0xd0 <4> [186.816427] rcu_barrier+0xaa/0x190 <4> [186.816429] kernel_init+0x21/0x100 <4> [186.816431] ret_from_fork+0x24/0x50 <4> [186.816433] -> #3 (rcu_state.barrier_mutex){+.+.}: <4> [186.816436] __mutex_lock+0x9a/0x9d0 <4> [186.816438] rcu_barrier+0x23/0x190 <4> [186.816502] i915_gem_object_unbind+0x3a6/0x400 [i915] <4> [186.816537] i915_gem_object_set_cache_level+0x32/0x90 [i915] <4> [186.816571] i915_gem_object_pin_to_display_plane+0x5d/0x160 [i915] <4> [186.816612] intel_pin_and_fence_fb_obj+0x9e/0x200 [i915] <4> [186.816679] intel_plane_pin_fb+0x3f/0xd0 [i915] <4> [186.816717] intel_prepare_plane_fb+0x130/0x520 [i915] <4> [186.816722] drm_atomic_helper_prepare_planes+0x85/0x110 <4> [186.816761] intel_atomic_commit+0xc6/0x350 [i915] <4> [186.816764] drm_atomic_helper_update_plane+0xed/0x110 <4> [186.816768] setplane_internal+0x97/0x190 <4> [186.816770] drm_mode_setplane+0xcd/0x190 <4> [186.816773] drm_ioctl_kernel+0xa7/0xf0 <4> [186.816775] drm_ioctl+0x2e1/0x390 <4> [186.816778] do_vfs_ioctl+0xa0/0x6f0 <4> [186.816780] ksys_ioctl+0x35/0x60 <4> [186.816782] __x64_sys_ioctl+0x11/0x20 <4> [186.816785] do_syscall_64+0x4f/0x210 <4> [186.816787] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4> [186.816789] -> #2 (reservation_ww_class_mutex){+.+.}: <4> [186.816793] __ww_mutex_lock.constprop.15+0xc3/0x1090 <4> [186.816795] ww_mutex_lock+0x39/0x70 <4> [186.816798] dma_resv_lockdep+0x10e/0x1f7 <4> [186.816800] do_one_initcall+0x58/0x2ff <4> [186.816802] kernel_init_freeable+0x137/0x1c7 <4> [186.816804] kernel_init+0x5/0x100 <4> [186.816806] ret_from_fork+0x24/0x50 <4> [186.816808] -> #1 (reservation_ww_class_acquire){+.+.}: <4> [186.816811] dma_resv_lockdep+0xec/0x1f7 <4> [186.816813] do_one_initcall+0x58/0x2ff <4> [186.816815] kernel_init_freeable+0x137/0x1c7 <4> [186.816817] kernel_init+0x5/0x100 <4> [186.816819] ret_from_fork+0x24/0x50 <4> [186.816820] -> #0 (&mm->mmap_sem#2){++++}: <4> [186.816824] __lock_acquire+0x1328/0x15d0 <4> [186.816826] lock_acquire+0xa7/0x1c0 <4> [186.816828] __might_fault+0x63/0x90 <4> [186.816831] _copy_to_user+0x1e/0x80 <4> [186.816834] perf_read+0x200/0x2b0 <4> [186.816836] vfs_read+0x96/0x160 <4> [186.816838] ksys_read+0x9f/0xe0 <4> [186.816839] do_syscall_64+0x4f/0x210 <4> [186.816841] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4> [186.816843] other info that might help us debug this: <4> [186.816846] Chain exists of: &mm->mmap_sem#2 --> pmus_lock --> &cpuctx_mutex <4> [186.816849] Possible unsafe locking scenario: <4> [186.816851] CPU0 CPU1 <4> [186.816853] ---- ---- <4> [186.816854] lock(&cpuctx_mutex); <4> [186.816856] lock(pmus_lock); <4> [186.816858] lock(&cpuctx_mutex); <4> [186.816860] lock(&mm->mmap_sem#2); <4> [186.816861] * DEADLOCK * Closes: https://gitlab.freedesktop.org/drm/intel/issues/728 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191206105527.1130413-5-chris@chris-wilson.co.uk
*	drm/i915/gem: Hold the obj->vma.lock while walking the vma.list	Chris Wilson	2019-12-04	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Remember to take the lock before walking the obj->vma.list so that the nodes do not change beneath us! E.g., i915_gem_object_bump_inactive_ggtt:387 GEM_BUG_ON(vma->vm != &i915->ggtt.vm) Closes: https://gitlab.freedesktop.org/drm/intel/issues/691 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191204164527.3872783-1-chris@chris-wilson.co.uk
*	drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET	Abdiel Janulgue	2019-12-04	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is really just an alias of mmap_gtt. The 'mmap offset' nomenclature comes from the value returned by this ioctl which is the offset into the device fd which userpace uses with mmap(2). mmap_gtt was our initial mmap_offset implementation, this extends our CPU mmap support to allow additional fault handlers that depends on the object's backing pages. Note that we multiplex mmap_gtt and mmap_offset through the same ioctl, and use the zero extending behaviour of drm to differentiate between them, when we inspect the flags. To support multiple mmap types on an object we need to support multiple mmap_offsets for an object (each offset in the global device address space corresponding to a unique instance of the object for a file + mmap type). As we drop the simplified drm core idea of a single mmap_offset, we need to provide replacement hooks for the dumb mmap interface as well. Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1675 Testcase: igt/gem_mmap_offset Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191204120032.3682839-1-chris@chris-wilson.co.uk
*	drm/i915/gem: Unbind all current vma on changing cache-level	Chris Wilson	2019-12-02	1	-119/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Avoid dangerous race handling of destroyed vma by unbinding all vma instead. Unfortunately, this stops us from trying to be clever and only doing the minimal change required, so on first use of scanout we may encounter an annoying stall as it transitions to a new cache level. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112413 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191202174310.2630302-1-chris@chris-wilson.co.uk
*	drm/i915/gem: Track ggtt writes from userspace on the bound vma	Chris Wilson	2019-11-19	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \|	When userspace writes into the GTT itself, it is supposed to call set-domain to let the kernel keep track and so manage the CPU/GPU caches. As we track writes on the individual i915_vma, we should also be sure to mark them as dirty. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191119112515.2766748-1-chris@chris-wilson.co.uk
*	drm/i915: FB backing gem obj should reside in LMEM	Ramalingam C	2019-11-07	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If Local memory is supported by hardware, we want framebuffer backing gem objects from local memory. if the backing obj is not from LMEM, pin_to_display is failed. v2: memory regions are correctly assigned to obj->memory_regions [tvrtko] migration failure is reported as debug log [Tvrtko] v3: Migration is dropped. only error is reported [Daniel] mem region check is move to pin_to_display [Chris] v4: s/dev_priv/i915 [chris] v5: i915_gem_object_is_lmem is used for detecting the obj mem type. [Matt] cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Ramalingam C <ramalingam.c@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191105144414.30470-1-ramalingam.c@intel.com
*	drm/i915: Pull i915_vma_pin under the vm->mutex	Chris Wilson	2019-10-04	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Replace the struct_mutex requirement for pinning the i915_vma with the local vm->mutex instead. Note that the vm->mutex is tainted by the shrinker (we require unbinding from inside fs-reclaim) and so we cannot allocate while holding that mutex. Instead we have to preallocate workers to do allocate and apply the PTE updates after we have we reserved their slot in the drm_mm (using fences to order the PTE writes with the GPU work and with later unbind). In adding the asynchronous vma binding, one subtle requirement is to avoid coupling the binding fence into the backing object->resv. That is the asynchronous binding only applies to the vma timeline itself and not to the pages as that is a more global timeline (the binding of one vma does not need to be ordered with another vma, nor does the implicit GEM fencing depend on a vma, only on writes to the backing store). Keeping the vma binding distinct from the backing store timelines is verified by a number of async gem_exec_fence and gem_exec_schedule tests. The way we do this is quite simple, we keep the fence for the vma binding separate and only wait on it as required, and never add it to the obj->resv itself. Another consequence in reducing the locking around the vma is the destruction of the vma is no longer globally serialised by struct_mutex. A natural solution would be to add a kref to i915_vma, but that requires decoupling the reference cycles, possibly by introducing a new i915_mm_pages object that is own by both obj->mm and vma->pages. However, we have not taken that route due to the overshadowing lmem/ttm discussions, and instead play a series of complicated games with trylocks to (hopefully) ensure that only one destruction path is called! v2: Add some commentary, and some helpers to reduce patch churn. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-4-chris@chris-wilson.co.uk
*	drm/i915: Make shrink/unshrink be atomic	Chris Wilson	2019-09-11	1	-1/+2
\| \| \| \| \| \| \| \| \|	Add an atomic counter and always take the spinlock around the pin/unpin events, so that we can perform the list manipulation concurrently. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190910212204.17190-1-chris@chris-wilson.co.uk
*	drm/i915: cleanup cache-coloring	Matthew Auld	2019-09-09	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Try to tidy up the cache-coloring such that we rid the code of any mm.color_adjust assumptions, this should hopefully make it more obvious in the code when we need to actually use the cache-level as the color, and as a bonus should make adding a different color-scheme simpler. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-3-matthew.auld@intel.com
*	drm/i915: Replace obj->pin_global with obj->frontbuffer	Chris Wilson	2019-09-03	1	-24/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	obj->pin_global was originally used as a means to keep the shrinker off the active scanout, but we use the vma->pin_count itself for that and the obj->frontbuffer to delay shrinking active framebuffers. The other role that obj->pin_global gained was for spotting display objects inside GEM and working harder to keep those coherent; for which we can again simply inspect obj->frontbuffer directly. Coming up next, we will want to manipulate the pin_global counter outside of the principle locks, so would need to make pin_global atomic. However, since obj->frontbuffer is already managed atomically, it makes sense to use that the primary key for display objects instead of having pin_global. Ville pointed out the principle difference is that obj->frontbuffer is set for as long as an intel_framebuffer is attached to an object, but obj->pin_global was only raised for as long as the object was active. In practice, this means that we consider the object as being on the scanout for longer than is strictly required, causing us to be more proactive in flushing -- though it should be true that we would have flushed eventually when the back became the front, except that on the flip path that flush is async but when hit from another ioctl it will be synchronous. v2: i915_gem_object_is_framebuffer() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190902040303.14195-5-chris@chris-wilson.co.uk
*	drm/i915: Replace i915_vma_put_fence()	Chris Wilson	2019-08-22	1	-8/+27
\| \| \| \| \| \| \| \| \| \| \|	Avoid calling i915_vma_put_fence() by using our alternate paths that bind a secondary vma avoiding the original fenced vma. For the few instances where we need to release the fence (i.e. on binding when the GGTT range becomes invalid), replace the put_fence with a revoke_fence. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190822061557.18402-1-chris@chris-wilson.co.uk
*	drm/i915: Extract intel_frontbuffer active tracking	Chris Wilson	2019-08-16	1	-11/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	Move the active tracking for the frontbuffer operations out of the i915_gem_object and into its own first class (refcounted) object. In the process of detangling, we switch from low level request tracking to the easier i915_active -- with the plan that this avoids any potential atomic callbacks as the frontbuffer tracking wishes to sleep as it flushes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190816074635.26062-1-chris@chris-wilson.co.uk
*	drm/i915: move modesetting core code under display/	Jani Nikula	2019-06-17	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we have a new subdirectory for display code, continue by moving modesetting core code. display/intel_frontbuffer.h sticks out like a sore thumb, otherwise this is, again, a surprisingly clean operation. v2: - don't move intel_sideband.[ch] (Ville) - use tabs for Makefile file lists and sort them Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190613084416.6794-3-jani.nikula@intel.com
*	drm/i915: Combine unbound/bound list tracking for objects	Chris Wilson	2019-06-12	1	-8/+3
\| \| \| \| \| \| \| \| \| \| \| \|	With async binding, we don't want to manage a bound/unbound list as we may end up running before we even acquire the pages. All that is required is keeping track of shrinkable objects, so reduce it to the minimum list. Fixes: 6951e5893b48 ("drm/i915: Move GEM object domain management from struct_mutex to local") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190612105720.30310-1-chris@chris-wilson.co.uk
*	drm/i915: Promote i915->mm.obj_lock to be irqsafe	Chris Wilson	2019-06-10	1	-9/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The intent is to be able to update the mm.lists from inside an irqsoff section (e.g. from a softirq rcu workqueue), ergo we need to make the i915->mm.obj_lock irqsafe. v2: can_discard_pages() ensures we are shrinkable v3: Beware shadowing of 'flags' Fixes: 3b4fa9640ccd ("drm/i915: Track the purgeable objects on a separate eviction list") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110869 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190610145430.17717-1-chris@chris-wilson.co.uk
*	drm/i915: Report all objects with allocated pages to the shrinker	Chris Wilson	2019-05-31	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, we try to report to the shrinker the precise number of objects (pages) that are available to be reaped at this moment. This requires searching all objects with allocated pages to see if they fulfill the search criteria, and this count is performed quite frequently. (The shrinker tries to free ~128 pages on each invocation, before which we count all the objects; counting takes longer than unbinding the objects!) If we take the pragmatic view that with sufficient desire, all objects are eventually reapable (they become inactive, or no longer used as framebuffer etc), we can simply return the count of pinned pages maintained during get_pages/put_pages rather than walk the lists every time. The downside is that we may (slightly) over-report the number of objects/pages we could shrink and so penalize ourselves by shrinking more than required. This is mitigated by keeping the order in which we shrink objects such that we avoid penalizing active and frequently used objects, and if memory is so tight that we need to free them we would need to anyway. v2: Only expose shrinkable objects to the shrinker; a small reduction in not considering stolen and foreign objects. v3: Restore the tracking from a "backup" copy from before the gem/ split Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190530203500.26272-2-chris@chris-wilson.co.uk
*	drm/i915: Track the purgeable objects on a separate eviction list	Chris Wilson	2019-05-31	1	-5/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the purgeable objects, I915_MADV_DONTNEED, are mixed in the normal bound/unbound lists. Every shrinker pass starts with an attempt to purge from this set of unneeded objects, which entails us doing a walk over both lists looking for any candidates. If there are none, and since we are shrinking we can reasonably assume that the lists are full!, this becomes a very slow futile walk. If we separate out the purgeable objects into own list, this search then becomes its own phase that is preferentially handled during shrinking. Instead the cost becomes that we then need to filter the purgeable list if we want to distinguish between bound and unbound objects. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190530203500.26272-1-chris@chris-wilson.co.uk
*	drm/i915: Move GEM object domain management from struct_mutex to local	Chris Wilson	2019-05-28	1	-31/+39
\| \| \| \| \| \| \| \| \| \| \| \|	Use the per-object local lock to control the cache domain of the individual GEM objects, not struct_mutex. This is a huge leap forward for us in terms of object-level synchronisation; execbuffers are coordinated using the ww_mutex and pread/pwrite is finally fully serialised again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk
*	drm/i915: Move GEM domain management to its own file	Chris Wilson	2019-05-28	1	-0/+782
	Continuing the decluttering of i915_gem.c, that of the read/write domains, perhaps the biggest of GEM's follies? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-7-chris@chris-wilson.co.uk