summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/i915_gem.c
Commit message (Collapse)AuthorAgeFilesLines
...
* drm/i915: Stop skipping the final clflush back to system pagesChris Wilson2016-11-111-4/+5
| | | | | | | | | | | | | When we release the shmem backing storage, we make sure that the pages are coherent with the cpu cache. However, our clflush routine was skipping the flush as the object had no pages at release time. Fix this by explicitly flushing the sg_table we are decoupling. Fixes: 03ac84f1830e ("drm/i915: Pass around sg_table to get_pages/put_pages backend") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161111145809.9701-2-chris@chris-wilson.co.uk
* drm/i915: Only wait upon the execution timeline when unlockedChris Wilson2016-11-111-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to walk the list of all timelines, we currently require the struct_mutex. We are sometimes called prior to the struct_mutex being taken by the caller (i.e !I915_WAIT_LOCKED) in which case we can only trust the global execution timelines (as these are owned by the device). This means in the unlocked phase we can only wait upon the currently executing requests and not all queued. [ 175.743243] general protection fault: 0000 [#1] SMP [ 175.743263] Modules linked in: nls_iso8859_1 intel_rapl x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iwlwifi aesni_intel aes_x86_64 lrw snd_soc_rt5640 gf128mul snd_soc_rl6231 snd_soc_core glue_helper snd_compress snd_pcm_dmaengine snd_hda_codec_hdmi ablk_helper snd_hda_codec_realtek cryptd snd_hda_codec_generic serio_raw cfg80211 snd_hda_intel snd_hda_codec ir_lirc_codec snd_hda_core lirc_dev snd_hwdep snd_pcm lpc_ich mei_me mei snd_seq_midi shpchp snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer rc_rc6_mce acpi_als nuvoton_cir kfifo_buf rc_core snd industrialio snd_soc_sst_acpi soundcore snd_soc_sst_match i2c_designware_platform 8250_dw i2c_designware_core dw_dmac spi_pxa2xx_platform mac_hid acpi_pad parport_pc ppdev lp parport [ 175.743509] autofs4 i915 e1000e psmouse ptp pps_core xhci_pci ehci_pci ahci xhci_hcd ehci_hcd libahci video sdhci_acpi sdhci i2c_hid hid [ 175.743560] CPU: 2 PID: 2386 Comm: wtdg_monitor.sh Tainted: G U 4.9.0-rc4-nightly+ #2 [ 175.743581] Hardware name: /NUC5i7RYB, BIOS RYBDWi35.86A.0358.2016.0606.1423 06/06/2016 [ 175.743603] task: ffff88024509ba80 task.stack: ffffc9007bd18000 [ 175.743618] RIP: 0010:[<ffffffffa01af29b>] [<ffffffffa01af29b>] i915_gem_wait_for_idle+0x3b/0x140 [i915] [ 175.743660] RSP: 0000:ffffc9007bd1b9b8 EFLAGS: 00010297 [ 175.743674] RAX: ffff88024489d248 RBX: 0000000000000000 RCX: 0000000000000000 [ 175.743691] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880244898000 [ 175.743708] RBP: ffffc9007bd1b9f0 R08: 0000000000000000 R09: 0000000000000001 [ 175.743724] R10: 00000028eaf42792 R11: 0000000000000001 R12: dead000000000100 [ 175.743741] R13: dead000000000148 R14: ffffc9007bd1ba5f R15: 0000000000000005 [ 175.743758] FS: 00007f2638330700(0000) GS:ffff880256d00000(0000) knlGS:0000000000000000 [ 175.743777] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 175.743791] CR2: 00007f885c8cea40 CR3: 00000002416b5000 CR4: 00000000003406e0 [ 175.743808] Stack: [ 175.743816] ffff88024489d248 000000004509ba80 ffff880244898000 ffff88024509ba80 [ 175.743840] 00000000ffff8b69 ffffc9007bd1ba5f ffffc9007bd1ba5e ffffc9007bd1ba28 [ 175.743863] ffffffffa01b661d 00000000ffffffff 0000000000000000 ffff880244898000 [ 175.743886] Call Trace: [ 175.743906] [<ffffffffa01b661d>] i915_gem_shrinker_lock_uninterruptible.constprop.5+0x5d/0xc0 [i915] [ 175.743937] [<ffffffffa01b6cd0>] i915_gem_shrinker_oom+0x30/0x1b0 [i915] [ 175.743955] [<ffffffff8109ca79>] notifier_call_chain+0x49/0x70 [ 175.743971] [<ffffffff8109cd9d>] __blocking_notifier_call_chain+0x4d/0x70 [ 175.743988] [<ffffffff8109cdd6>] blocking_notifier_call_chain+0x16/0x20 [ 175.744005] [<ffffffff811885dc>] out_of_memory+0x22c/0x480 [ 175.744020] [<ffffffff81205542>] __alloc_pages_slowpath+0x851/0x8ec [ 175.744037] [<ffffffff8118ca51>] __alloc_pages_nodemask+0x2c1/0x310 [ 175.744054] [<ffffffff811d8ea8>] alloc_pages_current+0x88/0x120 [ 175.744070] [<ffffffff811833a4>] __page_cache_alloc+0xb4/0xc0 [ 175.744086] [<ffffffff811865ca>] filemap_fault+0x29a/0x500 [ 175.744101] [<ffffffff81299aa6>] ext4_filemap_fault+0x36/0x50 [ 175.744117] [<ffffffff811b3d4a>] __do_fault+0x6a/0xe0 [ 175.744131] [<ffffffff811b97ee>] handle_mm_fault+0xd0e/0x1330 [ 175.744147] [<ffffffff8106738c>] __do_page_fault+0x23c/0x4d0 [ 175.744162] [<ffffffff81067650>] do_page_fault+0x30/0x80 [ 175.744177] [<ffffffff817ffbe8>] page_fault+0x28/0x30 [ 175.744191] Code: 41 57 41 56 41 55 41 54 53 48 83 ec 10 4c 8b a7 48 52 00 00 89 75 d4 48 89 45 c8 49 39 c4 74 78 4d 8d 6c 24 48 41 bf 05 00 00 00 <49> 8b 5d 00 48 85 db 74 50 8b 83 20 01 00 00 85 c0 74 15 48 8b [ 175.744320] RIP [<ffffffffa01af29b>] i915_gem_wait_for_idle+0x3b/0x140 [i915] [ 175.744351] RSP <ffffc9007bd1b9b8> Fixes: 80b204bce8f2 ("drm/i915: Enable multiple timelines") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161111145809.9701-1-chris@chris-wilson.co.uk
* drm/i915: Assorted dev_priv cleanupsTvrtko Ursulin2016-11-111-6/+7
| | | | | | | | A small selection of macros which can only accept dev_priv from now on and a resulting trickle of fixups. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
* drm/i915: Split out i915_vma.cJoonas Lahtinen2016-11-111-371/+0
| | | | | | | | | | | | | | | | | | | | | | | | | As a side product, had to split two other files; - i915_gem_fence_reg.h - i915_gem_object.h (only parts that needed immediate untanglement) I tried to move code in as big chunks as possible, to make review easier. i915_vma_compare was moved to a header temporarily. v2: - Use i915_gem_fence_reg.{c,h} v3: - Rebased v4: - Fix building when DEBUG_GEM is enabled by reordering a bit. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1478861034-30643-1-git-send-email-joonas.lahtinen@linux.intel.com
* drm/i915: Trim the object sg tableTvrtko Ursulin2016-11-101-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the moment we allocate enough sg table entries assuming we will not be able to do any coalescing. But since in practice we most often can, and more so very effectively, this ends up wasting a lot of memory. A simple and effective way of trimming the over-allocated entries is to copy the table over to a new one allocated to the exact size. Experiments on my freshly logged and idle desktop (KDE) showed that by doing this we can save approximately 1 MiB of RAM, or when running a typical benchmark like gl_manhattan I have even seen a 6 MiB saving. More complicated techniques such as only copying the last used page and freeing the rest are left to the reader. v2: * Update commit message. * Use temporary sg_table on stack. (Chris Wilson) v3: * Commit message update. * Comment added. * Replace memcpy with copy assignment. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1478704423-7447-1-git-send-email-tvrtko.ursulin@linux.intel.com
* drm/i915: Remove chipset flush after cache flushChris Wilson2016-11-081-15/+8
| | | | | | | | | | | | | | We always flush the chipset prior to executing with the GPU, so we can skip the flush during ordinary domain management. This should help mitigate some of the potential performance regressions, but likely trivial, from doing the flush unconditionally before execbuf introduced in commit dcd79934b0dd ("drm/i915: Unconditionally flush any chipset buffers before execbuf") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20161106130001.9509-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
* drm/i915: Add assert for no pending GPU requests during suspend/resume in LR ↵Imre Deak2016-11-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | mode During resume we will reset the SW/HW tracking for each ring head/tail pointers and so are not prepared to replay any pending requests (as opposed to GPU reset time). Add an assert for this both to the suspend and the resume code. v2: - Check for ELSP port idle already during suspend and check !gt.awake during resume. (Chris) v3: - Move the !gt.awake check to i915_gem_resume(). v4: - s/intel_lr_engines_idle/intel_execlists_idle/ (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-4-git-send-email-imre.deak@intel.com
* drm/i915: Make sure engines are idle during GPU idling in LR modeImre Deak2016-11-071-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We assume that the GPU is idle once receiving the seqno via the last request's user interrupt. In execlist mode the corresponding context completed interrupt can be delayed though and until this latter interrupt arrives we consider the request to be pending on the ELSP submit port. This can cause a problem during system suspend where this last request will be seen by the resume code as still pending. Such pending requests are normally replayed after a GPU reset, but during resume we reset both SW and HW tracking of the ring head/tail pointers, so replaying the pending request with its stale tail pointer will leave the ring in an inconsistent state. A subsequent request submission can lead then to the GPU executing from uninitialized area in the ring behind the above stale tail pointer. Fix this by making sure any pending request on the ELSP port is completed before suspending. I used a polling wait since the completion time I measured was <1ms and since normally we only need to wait during system suspend. GPU idling during runtime suspend is scheduled with a delay (currently 50-100ms) after the retirement of the last request at which point the context completed interrupt must have arrived already. The chance of this bug was increased by commit 1c777c5d1dcdf8fa0223fcff35fb387b5bb9517a Author: Imre Deak <imre.deak@intel.com> Date: Wed Oct 12 17:46:37 2016 +0300 drm/i915/hsw: Fix GPU hang during resume from S3-devices state but it could happen even without the explicit GPU reset, since we disable interrupts afterwards during the suspend sequence. v2: - Do an unlocked poll-wait first. (Chris) v3-4: - s/intel_lr_engines_idle/intel_execlists_idle/ and move i915.enable_execlists check to the new helper. (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98470 Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-3-git-send-email-imre.deak@intel.com
* drm/i915: Avoid early GPU idling due to race with new requestImre Deak2016-11-071-0/+7
| | | | | | | | | | | | | | There is a small race where a new request can be submitted and retired after the idle worker started to run which leads to idling the GPU too early. Fix this by deferring the idling to the pending instance of the worker. This scenario was pointed out by Chris. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1478510405-11799-2-git-send-email-imre.deak@intel.com
* drm/i915: Limit Valleyview and earlier to only using mappable scanoutChris Wilson2016-11-071-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Valleyview appears to be limited to only scanning out from the first 512MiB of the Global GTT. Lets presume that this behaviour was inherited from the display block copied from g4x (not Ironlake) and all earlier generations are similarly affected, though testing suggests different symptoms. For simplicity, impose that these platforms must scanout from the mappable region. (For extra simplicity, use HAS_GMCH_DISPLAY even though this catches Cherryview which does not appear to be limited to the low aperture for its scanout.) v2: Use HAS_GMCH_DISPLAY() to more clearly convey my intent about limiting this workaround to the old style of display engine. v3: Update changelog to reflect testing by Ville Syrjälä v4: Include the changes to the comments as well Reported-by: Luis Botello <luis.botello.ortega@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98036 Fixes: 2efb813d5388 ("drm/i915: Fallback to using unmappable memory for scanout") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Akash Goel <akash.goel@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.9-rc1+ Link: http://patchwork.freedesktop.org/patch/msgid/20161107110128.28762-1-chris@chris-wilson.co.uk Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com
* drm/i915: Round tile chunks up for constructing partial VMAsChris Wilson2016-11-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | When we split a large object up into chunks for GTT faulting (because we can't fit the whole object into the aperture) we have to align our cuts with the fence registers. Each partial VMA must cover a complete set of tile rows or the offset into each partial VMA is not aligned with the whole image. Currently we enforce a minimum size on each partial VMA, but this minimum size itself was not aligned to the tile row causing distortion. Reported-by: Andreas Reis <andreas.reis@gmail.com> Reported-by: Chris Clayton <chris2553@googlemail.com> Reported-by: Norbert Preining <preining@logic.at> Tested-by: Norbert Preining <preining@logic.at> Tested-by: Chris Clayton <chris2553@googlemail.com> Fixes: 03af84fe7f48 ("drm/i915: Choose partial chunksize based on tile row size") Fixes: a61007a83a46 ("drm/i915: Fix partial GGTT faulting") # enabling patch Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98402 Testcase: igt/gem_mmap_gtt/medium-copy-odd Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.9-rc1+ Link: http://patchwork.freedesktop.org/patch/msgid/20161107105443.27855-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
* drm/i915: Fix pages pin counting around swizzle quirkChris Wilson2016-11-041-21/+26
| | | | | | | | | | | | | | | | | | | | | | commit bc0629a76726 ("drm/i915: Track pages pinned due to swizzling quirk") fixed one problem, but revealed a whole lot more. The root cause of the pin count mismatch for the swizzle quirk (for L-shaped memory on gen3/4) was that we were incrementing the pages_pin_count upon getting the backing pages but then overwriting the pages_pin_count to set it to 1 afterwards. With a little bit of adjustment to satisfy the GEM_BUG_ON sanitychecks, the fix is to replace the explicit atomic_set with an atomic_inc. v2: Consistently use atomics (not mix atomics and helpers) within the lowlevel get_pages routines. This makes the atomic operations much clearer. Fixes: 1233e2db199d ("drm/i915: Move object backing storage manipulation") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161104103001.27643-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
* drm/i915: Tidy slab cache allocationsTvrtko Ursulin2016-11-031-27/+10
| | | | | | | | | | | | | | | We can use the preferred KMEM_CACHE helper for brevity. Also simplifiy error unwind by only setting the ENOMEM error code once. v2: Add forgotten changes. (Joonas Lahtinen) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v1) Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1478099699-28652-1-git-send-email-tvrtko.ursulin@linux.intel.com
* drm/i915: Unify global_list into global_linkJoonas Lahtinen2016-11-021-6/+6
| | | | | | | | | $ sed -i -r 's/\bglobal_list\b/global_link/g' *.c *.h Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1478081764-8058-1-git-send-email-joonas.lahtinen@linux.intel.com
* drm/i915: Allow shrinking of userptr objects once againTvrtko Ursulin2016-11-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | Commit 1bec9b0bda3d ("drm/i915/shrinker: Only shmemfs objects are backed by swap") stopped considering the userptr objects in shrinker callbacks. Restore that so idle userptr objects can be discarded in order to free up memory. One change further to what was introduced in 1bec9b0bda3d is to start considering userptr objects in oom but that should also be a correct thing to do. v2: Introduce I915_GEM_OBJECT_IS_SHRINKABLE. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 1bec9b0bda3d ("drm/i915/shrinker: Only shmemfs objects are backed by swap") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: <stable@vger.kernel.org> Link: http://patchwork.freedesktop.org/patch/msgid/1478011450-6634-1-git-send-email-tvrtko.ursulin@linux.intel.com
* drm/i915: Pass dev_priv to IS_BROADWATER/IS_CRESTLINEVille Syrjälä2016-11-011-1/+2
| | | | | | | | Unify our approach to things by passing around dev_priv instead of dev. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-21-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Improve lockdep tracking for obj->mm.lockChris Wilson2016-11-011-4/+5
| | | | | | | | | | | | | | | | | | The shrinker may appear to recurse into obj->mm.lock as the shrinker may be called from a direct reclaim path whilst handling get_pages. We filter out recursing on the same obj->mm.lock by inspecting obj->mm.pages, but we do want to take the lock on a second object in order to reap their pages. lockdep spots the recursion on the same lockclass and needs annotation to avoid a false positive. To keep the two paths distinct, create an enum to indicate which subclass of obj->mm.lock we are using. This removes the false positive and avoids masking real bugs. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161101121134.27504-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
* drm/i915: Store the vma in an rbtree under the objectChris Wilson2016-11-011-0/+2
| | | | | | | | | | | | | | | | | | | | | | With full-ppgtt one of the main bottlenecks is the lookup of the VMA underneath the object. For execbuf there is merit in having a very fast direct lookup of ctx:handle to the vma using a hashtree, but that still leaves a large number of other lookups. One way to speed up the lookup would be to use a rhashtable, but that requires extra allocations and may exhibit poor worse case behaviour. An alternative is to use an embedded rbtree, i.e. no extra allocations and deterministic behaviour, but at the slight cost of O(lgN) lookups (instead of O(1) for rhashtable). The major of such tree will be very shallow and so not much slower, and still scales much, much better than the current unsorted list. v2: Bump vma_compare() to return a long, as we return the result of comparing two pointers. References: https://bugs.freedesktop.org/show_bug.cgi?id=87726 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161101115400.15647-1-chris@chris-wilson.co.uk
* drm/i915: Track pages pinned due to swizzling quirkChris Wilson2016-11-011-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we have a tiled object and an unknown CPU swizzle pattern, we pin the pages to prevent the object from being swapped out (and us corrupting the contents as we do not know the access pattern and so cannot convert it to linear and back to tiled on reuse). This requires us to remember to drop the extra pinning when freeing the object, or else we trigger warnings about the pin leak. In commit fbbd37b36fa5 ("drm/i915: Move object release to a freelist + worker"), the object free path was deferred to a worker, but the unpinning of the quirk, along with marking the object as reclaimable, was left on the immediate path (so that if required we could reclaim the pages under memory pressure as early as possible). However, this split introduced a bug where the pages were no longer being unpinned if they were marked as unneeded. [ 231.800401] WARNING: CPU: 1 PID: 90 at drivers/gpu/drm/i915/i915_gem.c:4275 __i915_gem_free_objects+0x326/0x3c0 [i915] [ 231.800403] WARN_ON(i915_gem_object_has_pinned_pages(obj)) [ 231.800405] Modules linked in: [ 231.800406] snd_hda_intel i915 snd_hda_codec_generic mei_me snd_hda_codec coretemp snd_hwdep mei lpc_ich snd_hda_core snd_pcm e1000e ptp pps_core [last unloaded: i915] [ 231.800426] CPU: 1 PID: 90 Comm: kworker/1:4 Tainted: G U 4.9.0-rc2-CI-CI_DRM_1780+ #1 [ 231.800428] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) 04/22/2009 [ 231.800456] Workqueue: events __i915_gem_free_work [i915] [ 231.800459] ffffc9000034fc80 ffffffff8142dd65 ffffc9000034fcd0 0000000000000000 [ 231.800465] ffffc9000034fcc0 ffffffff8107e4e6 000010b300000001 0000000000001000 [ 231.800469] ffff88011d3db740 ffff880130ef0000 0000000000000000 ffff880130ef5ea0 [ 231.800474] Call Trace: [ 231.800479] [<ffffffff8142dd65>] dump_stack+0x67/0x92 [ 231.800484] [<ffffffff8107e4e6>] __warn+0xc6/0xe0 [ 231.800487] [<ffffffff8107e54a>] warn_slowpath_fmt+0x4a/0x50 [ 231.800491] [<ffffffff811d12ac>] ? kmem_cache_free+0x2dc/0x340 [ 231.800520] [<ffffffffa009ef36>] __i915_gem_free_objects+0x326/0x3c0 [i915] [ 231.800548] [<ffffffffa009effe>] __i915_gem_free_work+0x2e/0x50 [i915] [ 231.800552] [<ffffffff8109c27c>] process_one_work+0x1ec/0x6b0 [ 231.800555] [<ffffffff8109c1f6>] ? process_one_work+0x166/0x6b0 [ 231.800558] [<ffffffff8109c789>] worker_thread+0x49/0x490 [ 231.800561] [<ffffffff8109c740>] ? process_one_work+0x6b0/0x6b0 [ 231.800563] [<ffffffff8109c740>] ? process_one_work+0x6b0/0x6b0 [ 231.800566] [<ffffffff810a2aab>] kthread+0xeb/0x110 [ 231.800569] [<ffffffff810a29c0>] ? kthread_park+0x60/0x60 [ 231.800573] [<ffffffff818164a7>] ret_from_fork+0x27/0x40 Moving to a separate flag for tracking the quirked pin is overkill for the bug (since we only have to interchange the two tests in i915_gem_free_object) but it does reduce a complicated test on all objects and provide a sanitycheck for uncommon code paths. Fixes: fbbd37b36fa5 ("drm/i915: Move object release to a freelist + worker") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161101100317.11129-2-chris@chris-wilson.co.uk
* drm/i915: Avoid accessing request->timeline outside of its lifetimeChris Wilson2016-11-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whilst waiting on a request, we may do so without holding any locks or any guards beyond a reference to the request. In order to avoid taking locks within request deallocation, we drop references to its timeline (via the context and ppgtt) upon retirement. We should avoid chasing such pointers outside of their control, in particular we inspect the request->timeline to see if we may restore the RPS waitboost for a client. If we instead look at the engine->timeline, we will have similar behaviour on both full-ppgtt and !full-ppgtt systems and reduce the amount of reward we give towards stalling clients (i.e. only if the client stalls and the GPU is uncontended does it reclaim its boost). This restores behaviour back to pre-timelines, whilst fixing: [ 645.078485] BUG: KASAN: use-after-free in i915_gem_object_wait_fence+0x1ee/0x2e0 at addr ffff8802335643a0 [ 645.078577] Read of size 4 by task gem_exec_schedu/28408 [ 645.078638] CPU: 1 PID: 28408 Comm: gem_exec_schedu Not tainted 4.9.0-rc2+ #64 [ 645.078724] Hardware name: / , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015 [ 645.078816] ffff88022daef9a0 ffffffff8143d059 ffff880235402a80 ffff880233564200 [ 645.078998] ffff88022daef9c8 ffffffff81229c5c ffff88022daefa48 ffff880233564200 [ 645.079172] ffff880235402a80 ffff88022daefa38 ffffffff81229ef0 000000008110a796 [ 645.079345] Call Trace: [ 645.079404] [<ffffffff8143d059>] dump_stack+0x68/0x9f [ 645.079467] [<ffffffff81229c5c>] kasan_object_err+0x1c/0x70 [ 645.079534] [<ffffffff81229ef0>] kasan_report_error+0x1f0/0x4b0 [ 645.079601] [<ffffffff8122a244>] kasan_report+0x34/0x40 [ 645.079676] [<ffffffff81634f5e>] ? i915_gem_object_wait_fence+0x1ee/0x2e0 [ 645.079741] [<ffffffff81229951>] __asan_load4+0x61/0x80 [ 645.079807] [<ffffffff81634f5e>] i915_gem_object_wait_fence+0x1ee/0x2e0 [ 645.079876] [<ffffffff816364bf>] i915_gem_object_wait+0x19f/0x590 [ 645.079944] [<ffffffff81636320>] ? i915_gem_object_wait_priority+0x500/0x500 [ 645.080016] [<ffffffff8110fb30>] ? debug_show_all_locks+0x1e0/0x1e0 [ 645.080084] [<ffffffff8110abdc>] ? check_chain_key+0x14c/0x210 [ 645.080157] [<ffffffff8110a796>] ? __lock_is_held+0x46/0xc0 [ 645.080226] [<ffffffff8163bc61>] ? i915_gem_set_domain_ioctl+0x141/0x690 [ 645.080296] [<ffffffff8163bcc2>] i915_gem_set_domain_ioctl+0x1a2/0x690 [ 645.080366] [<ffffffff811f8f85>] ? __might_fault+0x75/0xe0 [ 645.080433] [<ffffffff815a55f7>] drm_ioctl+0x327/0x640 [ 645.080508] [<ffffffff8163bb20>] ? i915_gem_obj_prepare_shmem_write+0x3a0/0x3a0 [ 645.080603] [<ffffffff815a52d0>] ? drm_ioctl_permit+0x120/0x120 [ 645.080670] [<ffffffff8110abdc>] ? check_chain_key+0x14c/0x210 [ 645.080738] [<ffffffff81275717>] do_vfs_ioctl+0x127/0xa20 [ 645.080804] [<ffffffff8120268c>] ? do_mmap+0x47c/0x580 [ 645.080871] [<ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140 [ 645.080938] [<ffffffff812755f0>] ? ioctl_preallocate+0x150/0x150 [ 645.081011] [<ffffffff81108c53>] ? up_write+0x23/0x50 [ 645.081078] [<ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140 [ 645.081145] [<ffffffff811da450>] ? vma_is_stack_for_current+0x90/0x90 [ 645.081214] [<ffffffff8110d853>] ? mark_held_locks+0x23/0xc0 [ 645.082030] [<ffffffff81288408>] ? __fget+0x168/0x250 [ 645.082106] [<ffffffff819ad517>] ? entry_SYSCALL_64_fastpath+0x5/0xb1 [ 645.082176] [<ffffffff81288592>] ? __fget_light+0xa2/0xc0 [ 645.082242] [<ffffffff8127604c>] SyS_ioctl+0x3c/0x70 [ 645.082309] [<ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 645.082374] Object at ffff880233564200, in cache kmalloc-8192 size: 8192 [ 645.082431] Allocated: [ 645.082480] PID = 28408 [ 645.082535] [ 645.082566] [<ffffffff8103ae66>] save_stack_trace+0x16/0x20 [ 645.082623] [ 645.082656] [<ffffffff81228b06>] save_stack+0x46/0xd0 [ 645.082716] [ 645.082756] [<ffffffff812292fd>] kasan_kmalloc+0xad/0xe0 [ 645.082817] [ 645.082848] [<ffffffff81631752>] i915_ppgtt_create+0x52/0x220 [ 645.082908] [ 645.082941] [<ffffffff8161db96>] i915_gem_create_context+0x396/0x560 [ 645.083027] [ 645.083059] [<ffffffff8161f857>] i915_gem_context_create_ioctl+0x97/0xf0 [ 645.083152] [ 645.083183] [<ffffffff815a55f7>] drm_ioctl+0x327/0x640 [ 645.083243] [ 645.083274] [<ffffffff81275717>] do_vfs_ioctl+0x127/0xa20 [ 645.083334] [ 645.083372] [<ffffffff8127604c>] SyS_ioctl+0x3c/0x70 [ 645.083432] [ 645.083464] [<ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 645.083551] Freed: [ 645.083599] PID = 27629 [ 645.083648] [ 645.083676] [<ffffffff8103ae66>] save_stack_trace+0x16/0x20 [ 645.083738] [ 645.083770] [<ffffffff81228b06>] save_stack+0x46/0xd0 [ 645.083830] [ 645.083862] [<ffffffff81229203>] kasan_slab_free+0x73/0xc0 [ 645.083922] [ 645.083961] [<ffffffff812279c9>] kfree+0xa9/0x170 [ 645.084021] [ 645.084053] [<ffffffff81629f60>] i915_ppgtt_release+0x100/0x180 [ 645.084139] [ 645.084171] [<ffffffff8161d414>] i915_gem_context_free+0x1b4/0x230 [ 645.084257] [ 645.084288] [<ffffffff816537b2>] intel_lr_context_unpin+0x192/0x230 [ 645.084380] [ 645.084413] [<ffffffff81645250>] i915_gem_request_retire+0x620/0x630 [ 645.084500] [ 645.085226] [<ffffffff816473d1>] i915_gem_retire_requests+0x181/0x280 [ 645.085313] [ 645.085352] [<ffffffff816352ba>] i915_gem_retire_work_handler+0xca/0xe0 [ 645.085440] [ 645.085471] [<ffffffff810c725b>] process_one_work+0x4fb/0x920 [ 645.085532] [ 645.085562] [<ffffffff810c770d>] worker_thread+0x8d/0x840 [ 645.085622] [ 645.085653] [<ffffffff810d21e5>] kthread+0x185/0x1b0 [ 645.085718] [ 645.085750] [<ffffffff819ad7a7>] ret_from_fork+0x27/0x40 [ 645.085811] Memory state around the buggy address: [ 645.085869] ffff880233564280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 645.085956] ffff880233564300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 645.086053] >ffff880233564380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 645.086138] ^ [ 645.086193] ffff880233564400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 645.086283] ffff880233564480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb v2: Add a comment to document the hint like nature of intel_engine_last_submit() Fixes: 73cb97010d4f ("drm/i915: Combine seqno + tracking into a global timeline struct") Fixes: 80b204bce8f2 ("drm/i915: Enable multiple timelines") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161101100317.11129-1-chris@chris-wilson.co.uk
* drm/i915: Use the full hammer when shutting down the rcu tasksChris Wilson2016-11-011-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To flush all call_rcu() tasks (here from i915_gem_free_object()) we need to call rcu_barrier() (not synchronize_rcu()). If we don't then we may still have objects being freed as we continue to teardown the driver - in particular, the recently released rings may race with the memory manager shutdown resulting in sporadic: [ 142.217186] WARNING: CPU: 7 PID: 6185 at drivers/gpu/drm/drm_mm.c:932 drm_mm_takedown+0x2e/0x40 [ 142.217187] Memory manager not clean during takedown. [ 142.217187] Modules linked in: i915(-) x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel lpc_ich snd_hda_codec_realtek snd_hda_codec_generic mei_me mei snd_hda_codec_hdmi snd_hda_codec snd_hwdep snd_hda_core snd_pcm e1000e ptp pps_core [last unloaded: snd_hda_intel] [ 142.217199] CPU: 7 PID: 6185 Comm: rmmod Not tainted 4.9.0-rc2-CI-Trybot_242+ #1 [ 142.217199] Hardware name: LENOVO 10AGS00601/SHARKBAY, BIOS FBKT34AUS 04/24/2013 [ 142.217200] ffffc90002ecfce0 ffffffff8142dd65 ffffc90002ecfd30 0000000000000000 [ 142.217202] ffffc90002ecfd20 ffffffff8107e4e6 000003a40778c2a8 ffff880401355c48 [ 142.217204] ffff88040778c2a8 ffffffffa040f3c0 ffffffffa040f4a0 00005621fbf8b1f0 [ 142.217206] Call Trace: [ 142.217209] [<ffffffff8142dd65>] dump_stack+0x67/0x92 [ 142.217211] [<ffffffff8107e4e6>] __warn+0xc6/0xe0 [ 142.217213] [<ffffffff8107e54a>] warn_slowpath_fmt+0x4a/0x50 [ 142.217214] [<ffffffff81559e3e>] drm_mm_takedown+0x2e/0x40 [ 142.217236] [<ffffffffa035c02a>] i915_gem_cleanup_stolen+0x1a/0x20 [i915] [ 142.217246] [<ffffffffa034c581>] i915_ggtt_cleanup_hw+0x31/0xb0 [i915] [ 142.217253] [<ffffffffa0310311>] i915_driver_cleanup_hw+0x31/0x40 [i915] [ 142.217260] [<ffffffffa0312001>] i915_driver_unload+0x141/0x1a0 [i915] [ 142.217268] [<ffffffffa031c2c4>] i915_pci_remove+0x14/0x20 [i915] [ 142.217269] [<ffffffff8147d214>] pci_device_remove+0x34/0xb0 [ 142.217271] [<ffffffff8157b14c>] __device_release_driver+0x9c/0x150 [ 142.217272] [<ffffffff8157bcc6>] driver_detach+0xb6/0xc0 [ 142.217273] [<ffffffff8157abe3>] bus_remove_driver+0x53/0xd0 [ 142.217274] [<ffffffff8157c787>] driver_unregister+0x27/0x50 [ 142.217276] [<ffffffff8147c265>] pci_unregister_driver+0x25/0x70 [ 142.217287] [<ffffffffa03d764c>] i915_exit+0x1a/0x71 [i915] [ 142.217289] [<ffffffff811136b3>] SyS_delete_module+0x193/0x1e0 [ 142.217291] [<ffffffff818174ae>] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 142.217292] ---[ end trace 6fd164859c154772 ]--- [ 142.217505] [drm:show_leaks] *ERROR* node [6b6b6b6b6b6b6b6b + 6b6b6b6b6b6b6b6b]: inserted at [<ffffffff81559ff3>] save_stack.isra.1+0x53/0xa0 [<ffffffff8155a98d>] drm_mm_insert_node_in_range_generic+0x2ad/0x360 [<ffffffffa035bf23>] i915_gem_stolen_insert_node_in_range+0x93/0xe0 [i915] [<ffffffffa035c855>] i915_gem_object_create_stolen+0x75/0xb0 [i915] [<ffffffffa036a51a>] intel_engine_create_ring+0x9a/0x140 [i915] [<ffffffffa036a921>] intel_init_ring_buffer+0xf1/0x440 [i915] [<ffffffffa036be1b>] intel_init_render_ring_buffer+0xab/0x1b0 [i915] [<ffffffffa0363d08>] intel_engines_init+0xc8/0x210 [i915] [<ffffffffa0355d7c>] i915_gem_init+0xac/0xf0 [i915] [<ffffffffa0311454>] i915_driver_load+0x9c4/0x1430 [i915] [<ffffffffa031c2f8>] i915_pci_probe+0x28/0x40 [i915] [<ffffffff8147d315>] pci_device_probe+0x85/0xf0 [<ffffffff8157b7ff>] driver_probe_device+0x21f/0x430 [<ffffffff8157baee>] __driver_attach+0xde/0xe0 In particular note that the node was being poisoned as we inspected the list, a clear indication that the object is being freed as we make the assertion. v2: Don't loop, just assert that we do all the work required as that will be better at detecting further errors. Fixes: fbbd37b36fa5 ("drm/i915: Move object release to a freelist + worker") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161101084843.3961-1-chris@chris-wilson.co.uk
* drm/i915: Enable multiple timelinesChris Wilson2016-10-281-4/+6
| | | | | | | | | | | | | With the infrastructure converted over to tracking multiple timelines in the GEM API whilst preserving the efficiency of using a single execution timeline internally, we can now assign a separate timeline to every context with full-ppgtt. v2: Add a comment to indicate the xfer between timelines upon submission. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-35-chris@chris-wilson.co.uk
* drm/i915: Reserve space in the global seqno during request allocationChris Wilson2016-10-281-4/+3
| | | | | | | | | | | | | | | A restriction on our global seqno is that they cannot wrap, and that we cannot use the value 0. This allows us to detect when a request has not yet been submitted, its global seqno is still 0, and ensures that hardware semaphores are monotonic as required by older hardware. To meet these restrictions when we defer the assignment of the global seqno, we must check that we have an available slot in the global seqno space during request construction. If that test fails, we wait for all requests to be completed and reset the hardware back to 0. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-33-chris@chris-wilson.co.uk
* drm/i915: Introduce a global_seqno for each requestChris Wilson2016-10-281-1/+1
| | | | | | | | | | | | | | Though we will have multiple timelines, we still have a single timeline of execution. This we can use to provide an execution and retirement order of requests. This keeps tracking execution of requests simple, and vital for preserving a single waiter (i.e. so that we can order the waiters so that only the earliest to wakeup need be woken). To accomplish this we distinguish the seqno used to order requests per-context (external) and that used internally for execution. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-26-chris@chris-wilson.co.uk
* drm/i915: Queue the idling context switch after all other timelinesChris Wilson2016-10-281-0/+10
| | | | | | | | | | | | Before suspend, we wait for the switch to the kernel context. In order for all the other context images to be complete upon suspend, that switch must be the last operation by the GPU (i.e. this idling request must not overtake any pending requests). To make this request execute last, we make it depend on every other inflight request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-24-chris@chris-wilson.co.uk
* drm/i915: Combine seqno + tracking into a global timeline structChris Wilson2016-10-281-14/+58
| | | | | | | | | | | | | | | | Our timelines are more than just a seqno. They also provide an ordered list of requests to be executed. Due to the restriction of handling individual address spaces, we are limited to a timeline per address space but we use a fence context per engine within. Our first step to introducing independent timelines per context (i.e. to allow each context to have a queue of requests to execute that have a defined set of dependencies on other requests) is to provide a timeline abstraction for the global execution queue. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-23-chris@chris-wilson.co.uk
* drm/i915: Move GEM activity tracking into a common struct reservation_objectChris Wilson2016-10-281-188/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation to support many distinct timelines, we need to expand the activity tracking on the GEM object to handle more than just a request per engine. We already use the struct reservation_object on the dma-buf to handle many fence contexts, so integrating that into the GEM object itself is the preferred solution. (For example, we can now share the same reservation_object between every consumer/producer using this buffer and skip the manual import/export via dma-buf.) v2: Reimplement busy-ioctl (by walking the reservation object), postpone the ABI change for another day. Similarly use the reservation object to find the last_write request (if active and from i915) for choosing display CS flips. Caveats: * busy-ioctl: busy-ioctl only reports on the native fences, it will not warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is being rendered to by external fences. It also will not report the same busy state as wait-ioctl (or polling on the dma-buf) in the same circumstances. On the plus side, it does retain reporting of which *i915* engines are engaged with this object. * non-blocking atomic modesets take a step backwards as the wait for render completion blocks the ioctl. This is fixed in a subsequent patch to use a fence instead for awaiting on the rendering, see "drm/i915: Restore nonblocking awaits for modesetting" * dynamic array manipulation for shared-fences in reservation is slower than the previous lockless static assignment (e.g. gem_exec_lut_handle runtime on ivb goes from 42s to 66s), mainly due to atomic operations (maintaining the fence refcounts). * loss of object-level retirement callbacks, emulated by VMA retirement tracking. * minor loss of object-level last activity information from debugfs, could be replaced with per-vma information if desired Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-21-chris@chris-wilson.co.uk
* drm/i915: Use lockless object freeChris Wilson2016-10-281-15/+15
| | | | | | | | | | Having moved the locked phase of freeing an object to a separate worker, we can now declare to the core that we only need the unlocked variant of driver->gem_free_object, and can use the simple unreference internally. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-20-chris@chris-wilson.co.uk
* drm/i915: Move object release to a freelist + workerChris Wilson2016-10-281-50/+116
| | | | | | | | | | | | | | | | | | | | | | | | We want to hide the latency of releasing objects and their backing storage from the submission, so we move the actual free to a worker. This allows us to switch to struct_mutex freeing of the object in the next patch. Furthermore, if we know that the object we are dereferencing remains valid for the duration of our access, we can forgo the usual synchronisation barriers and atomic reference counting. To ensure this we defer freeing an object til after an RCU grace period, such that any lookup of the object within an RCU read critical section will remain valid until after we exit that critical section. We also employ this delay for rate-limiting the serialisation on reallocation - we have to slow down object creation in order to prevent resource starvation (in particular, files). v2: Return early in i915_gem_tiling() ioctl to skip over superfluous work on error. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-19-chris@chris-wilson.co.uk
* drm/i915: Acquire the backing storage outside of struct_mutex in set-domainChris Wilson2016-10-281-38/+61
| | | | | | | | | | As we can locklessly (well struct_mutex-lessly) acquire the backing storage, do so in set-domain-ioctl to reduce the contention on the struct_mutex. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-18-chris@chris-wilson.co.uk
* drm/i915: Implement pwrite without struct-mutexChris Wilson2016-10-281-231/+122
| | | | | | | | | | | We only need struct_mutex within pwrite for a brief window where we need to serialise with rendering and control our cache domains. Elsewhere we can rely on the backing storage being pinned, and forgive userspace any races against us. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-17-chris@chris-wilson.co.uk
* drm/i915: Implement pread without struct-mutexChris Wilson2016-10-281-208/+157
| | | | | | | | | | | We only need struct_mutex within pread for a brief window where we need to serialise with rendering and control our cache domains. Elsewhere we can rely on the backing storage being pinned, and forgive userspace any races against us. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-16-chris@chris-wilson.co.uk
* drm/i915: Move object backing storage manipulation to its own lockingChris Wilson2016-10-281-31/+62
| | | | | | | | | | | | | | | Break the allocation of the backing storage away from struct_mutex into a per-object lock. This allows parallel page allocation, provided we can do so outside of struct_mutex (i.e. set-domain-ioctl, pwrite, GTT fault), i.e. before execbuf! The increased cost of the atomic counters are hidden behind i915_vma_pin() for the typical case of execbuf, i.e. as the object is typically bound between execbufs, the page_pin_count is static. The cost will be felt around set-domain and pwrite, but offset by the improvement from reduced struct_mutex contention. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-14-chris@chris-wilson.co.uk
* drm/i915: Pass around sg_table to get_pages/put_pages backendChris Wilson2016-10-281-89/+83
| | | | | | | | | | | The plan is to move obj->pages out from under the struct_mutex into its own per-object lock. We need to prune any assumption of the struct_mutex from the get_pages/put_pages backends, and to make it easier we pass around the sg_table to operate on rather than indirectly via the obj. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-13-chris@chris-wilson.co.uk
* drm/i915: Refactor object page APIChris Wilson2016-10-281-110/+95
| | | | | | | | | | | | The plan is to make obtaining the backing storage for the object avoid struct_mutex (i.e. use its own locking). The first step is to update the API so that normal users only call pin/unpin whilst working on the backing storage. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk
* drm/i915: Use a radixtree for random access to the object's backing storageChris Wilson2016-10-281-17/+168
| | | | | | | | | | | | | | | | | | | | | | | A while ago we switched from a contiguous array of pages into an sglist, for that was both more convenient for mapping to hardware and avoided the requirement for a vmalloc array of pages on every object. However, certain GEM API calls (like pwrite, pread as well as performing relocations) do desire access to individual struct pages. A quick hack was to introduce a cache of the last access such that finding the following page was quick - this works so long as the caller desired sequential access. Walking backwards, or multiple callers, still hits a slow linear search for each page. One solution is to store each successful lookup in a radix tree. v2: Rewrite building the radixtree for clarity, hopefully. v3: Rearrange execbuf to avoid calling i915_gem_object_get_sg() from within an atomic section and so relax the allocation context to a simple GFP_KERNEL and mutex. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-10-chris@chris-wilson.co.uk
* drm/i915: Markup GEM API with lockdep assertsChris Wilson2016-10-281-0/+21
| | | | | | | | | Add lockdep_assert_held(struct_mutex) to the API preamble of the internal GEM interfaces. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-9-chris@chris-wilson.co.uk
* drm/i915: Defer active reference until requiredChris Wilson2016-10-281-1/+21
| | | | | | | | | | | We only need the active reference to keep the object alive after the handle has been deleted (so as to prevent a synchronous gem_close). Why then pay the price of a kref on every execbuf when we can insert that final active ref just in time for the handle deletion? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-6-chris@chris-wilson.co.uk
* drm/i915: Rearrange i915_wait_request() accounting with callersChris Wilson2016-10-281-78/+231
| | | | | | | | | | | | | | | | | | | Our low-level wait routine has evolved from our generic wait interface that handled unlocked, RPS boosting, waits with time tracking. If we push our GEM fence tracking to use reservation_objects (required for handling multiple timelines), we lose the ability to pass the required information down to i915_wait_request(). However, if we push the extra functionality from i915_wait_request() to the individual callsites (i915_gem_object_wait_rendering and i915_gem_wait_ioctl) that make use of those extras, we can both simplify our low level wait and prepare for extending the GEM interface for use of reservation_objects. v2: Rewrite i915_wait_request() kerneldocs Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-4-chris@chris-wilson.co.uk
* drm/i915: Remove superfluous wait_for_error() from throttle-ioctlChris Wilson2016-10-281-4/+0
| | | | | | | | | | | | | The throttle-ioctl never touches the struct_mutex. It does, however, as part of its ABI report whether the hardware is terminally wedged. For that purposes, it only has to report the current state and not incur the cost of checking/waiting every invocation, as we do not have to wait for a reset before waiting on a request to ensure completion (that is baked into the wait request implementation). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-3-chris@chris-wilson.co.uk
* drm/i915: Support asynchronous waits on struct fence from i915_gem_requestChris Wilson2016-10-281-1/+1
| | | | | | | | | | | | | | We will need to wait on DMA completion (as signaled via struct fence) before executing our i915_gem_request. Therefore we want to expose a method for adding the await on the fence itself to the request. v2: Add a comment detailing a failure to handle a signal-on-any fence-array. v3: Pretend that magic numbers don't exist. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-1-chris@chris-wilson.co.uk
* drm/i915: Remove two invalid warnsTvrtko Ursulin2016-10-261-3/+0
| | | | | | | | | | | | | | | | Objects can have multiple VMAs used for display in which case assertion that objects must not be pinned for display more times than the current VMA is incorrect. v2: Commit message update. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 058d88c4330f ("drm/i915: Track pinned VMA") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1477413635-3876-1-git-send-email-tvrtko.ursulin@linux.intel.com
* drm/i915: Rotated view does not need a fenceTvrtko Ursulin2016-10-261-1/+6
| | | | | | | | | | | | | | We do not need to set up a fence for the rotated view. Display does not need it and no one can access it. v2: Move code to __i915_vma_set_map_and_fenceable. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 05a20d098db1 ("drm/i915: Move map-and-fenceable tracking to the VMA") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Include the kernel uptime in the error stateChris Wilson2016-10-251-0/+2
| | | | | | | | | | As well as knowing when the error occurred, it is more interesting to me to know how long after booting the error occurred, and for good measure record the time since last hw initialisation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161025121602.1457-1-chris@chris-wilson.co.uk
* Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queuedDaniel Vetter2016-10-251-3/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backmerge because Chris Wilson needs the very latest&greates of Gustavo Padovan's sync_file work, specifically the refcounting changes from: commit 30cd85dd6edc86ea8d8589efb813f1fad41ef233 Author: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Date: Wed Oct 19 15:48:32 2016 -0200 dma-buf/sync_file: hold reference to fence when creating sync_file Also good to sync in general since git tends to get confused with the cherry-picking going on. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
| * Merge tag 'drm-intel-next-2016-10-24' of ↵Dave Airlie2016-10-251-70/+147
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://anongit.freedesktop.org/drm-intel into drm-next - first slice of the gvt device model (Zhenyu et al) - compression support for gpu error states (Chris) - sunset clause on gpu errors resulting in dmesg noise telling users how to report them - .rodata diet from Tvrtko - switch over lots of macros to only take dev_priv (Tvrtko) - underrun suppression for dp link training (Ville) - lspcon (hmdi 2.0 on skl/bxt) support from Shashank Sharma, polish from Jani - gen9 wm fixes from Paulo&Lyude - updated ddi programming for kbl (Rodrigo) - respect alternate aux/ddc pins (from vbt) for all ddi ports (Ville) * tag 'drm-intel-next-2016-10-24' of git://anongit.freedesktop.org/drm-intel: (227 commits) drm/i915: Update DRIVER_DATE to 20161024 drm/i915: Stop setting SNB min-freq-table 0 on powersave setup drm/i915/dp: add lane_count check in intel_dp_check_link_status drm/i915: Fix whitespace issues drm/i915: Clean up DDI DDC/AUX CH sanitation drm/i915: Respect alternate_ddc_pin for all DDI ports drm/i915: Respect alternate_aux_channel for all DDI ports drm/i915/gen9: Remove WaEnableYV12BugFixInHalfSliceChicken7 drm/i915: KBL - Recommended buffer translation programming for DisplayPort drm/i915: Move down skl/kbl ddi iboost and n_edp_entires fixup drm/i915: Add a sunset clause to GPU hang logging drm/i915: Stop reporting error details in dmesg as well as the error-state drm/i915/gvt: do not ignore return value of create_scratch_page drm/i915/gvt: fix spare warnings on odd constant _Bool cast drm/i915/gvt: mark symbols static where possible drm/i915/gvt: fix sparse warnings on different address spaces drm/i915/gvt: properly access enabled intel_engine_cs drm/i915/gvt: Remove defunct vmap_batch() drm/i915/gvt: Use common mapping routines for shadow_bb object drm/i915/gvt: Use common mapping routines for indirect_ctx object ...
| * \ Merge tag 'drm-for-v4.9' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds2016-10-111-2122/+1309
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm updates from Dave Airlie: "Core: - Fence destaging work - DRIVER_LEGACY to split off legacy drm drivers - drm_mm refactoring - Splitting drm_crtc.c into chunks and documenting better - Display info fixes - rbtree support for prime buffer lookup - Simple VGA DAC driver Panel: - Add Nexus 7 panel - More simple panels i915: - Refactoring GEM naming - Refactored vma/active tracking - Lockless request lookups - Better stolen memory support - FBC fixes - SKL watermark fixes - VGPU improvements - dma-buf fencing support - Better DP dongle support amdgpu: - Powerplay for Iceland asics - Improved GPU reset support - UVD/VEC powergating support for CZ/ST - Preinitialised VRAM buffer support - Virtual display support - Initial SI support - GTT rework - PCI shutdown callback support - HPD IRQ storm fixes amdkfd: - bugfixes tilcdc: - Atomic modesetting support mediatek: - AAL + GAMMA engine support - Hook up gamma LUT - Temporal dithering support imx: - Pixel clock from devicetree - drm bridge support for LVDS bridges - active plane reconfiguration - VDIC deinterlacer support - Frame synchronisation unit support - Color space conversion support analogix: - PSR support - Better panel on/off support rockchip: - rk3399 vop/crtc support - PSR support vc4: - Interlaced vblank timing - 3D rendering CPU overhead reduction - HDMI output fixes tda998x: - HDMI audio ASoC support sunxi: - Allwinner A33 support - better TCON support msm: - DT binding cleanups - Explicit fence-fd support sti: - remove sti415/416 support etnaviv: - MMUv2 refactoring - GC3000 support exynos: - Refactoring HDMI DCC/PHY - G2D pm regression fix - Page fault issues with wait for vblank There is no nouveau work in this tree, as Ben didn't get a pull request in, and he was fighting moving to atomic and adding mst support, so maybe best it waits for a cycle" * tag 'drm-for-v4.9' of git://people.freedesktop.org/~airlied/linux: (1412 commits) drm/crtc: constify drm_crtc_index parameter drm/i915: Fix conflict resolution from backmerge of v4.8-rc8 to drm-next drm/i915/guc: Unwind GuC workqueue reservation if request construction fails drm/i915: Reset the breadcrumbs IRQ more carefully drm/i915: Force relocations via cpu if we run out of idle aperture drm/i915: Distinguish last emitted request from last submitted request drm/i915: Allow DP to work w/o EDID drm/i915: Move long hpd handling into the hotplug work drm/i915/execlists: Reinitialise context image after GPU hang drm/i915: Use correct index for backtracking HUNG semaphores drm/i915: Unalias obj->phys_handle and obj->userptr drm/i915: Just clear the mmiodebug before a register access drm/i915/gen9: only add the planes actually affected by ddb changes drm/i915: Allow PCH DPLL sharing regardless of DPLL_SDVO_HIGH_SPEED drm/i915/bxt: Fix HDMI DPLL configuration drm/i915/gen9: fix the watermark res_blocks value drm/i915/gen9: fix plane_blocks_per_line on watermarks calculations drm/i915/gen9: minimum scanlines for Y tile is not always 4 drm/i915/gen9: fix the WaWmMemoryReadLatency implementation drm/i915/kbl: KBL also needs to run the SAGV code ...
| | * | drm/i915: Fix conflict resolution from backmerge of v4.8-rc8 to drm-nextChris Wilson2016-10-101-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The conflict resolution of v4.8-rc8 backmerge to drm-next pulled back in a few lines of dead code due to the code movement around i915_gem_reset(), fix that up. Fixes: ca09fb9f60b5 ("Merge tag 'v4.8-rc8' into drm-next") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161010125017.23911-1-chris@chris-wilson.co.uk
| | * | drm/i915: Only shrink the unbound objects during freezeChris Wilson2016-10-101-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the point of creating the hibernation image, the runtime power manage core is disabled - and using the rpm functions triggers a warn. i915_gem_shrink_all() tries to unbind objects, which requires device access and so tries to how an rpm reference triggering a warning: [ 44.235420] ------------[ cut here ]------------ [ 44.235424] WARNING: CPU: 2 PID: 2199 at drivers/gpu/drm/i915/intel_runtime_pm.c:2688 intel_runtime_pm_get_if_in_use+0xe6/0xf0 [ 44.235426] WARN_ON_ONCE(ret < 0) [ 44.235445] Modules linked in: ctr ccm arc4 rt2800usb rt2x00usb rt2800lib rt2x00lib crc_ccitt mac80211 cmac cfg80211 btusb rfcomm bnep btrtl btbcm btintel bluetooth dcdbas x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_realtek crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_generic aesni_intel snd_hda_codec_hdmi aes_x86_64 lrw gf128mul snd_hda_intel glue_helper ablk_helper cryptd snd_hda_codec hid_multitouch joydev snd_hda_core binfmt_misc i2c_hid serio_raw snd_pcm acpi_pad snd_timer snd i2c_designware_platform 8250_dw nls_iso8859_1 i2c_designware_core lpc_ich mfd_core soundcore usbhid hid psmouse ahci libahci [ 44.235447] CPU: 2 PID: 2199 Comm: kworker/u8:8 Not tainted 4.8.0-rc5+ #130 [ 44.235447] Hardware name: Dell Inc. XPS 13 9343/0310JH, BIOS A07 11/11/2015 [ 44.235450] Workqueue: events_unbound async_run_entry_fn [ 44.235453] 0000000000000000 ffff8801b2f7fb98 ffffffff81306c2f ffff8801b2f7fbe8 [ 44.235454] 0000000000000000 ffff8801b2f7fbd8 ffffffff81056c01 00000a801f50ecc0 [ 44.235456] ffff88020ce50000 ffff88020ce59b60 ffffffff81a60b5c ffffffff81414840 [ 44.235456] Call Trace: [ 44.235459] [<ffffffff81306c2f>] dump_stack+0x4d/0x6e [ 44.235461] [<ffffffff81056c01>] __warn+0xd1/0xf0 [ 44.235464] [<ffffffff81414840>] ? i915_pm_suspend_late+0x30/0x30 [ 44.235465] [<ffffffff81056c6f>] warn_slowpath_fmt+0x4f/0x60 [ 44.235468] [<ffffffff814e73ce>] ? pm_runtime_get_if_in_use+0x6e/0xa0 [ 44.235469] [<ffffffff81433526>] intel_runtime_pm_get_if_in_use+0xe6/0xf0 [ 44.235471] [<ffffffff81458a26>] i915_gem_shrink+0x306/0x360 [ 44.235473] [<ffffffff81343fd4>] ? pci_platform_power_transition+0x24/0x90 [ 44.235475] [<ffffffff81414840>] ? i915_pm_suspend_late+0x30/0x30 [ 44.235476] [<ffffffff81458dfb>] i915_gem_shrink_all+0x1b/0x30 [ 44.235478] [<ffffffff814560b3>] i915_gem_freeze_late+0x33/0x90 [ 44.235479] [<ffffffff81414877>] i915_pm_freeze_late+0x37/0x40 [ 44.235481] [<ffffffff814e9b8e>] dpm_run_callback+0x4e/0x130 [ 44.235483] [<ffffffff814ea5db>] __device_suspend_late+0xdb/0x1f0 [ 44.235484] [<ffffffff814ea70f>] async_suspend_late+0x1f/0xa0 [ 44.235486] [<ffffffff81077557>] async_run_entry_fn+0x37/0x150 [ 44.235488] [<ffffffff8106f518>] process_one_work+0x148/0x3f0 [ 44.235490] [<ffffffff8106f8eb>] worker_thread+0x12b/0x490 [ 44.235491] [<ffffffff8106f7c0>] ? process_one_work+0x3f0/0x3f0 [ 44.235492] [<ffffffff81074d09>] kthread+0xc9/0xe0 [ 44.235495] [<ffffffff816e257f>] ret_from_fork+0x1f/0x40 [ 44.235496] [<ffffffff81074c40>] ? kthread_park+0x60/0x60 [ 44.235497] ---[ end trace e438706b97c7f132 ]--- Alternatively, to actually shrink everything we have to do so slightly earlier in the hibernation process. To keep lockdep silent, we need to take struct_mutex for the shrinker even though we know that we are the only user during the freeze. Fixes: 7aab2d534e35 ("drm/i915: Shrink objects prior to hibernation") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160921135108.29574-2-chris@chris-wilson.co.uk (cherry picked from commit 6a800eabba34945c2986d70114b41d564bad52a8) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| | * | drm/i915: Restore current RPS state after resetChris Wilson2016-10-101-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Following commit 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete requests") we no longer mark the context as lost on reset as we keep the requests (and contexts) alive. However, RPS remains reset and we need to restore the current state to match the in-flight requests. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97824 Fixes: 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete requests") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160921135108.29574-1-chris@chris-wilson.co.uk (cherry picked from commit f2a91d1a6f5960c08f1ca60bd076f4dc020c50c6) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
OpenPOWER on IntegriCloud