blackbird-op-linux - Blackbird™ Linux sources for OpenPOWER

	Commit message (Collapse)	Author	Age	Files	Lines
*	drm/i915/pmu: Correct the rc6 offset upon enabling	Chris Wilson	2020-02-10	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The rc6 residency starts ticking from 0 from BIOS POST, but the kernel starts measuring the time from its boot. If we start measuruing I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from 0 and then upon first activity (park/unpark) add in all the rc6 residency since boot. After the first park with the sampler engaged, the sleep/active counters are aligned. v2: With a wakeref to be sure Closes: https://gitlab.freedesktop.org/drm/intel/issues/973 Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200114105648.2172026-1-chris@chris-wilson.co.uk (cherry picked from commit f4e9894b6952a2819937f363cd42e7cd7894a1e4) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
*	drm/i915/pmu: Do not use colons or dashes in PMU names	Tvrtko Ursulin	2020-01-13	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We use PCI device path in the registered PMU name in order to distinguish between multiple GPUs. But since tools/perf reserves a special meaning to dash and colon characters we need to transliterate them to something else. We choose an underscore. v2: * Use strreplace. (Chris) * Dashes are not good either. (Chris) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reported-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Fixes: 05488673a4d4 ("drm/i915/pmu: Support multiple GPUs") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200110113253.12535-1-tvrtko.ursulin@linux.intel.com
*	drm/i915/pmu: Ensure monotonic rc6	Tvrtko Ursulin	2019-12-18	1	-53/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid rc6 counter going backward in close to 0% RC6 scenarios like: 15.005477996 114,246,613 ns i915/rc6-residency/ 16.005876662 667,657 ns i915/rc6-residency/ 17.006131417 7,286 ns i915/rc6-residency/ 18.006615031 18,446,744,073,708,914,688 ns i915/rc6-residency/ 19.007158361 18,446,744,073,709,447,168 ns i915/rc6-residency/ 20.007806498 0 ns i915/rc6-residency/ 21.008227495 1,440,403 ns i915/rc6-residency/ There are two aspects to this fix. First is not assuming rc6 value zero means GT is asleep since that can also mean GPU is fully busy and we do not want to enter the estimation path in that case. Second is ensuring monotonicity on the estimation path itself. I suspect what is happening is with extremely rapid park/unpark cycles we get no updates on the real rc6 and therefore have to careful not to unconditionally trust use last known real rc6 when creating a new estimation. v2: * Simplify logic by not tracking the estimate but last reported value. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 16ffe73c186b ("drm/i915/pmu: Use GT parked for estimating RC6 while asleep") Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191217142057.1000-1-tvrtko.ursulin@linux.intel.com
*	drm/i915/pmu: Skip sampling engines if gt is asleep	Chris Wilson	2019-12-18	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	If the whole GT is asleep, we know that each engine must also be asleep and so we can quickly return without checking them all. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191218000756.3475668-1-chris@chris-wilson.co.uk
*	drm/i915/rps: Add frequency translation helpers	Andi Shyti	2019-12-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add two helpers that for reading the actual GT's frequency. The two helpers are: - intel_rps_read_cagf: reads the frequency and returns it not normalized - intel_rps_read_actual_frequency: provides the frequency in Hz. Use the above helpers in sysfs and debugfs. Signed-off-by: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191213183736.31992-2-andi@etezian.org
*	drm/i915/pmu: Report frequency as zero while GPU is sleeping	Tvrtko Ursulin	2019-12-06	1	-21/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We used to report the minimum possible frequency as both requested and active while GPU was in sleep state. This was a consequence of sampling the value from the "current frequency" field in our software tracking. This was strictly speaking wrong, but given that until recently the current frequency in sleeping state used to be equal to minimum, it did not stand out sufficiently to be noticed as such. After some recent changes have made the current frequency be reported as last active before GPU went to sleep, meaning both requested and active frequencies could end up being reported at their maximum values for the duration of the GPU idle state, it became much more obvious that this does not make sense. To fix this we will now sample the frequency counters only when the GPU is awake. As a consequence reported frequencies could be reported as below the GPU reported minimum but that should be much less confusing that the current situation. v2: * Split out early exit conditions for readability. (Chris) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Closes: https://gitlab.freedesktop.org/drm/intel/issues/675 Link: https://patchwork.freedesktop.org/patch/msgid/20191129105436.20100-1-tvrtko.ursulin@linux.intel.com
*	drm/i915: Mark up the calling context for intel_wakeref_put()	Chris Wilson	2019-11-20	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, we assumed we could use mutex_trylock() within an atomic context, falling back to a worker if contended. However, such trickery is illegal inside interrupt context, and so we need to always use a worker under such circumstances. As we normally are in process context, we can typically use a plain mutex, and only defer to a work when we know we are being called from an interrupt path. Fixes: 51fbd8de87dc ("drm/i915/pmu: Atomically acquire the gt_pm wakeref") References: a0855d24fc22d ("locking/mutex: Complain upon mutex API misuse in IRQ contexts") References: https://bugs.freedesktop.org/show_bug.cgi?id=111626 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191120125433.3767149-1-chris@chris-wilson.co.uk
*	drm/i915/pmu: "Frequency" is reported as accumulated cycles	Chris Wilson	2019-11-11	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	We report "frequencies" (actual-frequency, requested-frequency) as the number of accumulated cycles so that the average frequency over that period may be determined by the user. This means the units we report to the user are Mcycles (or just M), not MHz. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191109105356.5273-1-chris@chris-wilson.co.uk
*	drm/i915/pmu: Only use exclusive mmio access for gen7	Chris Wilson	2019-11-08	1	-2/+19
\| \| \| \| \| \| \| \| \| \| \| \| \|	On gen7, we have to avoid concurrent access to the same mmio cacheline, and so coordinate all mmio access with the uncore->lock. However, for pmu, we want to avoid perturbing the system and disabling interrupts unnecessarily, so restrict the w/a to gen7 where it is requied to prevent machine hangs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191108103511.20951-2-chris@chris-wilson.co.uk
*	drm/i915/pmu: Cheat when reading the actual frequency to avoid fw	Chris Wilson	2019-11-08	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \|	We want to avoid taking forcewake when querying the performance stats, as we wish to avoid perturbing the system under observation. (And with the forcewake being kept alive for 1ms after use, sampling the frequency from a 200Hz timer keeps forcewake 40% active.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191108103511.20951-1-chris@chris-wilson.co.uk
*	drm/i915: Extract GT render power state management	Andi Shyti	2019-10-26	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	i915_irq.c is large. One reason for this is that has a large chunk of the GT render power management stashed away in it. Extract that logic out of i915_irq.c and intel_pm.c and put it under one roof. Based on a patch by Chris Wilson. Signed-off-by: Andi Shyti <andi.shyti@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191024211642.7688-1-chris@chris-wilson.co.uk
*	drm/i915/pmu: Initialise the spinlock before registering	Chris Wilson	2019-10-25	1	-18/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	As the GT may be running in parallel with the module initialisation code, we may enter i915_pmu_gt_parked() as we are executing i915_pmu_register(). We have to init the spinlock before we mark pmu.event_init so that it is available for use by i915_pmu_gt_parked() (which may run as soon as event_init is set). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112127 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191025165442.23356-1-chris@chris-wilson.co.uk
*	drm/i915/gt: Convert the leftover for_each_engine(gt)	Chris Wilson	2019-10-18	1	-1/+1
\| \| \| \| \| \| \| \| \|	Use the local gt for iterating over the available set of engines. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191018115331.8980-1-chris@chris-wilson.co.uk
*	drm/i915/pmu: Fix uninitialized variable on error path	Tvrtko Ursulin	2019-10-18	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \|	If name allocation failed the log message will contain an uninitialized error code which can be confusing. Fixes: 05488673a4d4 ("drm/i915/pmu: Support multiple GPUs") Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191018090514.1818-1-tvrtko.ursulin@linux.intel.com [tursulin: Commit message spelling fix.]
*	drm/i915/pmu: Support multiple GPUs	Tvrtko Ursulin	2019-10-17	1	-2/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With discrete graphics system can have both integrated and discrete GPU handled by i915. Currently we use a fixed name ("i915") when registering as the uncore PMU provider which stops working in this case. To fix this we add the PCI device name string to non-integrated devices handled by us. Integrated devices keep the legacy name preserving backward compatibility. v2: * Detect IGP and keep legacy name. (Michal) * Use PCI device name as suffix. (Michal, Chris) v3: * Constify the name. (Chris) * Use pci_domain_nr. (Chris) v4: * Fix kfree_const usage. (Chris) v5: * kfree_const does not work for modules. (Chris) * Changed is_igp helper to take i915. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191016093802.12483-1-tvrtko.ursulin@linux.intel.com
*	drm/i915: Extract GT render sleep (rc6) management	Andi Shyti	2019-09-27	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Continuing the theme of breaking intel_pm.c up in a reasonable chunk of powermanagement utilities, pull out the rc6 setup into its GT handler. Based on a patch by Chris Wilson. Signed-off-by: Andi Shyti <andi.shyti@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190919143840.20384-1-andi.shyti@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20190927110849.28734-1-chris@chris-wilson.co.uk
*	drm/i915/pmu: Use GT parked for estimating RC6 while asleep	Chris Wilson	2019-09-12	1	-107/+135
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As we track when we put the GT device to sleep upon idling, we can use that callback to sample the current rc6 counters and record the timestamp for estimating samples after that point while asleep. v2: Stick to using ktime_t v3: Track user_wakerefs that interfere with the new intel_gt_pm_wait_for_idle v4: No need for parked/unparked estimation if !CONFIG_PM v5: Keep timer park/unpark logic as was v6: Refactor duplicated estimate/update rc6 logic v7: Pull intel_get_pm_get_if_awake() out from the pmu->lock. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105010 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190912124813.19225-1-chris@chris-wilson.co.uk
*	drm/i915/pmu: Skip busyness sampling when and where not needed	Tvrtko Ursulin	2019-09-12	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since d0aa694b9239 ("drm/i915/pmu: Always sample an active ringbuffer") the cost of sampling the engine state on execlists platforms became a little bit higher when both engine busyness and one of the wait states are being monitored. (Previously the busyness sampling on legacy platforms was done via seqno comparison so there was no cost of mmio read.) We can avoid that by skipping busyness sampling when engine supports software busy stats and so avoid the cost of potential mmio read and sample accumulation. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190911160730.22687-1-tvrtko.ursulin@linux.intel.com
*	drm/i915: Convert a few more bland dmesg info to be device specific	Chris Wilson	2019-08-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Looking around the GT initialisation, we have a few log messages we think are interesting enough present to the user (such as the amount of L4 cache) and a few to inform them of the result of actions or conflicting HW restrictions (i.e. quirks). These are device specific messages, so use the dev family of printk. v2: shave off a few bytes of .rodata! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190815093604.3618-1-chris@chris-wilson.co.uk
*	drm/i915/gt: Move the [class][inst] lookup for engines onto the GT	Chris Wilson	2019-08-06	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To maintain a fast lookup from a GT centric irq handler, we want the engine lookup tables on the intel_gt. To avoid having multiple copies of the same multi-dimension lookup table, move the generic user engine lookup into an rbtree (for fast and flexible indexing). v2: Split uabi_instance cf uabi_class v3: Set uabi_class/uabi_instance after collating all engines to provide a stable uabi across parallel unordered construction. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2 Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-2-chris@chris-wilson.co.uk
*	drm/i915/pmu: Atomically acquire the gt_pm wakeref	Chris Wilson	2019-08-02	1	-23/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, we only sample if the intel_gt is awake, but we acquire our own runtime_pm wakeref. Since intel_gt has transitioned to tracking its own wakeref, we can atomically test and acquire that wakeref instead. v2: Take engine->wakeref for engine sampling Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190801233616.23007-1-chris@chris-wilson.co.uk
*	drm/i915/pmu: Make get_rc6 take intel_gt	Tvrtko Ursulin	2019-08-01	1	-5/+7
\| \| \| \| \| \| \| \| \|	RC6 is a GT state so make the function parameter reflect that. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190801162330.2729-4-tvrtko.ursulin@linux.intel.com
*	drm/i915/pmu: Convert sampling to gt	Tvrtko Ursulin	2019-08-01	1	-20/+23
\| \| \| \| \| \| \| \| \| \|	Engines and frequencies are a GT thing so adjust sampling routines to match. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190801162330.2729-3-tvrtko.ursulin@linux.intel.com
*	drm/i915/pmu: Convert engine sampling to uncore mmio	Tvrtko Ursulin	2019-08-01	1	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Drops one macro using implicit dev_priv. v2: * Use ENGINE_READ_FW. (Chris) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190801162330.2729-2-tvrtko.ursulin@linux.intel.com
*	drm/i915/pmu: Make more struct i915_pmu centric	Tvrtko Ursulin	2019-08-01	1	-90/+104
\| \| \| \| \| \| \| \| \|	Just tidy the code a bit by removing a sea of overly verbose i915->pmu.*. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190801162330.2729-1-tvrtko.ursulin@linux.intel.com
*	drm/i915: Show support for accurate sw PMU busyness tracking	Chris Wilson	2019-07-04	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Expose whether or not we support the PMU software tracking in our scheduler capabilities, so userspace can query at runtime. v2: Use I915_SCHEDULER_CAP_ENGINE_BUSY_STATS for a less ambiguous capability name. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190703143702.11339-1-chris@chris-wilson.co.uk
*	drm/i915: update with_intel_runtime_pm to use the rpm structure	Daniele Ceraolo Spurio	2019-06-14	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \|	Matching the underlying get/put functions. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-8-daniele.ceraolospurio@intel.com
*	drm/i915: update rpm_get/put to use the rpm structure	Daniele Ceraolo Spurio	2019-06-14	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The functions where internally already only using the structure, so we need to just flip the interface. v2: rebase Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-7-daniele.ceraolospurio@intel.com
*	drm/i915: Remove I915_READ_NOTRACE	Tvrtko Ursulin	2019-06-12	1	-3/+5
\| \| \| \| \| \| \| \| \|	Only a few call sites remain which have been converted to uncore mmio accessors and so the macro can be removed. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190611104548.30545-5-tvrtko.ursulin@linux.intel.com
*	drm/i915: move some leftovers to intel_pm.h from i915_drv.h	Jani Nikula	2019-04-30	1	-1/+2
\| \| \| \| \| \| \| \| \|	Commit 696173b064c6 ("drm/i915: extract intel_pm.h from intel_drv.h") missed the declarations in i915_drv.h. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/770f5f1c2dd99e4d6a314b70184e71b928a6d362.1556540890.git.jani.nikula@intel.com
*	drm/i915: Move GraphicsTechnology files under gt/	Chris Wilson	2019-04-24	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Start partitioning off the code that talks to the hardware (GT) from the uapi layers and move the device facing code under gt/ One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to subdivide that header and body further (and split out the submission code from the ringbuffer and logical context handling). This patch aims to be simple motion so git can fixup inflight patches with little mess. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
*	Merge drm/drm-next into drm-intel-next-queued	Joonas Lahtinen	2019-03-27	1	-10/+6
\|\ \| \| \| \| \| \| \| \| \| \|	This is needed to get the fourcc code merged without conflicts. Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
\| *	Merge tag 'drm-next-2019-03-06' of git://anongit.freedesktop.org/drm/drm	Linus Torvalds	2019-03-08	1	-9/+14
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull drm updates from Dave Airlie: "This is the main drm pull request for the 5.1 merge window. The big changes I'd highlight are: - nouveau has HMM support now, there is finally an in-tree user so we can quieten down the rip it out people. - i915 now enables fastboot by default on Skylake+ - Displayport Multistream support has been refactored and should hopefully be more reliable. Core: - header cleanups aiming towards removing drmP.h - dma-buf fence seqnos to 64-bits - common helper for DP mst hotplug for radeon,i915,amdgpu + new refcounting scheme - MST i2c improvements - drm_syncobj_cb removal - ARM FB compression fourcc - P010 + P016 fourcc - allwinner tiled format modifier - i2c over aux I2C_M_STOP support - DRM_AUTH handling fixes TTM: - ref/unref renaming New driver: - ARM komeda display driver scheduler: - refactor mirror list handling - rework hw fence processing - 0 run queue entity fix bridge: - TI DS90C185 LVDS bridge - thc631lvdm83d bridge improvements - cadence + allwinner DSI ported to generic phy panels: - Sitronix ST7701 panel - Kingdisplay KD097D04 - LeMaker BL035-RGB-002 - PDA 91-00156-A0 - Innolux EE101IA-01D i915: - Enable fastboot by default on SKL+/VLV/CHV - Export RPCS configuration for ICL media driver - Coffelake PCI ID - CNL clocks setup fixes - ACPI/PMIC support for MIPI/DSI - Per-engine WA init for all engines - Shrinker locking fixes - Kerneldoc updates - Lots of ring improvements and reset fixes - Coffeelake GVT Support - VFIO GVT EDID Region support - runtime PM wakeref tracking - ILK->IVB primary plane enable delays - userptr mutex locking fixes - DSI fixes - LVDS/TV cleanups - HW readout fixes - LUT robustness fixes - ICL display and watermark fixes - gem mmap race fix amdgpu: - add scheduled dependencies interface - DCC on scanout surfaces - vega10/20 BACO support - Multiple IH rings on soc15 - XGMI locking fixes - DC i2c/aux cleanups - runtime SMU debug interface - Kexec improvmeents - SR-IOV fixes - DC freesync + ABM fixes - GDS fixes - GPUVM fixes - vega20 PCIE DPM switching fixes - Context priority handling fixes radeon: - fix missing break in evergreen parser nouveau: - SVM support via HMM msm: - QCOM Compressed modifier support exynos: - s5pv210 rotator support imx: - zpos property support - pending update fixes v3d: - cache flush improvments vc4: - reflection support - HDMI overscan support tegra: - CEC refactoring - HDMI audio fixes - Tegra186 prep work - SOR crossbar device tree fixes sun4i: - implicit fencing support - YUV and scalar support improvements - A23 support - tiling fixes atmel-hlcdc: - clipping and rotation property fixes qxl: - BO and PRIME improvements - generic fbdev emulation dw-hdmi: - HDMI 2.0 2160p - YUV420 ouput rockchip: - implicit fencing support - reflection proerties virtio-gpu: - use generic fbdev emulation tilcdc: - cpufreq vs crtc init fix rcar-du: - R8A774C0 support - D3/E3 RGB output routing fixes and DPAD0 support - RA87744 LVDS support bochs: - atomic and generic fbdev emulation - ID mismatch error on bochs load meson: - remove firmware fbs" * tag 'drm-next-2019-03-06' of git://anongit.freedesktop.org/drm/drm: (1130 commits) drm/amd/display: Use vrr friendly pageflip throttling in DC. drm/imx: only send commit done event when all state has been applied drm/imx: allow building under COMPILE_TEST drm/imx: imx-tve: depend on COMMON_CLK drm/imx: ipuv3-plane: add zpos property drm/imx: ipuv3-plane: add function to query atomic update status gpu: ipu-v3: prg: add function to get channel configure status gpu: ipu-v3: pre: add double buffer status readback drm/amdgpu: Bump amdgpu version for context priority override. drm/amdgpu/powerplay: fix typo in BACO header guards drm/amdgpu/powerplay: fix return codes in BACO code drm/amdgpu: add missing license on baco files drm/bochs: Fix the ID mismatch error drm/nouveau/dmem: use dma addresses during migration copies drm/nouveau/dmem: use physical vram addresses during migration copies drm/nouveau/dmem: extend copy function to allow direct use of physical addresses drm/nouveau/svm: new ioctl to migrate process memory to GPU memory drm/nouveau/dmem: device memory helpers for SVM drm/nouveau/svm: initial support for shared virtual memory drm/nouveau: prepare for enabling svm with existing userspace interfaces ...
\| * \	Merge back earlier PM core material for v5.1.	Rafael J. Wysocki	2019-02-24	1	-10/+6
\| \|\ \
\| \| * \|	drm/i915: Move on the new pm runtime interface	Vincent Guittot	2019-01-15	1	-10/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the new PM-runtime interface to get the accounted suspended time: pm_runtime_suspended_time(). This new interface helps to simplify and cleanup the code that computes __I915_SAMPLE_RC6_ESTIMATED and to remove direct access to internals of PM-runtime. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| * \| \|	drm/i915/pmu: Fix enable count array size and bounds checking	Tvrtko Ursulin	2019-02-12	1	-7/+15
\| \|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enable count array is supposed to have one counter for each possible engine sampler. As such, array sizing and bounds checking is not correct and would blow up the asserts if more samplers were added. No ill-effect in the current code base but lets fix it for correctness. At the same time tidy the assert for readability and robustness. v2: * One check per assert. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: b46a33e271ed ("drm/i915/pmu: Expose a PMU interface for perf queries") Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190205130353.21105-1-tvrtko.ursulin@linux.intel.com (cherry picked from commit 26a11deea685b41a43edb513194718aa1f461c9a) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
* \| \|	drm/i915: Store the BIT(engine->id) as the engine's mask	Chris Wilson	2019-03-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the next patch, we are introducing a broad virtual engine to encompass multiple physical engines, losing the 1:1 nature of BIT(engine->id). To reflect the broader set of engines implied by the virtual instance, lets store the full bitmask. v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/) v3: Tvrtko voted for moah churn so teach everyone to not mention ring and use $class$instance throughout. v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and VCS[0-4] in later gen. We opt to keep the code consistent and use 0-index naming throughout. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk
* \| \|	drm/i915/pmu: Always sample an active ringbuffer	Chris Wilson	2019-02-23	1	-37/+28
\| \|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As we no longer have a precise indication of requests queued to an engine, make no presumptions and just sample the ring registers to see if the engine is busy. v2: Report busy while the ring is idling on a semaphore/event. v3: Give the struct a name! v4: Always 0 outside the powerwell; trusting the powerwell is accurate enough for our sampling pmu. v5: Protect against gen7 mmio madness and try to improve grammar Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190223000102.14290-1-chris@chris-wilson.co.uk
* \|	drm/i915/pmu: Fix enable count array size and bounds checking	Tvrtko Ursulin	2019-02-06	1	-7/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enable count array is supposed to have one counter for each possible engine sampler. As such, array sizing and bounds checking is not correct and would blow up the asserts if more samplers were added. No ill-effect in the current code base but lets fix it for correctness. At the same time tidy the assert for readability and robustness. v2: * One check per assert. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: b46a33e271ed ("drm/i915/pmu: Expose a PMU interface for perf queries") Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190205130353.21105-1-tvrtko.ursulin@linux.intel.com
* \|	drm/i915: Syntatic sugar for using intel_runtime_pm	Chris Wilson	2019-01-14	1	-5/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Frequently, we use intel_runtime_pm_get/_put around a small block. Formalise that usage by providing a macro to define such a block with an automatic closure to scope the intel_runtime_pm wakeref to that block, i.e. macro abuse smelling of python. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-15-chris@chris-wilson.co.uk
* \|	drm/i915/pmu: Track rpm wakeref	Chris Wilson	2019-01-14	1	-9/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Track the wakeref used for temporary access to the device, and discard it upon release so that leaks can be identified. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-8-chris@chris-wilson.co.uk
* \|	drm/i915: Markup paired operations on wakerefs	Chris Wilson	2019-01-14	1	-3/+3
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The majority of runtime-pm operations are bounded and scoped within a function; these are easy to verify that the wakeref are handled correctly. We can employ the compiler to help us, and reduce the number of wakerefs tracked when debugging, by passing around cookies provided by the various rpm_get functions to their rpm_put counterpart. This makes the pairing explicit, and given the required wakeref cookie the compiler can verify that we pass an initialised value to the rpm_put (quite handy for double checking error paths). For regular builds, the compiler should be able to eliminate the unused local variables and the program growth should be minimal. Fwiw, it came out as a net improvement as gcc was able to refactor rpm_get and rpm_get_if_in_use together, v2: Just s/rpm_put/rpm_put_unchecked/ everywhere, leaving the manual mark up for smaller more targeted patches. v3: Mention the cookie in Returns Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-2-chris@chris-wilson.co.uk
*	Merge tag 'drm-next-2018-08-15' of git://anongit.freedesktop.org/drm/drm	Linus Torvalds	2018-08-15	1	-20/+47
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull drm updates from Dave Airlie: "This is the main drm pull request for 4.19. Rob has some new hardware support for new qualcomm hw that I'll send along separately. This has the display part of it, the remaining pull is for the acceleration engine. This also contains a wound-wait/wait-die mutex rework, Peter has acked it for merging via my tree. Otherwise mostly the usual level of activity. Summary: core: - Wound-wait/wait-die mutex rework - Add writeback connector type - Add "content type" property for HDMI - Move GEM bo to drm_framebuffer - Initial gpu scheduler documentation - GPU scheduler fixes for dying processes - Console deferred fbcon takeover support - Displayport support for CEC tunneling over AUX panel: - otm8009a panel driver fixes - Innolux TV123WAM and G070Y2-L01 panel driver - Ilitek ILI9881c panel driver - Rocktech RK070ER9427 LCD - EDT ETM0700G0EDH6 and EDT ETM0700G0BDH6 - DLC DLC0700YZG-1 - BOE HV070WSA-100 - newhaven, nhd-4.3-480272ef-atxl LCD - DataImage SCF0700C48GGU18 - Sharp LQ035Q7DB03 - p079zca: Refactor to support multiple panels tinydrm: - ILI9341 display panel New driver: - vkms - virtual kms driver to testing. i915: - Icelake: Display enablement DSI support IRQ support Powerwell support - GPU reset fixes and improvements - Full ppgtt support refactoring - PSR fixes and improvements - Execlist improvments - GuC related fixes amdgpu: - Initial amdgpu documentation - JPEG engine support on VCN - CIK uses powerplay by default - Move to using core PCIE functionality for gens/lanes - DC/Powerplay interface rework - Stutter mode support for RV - Vega12 Powerplay updates - GFXOFF fixes - GPUVM fault debugging - Vega12 GFXOFF - DC improvements - DC i2c/aux changes - UVD 7.2 fixes - Powerplay fixes for Polaris12, CZ/ST - command submission bo_list fixes amdkfd: - Raven support - Power management fixes udl: - Cleanups and fixes nouveau: - misc fixes and cleanups. msm: - DPU1 support display controller in sdm845 - GPU coredump support. vmwgfx: - Atomic modesetting validation fixes - Support for multisample surfaces armada: - Atomic modesetting support completed. exynos: - IPPv2 fixes - Move g2d to component framework - Suspend/resume support cleanups - Driver cleanups imx: - CSI configuration improvements - Driver cleanups - Use atomic suspend/resume helpers - ipu-v3 V4L2 XRGB32/XBGR32 support pl111: - Add Nomadik LCDC variant v3d: - GPU scheduler jobs management sun4i: - R40 display engine support - TCON TOP driver mediatek: - MT2712 SoC support rockchip: - vop fixes omapdrm: - Workaround for DRA7 errata i932 - Fix mm_list locking mali-dp: - Writeback implementation PM improvements - Internal error reporting debugfs tilcdc: - Single fix for deferred probing hdlcd: - Teardown fixes tda998x: - Converted to a bridge driver. etnaviv: - Misc fixes" * tag 'drm-next-2018-08-15' of git://anongit.freedesktop.org/drm/drm: (1506 commits) drm/amdgpu/sriov: give 8s for recover vram under RUNTIME drm/scheduler: fix param documentation drm/i2c: tda998x: correct PLL divider calculation drm/i2c: tda998x: get rid of private fill_modes function drm/i2c: tda998x: move mode_valid() to bridge drm/i2c: tda998x: register bridge outside of component helper drm/i2c: tda998x: cleanup from previous changes drm/i2c: tda998x: allocate tda998x_priv inside tda998x_create() drm/i2c: tda998x: convert to bridge driver drm/scheduler: fix timeout worker setup for out of order job completions drm/amd/display: display connected to dp-1 does not light up drm/amd/display: update clk for various HDMI color depths drm/amd/display: program display clock on cache match drm/amd/display: Add NULL check for enabling dp ss drm/amd/display: add vbios table check for enabling dp ss drm/amd/display: Don't share clk source between DP and HDMI drm/amd/display: Fix DP HBR2 Eye Diagram Pattern on Carrizo drm/amd/display: Use calculated disp_clk_khz value for dce110 drm/amd/display: Implement custom degamma lut on dcn drm/amd/display: Destroy aux_engines only once ...
\| *	drm/i915/pmu: Do not assume fixed hrtimer period	Tvrtko Ursulin	2018-06-05	1	-20/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As Chris has discovered on his Ivybridge, and later automated test runs have confirmed, on most of our platforms hrtimer faced with heavy GPU load can occasionally become sufficiently imprecise to affect PMU sampling calculations. This means we cannot assume sampling frequency is what we asked for, but we need to measure the interval ourselves. This patch is similar to Chris' original proposal for per-engine counters, but instead of introducing a new set to work around the problem with frequency sampling, it swaps around the way internal frequency accounting is done. Instead of accumulating current frequency and dividing by sampling frequency on readout, it accumulates frequency scaled by each period. v2: * Typo in commit message, comment on period calculation and USEC_PER_SEC. (Chris Wilson) Testcase: igt/perf_pmu/busy # snb, ivb, hsw Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180605140253.3541-1-tvrtko.ursulin@linux.intel.com
* \|	x86: Don't include linux/irq.h from asm/hardirq.h	Nicolai Stange	2018-08-05	1	-0/+1
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The next patch in this series will have to make the definition of irq_cpustat_t available to entering_irq(). Inclusion of asm/hardirq.h into asm/apic.h would cause circular header dependencies like asm/smp.h asm/apic.h asm/hardirq.h linux/irq.h linux/topology.h linux/smp.h asm/smp.h or linux/gfp.h linux/mmzone.h asm/mmzone.h asm/mmzone_64.h asm/smp.h asm/apic.h asm/hardirq.h linux/irq.h linux/irqdesc.h linux/kobject.h linux/sysfs.h linux/kernfs.h linux/idr.h linux/gfp.h and others. This causes compilation errors because of the header guards becoming effective in the second inclusion: symbols/macros that had been defined before wouldn't be available to intermediate headers in the #include chain anymore. A possible workaround would be to move the definition of irq_cpustat_t into its own header and include that from both, asm/hardirq.h and asm/apic.h. However, this wouldn't solve the real problem, namely asm/harirq.h unnecessarily pulling in all the linux/irq.h cruft: nothing in asm/hardirq.h itself requires it. Also, note that there are some other archs, like e.g. arm64, which don't have that #include in their asm/hardirq.h. Remove the linux/irq.h #include from x86' asm/hardirq.h. Fix resulting compilation errors by adding appropriate #includes to .c files as needed. Note that some of these .c files could be cleaned up a bit wrt. to their set of #includes, but that should better be done from separate patches, if at all. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
*	drm/i915/pmu: Inspect runtime PM state more carefully while estimating RC6	Tvrtko Ursulin	2018-04-11	1	-10/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While thinking about sporadic failures of perf_pmu/rc6-runtime-pm* tests on some CI machines I have concluded that: a) the PMU readout of RC6 can race against runtime PM transitions, and b) there are other reasons than being runtime suspended which can cause intel_runtime_pm_get_if_in_use to fail. Therefore when estimating RC6 the code needs to assert we are indeed in suspended state, and if not, the best we can do is return the last known RC6 value. Without this check we can calculate the estimated value based on un- initialized or inappropriate internal state, which can result in over- estimation, or in any case incorrect value being returned. v2: * Re-arrange the code a bit to avoid second unlock and return branch. (Chris Wilson) v3: * Insert some strategic blank lines and improve commit msg. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 1fe699e30113 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105010 Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180410112704.24462-1-tvrtko.ursulin@linux.intel.com
*	drm/i915/pmu: Work around compiler warnings on some kernel configs	Tvrtko Ursulin	2018-03-14	1	-19/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Arnd Bergman reports: """ The conditional spinlock confuses gcc into thinking the 'flags' value might contain uninitialized data: drivers/gpu/drm/i915/i915_pmu.c: In function '__i915_pmu_event_read': arch/x86/include/asm/paravirt_types.h:573:3: error: 'flags' may be used uninitialized in this function [-Werror=maybe-uninitialized] The code is correct, but it's easy to see how the compiler gets confused here. This avoids the problem by pulling the lock outside of the function into its only caller. """ On deeper look it seems this is caused by paravirt spinlocks implementation when CONFIG_PARAVIRT_DEBUG is set, which by being complicated, manages to convince gcc locked parameter can be changed externally (impossible). Work around it by removing the conditional locking parameters altogether. (It was never the most elegant code anyway.) Slight penalty we now pay is an additional irqsave spin lock/unlock cycle on the event enable path. But since enable is not a fast path, that is preferrable to the alternative solution which was doing MMIO under irqsave spinlock. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reported-by: Arnd Bergmann <arnd@arndb.de> Fixes: 1fe699e30113 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout") Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: David Airlie <airlied@linux.ie> Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180314080535.17490-1-tvrtko.ursulin@linux.intel.com
*	drm/i915: Make header i915_pmu.h more robust	Michal Wajdeczko	2018-03-09	1	-24/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Definitions in i915_pmu.h header depend on other types and declarations that were not explicitly included. Fix that by adding related headers and forward declarations. While here, change license text to SPDX format. v2: don't drop "intel_ringbuffer.h" (Tvrtko) Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180308095037.18264-4-michal.wajdeczko@intel.com
*	drm/i915/pmu: Fix building without CONFIG_PM	Chris Wilson	2018-02-07	1	-10/+23
\| \| \| \| \| \| \| \| \| \| \| \|	As we peek inside struct device to query members guarded by CONFIG_PM, so must be the code. Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: 1fe699e30113 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180207160428.17015-1-chris@chris-wilson.co.uk
*	drm/i915/pmu: Fix sleep under atomic in RC6 readout	Tvrtko Ursulin	2018-02-07	1	-15/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We are not allowed to call intel_runtime_pm_get from the PMU counter read callback since the former can sleep, and the latter is running under IRQ context. To workaround this, we record the last known RC6 and while runtime suspended estimate its increase by querying the runtime PM core timestamps. Downside of this approach is that we can temporarily lose a chunk of RC6 time, from the last PMU read-out to runtime suspend entry, but that will eventually catch up, once device comes back online and in the presence of PMU queries. Also, we have to be careful not to overshoot the RC6 estimate, so once resumed after a period of approximation, we only update the counter once it catches up. With the observation that RC6 is increasing while the device is suspended, this should not pose a problem and can only cause slight inaccuracies due clock base differences. v2: Simplify by estimating on top of PM core counters. (Imre) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104943 Fixes: 6060b6aec03c ("drm/i915/pmu: Add RC6 residency metrics") Testcase: igt/perf_pmu/rc6-runtime-pm Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: David Airlie <airlied@linux.ie> Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180206183311.17924-1-tvrtko.ursulin@linux.intel.com