drm/i915/gen8: Add infrastructure to initialize WA batch buffers

Some of the WA are to be applied during context save but before restore and some at the end of context save/restore but before executing the instructions in the ring, WA batch buffers are created for this purpose and these WA cannot be applied using normal means. Each context has two registers to load the offsets of these batch buffers. If they are non-zero, HW understands that it need to execute these batches. v1: In this version two separate ring_buffer objects were used to load WA instructions for indirect and per context batch buffers and they were part of every context. v2: Chris suggested to include additional page in context and use it to load these WA instead of creating separate objects. This will simplify lot of things as we need not explicity pin/unpin them. Thomas Daniel further pointed that GuC is planning to use a similar setup to share data between GuC and driver and WA batch buffers can probably share that page. However after discussions with Dave who is implementing GuC changes, he suggested to use an independent page for the reasons - GuC area might grow and these WA are initialized only once and are not changed afterwards so we can share them share across all contexts. The page is updated with WA during render ring init. This has an advantage of not adding more special cases to default_context. We don't know upfront the number of WA we will applying using these batch buffers. For this reason the size was fixed earlier but it is not a good idea. To fix this, the functions that load instructions are modified to report the no of commands inserted and the size is now calculated after the batch is updated. A macro is introduced to add commands to these batch buffers which also checks for overflow and returns error. We have a full page dedicated for these WA so that should be sufficient for good number of WA, anything more means we have major issues. The list for Gen8 is small, same for Gen9 also, maybe few more gets added going forward but not close to filling entire page. Chris suggested a two-pass approach but we agreed to go with single page setup as it is a one-off routine and simpler code wins. One additional option is offset field which is helpful if we would like to have multiple batches at different offsets within the page and select them based on some criteria. This is not a requirement at this point but could help in future (Dave). Chris provided some helpful macros and suggestions which further simplified the code, they will also help in reducing code duplication when WA for other Gen are added. Add detailed comments explaining restrictions. Use do {} while(0) for wa_ctx_emit() macro. (Many thanks to Chris, Dave and Thomas for their reviews and inputs) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
author: Arun Siluvery <arun.siluvery@linux.intel.com> 2015-06-19 19:07:01 +0100
committer: Daniel Vetter <daniel.vetter@ffwll.ch> 2015-06-23 14:01:39 +0200
commit: 17ee950df38b649d8431e2f6f7f85282d89f5398 (patch)
tree: 4fbea48c0c7ca7e9ca4b0e801c7d8ee3feace314 /drivers/gpu/drm/i915/intel_ringbuffer.h
parent: b1330fbb870467bbb90adb2e8868672af4ca88c7 (diff)
download: talos-obmc-linux-17ee950df38b649d8431e2f6f7f85282d89f5398.tar.gz
talos-obmc-linux-17ee950df38b649d8431e2f6f7f85282d89f5398.zip
1 files changed, 21 insertions, 0 deletions
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index e539314ae87e..64850293559c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -12,6 +12,7 @@
  * workarounds!
  */
 #define CACHELINE_BYTES 64
+#define CACHELINE_DWORDS (CACHELINE_BYTES / sizeof(uint32_t))
 
 /*
  * Gen2 BSpec "1. Programming Environment" / 1.4.4.6 "Ring Buffer Use"
@@ -120,6 +121,25 @@ struct intel_ringbuffer {
 struct	intel_context;
 struct drm_i915_reg_descriptor;
 
+/*
+ * we use a single page to load ctx workarounds so all of these
+ * values are referred in terms of dwords
+ *
+ * struct i915_wa_ctx_bb:
+ *  offset: specifies batch starting position, also helpful in case
+ *    if we want to have multiple batches at different offsets based on
+ *    some criteria. It is not a requirement at the moment but provides
+ *    an option for future use.
+ *  size: size of the batch in DWORDS
+ */
+struct  i915_ctx_workarounds {
+	struct i915_wa_ctx_bb {
+		u32 offset;
+		u32 size;
+	} indirect_ctx, per_ctx;
+	struct drm_i915_gem_object *obj;
+};
+
 struct  intel_engine_cs {
 	const char	*name;
 	enum intel_ring_id {
@@ -143,6 +163,7 @@ struct  intel_engine_cs {
 	struct i915_gem_batch_pool batch_pool;
 
 	struct intel_hw_status_page status_page;
+	struct i915_ctx_workarounds wa_ctx;
 
 	unsigned irq_refcount; /* protected by dev_priv->irq_lock */
 	u32		irq_enable_mask;	/* bitmask to enable ring interrupt */
author	Arun Siluvery <arun.siluvery@linux.intel.com>	2015-06-19 19:07:01 +0100
committer	Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-23 14:01:39 +0200
commit	17ee950df38b649d8431e2f6f7f85282d89f5398 (patch)
tree	4fbea48c0c7ca7e9ca4b0e801c7d8ee3feace314 /drivers/gpu/drm/i915/intel_ringbuffer.h
parent	b1330fbb870467bbb90adb2e8868672af4ca88c7 (diff)
download	talos-obmc-linux-17ee950df38b649d8431e2f6f7f85282d89f5398.tar.gz talos-obmc-linux-17ee950df38b649d8431e2f6f7f85282d89f5398.zip