Fix a deadlock in trace.

This is actually a workaround and based on the symptom I believe there are many other hazards in the trace code. What happened here is that a task caused a page-fault in the trace code while holding the trace mutex. This prevented code like the PNOR resource provider from being able to execute traces, which deadlocked all code. There are similar deadlock hazards in the binary and %s handling of trace. We need to revisit trace to ensure that it can never cause a page-fault while holding the global mutex. We talked recently about revamping trace entirely, so this is just one more design item to consider. Change-Id: I28e32d2d79cf419a7a7eb680627e79a88bc6a5a7 Reviewed-on: http://gfw160.austin.ibm.com:8080/gerrit/649 Reviewed-by: Mark W. Wenning <wenning@us.ibm.com> Reviewed-by: CAMVAN T. NGUYEN <ctnguyen@us.ibm.com> Tested-by: Jenkins Server Reviewed-by: Van H. Lee <vanlee@us.ibm.com> Reviewed-by: A. Patrick Williams III <iawillia@us.ibm.com>
author: Patrick Williams <iawillia@us.ibm.com> 2012-02-07 16:49:39 -0600
committer: A. Patrick Williams III <iawillia@us.ibm.com> 2012-02-08 08:10:56 -0600
commit: 4c500ad53631f8a42d64a88112b30b19a0c6373b (patch)
tree: e54808328ea6a2c25647ad05283b37a52dfd42f3 /src/usr/trace
parent: 1ae3dfacaccbcb9bdb1b0e8306118844331a6e10 (diff)
download: talos-hostboot-4c500ad53631f8a42d64a88112b30b19a0c6373b.tar.gz
talos-hostboot-4c500ad53631f8a42d64a88112b30b19a0c6373b.zip
1 files changed, 8 insertions, 0 deletions
diff --git a/src/usr/trace/trace.C b/src/usr/trace/trace.C
index 1510eac78..29e7f7771 100644
--- a/src/usr/trace/trace.C
+++ b/src/usr/trace/trace.C
@@ -213,6 +213,14 @@ void Trace::initBuffer( trace_desc_t **o_td,
     // Store buffer name internally in upper case
     strupr(l_comp);
 
+    // The page containing the trace-descriptor destination might not be
+    // loaded yet, so we write to it outside of the mutex to force a page
+    // fault to bring the page in.  If we don't do this, we can end up with
+    // a dead-lock where this code is blocked due to a page-fault while
+    // holding the trace mutex, which in turn blocks the code that handles
+    // page faults.
+    *o_td = NULL;
+
     // CRITICAL REGION START
     mutex_lock(&iv_trac_mutex);
author	Patrick Williams <iawillia@us.ibm.com>	2012-02-07 16:49:39 -0600
committer	A. Patrick Williams III <iawillia@us.ibm.com>	2012-02-08 08:10:56 -0600
commit	4c500ad53631f8a42d64a88112b30b19a0c6373b (patch)
tree	e54808328ea6a2c25647ad05283b37a52dfd42f3 /src/usr/trace
parent	1ae3dfacaccbcb9bdb1b0e8306118844331a6e10 (diff)
download	talos-hostboot-4c500ad53631f8a42d64a88112b30b19a0c6373b.tar.gz talos-hostboot-4c500ad53631f8a42d64a88112b30b19a0c6373b.zip