<feed xmlns='http://www.w3.org/2005/Atom'>
<title>talos-obmc-linux/drivers/gpu/drm/amd/scheduler, branch dev-4.13</title>
<subtitle>Talos™ II Linux sources for OpenBMC</subtitle>
<id>https://git.raptorcs.com/git/talos-obmc-linux/atom?h=dev-4.13</id>
<link rel='self' href='https://git.raptorcs.com/git/talos-obmc-linux/atom?h=dev-4.13'/>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/'/>
<updated>2017-07-14T15:06:00+00:00</updated>
<entry>
<title>drm/amd/sched: print sched job id in amd_sched_job trace</title>
<updated>2017-07-14T15:06:00+00:00</updated>
<author>
<name>Nicolai Hähnle</name>
<email>nicolai.haehnle@amd.com</email>
</author>
<published>2017-06-13T20:12:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=eabd76cef9002697979794826e625b9868e697b5'/>
<id>urn:sha1:eabd76cef9002697979794826e625b9868e697b5</id>
<content type='text'>
This makes it easier to correlate amd_sched_job with with other trace
points that don't log the job pointer.

v2: don't print the sched_job pointer (Andres)

Signed-off-by: Nicolai Hähnle &lt;nicolai.haehnle@amd.com&gt;
Reviewed-by: Andres Rodriguez &lt;andresx7@gmail.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu/SRIOV:implement guilty job TDR for(V2)</title>
<updated>2017-05-24T21:40:40+00:00</updated>
<author>
<name>Monk Liu</name>
<email>Monk.Liu@amd.com</email>
</author>
<published>2017-05-11T05:36:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=65781c78ad74e4260fbec92c0ecc05738044e177'/>
<id>urn:sha1:65781c78ad74e4260fbec92c0ecc05738044e177</id>
<content type='text'>
1,TDR will kickout guilty job if it hang exceed the threshold
of the given one from kernel paramter "job_hang_limit", that
way a bad command stream will not infinitly cause GPU hang.

by default this threshold is 1 so a job will be kicked out
after it hang.

2,if a job timeout TDR routine will not reset all sched/ring,
instead if will only reset on the givn one which is indicated
by @job of amdgpu_sriov_gpu_reset, that way we don't need to
reset and recover each sched/ring if we already know which job
cause GPU hang.

3,unblock sriov_gpu_reset for AI family.

V2:
1:put kickout guilty job after sched parked.
2:since parking scheduler prior to kickout already occupies a
while, we can do last check on the in question job before
doing hw_reset.

TODO:
1:when a job is considered as guilty, we should mark some flag
in its fence status flag, and let UMD side aware that this
fence signaling is not due to job complete but job hang.

2:if gpu reset cause all video memory lost, we need introduce
a new policy to implement TDR, like drop all jobs not yet
signaled, and all IOCTL on this device will return ERROR
DEVICE_LOST.
this will be implemented later.

Signed-off-by: Monk Liu &lt;Monk.Liu@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: fix dependency issue</title>
<updated>2017-05-10T17:23:53+00:00</updated>
<author>
<name>Chunming Zhou</name>
<email>David1.Zhou@amd.com</email>
</author>
<published>2017-05-09T05:39:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=30514decb27d45b98599612cb5d3e6a20ba733a5'/>
<id>urn:sha1:30514decb27d45b98599612cb5d3e6a20ba733a5</id>
<content type='text'>
The problem is that executing the jobs in the right order doesn't give you the right result
because consecutive jobs executed on the same engine are pipelined.
In other words job B does it buffer read before job A has written it's result.

Signed-off-by: Chunming Zhou &lt;David1.Zhou@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amd: fix init order of sched job</title>
<updated>2017-05-10T17:16:50+00:00</updated>
<author>
<name>Chunming Zhou</name>
<email>David1.Zhou@amd.com</email>
</author>
<published>2017-05-09T07:34:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=cb3696fdeca24d3a347a5225d934ce50c6edccba'/>
<id>urn:sha1:cb3696fdeca24d3a347a5225d934ce50c6edccba</id>
<content type='text'>
Need to increment after the fence check.

Signed-off-by: Chunming Zhou &lt;David1.Zhou@amd.com&gt;
Reviewed-by: Junwei Zhang &lt;Jerry.Zhang@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: fix NULL pointer error</title>
<updated>2017-04-28T21:33:09+00:00</updated>
<author>
<name>Chunming Zhou</name>
<email>David1.Zhou@amd.com</email>
</author>
<published>2017-04-24T09:39:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=a6bef67e2abae4f19de8fb4ec34acb1df8156a75'/>
<id>urn:sha1:a6bef67e2abae4f19de8fb4ec34acb1df8156a75</id>
<content type='text'>
[  141.420491] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
[  141.420532] IP: [&lt;ffffffff81579ee1&gt;] fence_remove_callback+0x11/0x60
[  141.420563] PGD 20a030067
[  141.420575] PUD 2088ca067
[  141.420587] PMD 0

[  141.420599] Oops: 0000 [#1] SMP
[  141.420612] Modules linked in: amdgpu(OE) ttm(OE) drm_kms_helper(E) drm(E) i2c_algo_bit(E) fb_sys_fops(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) rpcsec_gss_krb5(E) nfsv4(E) nfs(E) fscache(E) eeepc_wmi(E) asus_wmi(E) sparse_keymap(E) snd_hda_codec_realtek(E) video(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) joydev(E) snd_hda_codec(E) snd_seq_midi(E) snd_seq_midi_event(E) snd_hda_core(E) snd_hwdep(E) snd_rawmidi(E) snd_pcm(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) snd_seq(E) crc32_pclmul(E) ghash_clmulni_intel(E) snd_seq_device(E) snd_timer(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) snd(E) soundcore(E) serio_raw(E) shpchp(E) i2c_piix4(E) i2c_designware_platform(E) 8250_dw(E) i2c_designware_core(E) mac_hid(E) binfmt_misc(E)
[  141.420948]  nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) parport_pc(E) ppdev(E) lp(E) parport(E) autofs4(E) hid_generic(E) usbhid(E) hid(E) psmouse(E) r8169(E) ahci(E) mii(E) libahci(E) wmi(E)
[  141.421042] CPU: 14 PID: 223 Comm: kworker/14:2 Tainted: G           OE   4.9.0-custom #4
[  141.421074] Hardware name: System manufacturer System Product Name/PRIME B350-PLUS, BIOS 0606 04/06/2017
[  141.421146] Workqueue: events amd_sched_job_timedout [amdgpu]
[  141.421169] task: ffff88020b03ba80 task.stack: ffffc900016f4000
[  141.421193] RIP: 0010:[&lt;ffffffff81579ee1&gt;]  [&lt;ffffffff81579ee1&gt;] fence_remove_callback+0x11/0x60
[  141.421229] RSP: 0018:ffffc900016f7d30  EFLAGS: 00010202
[  141.421250] RAX: ffff8801c049fc00 RBX: ffff8801d4d8dc00 RCX: 0000000000000000
[  141.421278] RDX: 0000000000000001 RSI: ffff8801c049fcc0 RDI: 0000000000000000
[  141.421307] RBP: ffffc900016f7d48 R08: 0000000000000000 R09: 0000000000000000
[  141.421334] R10: 00000020ed512a30 R11: 0000000000000001 R12: 0000000000000000
[  141.421362] R13: ffff880209ba4ba0 R14: ffff880209ba4c58 R15: ffff8801c055cc60
[  141.421390] FS:  0000000000000000(0000) GS:ffff88021ef80000(0000) knlGS:0000000000000000
[  141.421421] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  141.421443] CR2: 0000000000000030 CR3: 000000020b554000 CR4: 00000000003406e0
[  141.421471] Stack:
[  141.421480]  ffff8801d4d8dc00 ffff880209ba4c48 ffff880209ba4ba0 ffffc900016f7d78
[  141.421513]  ffffffffa0697920 ffff880209ba0000 0000000000000000 ffff880209ba2770
[  141.421549]  ffff880209ba4b08 ffffc900016f7df0 ffffffffa05ce2ae ffffffffa0509eb7
[  141.421583] Call Trace:
[  141.421628]  [&lt;ffffffffa0697920&gt;] amd_sched_hw_job_reset+0x50/0xb0 [amdgpu]
[  141.421676]  [&lt;ffffffffa05ce2ae&gt;] amdgpu_gpu_reset+0x8e/0x690 [amdgpu]
[  141.421712]  [&lt;ffffffffa0509eb7&gt;] ? drm_printk+0x97/0xa0 [drm]
[  141.421770]  [&lt;ffffffffa0698156&gt;] amdgpu_job_timedout+0x46/0x50 [amdgpu]
[  141.421829]  [&lt;ffffffffa0696a07&gt;] amd_sched_job_timedout+0x17/0x20 [amdgpu]
[  141.421859]  [&lt;ffffffff81095493&gt;] process_one_work+0x153/0x3f0
[  141.421884]  [&lt;ffffffff81095c5b&gt;] worker_thread+0x12b/0x4b0
[  141.421907]  [&lt;ffffffff81095b30&gt;] ? rescuer_thread+0x350/0x350
[  141.421931]  [&lt;ffffffff8109b423&gt;] kthread+0xd3/0xf0
[  141.421951]  [&lt;ffffffff8109b350&gt;] ? kthread_park+0x60/0x60
[  141.421975]  [&lt;ffffffff817e1ee5&gt;] ret_from_fork+0x25/0x30
[  141.421996] Code: ac 81 e8 a3 1f b0 ff 48 c7 c0 ea ff ff ff e9 48 ff ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 49 89 fc 53 &lt;48&gt; 8b 7f 30 48 89 f3 e8 73 7c 26 00 48 8b 13 48 39 d3 41 0f 95
[  141.422156] RIP  [&lt;ffffffff81579ee1&gt;] fence_remove_callback+0x11/0x60
[  141.422183]  RSP &lt;ffffc900016f7d30&gt;
[  141.422197] CR2: 0000000000000030
[  141.433483] ---[ end trace bc0949bf7ddd6d4b ]---

if the job is reset twice, then the parent could be NULL.

Signed-off-by: Chunming Zhou &lt;David1.Zhou@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amd/sched: revise priority number</title>
<updated>2017-03-30T03:54:03+00:00</updated>
<author>
<name>Chunming Zhou</name>
<email>David1.Zhou@amd.com</email>
</author>
<published>2017-03-16T03:44:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=153de9dff94cc559646b04f232738e22ee318705'/>
<id>urn:sha1:153de9dff94cc559646b04f232738e22ee318705</id>
<content type='text'>
big number is to high priority.

Signed-off-by: Chunming Zhou &lt;David1.Zhou@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amd/sched: add a unique job id to amd_sched_job</title>
<updated>2017-03-30T03:53:53+00:00</updated>
<author>
<name>Andres Rodriguez</name>
<email>andresx7@gmail.com</email>
</author>
<published>2017-03-10T02:25:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=93f8b367382108c2363335cfeb5cb3f3f39cfe43'/>
<id>urn:sha1:93f8b367382108c2363335cfeb5cb3f3f39cfe43</id>
<content type='text'>
A unique id is useful for debugging and tracing. Intended to replace
pointers in ftrace output.

Reviewed-by: Chunming Zhou &lt;david1.zhou@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Andres Rodriguez &lt;andresx7@gmail.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>sched/headers: Prepare for new header dependencies before moving code to &lt;uapi/linux/sched/types.h&gt;</title>
<updated>2017-03-02T07:42:27+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-02-01T17:07:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=ae7e81c077d60507dcec139e40a6d10cf932cf4b'/>
<id>urn:sha1:ae7e81c077d60507dcec139e40a6d10cf932cf4b</id>
<content type='text'>
We are going to move scheduler ABI details to &lt;uapi/linux/sched/types.h&gt;,
which will be used from a number of .c files.

Create empty placeholder header that maps to &lt;linux/types.h&gt;.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'drm-qemu-20161121' of git://git.kraxel.org/linux into drm-next</title>
<updated>2016-11-30T04:18:51+00:00</updated>
<author>
<name>Dave Airlie</name>
<email>airlied@redhat.com</email>
</author>
<published>2016-11-30T04:18:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=63207455963053ca212e61c75f43b3502ea69f0e'/>
<id>urn:sha1:63207455963053ca212e61c75f43b3502ea69f0e</id>
<content type='text'>
drm/virtio: fix busid in a different way, allocate more vbufs.
drm/qxl: various bugfixes and cleanups,

* tag 'drm-qemu-20161121' of git://git.kraxel.org/linux: (224 commits)
  drm/virtio: allocate some extra bufs
  qxl: Allow resolution which are not multiple of 8
  qxl: Don't notify userspace when monitors config is unchanged
  qxl: Remove qxl_bo_init() return value
  qxl: Call qxl_gem_{init, fini}
  qxl: Add missing '\n' to qxl_io_log() call
  qxl: Remove unused prototype
  qxl: Mark some internal functions as static
  Revert "drm: virtio: reinstate drm_virtio_set_busid()"
  drm/virtio: fix busid regression
  drm: re-export drm_dev_set_unique
  Linux 4.9-rc5
  gp8psk: Fix DVB frontend attach
  gp8psk: fix gp8psk_usb_in_op() logic
  dvb-usb: move data_mutex to struct dvb_usb_device
  iio: maxim_thermocouple: detect invalid storage size in read()
  aoe: fix crash in page count manipulation
  lightnvm: invalid offset calculation for lba_shift
  Kbuild: enable -Wmaybe-uninitialized warnings by default
  pcmcia: fix return value of soc_pcmcia_regulator_set
  ...
</content>
</entry>
<entry>
<title>Backmerge tag 'v4.9-rc4' into drm-next</title>
<updated>2016-11-06T23:37:09+00:00</updated>
<author>
<name>Dave Airlie</name>
<email>airlied@redhat.com</email>
</author>
<published>2016-11-06T23:37:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/talos-obmc-linux/commit/?id=7b624ad8fea1be7ff4c22643e212191aa6a2a3c2'/>
<id>urn:sha1:7b624ad8fea1be7ff4c22643e212191aa6a2a3c2</id>
<content type='text'>
Linux 4.9-rc4

This is needed for nouveau development.
</content>
</entry>
</feed>
