summaryrefslogtreecommitdiffstats
path: root/drivers/dma/fsldma.h
diff options
context:
space:
mode:
authorHongbo Zhang <hongbo.zhang@freescale.com>2014-05-21 16:03:03 +0800
committerVinod Koul <vinod.koul@intel.com>2014-07-14 21:32:18 +0530
commit43452fadd614b62b84e950838cb7d2419f3aafb1 (patch)
tree241fe333208fd23b68c99f69b5821faf14204406 /drivers/dma/fsldma.h
parent14c6a3333c8e885604fc98768d8b9a32e08110ac (diff)
downloadtalos-op-linux-43452fadd614b62b84e950838cb7d2419f3aafb1.tar.gz
talos-op-linux-43452fadd614b62b84e950838cb7d2419f3aafb1.zip
dmaengine: Freescale: change descriptor release process for supporting async_tx
Fix the potential risk when enable config NET_DMA and ASYNC_TX. Async_tx is lack of support in current release process of dma descriptor, all descriptors will be released whatever is acked or no-acked by async_tx, so there is a potential race condition when dma engine is uesd by others clients (e.g. when enable NET_DMA to offload TCP). In our case, a race condition which is raised when use both of talitos and dmaengine to offload xor is because napi scheduler will sync all pending requests in dma channels, it affects the process of raid operations due to ack_tx is not checked in fsl dma. The no-acked descriptor is freed which is submitted just now, as a dependent tx, this freed descriptor trigger BUG_ON(async_tx_test_ack(depend_tx)) in async_tx_submit(). TASK = ee1a94a0[1390] 'md0_raid5' THREAD: ecf40000 CPU: 0 GPR00: 00000001 ecf41ca0 ee44/921a94a0 0000003f 00000001 c00593e4 00000000 00000001 GPR08: 00000000 a7a7a7a7 00000001 045/920000002 42028042 100a38d4 ed576d98 00000000 GPR16: ed5a11b0 00000000 2b162000 00000200 046/920000000 2d555000 ed3015e8 c15a7aa0 GPR24: 00000000 c155fc40 00000000 ecb63220 ecf41d28 e47/92f640bb0 ef640c30 ecf41ca0 NIP [c02b048c] async_tx_submit+0x6c/0x2b4 LR [c02b068c] async_tx_submit+0x26c/0x2b4 Call Trace: [ecf41ca0] [c02b068c] async_tx_submit+0x26c/0x2b448/92 (unreliable) [ecf41cd0] [c02b0a4c] async_memcpy+0x240/0x25c [ecf41d20] [c0421064] async_copy_data+0xa0/0x17c [ecf41d70] [c0421cf4] __raid_run_ops+0x874/0xe10 [ecf41df0] [c0426ee4] handle_stripe+0x820/0x25e8 [ecf41e90] [c0429080] raid5d+0x3d4/0x5b4 [ecf41f40] [c04329b8] md_thread+0x138/0x16c [ecf41f90] [c008277c] kthread+0x8c/0x90 [ecf41ff0] [c0011630] kernel_thread+0x4c/0x68 Another modification in this patch is the change of completed descriptors, there is a potential risk which caused by exception interrupt, all descriptors in ld_running list are seemed completed when an interrupt raised, it works fine under normal condition, but if there is an exception occured, it cannot work as our excepted. Hardware should not be depend on s/w list, the right way is to read current descriptor address register to find the last completed descriptor. If an interrupt is raised by an error, all descriptors in ld_running should not be seemed finished, or these unfinished descriptors in ld_running will be released wrongly. A simple way to reproduce: Enable dmatest first, then insert some bad descriptors which can trigger Programming Error interrupts before the good descriptors. Last, the good descriptors will be freed before they are processsed because of the exception intrerrupt. Note: the bad descriptors are only for simulating an exception interrupt. This case can illustrate the potential risk in current fsl-dma very well. Signed-off-by: Hongbo Zhang <hongbo.zhang@freescale.com> Signed-off-by: Qiang Liu <qiang.liu@freescale.com> Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Diffstat (limited to 'drivers/dma/fsldma.h')
-rw-r--r--drivers/dma/fsldma.h17
1 files changed, 15 insertions, 2 deletions
diff --git a/drivers/dma/fsldma.h b/drivers/dma/fsldma.h
index f2e0c4dcf901..239c20c84382 100644
--- a/drivers/dma/fsldma.h
+++ b/drivers/dma/fsldma.h
@@ -149,8 +149,21 @@ struct fsldma_chan {
char name[8]; /* Channel name */
struct fsldma_chan_regs __iomem *regs;
spinlock_t desc_lock; /* Descriptor operation lock */
- struct list_head ld_pending; /* Link descriptors queue */
- struct list_head ld_running; /* Link descriptors queue */
+ /*
+ * Descriptors which are queued to run, but have not yet been
+ * submitted to the hardware for execution
+ */
+ struct list_head ld_pending;
+ /*
+ * Descriptors which are currently being executed by the hardware
+ */
+ struct list_head ld_running;
+ /*
+ * Descriptors which have finished execution by the hardware. These
+ * descriptors have already had their cleanup actions run. They are
+ * waiting for the ACK bit to be set by the async_tx API.
+ */
+ struct list_head ld_completed; /* Link descriptors queue */
struct dma_chan common; /* DMA common channel */
struct dma_pool *desc_pool; /* Descriptors pool */
struct device *dev; /* Channel device */
OpenPOWER on IntegriCloud