diff options
author | Faisal Latif <faisal.latif@intel.com> | 2009-04-08 14:23:55 -0700 |
---|---|---|
committer | Roland Dreier <rolandd@cisco.com> | 2009-04-08 14:23:55 -0700 |
commit | 5962c2c8036b4dcf10ec6c481be656ae4700b664 (patch) | |
tree | a20bfcbb93e52f7a1dc161d7c6333ceadc5ba046 /mm/bootmem.c | |
parent | 79fc3d7410c861c8ced5b81a5c3759f6bbf891dc (diff) | |
download | talos-obmc-linux-5962c2c8036b4dcf10ec6c481be656ae4700b664.tar.gz talos-obmc-linux-5962c2c8036b4dcf10ec6c481be656ae4700b664.zip |
RDMA/nes: Fix nes_nic_cm_xmit() error handling
We are getting crash or hung situation when we are running network
cable pull tests during RDMA traffic.
In schedule_nes_timer(), we return an error if nes_nic_cm_xmit()
returns failure. This is changed to success as skb is being put on
the timer routines to be processed later. In send_syn() case, we are
indicating connect failure once from nes_connect() and the other when
the rexmit retries expires.
The other issue is skb->users which we are incrementing before calling
nes_nic_cm_xmit() which calls dev_queue_xmit() but in case of failure
we are decrementing the skb->users at the same time putting the skb on
the rexmit path. Even if dev_queue_xmit() fails, the skb->users is
decremented already. We are removing the decrement of skb->users in
case of failure from both schedule_nes_timer() as well as from
nes_cm_timer_tick().
There is also extra check in nes_cm_timer_tick() for rexmit failure
which does a break from the loop is removed. This causes problem as
the other nodes have their cm_node->ref_count incremented and are not
processed.
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Diffstat (limited to 'mm/bootmem.c')
0 files changed, 0 insertions, 0 deletions