summaryrefslogtreecommitdiffstats
path: root/drivers/net
diff options
context:
space:
mode:
authorEric Dumazet <eric.dumazet@gmail.com>2010-10-03 22:17:54 -0700
committerDavid S. Miller <davem@davemloft.net>2010-10-03 22:17:54 -0700
commitc7d4426a98a5f6654cd0b4b33d9dab2e77192c18 (patch)
tree0db2524e6f3f742861765dd6aa696a9271767056 /drivers/net
parent9a7241c21b06c3a3f8ebcf3e347bd68556369da7 (diff)
downloadblackbird-op-linux-c7d4426a98a5f6654cd0b4b33d9dab2e77192c18.tar.gz
blackbird-op-linux-c7d4426a98a5f6654cd0b4b33d9dab2e77192c18.zip
net: introduce DST_NOCACHE flag
While doing stress tests with IP route cache disabled, and multi queue devices, I noticed a very high contention on one rwlock used in neighbour code. When many cpus are trying to send frames (possibly using a high performance multiqueue device) to the same neighbour, they fight for the neigh->lock rwlock in order to call neigh_hh_init(), and fight on hh->hh_refcnt (a pair of atomic_inc/atomic_dec_and_test()) But we dont need to call neigh_hh_init() for dst that are used only once. It costs four atomic operations at least, on two contended cache lines, plus the high contention on neigh->lock rwlock. Introduce a new dst flag, DST_NOCACHE, that is set when dst was not inserted in route cache. With the stress test bench, sending 160000000 frames on one neighbour, results are : Before patch: real 2m28.406s user 0m11.781s sys 36m17.964s After patch: real 1m26.532s user 0m12.185s sys 20m3.903s Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'drivers/net')
0 files changed, 0 insertions, 0 deletions
OpenPOWER on IntegriCloud