| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
String literals are constant, and usually, we can also tag the array
of pointers const too, moving it to the .rodata section.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Same as how Randy Dunlap fixed the ipv4 side of things.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This renames away a variable clash:
* ipv6_table[] is declared as a static global table;
* ipv6_sysctl_net_init() uses ipv6_table to refer/destroy dynamic memory;
* ipv6_sysctl_net_exit() also uses ipv6_table for the same purpose;
* both the two last functions call kfree() on ipv6_table.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Choose saner defaults for xfrm[4|6] gc_thresh values on init
Currently, the xfrm[4|6] code has hard-coded initial gc_thresh values
(set to 1024). Given that the ipv4 and ipv6 routing caches are sized
dynamically at boot time, the static selections can be non-sensical.
This patch dynamically selects an appropriate gc threshold based on
the corresponding main routing table size, using the assumption that
we should in the worst case be able to handle as many connections as
the routing table can.
For ipv4, the maximum route cache size is 16 * the number of hash
buckets in the route cache. Given that xfrm4 starts garbage
collection at the gc_thresh and prevents new allocations at 2 *
gc_thresh, we set gc_thresh to half the maximum route cache size.
For ipv6, its a bit trickier. there is no maximum route cache size,
but the ipv6 dst_ops gc_thresh is statically set to 1024. It seems
sane to select a simmilar gc_thresh for the xfrm6 code that is half
the number of hash buckets in the v6 route cache times 16 (like the v4
code does).
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Export garbage collector thresholds for xfrm[4|6]_dst_ops
Had a problem reported to me recently in which a high volume of ipsec
connections on a system began reporting ENOBUFS for new connections
eventually.
It seemed that after about 2000 connections we started being unable to
create more. A quick look revealed that the xfrm code used a dst_ops
structure that limited the gc_thresh value to 1024, and always
dropped route cache entries after 2x the gc_thresh.
It seems the most direct solution is to export the gc_thresh values in
the xfrm[4|6] dst_ops as sysctls, like the main routing table does, so
that higher volumes of connections can be supported. This patch has
been tested and allows the reporter to increase their ipsec connection
volume successfully.
Reported-by: Joe Nall <joe@nall.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
ipv4/xfrm4_policy.c | 18 ++++++++++++++++++
ipv6/xfrm6_policy.c | 18 ++++++++++++++++++
2 files changed, 36 insertions(+)
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\ \ \ \
| |/ / /
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
drivers/net/wireless/iwmc3200wifi/netdev.c
net/wireless/scan.c
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When the TCP connection handshake completes on the passive
side, a variety of state must be set up in the "child" sock,
including the key if MD5 authentication is being used. Fix TCP
for both address families to label the key with the peer's
destination address, rather than the address from the listening
sock, which is usually the wildcard.
Reported-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: John Dykstra <john.dykstra1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Fix MD5 signature checking so that an IPv4 active open
to an IPv6 socket can succeed. In particular, use the
correct address family's signature generation function
for the SYN/ACK.
Reported-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: John Dykstra <john.dykstra1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The local variable 'idev' shadows the function argument 'idev' to
ip6_mc_add_src(). Fixed by removing the local declaration, as pmc->idev
should be identical with 'idev' passed as argument.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\ \ \ \
| |/ / /
| | | |
| | | |
| | | |
| | | |
| | | | |
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
drivers/net/wireless/orinoco/main.c
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The sit module makes use of skb->dst in it's xmit function, so since
93f154b594fe47 ("net: release dst entry in dev_hard_start_xmit()") sit
tunnels are broken, because the flag IFF_XMIT_DST_RELEASE is not
unset.
This patch unsets that flag for sit devices to fix this
regression.
Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
After commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
(net: No more expensive sock_hold()/sock_put() on each tx)
we do not take any more references on sk->sk_refcnt on outgoing packets.
I forgot to delete two __sock_put() from ip_push_pending_frames()
and ip6_push_pending_frames().
Reported-by: Emil S Tantilov <emils.tantilov@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Emil S Tantilov <emils.tantilov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
- validate and forward GSO UDP/IPv6 packets from untrusted sources.
- do software UFO if the outgoing device doesn't support UFO.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
- move ipv6_select_ident() inline function to ipv6.h and remove the unused
skb argument
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
- fix gso_size setting for ipv6 fragment to be a multiple of 8 bytes.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
- add HW checksum support for outgoing large UDP/IPv6 packets destined for
a UFO enabled device.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The routine ipv6_rcv() uses magic number 0 for a return when it drops a
packet. This corresponds to NET_RX_SUCCESS, which is obviously
incorrect. Correct this by using NET_RX_DROP instead.
ps. It isn't exactly clear who the IPv6 maintainers are, apologies if
I've missed any.
Signed-off-by: Mark Smith <markzzzsmith@yahoo.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|/ / /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch is the result of an automatic spatch transformation to convert
all ndo_start_xmit() return values of 0 to NETDEV_TX_OK.
Some occurences are missed by the automatic conversion, those will be
handled in a seperate patch.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
There's a bug in addrconf_prefix_rcv() where it won't update the
preferred lifetime of an IPv6 address if the current valid lifetime
of the address is less than 2 hours (the minimum value in the RA).
For example, If I send a router advertisement with a prefix that
has valid lifetime = preferred lifetime = 2 hours we'll build
this address:
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic
valid_lft 7175sec preferred_lft 7175sec
If I then send the same prefix with valid lifetime = preferred
lifetime = 0 it will be ignored since the minimum valid lifetime
is 2 hours:
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic
valid_lft 7161sec preferred_lft 7161sec
But according to RFC 4862 we should always reset the preferred lifetime
even if the valid lifetime is invalid, which would cause the address
to immediately get deprecated. So with this patch we'd see this:
5: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2001:1890:1109:a20:21f:29ff:fe5a:ef04/64 scope global deprecated dynamic
valid_lft 7163sec preferred_lft 0sec
The comment winds-up being 5x the size of the code to fix the problem.
Update the preferred lifetime of IPv6 addresses derived from a prefix
info option in a router advertisement even if the valid lifetime in
the option is invalid, as specified in RFC 4862 Section 5.5.3e. Fixes
an issue where an address will not immediately become deprecated.
Reported by Jens Rosenboom.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The SCTP pushed the skb above the sctp chunk header, so the
check of pskb_may_pull(skb, nh + offset + 1 - skb->data) in
_decode_session6() will never return 0 and the ports decode
of sctp will always fail. (nh + offset + 1 - skb->data < 0)
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
As transparent proxying looks up the socket early and assigns
it to the skb for later processing, we must drop any existing
socket ownership prior to that in order to distinguish between
the case where tproxy is active and where it is not.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The ipv6 module uses rcu_call() thus it should use rcu_barrier() on
module unload.
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Avoid showing wrong high values when the preferred lifetime of an address
is expired.
Signed-off-by: Jens Rosenboom <me@jayr.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|/
|
|
|
|
|
|
|
| |
Change all the code that deals directly with ICMPv6 type and code
values to use u8 instead of a signed int as that's the actual data
type.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
(net: No more expensive sock_hold()/sock_put() on each tx)
changed initial sk_wmem_alloc value.
We need to take into account this offset when reporting
sk_wmem_alloc to user, in PROC_FS files or various
ioctls (SIOCOUTQ/TIOCOUTQ)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| | |
master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:
Documentation/feature-removal-schedule.txt
drivers/scsi/fcoe/fcoe.c
net/core/drop_monitor.c
net/core/net-traces.c
|
| |
| |
| |
| |
| | |
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
| |
| |
| |
| |
| |
| |
| | |
.ko is normally not included in Kconfig help, make it consistent.
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
IPv4:
- make PIM register vifs netns local
- set the netns when a PIM register vif is created
- make PIM available in all network namespaces (if CONFIG_IP_PIMSM_V2)
by adding the protocol handler when multicast routing is initialized
IPv6:
- make PIM register vifs netns local
- make PIM available in all network namespaces (if CONFIG_IPV6_PIMSM_V2)
by adding the protocol handler when multicast routing is initialized
Signed-off-by: Tom Goff <thomas.goff@boeing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\ \
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
One of the problem with sock memory accounting is it uses
a pair of sock_hold()/sock_put() for each transmitted packet.
This slows down bidirectional flows because the receive path
also needs to take a refcount on socket and might use a different
cpu than transmit path or transmit completion path. So these
two atomic operations also trigger cache line bounces.
We can see this in tx or tx/rx workloads (media gateways for example),
where sock_wfree() can be in top five functions in profiles.
We use this sock_hold()/sock_put() so that sock freeing
is delayed until all tx packets are completed.
As we also update sk_wmem_alloc, we could offset sk_wmem_alloc
by one unit at init time, until sk_free() is called.
Once sk_free() is called, we atomic_dec_and_test(sk_wmem_alloc)
to decrement initial offset and atomicaly check if any packets
are in flight.
skb_set_owner_w() doesnt call sock_hold() anymore
sock_wfree() doesnt call sock_put() anymore, but check if sk_wmem_alloc
reached 0 to perform the final freeing.
Drawback is that a skb->truesize error could lead to unfreeable sockets, or
even worse, prematurely calling __sk_free() on a live socket.
Nice speedups on SMP. tbench for example, going from 2691 MB/s to 2711 MB/s
on my 8 cpu dev machine, even if tbench was not really hitting sk_refcnt
contention point. 5 % speedup on a UDP transmit workload (depends
on number of flows), lowering TX completion cpu usage.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | | |
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | | |
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Define three accessors to get/set dst attached to a skb
struct dst_entry *skb_dst(const struct sk_buff *skb)
void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)
void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;
Delete skb->dst field
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Print-out the error value when sock_alloc_send_skb() fails in
the IPv6 neighbor discovery code - can be useful for debugging.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add 'autoconf' and 'disable_ipv6' parameters to the IPv6 module.
The first controls if IPv6 addresses are autoconfigured from
prefixes received in Router Advertisements. The IPv6 loopback
(::1) and link-local addresses are still configured.
The second controls if IPv6 addresses are desired at all. No
IPv6 addresses will be added to any interfaces.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Current conntrack code kills the ICMP conntrack entry as soon as
the first reply is received. This is incorrect, as we then see only
the first ICMP echo reply out of several possible duplicates as
ESTABLISHED, while the rest will be INVALID. Also this unnecessarily
increases the conntrackd traffic on H-A firewalls.
Make all the ICMP conntrack entries (including the replied ones)
last for the default of nf_conntrack_icmp{,v6}_timeout seconds.
Signed-off-by: Jan "Yenya" Kasprzak <kas@fi.muni.cz>
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
| | |
| | |
| | |
| | |
| | | |
Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch simplifies the conntrack event caching system by removing
several events:
* IPCT_[*]_VOLATILE, IPCT_HELPINFO and IPCT_NATINFO has been deleted
since the have no clients.
* IPCT_COUNTER_FILLING which is a leftover of the 32-bits counter
days.
* IPCT_REFRESH which is not of any use since we always include the
timeout in the messages.
After this patch, the existing events are:
* IPCT_NEW, IPCT_RELATED and IPCT_DESTROY, that are used to identify
addition and deletion of entries.
* IPCT_STATUS, that notes that the status bits have changes,
eg. IPS_SEEN_REPLY and IPS_ASSURED.
* IPCT_PROTOINFO, that reports that internal protocol information has
changed, eg. the TCP, DCCP and SCTP protocol state.
* IPCT_HELPER, that a helper has been assigned or unassigned to this
entry.
* IPCT_MARK and IPCT_SECMARK, that reports that the mark has changed, this
covers the case when a mark is set to zero.
* IPCT_NATSEQADJ, to report that there's updates in the NAT sequence
adjustment.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|\ \ \
| |/ /
|/| | |
|
| | |
| | |
| | |
| | | |
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
| | |
| | |
| | |
| | |
| | |
| | | |
Combining two ifs, and goto is easily gone.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Cosmetic only. Transformation applied:
-if (foo) { long block; } else { short block; }
+if (!foo) { short block; continue; } long block;
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
| | |
| | |
| | |
| | |
| | |
| | | |
af is an nfproto.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
|