summaryrefslogtreecommitdiffstats
path: root/net/ipv6/ip6_output.c
Commit message (Collapse)AuthorAgeFilesLines
* inet: Decrease overhead of on-stack inet_cork.David S. Miller2011-05-061-16/+18
| | | | | | | | | | | | | | | | When we fast path datagram sends to avoid locking by putting the inet_cork on the stack we use up lots of space that isn't necessary. This is because inet_cork contains a "struct flowi" which isn't used in these code paths. Split inet_cork to two parts, "inet_cork" and "inet_cork_full". Only the latter of which has the "struct flowi" and is what is stored in inet_sock. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
* inet: constify ip headers and in6_addrEric Dumazet2011-04-221-4/+4
| | | | | | | | Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers where possible, to make code intention more obvious. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: RTA_PREFSRC support for ipv6 route source address selectionDaniel Walter2011-04-151-4/+4
| | | | | | | | | | | | [ipv6] Add support for RTA_PREFSRC This patch allows a user to select the preferred source address for a specific IPv6-Route. It can be set via a netlink message setting RTA_PREFSRC to a valid IPv6 address which must be up on the device the route will be bound to. Signed-off-by: Daniel Walter <dwalter@barracuda.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Fix common misspellingsLucas De Marchi2011-03-311-1/+1
| | | | | | Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
* ipv6: Convert to use flowi6 where applicable.David S. Miller2011-03-121-45/+45
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Put flowi_* prefix on AF independent members of struct flowiDavid S. Miller2011-03-121-5/+5
| | | | | | | | | | I intend to turn struct flowi into a union of AF specific flowi structs. There will be a common structure that each variant includes first, much like struct sock_common. This is the first step to move in that direction. Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Return dst directly from xfrm_lookup()David S. Miller2011-03-021-8/+2
| | | | | | Instead of on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Handle blackhole route creation via afinfo.David S. Miller2011-03-011-22/+10
| | | | | | | That way we don't have to potentially do this in every xfrm_lookup() caller. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: Normalize arguments to ip6_dst_blackhole().David S. Miller2011-03-011-2/+2
| | | | | | | | | | Return a dst pointer which is potentitally error encoded. Don't pass original dst pointer by reference, pass a struct net instead of a socket, and elide the flow argument since it is unnecessary. Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Kill XFRM_LOOKUP_WAIT flag.David S. Miller2011-03-011-2/+2
| | | | | | This can be determined from the flow flags instead. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: Change final dst lookup arg name to "can_sleep"David S. Miller2011-03-011-6/+6
| | | | | | | Since it indicates whether we are invoked from a sleepable context or not. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Add FLOWI_FLAG_CAN_SLEEP.David S. Miller2011-03-011-0/+2
| | | | | | And set is in contexts where the route resolution can sleep. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: Consolidate route lookup sequences.David S. Miller2011-03-011-11/+69
| | | | | | | | | | | | | | | | | | | | | Route lookups follow a general pattern in the ipv6 code wherein we first find the non-IPSEC route, potentially override the flow destination address due to ipv6 options settings, and then finally make an IPSEC search using either xfrm_lookup() or __xfrm_lookup(). __xfrm_lookup() is used when we want to generate a blackhole route if the key manager needs to resolve the IPSEC rules (in this case -EREMOTE is returned and the original 'dst' is left unchanged). Otherwise plain xfrm_lookup() is used and when asynchronous IPSEC resolution is necessary, we simply fail the lookup completely. All of these cases are encapsulated into two routines, ip6_dst_lookup_flow and ip6_sk_dst_lookup_flow. The latter of which handles unconnected UDP datagram sockets. Signed-off-by: David S. Miller <davem@davemloft.net>
* inet: Remove unused sk_sndmsg_* from UFOHerbert Xu2011-03-011-1/+0
| | | | | | | | | | UFO doesn't really use the sk_sndmsg_* parameters so touching them is pointless. It can't use them anyway since the whole point of UFO is to use the original pages without copying. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: TX timestamps for IPv6 UDP packetsAnders Berggren2011-02-281-0/+17
| | | | | | | | | Enabling TX timestamps (SO_TIMESTAMPING) for IPv6 UDP packets, in the same fashion as for IPv4. Necessary in order for NICs such as Intel 82580 to timestamp IPv6 packets. Signed-off-by: Anders Berggren <anders@halon.se> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: totlen is declared and assigned but not usedHagen Paul Pfeifer2011-02-251-3/+0
| | | | | Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* inetpeer: Move ICMP rate limiting state into inet_peer entries.David S. Miller2011-02-041-1/+4
| | | | | | | | | | Like metrics, the ICMP rate limiting bits are cached state about a destination. So move it into the inet_peer entries. If an inet_peer cannot be bound (the reason is memory allocation failure or similar), the policy is to allow. Signed-off-by: David S. Miller <davem@davemloft.net>
* inet6: prevent network storms caused by linux IPv6 routersAlexey Kuznetsov2011-01-121-0/+3
| | | | | | | | | | | Linux IPv6 forwards unicast packets, which are link layer multicasts... The hole was present since day one. I was 100% this check is there, but it is not. The problem shows itself, f.e. when Microsoft Network Load Balancer runs on a network. This software resolves IPv6 unicast addresses to multicast MAC addresses. Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed.David Stevens2010-12-191-10/+2
| | | | | | | | | | | | This patch modifies IPsec6 to fragment IPv6 packets that are locally generated as needed. This version of the patch only fragments in tunnel mode, so that fragment headers will not be obscured by ESP in transport mode. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2010-09-271-5/+13
|\ | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/qlcnic/qlcnic_init.c net/ipv4/ip_output.c
| * ip: fix truesize mismatch in ip fragmentationEric Dumazet2010-09-211-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Special care should be taken when slow path is hit in ip_fragment() : When walking through frags, we transfert truesize ownership from skb to frags. Then if we hit a slow_path condition, we must undo this or risk uncharging frags->truesize twice, and in the end, having negative socket sk_wmem_alloc counter, or even freeing socket sooner than expected. Many thanks to Nick Bowler, who provided a very clean bug report and test program. Thanks to Jarek for reviewing my first patch and providing a V2 While Nick bisection pointed to commit 2b85a34e911 (net: No more expensive sock_hold()/sock_put() on each tx), underlying bug is older (2.6.12-rc5) A side effect is to extend work done in commit b2722b1c3a893e (ip_fragment: also adjust skb->truesize for packets not owned by a socket) to ipv6 as well. Reported-and-bisected-by: Nick Bowler <nbowler@elliptictech.com> Tested-by: Nick Bowler <nbowler@elliptictech.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Jarek Poplawski <jarkao2@gmail.com> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: return operator cleanupEric Dumazet2010-09-231-2/+2
| | | | | | | | | | | | | | | | | | Change "return (EXPR);" to "return EXPR;" return is not a function, parentheses are not required. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Rename skb_has_frags to skb_has_frag_listDavid S. Miller2010-08-231-1/+1
|/ | | | | | | | | | | SKBs can be "fragmented" in two ways, via a page array (called skb_shinfo(skb)->frags[]) and via a list of SKBs (called skb_shinfo(skb)->frag_list). Since skb_has_frags() tests the latter, it's name is confusing since it sounds more like it's testing the former. Signed-off-by: David S. Miller <davem@davemloft.net>
* net-next: remove useless union keywordChangli Gao2010-06-101-19/+19
| | | | | | | | | | remove useless union keyword in rtable, rt6_info and dn_route. Since there is only one member in a union, the union keyword isn't useful. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: Add GSO support on forwarding pathHerbert Xu2010-05-281-1/+1
| | | | | | | | | | | | | | | | | Currently we disallow GSO packets on the IPv6 forward path. This patch fixes this. Note that I discovered that our existing GSO MTU checks (e.g., IPv4 forwarding) are buggy in that they skip the check altogether, when they really should be checking gso_size + header instead. I have also been lazy here in that I haven't bothered to segment the GSO packet by hand before generating an ICMP message. Someone should add that to be 100% correct. Reported-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: ip6mr: support multiple tablesPatrick McHardy2010-05-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for multiple independant multicast routing instances, named "tables". Userspace multicast routing daemons can bind to a specific table instance by issuing a setsockopt call using a new option MRT6_TABLE. The table number is stored in the raw socket data and affects all following ip6mr setsockopt(), getsockopt() and ioctl() calls. By default, a single table (RT6_TABLE_DFLT) is created with a default routing rule pointing to it. Newly created pim6reg devices have the table number appended ("pim6regX"), with the exception of devices created in the default table, which are named just "pim6reg" for compatibility reasons. Packets are directed to a specific table instance using routing rules, similar to how regular routing rules work. Currently iif, oif and mark are supported as keys, source and destination addresses could be supported additionally. Example usage: - bind pimd/xorp/... to a specific table: uint32_t table = 123; setsockopt(fd, SOL_IPV6, MRT6_TABLE, &table, sizeof(table)); - create routing rules directing packets to the new table: # ip -6 mrule add iif eth0 lookup 123 # ip -6 mrule add oif eth0 lookup 123 Signed-off-by: Patrick McHardy <kaber@trash.net>
* Merge branch 'master' of /repos/git/net-next-2.6Patrick McHardy2010-05-101-13/+20
|\ | | | | | | | | | | | | | | Conflicts: net/bridge/br_device.c net/bridge/br_forward.c Signed-off-by: Patrick McHardy <kaber@trash.net>
| * ipv6: cleanup: remove unneeded null checkDan Carpenter2010-04-301-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | We dereference "sk" unconditionally elsewhere in the function. This was left over from: b30bd282 "ip6_xmit: remove unnecessary NULL ptr check". According to that commit message, "the sk argument to ip6_xmit is never NULL nowadays since the skb->priority assigment expects a valid socket." Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'master' of ↵David S. Miller2010-04-271-1/+1
| |\ | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/e100.c drivers/net/e1000e/netdev.c
| | * ipv6: allow to send packet after receiving ICMPv6 Too Big message with MTU ↵Shan Wei2010-04-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | field less than IPV6_MIN_MTU According to RFC2460, PMTU is set to the IPv6 Minimum Link MTU (1280) and a fragment header should always be included after a node receiving Too Big message reporting PMTU is less than the IPv6 Minimum Link MTU. After receiving a ICMPv6 Too Big message reporting PMTU is less than the IPv6 Minimum Link MTU, sctp *can't* send any data/control chunk that total length including IPv6 head and IPv6 extend head is less than IPV6_MIN_MTU(1280 bytes). The failure occured in p6_fragment(), about reason see following(take SHUTDOWN chunk for example): sctp_packet_transmit (SHUTDOWN chunk, len=16 byte) |------sctp_v6_xmit (local_df=0) |------ip6_xmit |------ip6_output (dst_allfrag is ture) |------ip6_fragment In ip6_fragment(), for local_df=0, drops the the packet and returns EMSGSIZE. The patch fixes it with adding check length of skb->len. In this case, Ipv6 not to fragment upper protocol data, just only add a fragment header before it. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | IPv6: Complete IPV6_DONTFRAG supportBrian Haley2010-04-231-8/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Finally add support to detect a local IPV6_DONTFRAG event and return the relevant data to the user if they've enabled IPV6_RECVPATHMTU on the socket. The next recvmsg() will return no data, but have an IPV6_PATHMTU as ancillary data. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | IPv6: Add dontfrag argument to relevant functionsBrian Haley2010-04-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add dontfrag argument to relevant functions for IPV6_DONTFRAG support, as well as allowing the value to be passed-in via ancillary cmsg data. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'master' of ↵David S. Miller2010-04-211-1/+1
| |\ \ | | |/ | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-6000.c net/core/dev.c
| | * ip: Fix ip_dev_loopback_xmit()Eric Dumazet2010-04-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eric Paris got following trace with a linux-next kernel [ 14.203970] BUG: using smp_processor_id() in preemptible [00000000] code: avahi-daemon/2093 [ 14.204025] caller is netif_rx+0xfa/0x110 [ 14.204035] Call Trace: [ 14.204064] [<ffffffff81278fe5>] debug_smp_processor_id+0x105/0x110 [ 14.204070] [<ffffffff8142163a>] netif_rx+0xfa/0x110 [ 14.204090] [<ffffffff8145b631>] ip_dev_loopback_xmit+0x71/0xa0 [ 14.204095] [<ffffffff8145b892>] ip_mc_output+0x192/0x2c0 [ 14.204099] [<ffffffff8145d610>] ip_local_out+0x20/0x30 [ 14.204105] [<ffffffff8145d8ad>] ip_push_pending_frames+0x28d/0x3d0 [ 14.204119] [<ffffffff8147f1cc>] udp_push_pending_frames+0x14c/0x400 [ 14.204125] [<ffffffff814803fc>] udp_sendmsg+0x39c/0x790 [ 14.204137] [<ffffffff814891d5>] inet_sendmsg+0x45/0x80 [ 14.204149] [<ffffffff8140af91>] sock_sendmsg+0xf1/0x110 [ 14.204189] [<ffffffff8140dc6c>] sys_sendmsg+0x20c/0x380 [ 14.204233] [<ffffffff8100ad82>] system_call_fastpath+0x16/0x1b While current linux-2.6 kernel doesnt emit this warning, bug is latent and might cause unexpected failures. ip_dev_loopback_xmit() runs in process context, preemption enabled, so must call netif_rx_ni() instead of netif_rx(), to make sure that we process pending software interrupt. Same change for ip6_dev_loopback_xmit() Reported-by: Eric Paris <eparis@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'master' of /repos/git/net-next-2.6Patrick McHardy2010-04-201-6/+3
|\ \ \ | |/ / | | | | | | | | | | | | | | | | | | | | | Conflicts: Documentation/feature-removal-schedule.txt net/ipv6/netfilter/ip6t_REJECT.c net/netfilter/xt_limit.c Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | ipv6: fix the comment of ip6_xmit()Shan Wei2010-04-151-1/+1
| | | | | | | | | | | | | | | | | | | | | ip6_xmit() is used by upper transport protocol. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: replace ipfragok with skb->local_dfShan Wei2010-04-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As Herbert Xu said: we should be able to simply replace ipfragok with skb->local_df. commit f88037(sctp: Drop ipfargok in sctp_xmit function) has droped ipfragok and set local_df value properly. The patch kills the ipfragok parameter of .queue_xmit(). Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: cancel to setting local_df in ip6_xmit()Shan Wei2010-04-151-4/+0
| |/ | | | | | | | | | | | | | | | | | | | | | | commit f88037(sctp: Drop ipfargok in sctp_xmit function) has droped ipfragok and set local_df value properly. So the change of commit 77e2f1(ipv6: Fix ip6_xmit to send fragments if ipfragok is true) is not needed. So the patch remove them. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* | netfilter: xt_TEE: have cloned packet travel through Xtables tooJan Engelhardt2010-04-191-1/+0
| | | | | | | | | | | | | | | | Since Xtables is now reentrant/nestable, the cloned packet can also go through Xtables and be subject to rules itself. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
* | netfilter: xtables: inclusion of xt_TEEJan Engelhardt2010-04-191-0/+1
| | | | | | | | | | | | | | | | | | | | xt_TEE can be used to clone and reroute a packet. This can for example be used to copy traffic at a router for logging purposes to another dedicated machine. References: http://www.gossamer-threads.com/lists/iptables/devel/68781 Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
* | netfilter: ipv6: add IPSKB_REROUTED exclusion to NF_HOOK/POSTROUTING invocationJan Engelhardt2010-04-131-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | Similar to how IPv4's ip_output.c works, have ip6_output also check the IPSKB_REROUTED flag. It will be set from xt_TEE for cloned packets since Xtables can currently only deal with a single packet in flight at a time. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Acked-by: David S. Miller <davem@davemloft.net> [Patrick: changed to use an IP6SKB value instead of IPSKB] Signed-off-by: Patrick McHardy <kaber@trash.net>
* | netfilter: ipv6: move POSTROUTING invocation before fragmentationJan Engelhardt2010-04-131-26/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Patrick McHardy notes: "We used to invoke IPv4 POST_ROUTING after fragmentation as well just to defragment the packets in conntrack immediately afterwards, but that got changed during the netfilter-ipsec integration. Ideally IPv6 would behave like IPv4." This patch makes it so. Sending an oversized frame (e.g. `ping6 -s64000 -c1 ::1`) will now show up in POSTROUTING as a single skb rather than multiple ones. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
* | netfilter: ipv6: use NFPROTO values for NF_HOOK invocationJan Engelhardt2010-03-251-8/+8
|/ | | | | | | | | | | | | | | | | The semantic patch that was used: // <smpl> @@ @@ (NF_HOOK |NF_HOOK_THRESH |nf_hook )( -PF_INET6, +NFPROTO_IPV6, ...) // </smpl> Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
* ipv6: Use 1280 as min MTU for ipv6 forwardingUlrich Weber2010-02-261-4/+8
| | | | | | | | | | | | | | | | | Clients will set their MTU to 1280 if they receive a ICMPV6_PKT_TOOBIG message with an MTU less than 1280. To allow encapsulating of packets over a 1280 link we should always accept packets with a size of 1280 for forwarding even if the path has a lower MTU and fragment the encapsulated packets afterwards. In case a forwarded packet is not going to be encapsulated a ICMPV6_PKT_TOOBIG msg will still be send by ip6_fragment() with the correct MTU. Signed-off-by: Ulrich Weber <uweber@astaro.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: drop unused "dev" arg of icmpv6_send()Alexey Dobriyan2010-02-181-6/+5
| | | | | | | Dunno, what was the idea, it wasn't used for a long time. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ip: fix mc_loop checks for tunnels with multicast outer addressesOctavian Purdila2010-01-061-2/+1
| | | | | | | | | | | | | | | | When we have L3 tunnels with different inner/outer families (i.e. IPV4/IPV6) which use a multicast address as the outer tunnel destination address, multicast packets will be loopbacked back to the sending socket even if IP*_MULTICAST_LOOP is set to disabled. The mc_loop flag is present in the family specific part of the socket (e.g. the IPv4 or IPv4 specific part). setsockopt sets the inner family mc_loop flag. When the packet is pushed through the L3 tunnel it will eventually be processed by the outer family which if different will check the flag in a different part of the socket then it was set. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ip: Report qdisc packet dropsEric Dumazet2009-09-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Christoph Lameter pointed out that packet drops at qdisc level where not accounted in SNMP counters. Only if application sets IP_RECVERR, drops are reported to user (-ENOBUFS errors) and SNMP counters updated. IP_RECVERR is used to enable extended reliable error message passing, but these are not needed to update system wide SNMP stats. This patch changes things a bit to allow SNMP counters to be updated, regardless of IP_RECVERR being set or not on the socket. Example after an UDP tx flood # netstat -s ... IP: 1487048 outgoing packets dropped ... Udp: ... SndbufErrors: 1487048 send() syscalls, do however still return an OK status, to not break applications. Note : send() manual page explicitly says for -ENOBUFS error : "The output queue for a network interface was full. This generally indicates that the interface has stopped sending, but may be caused by transient congestion. (Normally, this does not occur in Linux. Packets are just silently dropped when a device queue overflows.) " This is not true for IP_RECVERR enabled sockets : a send() syscall that hit a qdisc drop returns an ENOBUFS error. Many thanks to Christoph, David, and last but not least, Alexey ! Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: ip6_push_pending_frames() should increment IPSTATS_MIB_OUTDISCARDSEric Dumazet2009-09-011-0/+1
| | | | | | | | qdisc drops should be notified to IP_RECVERR enabled sockets, as done in IPV4. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* inet6: Conversion from u8 to intGerrit Renker2009-08-131-10/+5
| | | | | | | | | This replaces assignments of the type "int on LHS" = "u8 on RHS" with simpler code. The LHS can express all of the unsigned right hand side values, hence the assigned value can not be negative. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud