summaryrefslogtreecommitdiffstats
path: root/net/ipv4
Commit message (Collapse)AuthorAgeFilesLines
...
* [IPVS]: Fix compilationAdrian Bunk2006-01-052-0/+2
| | | | | | Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP] tcp_vegas: Fix slow startThomas Young2006-01-041-0/+4
| | | | | | | | Vegas' slow start was only adding one MSS per RTT rather than one for every ack. Slow start behavior should now match Reno. Signed-off-by: Thomas Young <tyo@ee.mu.oz.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPVS]: Add missing include <linux/net.h>Arnaldo Carvalho de Melo2006-01-041-0/+1
| | | | | | | | | | | | | | CC [M] net/ipv4/ipvs/ip_vs_conn.o /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/ipvs/ip_vs_conn.c: In function 'ip_vs_conn_new': /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/ipvs/ip_vs_conn.c:606: warning: implicit declaration of function 'net_ratelimit' /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/ipvs/ip_vs_conn.c: In function 'ip_vs_random_dropentry': /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/ipvs/ip_vs_conn.c:810: warning: implicit declaration of function 'net_random' Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [TCP]: syn_flood_warning is only needed if CONFIG_SYN_COOKIES is selectedArnaldo Carvalho de Melo2006-01-041-0/+2
| | | | | | | | CC net/ipv4/tcp_ipv4.o /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/tcp_ipv4.c:665: warning: 'syn_flood_warning' defined but not used Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [TCP]: less inline'sStephen Hemminger2006-01-034-35/+171
| | | | | | | | | | | | | | | | | | | | | TCP inline usage cleanup: * get rid of inline in several places * replace __inline__ with inline where possible * move functions used in one file out of tcp.h * let compiler decide on used once cases On x86_64: text data bss dec hex filename 3594701 648348 567400 4810449 4966d1 vmlinux.orig 3593133 648580 567400 4809113 496199 vmlinux On sparc64: text data bss dec hex filename 2538278 406152 530392 3474822 350586 vmlinux.ORIG 2536382 406384 530392 3473158 34ff06 vmlinux Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4] fib_trie: build fixStephen Hemminger2006-01-031-0/+1
| | | | | | | | Need this to fix build of fib_trie in net-2.6.16 (rebased) tree. The code needs the new inet_make_mask inline. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPVS]: Cleanup IP_VS_DBG statements.Roberto Nibali2006-01-034-14/+17
| | | | | | | | | | | | | | | | | | | | From: Roberto Nibali <ratz@drugphish.ch> The attached patch (against current -GIT) is a cleanup patch which does following: o lookup debug messages shifted back to 9 o added more informational value to flags and refcnt since those entries can be in multiple referenced structures o cleanup 80 char violation It's the prepatch to the session pool implementation and helps very much to debug and monitor important variables and structures regarding the threshold limitation and persistency without the thousands of lookup messages which noone is interested in. Signed-off-by: Horms <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Add a dev_ioctl() fallback to sock_ioctl()Christoph Hellwig2006-01-031-4/+4
| | | | | | | | | | | Currently all network protocols need to call dev_ioctl as the default fallback in their ioctl implementations. This patch adds a fallback to dev_ioctl to sock_ioctl if the protocol returned -ENOIOCTLCMD. This way all the procotol ioctl handlers can be simplified and we don't need to export dev_ioctl. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* [INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.hArnaldo Carvalho de Melo2006-01-0338-0/+56
| | | | | | | | | | | To help in reducing the number of include dependencies, several files were touched as they were getting needed headers indirectly for stuff they use. Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had linux/dccp.h include twice. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: move struct proto_ops to constEric Dumazet2006-01-031-3/+3
| | | | | | | | | | | | | | | | | | | | | | I noticed that some of 'struct proto_ops' used in the kernel may share a cache line used by locks or other heavily modified data. (default linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at least) This patch makes sure a 'struct proto_ops' can be declared as const, so that all cpus can share all parts of it without false sharing. This is not mandatory : a driver can still use a read/write structure if it needs to (and eventually a __read_mostly) I made a global stubstitute to change all existing occurences to make them const. This should reduce the possibility of false sharing on SMP, and speedup some socket system calls. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4] fib_trie: Add credits.Robert Olsson2006-01-031-0/+7
| | | | | Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP] cubic: use Newton-RaphsonStephen Hemminger2006-01-031-54/+39
| | | | | | | | | Replace cube root algorithim with a faster version using Newton-Raphson. Surprisingly, doing the scaled div64_64 is faster than a true 64 bit division on 64 bit CPU's. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP] cubic: precompute constantsStephen Hemminger2006-01-031-76/+57
| | | | | | | | | | | | Revised version of patch to pre-compute values for TCP cubic. * d32,d64 replaced with descriptive names * cube_factor replaces srtt[scaled by count] / HZ * ((1 << (10+2*BICTCP_HZ)) / bic_scale) * beta_scale replaces 8*(BICTCP_BETA_SCALE+beta)/3/(BICTCP_BETA_SCALE-beta); Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IP_SOCKGLUE]: Remove most of the tcp specific callsArnaldo Carvalho de Melo2006-01-036-29/+30
| | | | | | | | | | | | | | As DCCP needs to be called in the same spots. Now we have a member in inet_sock (is_icsk), set at sock creation time from struct inet_protosw->flags (if INET_PROTOSW_ICSK is set, like for TCP and DCCP) to see if a struct sock instance is a inet_connection_sock for places like the ones in ip_sockglue.c (v4 and v6) where we previously were looking if sk_type was SOCK_STREAM, that is insufficient because we now use the same code for DCCP, that has sk_type SOCK_DCCP. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [INET]: Generalise tcp_v4_hash_connectArnaldo Carvalho de Melo2006-01-032-172/+179
| | | | | | | | Renaming it to inet_hash_connect, making it possible to ditch dccp_v4_hash_connect and share the same code with TCP instead. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TWSK]: Introduce struct timewait_sock_opsArnaldo Carvalho de Melo2006-01-032-31/+45
| | | | | | | | | | | | So that we can share several timewait sockets related functions and make the timewait mini sockets infrastructure closer to the request mini sockets one. Next changesets will take advantage of this, moving more code out of TCP and DCCP v4 and v6 to common infrastructure. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV6]: Introduce inet6_timewait_sockArnaldo Carvalho de Melo2006-01-032-6/+8
| | | | | | | | | | | | | Out of tcp6_timewait_sock, that now is just an aggregation of inet_timewait_sock and inet6_timewait_sock, using tw_ipv6_offset in struct inet_timewait_sock, that is common to the IPv6 transport protocols that use timewait sockets, like DCCP and TCP. tw_ipv6_offset plays the struct inet_sock pinfo6 role, i.e. for the generic code to find the IPv6 area in a timewait sock. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPVS]: remove dead codeRoberto Nibali2006-01-034-104/+0
| | | | | | | | | This patch removes dead code. I don't see the reason to keep this cruft around, besides cluttering the nice and functionally working code. Signed-off-by: Roberto Nibali <ratz@drugphish.ch> Signed-off-by: Horms <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [UDP]: udp_checksum_init return valueStephen Hemminger2006-01-031-4/+2
| | | | | | | | Since udp_checksum_init always returns 0 there is no point in having it return a value. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IP]: Simplify and consolidate MSG_PEEK error handlingHerbert Xu2006-01-031-14/+1
| | | | | | | | | | | | | | | | When a packet is obtained from skb_recv_datagram with MSG_PEEK enabled it is left on the socket receive queue. This means that when we detect a checksum error we have to be careful when trying to free the packet as someone could have dequeued it in the time being. Currently this delicate logic is duplicated three times between UDPv4, UDPv6 and RAWv6. This patch moves them into a one place and simplifies the code somewhat. This is based on a suggestion by Eric Dumazet. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [ICSK]: Move v4_addr2sockaddr from TCP to icskArnaldo Carvalho de Melo2006-01-032-11/+13
| | | | | | | Renaming it to inet_csk_addr2sockaddr. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [ICSK]: Rename struct tcp_func to struct inet_connection_sock_af_opsArnaldo Carvalho de Melo2006-01-036-30/+29
| | | | | | | | And move it to struct inet_connection_sock. DCCP will use it in the upcoming changesets. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV6]: Introduce inet6_rsk()Arnaldo Carvalho de Melo2006-01-031-4/+4
| | | | | | | | And inet6_rsk_offset in inet_request_sock, for the same reasons as inet_sock's pinfo6 member. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [ICSK]: make inet_csk_reqsk_queue_hash_add timeout arg unsigned longArnaldo Carvalho de Melo2006-01-031-1/+1
| | | | | Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV6]: Reuse inet_csk_get_port in tcp_v6_get_portArnaldo Carvalho de Melo2006-01-032-4/+10
| | | | | Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: Safer reassemblyHerbert Xu2006-01-034-1/+79
| | | | | | | | | | | | | | | | | Another spin of Herbert Xu's "safer ip reassembly" patch for 2.6.16. (The original patch is here: http://marc.theaimsgroup.com/?l=linux-netdev&m=112281936522415&w=2 and my only contribution is to have tested it.) This patch (optionally) does additional checks before accepting IP fragments, which can greatly reduce the possibility of reassembling fragments which originated from different IP datagrams. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Arthur Kepner <akepner@sgi.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER] ip_tables: NUMA-aware allocationEric Dumazet2006-01-032-118/+256
| | | | | | | | | | | | | | | | | Part of a performance problem with ip_tables is that memory allocation is not NUMA aware, but 'only' SMP aware (ie each CPU normally touch separate cache lines) Even with small iptables rules, the cost of this misplacement can be high on common workloads. Instead of using one vmalloc() area (located in the node of the iptables process), we now allocate an area for each possible CPU, using vmalloc_node() so that memory should be allocated in the CPU's node if possible. Port to arp_tables and ip6_tables by Harald Welte. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP] BIC: CUBIC window growth (2.0)Stephen Hemminger2006-01-033-0/+454
| | | | | | | | | | Replace existing BIC version 1.1 with new version 2.0. The main change is to replace the window growth function with a cubic function as described in: http://www.csc.ncsu.edu/faculty/rhee/export/bitcp/cubic-paper.pdf Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP] BIC: spelling and whitespaceStephen Hemminger2006-01-031-2/+2
| | | | | Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP] BIC: remove low utilization code.Stephen Hemminger2006-01-031-80/+1
| | | | | | | | | | | The latest BICTCP patch at: http://www.csc.ncsu.edu:8080/faculty/rhee/export/bitcp/index_files/Page546.htm disables the low_utilization feature of BICTCP because it doesn't work in some cases. This patch removes it. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [XFRM]: Handle DCCP in xfrm{4,6}_decode_sessionPatrick McHardy2005-12-191-0/+1
| | | | | Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Fix NAT init orderPatrick McHardy2005-12-191-1/+2
| | | | | | | | | | As noticed by Phil Oester, the GRE NAT protocol helper is initialized before the NAT core, which makes registration fail. Change the linking order to make NAT be initialized first. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [GRE]: Fix hardware checksum modificationHerbert Xu2005-12-141-1/+1
| | | | | | | | | The skb_postpull_rcsum introduced a bug to the checksum modification. Although the length pulled is offset bytes, the origin of the pulling is the GRE header, not the IP header. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: ip_nat_tftp: Fix expectation NATMarcus Sundberg2005-12-121-1/+4
| | | | | | | | | When a TFTP client is SNATed so that the port is also changed, the port is never changed back for the expected connection. Signed-off-by: Marcus Sundberg <marcus@ingate.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP] Vegas: timestamp before cloneDavid S. Miller2005-12-061-109/+124
| | | | | | | | | | | | | We have to store the congestion control timestamp on the SKB before we clone it, not after. Else we get no timestamping information at all. tcp_transmit_skb() has been reworked so that we can do the timestamp still in one spot, instead of at all the call sites. Problem discovered, and initial fix, from Tom Young <tyo@ee.unimelb.edu.au>. Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP] Vegas: Remove extra call to tcp_vegas_rtt_calcThomas Young2005-12-061-8/+0
| | | | | | | | | Remove unneeded call to tcp_vegas_rtt_calc. The more accurate microsecond value has already been registered prior to calling tcp_vegas_cong_avoid. Signed-off-by: Thomas Young <tyo@ee.mu.oz.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP] Vegas: stop resetting rtt every ackThomas Young2005-12-061-4/+4
| | | | | | | | Move the resetting of rtt measurements to inside the once per RTT block of code. Signed-off-by: Thomas Young <tyo@ee.mu.oz.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Don't use conntrack entry after dropping the referencePatrick McHardy2005-12-051-4/+2
| | | | | Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Fix unbalanced read_unlock_bh in ctnetlinkPatrick McHardy2005-12-051-1/+2
| | | | | | | | | NFA_NEST calls NFA_PUT which jumps to nfattr_failure if the skb has no room left. We call read_unlock_bh at nfattr_failure for the NFA_PUT inside the locked section, so move NFA_NEST inside the locked section too. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Mark ctnetlink as EXPERIMENTALPatrick McHardy2005-12-051-4/+4
| | | | | | | | Should have been marked EXPERIMENTAL from the beginning, as the current bunch of fixes show. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Fix CTA_PROTO_NUM attribute size in ctnetlinkPatrick McHardy2005-12-051-2/+2
| | | | | | | CTA_PROTO_NUM is a u_int8_t. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Fix ip_conntrack_flush abuse in ctnetlinkPatrick McHardy2005-12-051-9/+11
| | | | | | | | | | | ip_conntrack_flush() used to be part of ip_conntrack_cleanup(), which needs to drop _all_ references on module unload. Table flushed using ctnetlink just needs to clean the table and doesn't need to flush the event cache or wait for any references attached to skbs. Move everything but pure table flushing back to ip_conntrack_cleanup(). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Fix incorrect argument to ip_nat_initialized() in ctnetlinkPablo Neira Ayuso2005-12-051-1/+1
| | | | | | | | | | | ip_nat_initialized() takes enum ip_nat_manip_type as it's second argument, not a hook number. Noticed and initial patch by Marcus Sundberg <marcus@ingate.com>. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4] Fix EPROTONOSUPPORT error in inet_createHerbert Xu2005-12-021-4/+3
| | | | | | | | | | | | There is a coding error in inet_create that causes it to always return ESOCKTNOSUPPORT. It should return EPROTONOSUPPORT when there are protocols registered for a given socket type but none of them match the requested protocol. This is based on a patch by Jayachandran C. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IGMP]: workaround for IGMP v1/v2 bugDavid Stevens2005-12-021-1/+4
| | | | | | | | | | | | | | | | | From: David Stevens <dlstevens@us.ibm.com> As explained at: http://www.cs.ucsb.edu/~krishna/igmp_dos/ With IGMP version 1 and 2 it is possible to inject a unicast report to a client which will make it ignore multicast reports sent later by the router. The fix is to only accept the report if is was sent to a multicast or unicast address. Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETLINK]: Fix processing of fib_lookup netlink messagesThomas Graf2005-12-011-2/+6
| | | | | | | | | The receive path for fib_lookup netlink messages is lacking sanity checks for header and payload and is thus vulnerable to malformed netlink messages causing illegal memory references. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Fix recent match jiffies wrap mismatchesPhil Oester2005-12-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | Around jiffies wrap time (i.e. within first 5 mins after boot), recent match rules which contain both --seconds and --hitcount arguments experience false matches. This is because the last_pkts array is filled with zeros on creation, and when comparing 'now' to 0 (+ --seconds argument), time_before_eq thinks it has found a hit. Below patch adds a break if the packet value is zero. This has the unfortunate side effect of causing mismatches if a packet was received when jiffies really was equal to zero. The odds of that happening are slim compared to the problems caused by not adding the break however. Plus, the author used this same method just below, so it is "good enough". This fixes netfilter bugs #383 and #395. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Ignore ACKs ACKs on half open connections in TCP conntrackJozsef Kadlecsik2005-12-011-9/+20
| | | | | | | | | | | | | | | | | | | | | | Mounting NFS file systems after a (warm) reboot could take a long time if firewalling and connection tracking was enabled. The reason is that the NFS clients tends to use the same ports (800 and counting down). Now on reboot, the server would still have a TCB for an existing TCP connection client:800 -> server:2049. The client sends a SYN from port 800 to server:2049, which elicits an ACK from the server. The firewall on the client drops the ACK because (from its point of view) the connection is still in half-open state, and it expects to see a SYNACK. The client will eventually time out after several minutes. The following patch corrects this, by accepting ACKs on half open connections as well. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER] ipv4: small cleanupsAdrian Bunk2005-11-293-4/+4
| | | | | | | | | This patch contains the following cleanups: - make needlessly global code static - ip_conntrack_core.c: ip_conntrack_flush() -> ip_conntrack_flush(void) Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: make two functions staticAdrian Bunk2005-11-292-2/+2
| | | | | | | This patch makes two needlessly global functions static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud