blackbird-op-linux - Blackbird™ Linux sources for OpenPOWER

	Commit message (Collapse)	Author	Age	Files	Lines
*	[NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM	Herbert Xu	2006-06-17	2	-9/+3
\| \| \| \| \| \| \| \| \| \| \|	The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM identically so we test for them in quite a few places. For the sake of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two. We also test the disjunct of NETIF_F_IP_CSUM and the other two in various places, for that purpose I've added NETIF_F_ALL_CSUM. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: Add tcp_slow_start_after_idle sysctl.	David S. Miller	2006-06-17	2	-1/+13
\| \| \| \| \| \| \|	A lot of people have asked for a way to disable tcp_cwnd_restart(), and it seems reasonable to add a sysctl to do that. Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP] Westwood: reset RTT min after FRTO	Luca De Cicco	2006-06-17	1	-2/+16
\| \| \| \| \| \| \| \| \|	RTT_min is updated each time a timeout event occurs in order to cope with hard handovers in wireless scenarios such as UMTS. Signed-off-by: Luca De Cicco <ldecicco@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP] Westwood: bandwidth filter startup	Luca De Cicco	2006-06-17	1	-3/+9
\| \| \| \| \| \| \| \| \| \|	The bandwidth estimate filter is now initialized with the first sample in order to have better performances in the case of small file transfers. Signed-off-by: Luca De Cicco <ldecicco@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP] Westwood: comment fixes	Luca De Cicco	2006-06-17	1	-4/+21
\| \| \| \| \| \| \| \|	Cleanup some comments and add more references Signed-off-by: Luca De Cicco <ldecicco@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP] Westwood: fix first sample	Stephen Hemminger	2006-06-17	1	-1/+12
\| \| \| \| \| \| \| \|	Need to update send sequence number tracking after first ack. Rework of patch from Luca De Cicco. Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET]: net.ipv4.ip_autoconfig sysctl removal	Stephen Hemminger	2006-06-17	1	-8/+0
\| \| \| \| \| \| \|	The sysctl net.ipv4.ip_autoconfig is a legacy value that is not used. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET]: Clean up skb_linearize	Herbert Xu	2006-06-17	1	-8/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The linearisation operation doesn't need to be super-optimised. So we can replace __skb_linearize with __pskb_pull_tail which does the same thing but is more general. Also, most users of skb_linearize end up testing whether the skb is linear or not so it helps to make skb_linearize do just that. Some callers of skb_linearize also use it to copy cloned data, so it's useful to have a new function skb_linearize_cow to copy the data if it's either non-linear or cloned. Last but not least, I've removed the gfp argument since nobody uses it anymore. If it's ever needed we can easily add it back. Misc bugs fixed by this patch: * via-velocity error handling (also, no SG => no frags) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: hashlimit match: fix random initialization	Patrick McHardy	2006-06-17	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \|	hashlimit does: if (!ht->rnd) get_random_bytes(&ht->rnd, 4); ignoring that 0 is also a valid random number. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: recent match: missing refcnt initialization	Patrick McHardy	2006-06-17	1	-0/+1
\| \| \| \| \|	Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: recent match: fix "sleeping function called from invalid context"	Patrick McHardy	2006-06-17	1	-5/+10
\| \| \| \| \| \| \| \|	create_proc_entry must not be called with locks held. Use a mutex instead to protect data only changed in user context. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[SECMARK]: Add secmark support to conntrack	James Morris	2006-06-17	3	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \|	Add a secmark field to IP and NF conntracks, so that security markings on packets can be copied to their associated connections, and also copied back to packets as required. This is similar to the network mark field currently used with conntrack, although it is intended for enforcement of security policy rather than network policy. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[SECMARK]: Add secmark support to core networking.	James Morris	2006-06-17	2	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add a secmark field to the skbuff structure, to allow security subsystems to place security markings on network packets. This is similar to the nfmark field, except is intended for implementing security policy, rather than than networking policy. This patch was already acked in principle by Dave Miller. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[IPV4] icmp: Kill local 'ip' arg in icmp_redirect().	David S. Miller	2006-06-17	1	-3/+2
\| \| \| \| \| \| \| \| \|	It is typed wrong, and it's only assigned and used once. So just pass in iph->daddr directly which fixes both problems. Based upon a patch by Alexey Dobriyan. Signed-off-by: David S. Miller <davem@davemloft.net>
*	[IPV4]: Right prototype of __raw_v4_lookup()	Alexey Dobriyan	2006-06-17	1	-1/+1
\| \| \| \| \| \| \| \| \|	All users pass 32-bit values as addresses and internally they're compared with 32-bit entities. So, change "laddr" and "raddr" types to __be32. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[IPV4] igmp: Fixup struct ip_mc_list::multiaddr type	Alexey Dobriyan	2006-06-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	All users except two expect 32-bit big-endian value. One is of ->multiaddr = ->multiaddr variety. And last one is "%08lX". Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: Fix compile warning in tcp_probe.c	David S. Miller	2006-06-17	1	-1/+3
\| \| \| \| \| \| \| \|	The suseconds_t et al. are not necessarily any particular type on every platform, so cast to unsigned long so that we can use one printf format string and avoid warnings across the board Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: Limited slow start for Highspeed TCP	Stephen Hemminger	2006-06-17	1	-3/+21
\| \| \| \| \| \| \| \|	Implementation of RFC3742 limited slow start. Added as part of the TCP highspeed congestion control module. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: TCP Probe congestion window tracing	Stephen Hemminger	2006-06-17	2	-0/+180
\| \| \| \| \| \| \| \| \| \|	This adds a new module for tracking TCP state variables non-intrusively using kprobes. It has a simple /proc interface that outputs one line for each packet received. A sample usage is to collect congestion window and ssthresh over time graphs. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: Minimum congestion window consolidation.	Stephen Hemminger	2006-06-17	8	-46/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Many of the TCP congestion methods all just use ssthresh as the minimum congestion window on decrease. Rather than duplicating the code, just have that be the default if that handle in the ops structure is not set. Minor behaviour change to TCP compound. It probably wants to use this (ssthresh) as lower bound, rather than ssthresh/2 because the latter causes undershoot on loss. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: TCP Compound quad root function	Stephen Hemminger	2006-06-17	1	-24/+66
\| \| \| \| \| \| \| \| \|	The original code did a 64 bit divide directly, which won't work on 32 bit platforms. Rather than doing a 64 bit square root twice, just implement a 4th root function in one pass using Newton's method. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: TCP Compound congestion control	Angelo P. Castellani	2006-06-17	3	-0/+418
\| \| \| \| \| \| \| \| \| \| \| \|	TCP Compound is a sender-side only change to TCP that uses a mixed Reno/Vegas approach to calculate the cwnd. For further details look here: ftp://ftp.research.microsoft.com/pub/tr/TR-2005-86.pdf Signed-off-by: Angelo P. Castellani <angelo.castellani@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: TCP Veno congestion control	Bin Zhou	2006-06-17	3	-0/+251
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TCP Veno module is a new congestion control module to improve TCP performance over wireless networks. The key innovation in TCP Veno is the enhancement of TCP Reno/Sack congestion control algorithm by using the estimated state of a connection based on TCP Vegas. This scheme significantly reduces "blind" reduction of TCP window regardless of the cause of packet loss. This work is based on the research paper "TCP Veno: TCP Enhancement for Transmission over Wireless Access Networks." C. P. Fu, S. C. Liew, IEEE Journal on Selected Areas in Communication, Feb. 2003. Original paper and many latest research works on veno: http://www.ntu.edu.sg/home/ascpfu/veno/veno.html Signed-off-by: Bin Zhou <zhou0022@ntu.edu.sg> Cheng Peng Fu <ascpfu@ntu.edu.sg> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: TCP Low Priority congestion control	Wong Hoi Sing Edison	2006-06-17	3	-0/+349
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TCP Low Priority is a distributed algorithm whose goal is to utilize only the excess network bandwidth as compared to the ``fair share`` of bandwidth as targeted by TCP. Available from: http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf Original Author: Aleksandar Kuzmanovic <akuzma@northwestern.edu> See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation. As of 2.6.13, Linux supports pluggable congestion control algorithms. Due to the limitation of the API, we take the following changes from the original TCP-LP implementation: o We use newReno in most core CA handling. Only add some checking within cong_avoid. o Error correcting in remote HZ, therefore remote HZ will be keeped on checking and updating. o Handling calculation of One-Way-Delay (OWD) within rtt_sample, sicne OWD have a similar meaning as RTT. Also correct the buggy formular. o Handle reaction for Early Congestion Indication (ECI) within pkts_acked, as mentioned within pseudo code. o OWD is handled in relative format, where local time stamp will in tcp_time_stamp format. Port from 2.4.19 to 2.6.16 as module by: Wong Hoi Sing Edison <hswong3i@gmail.com> Hung Hing Lun <hlhung3i@gmail.com> Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: PPTP helper: fixup gre_keymap_lookup() return type	Alexey Dobriyan	2006-06-17	1	-3/+3
\| \| \| \| \| \| \| \|	GRE keys are 16-bit wide. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: Add SIP connection tracking helper	Patrick McHardy	2006-06-17	4	-0/+740
\| \| \| \| \| \| \| \| \|	Add SIP connection tracking helper. Originally written by Christian Hentschel <chentschel@arnet.com.ar>, some cleanup, minor fixes and bidirectional SIP support added by myself. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: H.323 helper: replace internal_net_addr parameter by ↵	Patrick McHardy	2006-06-17	1	-30/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	routing-based heuristic Call Forwarding doesn't need to create an expectation if both peers can reach each other without our help. The internal_net_addr parameter lets the user explicitly specify a single network where this is true, but is not very flexible and even fails in the common case that calls will both be forwarded to outside parties and inside parties. Use an optional heuristic based on routing instead, the assumption is that if bpth the outgoing device and the gateway are equal, both peers can reach each other directly. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: H.323 helper: Add support for Call Forwarding	Jing Min Zhao	2006-06-17	4	-7/+196
\| \| \| \| \| \|	Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: amanda helper: convert to textsearch infrastructure	Patrick McHardy	2006-06-17	2	-49/+96
\| \| \| \| \| \| \| \| \| \| \| \| \|	When a port number within a packet is replaced by a differently sized number only the packet is resized, but not the copy of the data. Following port numbers are rewritten based on their offsets within the copy, leading to packet corruption. Convert the amanda helper to the textsearch infrastructure to avoid the copy entirely. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: FTP helper: search optimization	Patrick McHardy	2006-06-17	1	-34/+43
\| \| \| \| \| \| \| \| \| \|	Instead of skipping search entries for the wrong direction simply index them by direction. Based on patch by Pablo Neira <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: SNMP helper: fix debug module param type	Patrick McHardy	2006-06-17	1	-1/+1
\| \| \| \| \| \| \|	debug is the debug level, not a bool. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: ctnetlink: change table dumping not to require an unique ID	Patrick McHardy	2006-06-17	1	-8/+24
\| \| \| \| \| \| \| \|	Instead of using the ID to find out where to continue dumping, take a reference to the last entry dumped and try to continue there. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: ctnetlink: fix NAT configuration	Patrick McHardy	2006-06-17	1	-31/+22
\| \| \| \| \| \| \| \|	The current configuration only allows to configure one manip and overloads conntrack status flags with netlink semantic. Signed-off-by: Patrick Mchardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: conntrack: add fixed timeout flag in connection tracking	Eric Leblond	2006-06-17	1	-0/+6
\| \| \| \| \| \| \| \| \| \|	Add a flag in a connection status to have a non updated timeout. This permits to have connection that automatically die at a given time. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: conntrack: add sysctl to disable checksumming	Patrick McHardy	2006-06-17	5	-4/+15
\| \| \| \| \|	Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: conntrack: don't call helpers for related ICMP messages	Patrick McHardy	2006-06-17	2	-2/+2
\| \| \| \| \| \| \| \|	None of the existing helpers expects to get called for related ICMP packets and some even drop them if they can't parse them. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: recent match: replace by rewritten version	Patrick McHardy	2006-06-17	1	-891/+377
\| \| \| \| \| \| \| \|	Replace the unmaintainable ipt_recent match by a rewritten version that should be fully compatible. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: x_tables: add SCTP/DCCP support where missing	Patrick McHardy	2006-06-17	2	-62/+22
\| \| \| \| \|	Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: x_tables: remove some unnecessary casts	Patrick McHardy	2006-06-17	1	-1/+1
\| \| \| \| \|	Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[IPSEC] proto: Move transport mode input path into xfrm_mode_transport	Herbert Xu	2006-06-17	4	-38/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we have xfrm_mode objects we can move the transport mode specific input decapsulation code into xfrm_mode_transport. This removes duplicate code as well as unnecessary header movement in case of tunnel mode SAs since we will discard the original IP header immediately. This also fixes a minor bug for transport-mode ESP where the IP payload length is set to the correct value minus the header length (with extension headers for IPv6). Of course the other neat thing is that we no longer have to allocate temporary buffers to hold the IP headers for ESP and IPComp. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[IPSEC] xfrm: Abstract out encapsulation modes	Herbert Xu	2006-06-17	6	-84/+219
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds the structure xfrm_mode. It is meant to represent the operations carried out by transport/tunnel modes. By doing this we allow additional encapsulation modes to be added without clogging up the xfrm_input/xfrm_output paths. Candidate modes include 4-to-6 tunnel mode, 6-to-4 tunnel mode, and BEET modes. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[IPSEC] xfrm: Undo afinfo lock proliferation	Herbert Xu	2006-06-17	2	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The number of locks used to manage afinfo structures can easily be reduced down to one each for policy and state respectively. This is based on the observation that the write locks are only held by module insertion/removal which are very rare events so there is no need to further differentiate between the insertion of modules like ipv6 versus esp6. The removal of the read locks in xfrm4_policy.c/xfrm6_policy.c might look suspicious at first. However, after you realise that nobody ever takes the corresponding write lock you'll feel better :) As far as I can gather it's an attempt to guard against the removal of the corresponding modules. Since neither module can be unloaded at all we can leave it to whoever fixes up IPv6 unloading :) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: tcp_rcv_rtt_measure_ts() call in pure-ACK path is superfluous	David S. Miller	2006-06-17	1	-2/+0
\| \| \| \| \| \| \| \| \|	We only want to take receive RTT mesaurements for data bearing frames, here in the header prediction fast path for a pure-sender, we know that we have a pure-ACK and thus the checks in tcp_rcv_rtt_mesaure_ts() will not pass. Signed-off-by: David S. Miller <davem@davemloft.net>
*	[I/OAT]: TCP recv offload to I/OAT	Chris Leech	2006-06-17	3	-20/+175
\| \| \| \| \| \| \| \|	Locks down user pages and sets up for DMA in tcp_recvmsg, then calls dma_async_try_early_copy in tcp_v4_do_rcv Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[I/OAT]: Add a sysctl for tuning the I/OAT offloaded I/O threshold	Chris Leech	2006-06-17	1	-0/+10
\| \| \| \| \| \| \|	Any socket recv of less than this ammount will not be offloaded Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[I/OAT]: Make sk_eat_skb I/OAT aware.	Chris Leech	2006-06-17	1	-4/+4
\| \| \| \| \| \| \| \|	Add an extra argument to sk_eat_skb, and make it move early copied packets to the async_wait_queue instead of freeing them. Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[I/OAT]: Rename cleanup_rbuf to tcp_cleanup_rbuf and make non-static	Chris Leech	2006-06-17	1	-5/+5
\| \| \| \| \| \| \|	Needed to be able to call tcp_cleanup_rbuf in tcp_input.c for I/OAT Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[IPV4]: Increment ipInHdrErrors when TTL expires.	Weidong	2006-06-12	1	-0/+1
\| \| \| \| \| \|	Signed-off-by: Weidong <weid@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: continued: reno sacked_out count fix	Aki M Nyrhinen	2006-06-11	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From: Aki M Nyrhinen <anyrhine@cs.helsinki.fi> IMHO the current fix to the problem (in_flight underflow in reno) is incorrect. it treats the symptons but ignores the problem. the problem is timing out packets other than the head packet when we don't have sack. i try to explain (sorry if explaining the obvious). with sack, scanning the retransmit queue for timed out packets is fine because we know which packets in our retransmit queue have been acked by the receiver. without sack, we know only how many packets in our retransmit queue the receiver has acknowledged, but no idea which packets. think of a "typical" slow-start overshoot case, where for example every third packet in a window get lost because a router buffer gets full. with sack, we check for timeouts on those every third packet (as the rest have been sacked). the packet counting works out and if there is no reordering, we'll retransmit exactly the packets that were lost. without sack, however, we check for timeout on every packet and end up retransmitting consecutive packets in the retransmit queue. in our slow-start example, 2/3 of those retransmissions are unnecessary. these unnecessary retransmissions eat the congestion window and evetually prevent fast recovery from continuing, if enough packets were lost. Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: Avoid skb_pull if possible when trimming head	Herbert Xu ~{PmVHI~}	2006-06-05	1	-7/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Trimming the head of an skb by calling skb_pull can cause the packet to become unaligned if the length pulled is odd. Since the length is entirely arbitrary for a FIN packet carrying data, this is actually quite common. Unaligned data is not the end of the world, but we should avoid it if it's easily done. In this case it is trivial. Since we're discarding all of the head data it doesn't matter whether we move skb->data forward or back. However, it is still possible to have unaligned skb->data in general. So network drivers should be prepared to handle it instead of crashing. This patch also adds an unlikely marking on len < headlen since partial ACKs on head data are extremely rare in the wild. As the return value of __pskb_trim_head is no longer ever NULL that has been removed. Signed-off-by: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>