summaryrefslogtreecommitdiffstats
path: root/include/linux/inetdevice.h
Commit message (Collapse)AuthorAgeFilesLines
* net: ipv4: provide __rcu annotation for ifa_listFlorian Westphal2019-06-021-15/+6
| | | | | | | | | | | | | | ifa_list is protected by rcu, yet code doesn't reflect this. Add the __rcu annotations and fix up all places that are now reported by sparse. I've done this in the same commit to not add intermediate patches that result in new warnings. Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: inetdevice: provide replacement iterators for in_ifaddr walkFlorian Westphal2019-06-021-1/+9
| | | | | | | | | | | | | | The ifa_list is protected either by rcu or rtnl lock, but the current iterators do not account for this. This adds two iterators as replacement, a later patch in the series will update them with the needed rcu/rtnl_dereference calls. Its not done in this patch yet to avoid sparse warnings -- the fields lack the proper __rcu annotation. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Move IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN to helperDavid Ahern2019-03-291-0/+14
| | | | | | | | | | | | | | in_dev lookup followed by IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN check is called in several places, some with the rcu lock and others with the rtnl held. Move the check to a helper similar to what IPv6 has. Since the helper can be invoked from either context use rcu_dereference_rtnl to dereference ip_ptr. Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4/igmp: fix v1/v2 switchback timeout based on rfc3376, 8.12Hangbin Liu2018-10-291-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Similiar with ipv6 mcast commit 89225d1ce6af3 ("net: ipv6: mld: fix v1/v2 switchback timeout to rfc3810, 9.12.") i) RFC3376 8.12. Older Version Querier Present Timeout says: The Older Version Querier Interval is the time-out for transitioning a host back to IGMPv3 mode once an older version query is heard. When an older version query is received, hosts set their Older Version Querier Present Timer to Older Version Querier Interval. This value MUST be ((the Robustness Variable) times (the Query Interval in the last Query received)) plus (one Query Response Interval). Currently we only use a hardcode value IGMP_V1/v2_ROUTER_PRESENT_TIMEOUT. Fix it by adding two new items mr_qi(Query Interval) and mr_qri(Query Response Interval) in struct in_device. Now we can calculate the switchback time via (mr_qrv * mr_qi) + mr_qri. We need update these values when receive IGMPv3 queries. Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* route: add support for directed broadcast forwardingXin Long2018-07-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | This patch implements the feature described in rfc1812#section-5.3.5.2 and rfc2644. It allows the router to forward directed broadcast when sysctl bc_forwarding is enabled. Note that this feature could be done by iptables -j TEE, but it would cause some problems: - target TEE's gateway param has to be set with a specific address, and it's not flexible especially when the route wants forward all directed broadcasts. - this duplicates the directed broadcasts so this may cause side effects to applications. Besides, to keep consistent with other os router like BSD, it's also necessary to implement it in the route rx path. Note that route cache needs to be flushed when bc_forwarding is changed. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/ipv4: Add support for specifying metric of connected routesDavid Ahern2018-05-291-0/+1
| | | | | | | | | | | Add support for IFA_RT_PRIORITY to ipv4 addresses. If the metric is changed on an existing address then the new route is inserted before removing the old one. Since the metric is one of the route keys, the prefix route can not be replaced. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* devinet_ioctl(): take copyin/copyout to callerAl Viro2018-01-241-1/+1
| | | | | Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-11-041-0/+1
|\ | | | | | | | | | | | | Files removed in 'net-next' had their license header updated in 'net'. We take the remove from 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
| * License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman2017-11-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | net: Add extack to validator_info structs used for address notifierDavid Ahern2017-10-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | Add extack to in_validator_info and in6_validator_info. Update the one user of each, ipvlan, to return an error message for failures. Only manual configuration of an address is plumbed in the IPv6 code path. Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: avoid a full fib lookup when rp_filter is disabled.Paolo Abeni2017-09-211-0/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 1dced6a85482 ("ipv4: Restore accept_local behaviour in fib_validate_source()") a full fib lookup is needed even if the rp_filter is disabled, if accept_local is false - which is the default. What we really need in the above scenario is just checking that the source IP address is not local, and in most case we can do that is a cheaper way looking up the ifaddr hash table. This commit adds a helper for such lookup, and uses it to validate the src address when rp_filter is disabled and no 'local' routes are created by the user space in the relevant namespace. A new ipv4 netns flag is added to account for such routes. We need that to preserve the same behavior we had before this patch. It also drops the checks to bail early from __fib_validate_source, added by the commit 1dced6a85482 ("ipv4: Restore accept_local behaviour in fib_validate_source()") they do not give any measurable performance improvement: if we do the lookup with are on a slower path. This improves UDP performances for unconnected sockets when rp_filter is disabled by 5% and also gives small but measurable performance improvement for TCP flood scenarios. v1 -> v2: - use the ifaddr lookup helper in __ip_dev_find(), as suggested by Eric - fall-back to full lookup if custom local routes are present Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: convert in_device.refcnt from atomic_t to refcount_tReshetova, Elena2017-07-011-5/+6
| | | | | | | | | | | | | | refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Ipvlan should return an error when an address is already in use.Krister Johansen2017-06-091-0/+7
| | | | | | | | | | | | | | | | | | | The ipvlan code already knows how to detect when a duplicate address is about to be assigned to an ipvlan device. However, that failure is not propogated outward and leads to a silent failure. Introduce a validation step at ip address creation time and allow device drivers to register to validate the incoming ip addresses. The ipvlan code is the first consumer. If it detects an address in use, we can return an error to the user before beginning to commit the new ifa in the networking code. This can be especially useful if it is necessary to provision many ipvlans in containers. The provisioning software (or operator) can use this to detect situations where an ip address is unexpectedly in use. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: devinet: Refactor inet_netconf_notify_devconf to take eventDavid Ahern2017-03-281-2/+2
| | | | | | | Refactor inet_netconf_notify_devconf to take the event as an input arg. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/inetdevice: bad_mask can be booleanYaowei Bai2015-10-091-4/+4
| | | | | | | | | | This patch makes bad_mask return bool due to this particular function only using either one or zero as its return value. No functional change. Signed-off-by: Yaowei Bai <bywxiaobai@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/inetdevice: inet_ifa_match can be booleanYaowei Bai2015-10-091-1/+1
| | | | | | | | | | | This patch makes inet_ifa_match return bool due to this particular function only using either one or zero as its return value. No functional change. Signed-off-by: Yaowei Bai <bywxiaobai@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ipv4 sysctl option to ignore routes when nexthop link is downAndy Gospodarek2015-06-241-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This feature is only enabled with the new per-interface or ipv4 global sysctls called 'ignore_routes_with_linkdown'. net.ipv4.conf.all.ignore_routes_with_linkdown = 0 net.ipv4.conf.default.ignore_routes_with_linkdown = 0 net.ipv4.conf.lo.ignore_routes_with_linkdown = 0 ... When the above sysctls are set, will report to userspace that a route is dead and will no longer resolve to this nexthop when performing a fib lookup. This will signal to userspace that the route will not be selected. The signalling of a RTNH_F_DEAD is only passed to userspace if the sysctl is enabled and link is down. This was done as without it the netlink listeners would have no idea whether or not a nexthop would be selected. The kernel only sets RTNH_F_DEAD internally if the interface has IFF_UP cleared. With the new sysctl set, the following behavior can be observed (interface p8p1 is link-down): default via 10.0.5.2 dev p9p1 10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15 70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1 80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 dead linkdown 90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 dead linkdown 90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2 90.0.0.1 via 70.0.0.2 dev p7p1 src 70.0.0.1 cache local 80.0.0.1 dev lo src 80.0.0.1 cache <local> 80.0.0.2 via 10.0.5.2 dev p9p1 src 10.0.5.15 cache While the route does remain in the table (so it can be modified if needed rather than being wiped away as it would be if IFF_UP was cleared), the proper next-hop is chosen automatically when the link is down. Now interface p8p1 is linked-up: default via 10.0.5.2 dev p9p1 10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15 70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1 80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2 192.168.56.0/24 dev p2p1 proto kernel scope link src 192.168.56.2 90.0.0.1 via 80.0.0.2 dev p8p1 src 80.0.0.1 cache local 80.0.0.1 dev lo src 80.0.0.1 cache <local> 80.0.0.2 dev p8p1 src 80.0.0.1 cache and the output changes to what one would expect. If the sysctl is not set, the following output would be expected when p8p1 is down: default via 10.0.5.2 dev p9p1 10.0.5.0/24 dev p9p1 proto kernel scope link src 10.0.5.15 70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1 80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 linkdown 90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 1 linkdown 90.0.0.0/24 via 70.0.0.2 dev p7p1 metric 2 Since the dead flag does not appear, there should be no expectation that the kernel would skip using this route due to link being down. v2: Split kernel changes into 2 patches, this actually makes a behavioral change if the sysctl is set. Also took suggestion from Alex to simplify code by only checking sysctl during fib lookup and suggestion from Scott to add a per-interface sysctl. v3: Code clean-ups to make it more readable and efficient as well as a reverse path check fix. v4: Drop binary sysctl v5: Whitespace fixups from Dave v6: Style changes from Dave and checkpatch suggestions v7: One more checkpatch fixup Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* inetdevice: fixed signed integer overflowVincent BENAYOUN2014-11-141-1/+1
| | | | | | | | | | | | | There could be a signed overflow in the following code. The expression, (32-logmask) is comprised between 0 and 31 included. It may be equal to 31. In such a case the left shift will produce a signed integer overflow. According to the C99 Standard, this is an undefined behavior. A simple fix is to replace the signed int 1 with the unsigned int 1U. Signed-off-by: Vincent BENAYOUN <vincent.benayoun@trust-in-soft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: fix wildcard search with inet_confirm_addr()Nicolas Dichtel2013-12-111-3/+2
| | | | | | | | | | | | | | | | Help of this function says: "in_dev: only on this interface, 0=any interface", but since commit 39a6d0630012 ("[NETNS]: Process inet_confirm_addr in the correct namespace."), the code supposes that it will never be NULL. This function is never called with in_dev == NULL, but it's exported and may be used by an external module. Because this patch restore the ability to call inet_confirm_addr() with in_dev == NULL, I partially revert the above commit, as suggested by Julian. CC: Julian Anastasov <ja@ssi.bg> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: add support for IFA_FLAGS nl attributeJiri Pirko2013-12-101-1/+1
| | | | | Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
* neigh: restore old behaviour of default parms valuesJiri Pirko2013-12-091-0/+7
| | | | | | | | | | | | | | | | | | | | | | | Previously inet devices were only constructed when addresses are added. Therefore the default neigh parms values they get are the ones at the time of these operations. Now that we're creating inet devices earlier, this changes the behaviour of default neigh parms values in an incompatible way (see bug #8519). This patch creates a compromise by setting the default values at the same point as before but only for those that have not been explicitly set by the user since the inet device's creation. Introduced by: commit 8030f54499925d073a88c09f30d5d844fb1b3190 Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Thu Feb 22 01:53:47 2007 +0900 [IPV4] devinet: Register inetdev earlier. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
* [networking]device.h: Remove extern from function prototypesJoe Perches2013-09-261-13/+15
| | | | | | | | | | | | There are a mix of function prototypes with and without extern in the kernel sources. Standardize on not using extern for function prototypes. Function prototypes don't need to be written with extern. extern is assumed by the compiler. Its use is as unnecessary as using auto to declare automatic/local variables in a block. Signed-off-by: Joe Perches <joe@perches.com>
* ipv4: expose IPV4_DEVCONFstephen hemminger2013-08-221-33/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | IP sends device configuration (see inet_fill_link_af) as an array in the netlink information, but the indices in that array are not exposed to userspace through any current santized header file. It was available back in 2.6.32 (in /usr/include/linux/sysctl.h) but was broken by: commit 02291680ffba92e5b5865bc0c5e7d1f3056b80ec Author: Eric W. Biederman <ebiederm@xmission.com> Date: Sun Feb 14 03:25:51 2010 +0000 net ipv4: Decouple ipv4 interface parameters from binary sysctl numbers Eric was solving the sysctl problem but then the indices were re-exposed by a later addition of devconf support for IPV4 commit 9f0f7272ac9506f4c8c05cc597b7e376b0b9f3e4 Author: Thomas Graf <tgraf@infradead.org> Date: Tue Nov 16 04:32:48 2010 +0000 ipv4: AF_INET link address family Putting them in /usr/include/linux/ip.h seemed the logical match for the DEVCONF_ definitions for IPV6 in /usr/include/linux/ip6.h Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* igmp: hash a hash table to speedup ip_check_mc_rcu()Eric Dumazet2013-06-121-0/+5
| | | | | | | | | | | | | | | | After IP route cache removal, multicast applications using a lot of multicast addresses hit a O(N) behavior in ip_check_mc_rcu() Add a per in_device hash table to get faster lookup. This hash table is created only if the number of items in mc_list is above 4. Reported-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: introduce address lifetimeJiri Pirko2013-01-291-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | There are some usecase when lifetime of ipv4 addresses might be helpful. For example: 1) initramfs networkmanager uses a DHCP daemon to learn network configuration parameters 2) initramfs networkmanager addresses, routes and DNS configuration 3) initramfs networkmanager is requested to stop 4) initramfs networkmanager stops all daemons including dhclient 5) there are addresses and routes configured but no daemon running. If the system doesn't start networkmanager for some reason, addresses and routes will be used forever, which violates RFC 2131. This patch is essentially a backport of ivp6 address lifetime mechanism for ipv4 addresses. Current "ip" tool supports this without any patch (since it does not distinguish between ipv4 and ipv6 addresses in this perspective. Also, this should be back-compatible with all current netlink users. Reported-by: Pavel Šimerda <psimerda@redhat.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
* netconf: advertise mc_forwarding statusNicolas Dichtel2012-12-041-0/+3
| | | | | | | | This patch advertise the MC_FORWARDING status for IPv4 and IPv6. This field is readonly, only multicast engine in the kernel updates it. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Introduce IN_DEV_NET_ROUTE_LOCALNETEric Dumazet2012-08-041-2/+9
| | | | | | | | | | | | | | | | performance profiles show a high cost in the IN_DEV_ROUTE_LOCALNET() call done in ip_route_input_slow(), because of multiple dereferences, even if cache lines are clean and available in cpu caches. Since we already have the 'net' pointer, introduce IN_DEV_NET_ROUTE_LOCALNET() macro avoiding two dereferences (dev_net(in_dev->dev)) Also change the tests to use IN_DEV_NET_ROUTE_LOCALNET() only if saddr or/and daddr are loopback addresse. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Add interface option to enable routing of 127.0.0.0/8Thomas Graf2012-06-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Routing of 127/8 is tradtionally forbidden, we consider packets from that address block martian when routing and do not process corresponding ARP requests. This is a sane default but renders a huge address space practically unuseable. The RFC states that no address within the 127/8 block should ever appear on any network anywhere but it does not forbid the use of such addresses outside of the loopback device in particular. For example to address a pool of virtual guests behind a load balancer. This patch adds a new interface option 'route_localnet' enabling routing of the 127/8 address block and processing of ARP requests on a specific interface. Note that for the feature to work, the default local route covering 127/8 dev lo needs to be removed. Example: $ sysctl -w net.ipv4.conf.eth0.route_localnet=1 $ ip route del 127.0.0.0/8 dev lo table local $ ip addr add 127.1.0.1/16 dev eth0 $ ip route flush cache V2: Fix invalid check to auto flush cache (thanks davem) Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* arp: allow arp processing to honor per interface arp_accept sysctlNeil Horman2012-03-161-0/+1
| | | | | | | | | | | | | | | | I found recently that the arp_process function which handles all of our received arp frames, is using IPV4_DEVCONF_ALL macro to check the state of the arp_process flag. This seems wrong, as it implies that either none or all of the network interfaces accept gratuitous arps. This patch corrects that, allowing per-interface arp_accept configuration to deviate from the all setting. Note this also brings us into line with the way the arp_filter setting is handled during arp_process execution. Tested this myself on my home network, and confirmed it works as expected. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Add hash table of interface addresses.David S. Miller2011-02-181-0/+1
| | | | | | | | This will be used to optimize __ip_dev_find() and friends. With help from Eric Dumazet. Signed-off-by: David S. Miller <davem@davemloft.net>
* __in_dev_get_rtnl() can use rtnl_dereference()Eric Dumazet2010-12-061-1/+1
| | | | | | | | | | If caller holds RTNL, we dont need a memory barrier (smp_read_barrier_depends) included in rcu_dereference(). Just use rtnl_dereference() to properly document the assertions. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* inet: Define IPV4_DEVCONF_MAXThomas Graf2010-11-171-3/+5
| | | | | | | Define IPV4_DEVCONF_MAX to get rid of MAX - 1 notation. Signed-off-by: Thomas Graf <tgraf@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* igmp: RCU conversion of in_dev->mc_listEric Dumazet2010-11-121-3/+2
| | | | | | | | | | | | | | in_dev->mc_list is protected by one rwlock (in_dev->mc_list_lock). This can easily be converted to a RCU protection. Writers hold RTNL, so mc_list_lock is removed, not replaced by a spinlock. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Cypher Wu <cypher.w@gmail.com> Cc: Américo Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: introduce __ip_dev_find()Eric Dumazet2010-09-301-1/+6
| | | | | | | | | | | | ip_dev_find(net, addr) finds a device given an IPv4 source address and takes a reference on it. Introduce __ip_dev_find(), taking a third argument, to optionally take the device reference. Callers not asking the reference to be taken should be in an rcu_read_lock() protected section. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: ip_ptr cleanupsEric Dumazet2010-09-151-9/+5
| | | | | | | | | dev->ip_ptr is protected by rtnl and rcu. Yet some places dont use appropriate primitives and/or locking rules. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net ipv4: Decouple ipv4 interface parameters from binary sysctl numbersEric W. Biederman2010-02-161-6/+36
| | | | | | | | | | Stop using the binary sysctl enumeartion in sysctl.h as an index into a per interface array. This leads to unnecessary binary sysctl number allocation, and a fragility in data structure and implementation because of unnecessary coupling. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2010-01-101-0/+1
|\ | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/benet/be_cmds.h include/linux/sysctl.h
| * net: restore ip source validationJamal Hadi Salim2009-12-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when using policy routing and the skb mark: there are cases where a back path validation requires us to use a different routing table for src ip validation than the one used for mapping ingress dst ip. One such a case is transparent proxying where we pretend to be the destination system and therefore the local table is used for incoming packets but possibly a main table would be used on outbound. Make the default behavior to allow the above and if users need to turn on the symmetry via sysctl src_valid_mark Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: RFC3069, private VLAN proxy arp supportJesper Dangaard Brouer2010-01-071-0/+1
|/ | | | | | | | | | | | | | | | | | | | | | | This is to be used together with switch technologies, like RFC3069, that where the individual ports are not allowed to communicate with each other, but they are allowed to talk to the upstream router. As described in RFC 3069, it is possible to allow these hosts to communicate through the upstream router by proxy_arp'ing. This patch basically allow proxy arp replies back to the same interface (from which the ARP request/solicitation was received). Tunable per device via proc "proxy_arp_pvlan": /proc/sys/net/ipv4/conf/*/proxy_arp_pvlan This switch technology is known by different vendor names: - In RFC 3069 it is called VLAN Aggregation. - Cisco and Allied Telesyn call it Private VLAN. - Hewlett-Packard call it Source-Port filtering or port-isolation. - Ericsson call it MAC-Forced Forwarding (RFC Draft). Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4 05/05: add sysctl to accept packets with local source addressesPatrick McHardy2009-12-031-0/+1
| | | | | | | | | | | | | | | | | commit 8ec1e0ebe26087bfc5c0394ada5feb5758014fc8 Author: Patrick McHardy <kaber@trash.net> Date: Thu Dec 3 12:16:35 2009 +0100 ipv4: add sysctl to accept packets with local source addresses Change fib_validate_source() to accept packets with a local source address when the "accept_local" sysctl is set for the incoming inet device. Combined with the previous patches, this allows to communicate between multiple local interfaces over the wire. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: cleanup include/linuxEric Dumazet2009-11-041-6/+3
| | | | | | | | | | | | | | | This cleanup patch puts struct/union/enum opening braces, in first line to ease grep games. struct something { becomes : struct something { Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ip: fix logic of reverse path filter sysctlStephen Hemminger2009-07-271-1/+1
| | | | | | | | | Even though reverse path filter was changed from simple boolean to trinary control, the loose mode only works if both all and device are configured because of this logic error. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: add ARP notify option for devicesStephen Hemminger2009-02-011-0/+1
| | | | | | | | | | This adds another inet device option to enable gratuitous ARP when device is brought up or address change. This is handy for clusters or virtualization. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: add mc_count to in_device.Rami Rosen2008-10-071-0/+1
| | | | | | | | | | | This patch add mc_count to struct in_device and updates increment/decrement/initilaize of this field in IPv4 and in IPv6. - Also printing the vfs /proc entry (/proc/net/igmp) is adjusted to use the new mc_count. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* route: Remove unused ifa_anycast fieldThomas Graf2008-06-031-1/+0
| | | | | | | | | The field was supposed to allow the creation of an anycast route by assigning an anycast address to an address prefix. It was never implemented so this field is unused and serves no purpose. Remove it. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS.YOSHIFUJI Hideaki2008-03-261-3/+3
| | | | | | | | Introduce per-net_device inlines: dev_net(), dev_net_set(). Without CONFIG_NET_NS, no namespace other than &init_net exists. Let's explicitly define them to help compiler optimizations. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* [NETNS]: Process devinet ioctl in the correct namespace.Denis V. Lunev2008-02-281-1/+1
| | | | | | | | Add namespace parameter to devinet_ioctl and locate device inside it for state changes. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: Make struct ipv4_devconf static.Adrian Bunk2008-01-311-2/+0
| | | | | | | | struct ipv4_devconf can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Add namespace parameter to ip_dev_find.Denis V. Lunev2008-01-281-1/+1
| | | | | | | | in_dev_find() need a namespace to pass it to fib_get_table(), so add an argument. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Add netns parameter to inetdev_by_index.Denis V. Lunev2008-01-281-1/+1
| | | | | Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud