summaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | net/smc: fix error return code in smc_setsockopt()Wei Yongjun2018-06-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix to return error code -EINVAL instead of 0 if optlen is invalid. Fixes: 01d2f7e2cdd3 ("net/smc: sockopts TCP_NODELAY and TCP_CORK") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-06-0322-67/+100
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Filling in the padding slot in the bpf structure as a bug fix in 'ne' overlapped with actually using that padding area for something in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | netfilter: nf_tables: handle chain name lookups via rhltableFlorian Westphal2018-06-031-15/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there is a significant amount of chains list search is too slow, so add an rhlist table for this. This speeds up ruleset loading: for every new rule we have to check if the name already exists in current generation. We need to be able to cope with duplicate chain names in case a transaction drops the nfnl mutex (for request_module) and the abort of this old transaction is still pending. The list is kept -- we need a way to iterate chains even if hash resize is in progress without missing an entry. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_tables: add connlimit supportPablo Neira Ayuso2018-06-033-0/+307
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This features which allows you to limit the maximum number of connections per arbitrary key. The connlimit expression is stateful, therefore it can be used from meters to dynamically populate a set, this provides a mapping to the iptables' connlimit match. This patch also comes that allows you define static connlimit policies. This extension depends on the nf_conncount infrastructure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_tables: add destroy_clone expressionPablo Neira Ayuso2018-06-032-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch, cloned expressions are released via ->destroy. This is a problem for the new connlimit expression since the ->destroy path drop a reference on the conntrack modules and it unregisters hooks. The new ->destroy_clone provides context that this expression is being released from the packet path, so it is mirroring ->clone(), where neither module reference is dropped nor hooks need to be unregistered - because this done from the control plane path from the ->init() path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_tables: garbage collection for stateful expressionsPablo Neira Ayuso2018-06-032-2/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use garbage collector to schedule removal of elements based of feedback from expression that this element comes with. Therefore, the garbage collector is not guided by timeout expirations in this new mode. The new connlimit expression sets on the NFT_EXPR_GC flag to enable this behaviour, the dynset expression needs to explicitly enable the garbage collector via set->ops->gc_init call. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_tables: pass ctx to nf_tables_expr_destroy()Pablo Neira Ayuso2018-06-031-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nft_set_elem_destroy() can be called from call_rcu context. Annotate netns and table in set object so we can populate the context object. Moreover, pass context object to nf_tables_set_elem_destroy() from the commit phase, since it is already available from there. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_conncount: expose connection list interfacePablo Neira Ayuso2018-06-031-13/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch provides an interface to maintain the list of connections and the lookup function to obtain the number of connections in the list. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: nf_tables: pass context to object destroy indirectionPablo Neira Ayuso2018-06-033-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new connlimit object needs this to properly deal with conntrack dependencies. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: Libify xt_TPROXYMáté Eckl2018-06-038-339/+323
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The extracted functions will likely be usefull to implement tproxy support in nf_tables. Extrancted functions: - nf_tproxy_sk_is_transparent - nf_tproxy_laddr4 - nf_tproxy_handle_time_wait4 - nf_tproxy_get_sock_v4 - nf_tproxy_laddr6 - nf_tproxy_handle_time_wait6 - nf_tproxy_get_sock_v6 (nf_)tproxy_handle_time_wait6 also needed some refactor as its current implementation was xtables-specific. Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | netfilter: Decrease code duplication regarding transparent socket optionMáté Eckl2018-06-033-16/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a function in include/net/netfilter/nf_socket.h to decide if a socket has IP(V6)_TRANSPARENT socket option set or not. However this does the same as inet_sk_transparent() in include/net/tcp.h include/net/tcp.h:1733 /* This helper checks if socket has IP_TRANSPARENT set */ static inline bool inet_sk_transparent(const struct sock *sk) { switch (sk->sk_state) { case TCP_TIME_WAIT: return inet_twsk(sk)->tw_transparent; case TCP_NEW_SYN_RECV: return inet_rsk(inet_reqsk(sk))->no_srccheck; } return inet_sk(sk)->transparent; } tproxy_sk_is_transparent has also been refactored to use this function instead of reimplementing it. Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller2018-06-0228-397/+1313
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for your net-next tree, the most relevant things in this batch are: 1) Compile masquerade infrastructure into NAT module, from Florian Westphal. Same thing with the redirection support. 2) Abort transaction if early initialization of the commit phase fails. Also from Florian. 3) Get rid of synchronize_rcu() by using rule array in nf_tables, from Florian. 4) Abort nf_tables batch if fatal signal is pending, from Florian. 5) Use .call_rcu nfnetlink from nf_tables to make dumps fully lockless. From Florian Westphal. 6) Support to match transparent sockets from nf_tables, from Máté Eckl. 7) Audit support for nf_tables, from Phil Sutter. 8) Validate chain dependencies from commit phase, fall back to fine grain validation only in case of errors. 9) Attach dst to skbuff from netfilter flowtable packet path, from Jason A. Donenfeld. 10) Use artificial maximum attribute cap to remove VLA from nfnetlink. Patch from Kees Cook. 11) Add extension to allow to forward packets through neighbour layer. 12) Add IPv6 conntrack helper support to IPVS, from Julian Anastasov. 13) Add IPv6 FTP conntrack support to IPVS, from Julian Anastasov. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | ipvs: add ipv6 support to ftpJulian Anastasov2018-06-015-178/+325
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for FTP commands with extended format (RFC 2428): - FTP EPRT: IPv4 and IPv6, active mode, similar to PORT - FTP EPSV: IPv4 and IPv6, passive mode, similar to PASV. EPSV response usually contains only port but we allow real server to provide different address We restrict control and data connection to be from same address family. Allow the "(" and ")" to be optional in PASV response. Also, add ipvsh argument to the pkt_in/pkt_out handlers to better access the payload after transport header. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | ipvs: add full ipv6 support to nfctJulian Anastasov2018-06-011-52/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prepare NFCT to support IPv6 for FTP: - Do not restrict the expectation callback to PF_INET - Split the debug messages, so that the 160-byte limitation in IP_VS_DBG_BUF is not exceeded when printing many IPv6 addresses. This means no more than 3 addresses in one message, i.e. 1 tuple with 2 addresses or 1 connection with 3 addresses. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nft_fwd_netdev: allow to forward packets via neighbour layerPablo Neira Ayuso2018-06-011-1/+145
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows us to forward packets from the netdev family via neighbour layer, so you don't need an explicit link-layer destination when using this expression from rules. The ttl/hop_limit field is decremented. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nfnetlink: Remove VLA usageKees Cook2018-06-011-2/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the quest to remove all stack VLA usage from the kernel[1], this allocates the maximum size expected for all possible attrs and adds sanity-checks at both registration and usage to make sure nothing gets out of sync. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nf_flow_table: attach dst to skbsJason A. Donenfeld2018-06-011-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some drivers, such as vxlan and wireguard, use the skb's dst in order to determine things like PMTU. They therefore loose functionality when flow offloading is enabled. So, we ensure the skb has it before xmit'ing it in the offloading path. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nf_tables: fix chain dependency validationPablo Neira Ayuso2018-06-014-32/+196
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following ruleset: add table ip filter add chain ip filter input { type filter hook input priority 4; } add chain ip filter ap add rule ip filter input jump ap add rule ip filter ap masquerade results in a panic, because the masquerade extension should be rejected from the filter chain. The existing validation is missing a chain dependency check when the rule is added to the non-base chain. This patch fixes the problem by walking down the rules from the basechains, searching for either immediate or lookup expressions, then jumping to non-base chains and again walking down the rules to perform the expression validation, so we make sure the full ruleset graph is validated. This is done only once from the commit phase, in case of problem, we abort the transaction and perform fine grain validation for error reporting. This patch requires 003087911af2 ("netfilter: nfnetlink: allow commit to fail") to achieve this behaviour. This patch also adds a cleanup callback to nfnl batch interface to reset the validate state from the exit path. As a result of this patch, nf_tables_check_loops() doesn't use ->validate to check for loops, instead it just checks for immediate expressions. Reported-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nf_tables: Add audit support to log statementPhil Sutter2018-06-011-1/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This extends log statement to support the behaviour achieved with AUDIT target in iptables. Audit logging is enabled via a pseudo log level 8. In this case any other settings like log prefix are ignored since audit log format is fixed. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nf_tables: add support for native socket matchingMáté Eckl2018-06-013-0/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now it can only match the transparent flag of an ip/ipv6 socket. Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: fix ptr_ret.cocci warningskbuild test robot2018-06-012-12/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | net/netfilter/nft_numgen.c:117:1-3: WARNING: PTR_ERR_OR_ZERO can be used net/netfilter/nft_hash.c:180:1-3: WARNING: PTR_ERR_OR_ZERO can be used net/netfilter/nft_hash.c:223:1-3: WARNING: PTR_ERR_OR_ZERO can be used Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR Generated by: scripts/coccinelle/api/ptr_ret.cocci Fixes: b9ccc07e3f31 ("netfilter: nft_hash: add map lookups for hashing operations") Fixes: d734a2888922 ("netfilter: nft_numgen: add map lookups for numgen statements") CC: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: kbuild test robot <fengguang.wu@intel.com> Acked-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nf_tables: remove unused variablesTaehee Yoo2018-05-291-16/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The comment and trace_loginfo are not used anymore. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nf_tables: use call_rcu in netlink dumpsFlorian Westphal2018-05-291-39/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can make all dumps and lookups lockless. Dumps currently only hold the nfnl mutex on the dump request itself. Dumps can span multiple syscalls, dump continuation doesn't acquire the nfnl mutex anywhere, i.e. the dump callbacks in nf_tables already use rcu and never rely on nfnl mutex being held. So, just switch all dumpers to rcu. This requires taking a module reference before dropping the rcu lock so rmmod is blocked, we also need to hold module reference over the entire dump operation sequence. netlink already supports this via the .module member in the netlink_dump_control struct. For the non-dump case (i.e. lookup of a specific tables, chains, etc), we need to swtich to _rcu list iteration primitive and make sure we use GFP_ATOMIC. This patch also adds the new nft_netlink_dump_start_rcu() helper that takes care of the get_ref, drop-rcu-lock,start dump, get-rcu-lock,put-ref sequence. The helper will be reused for all dumps. Rationale in all dump requests is: - use the nft_netlink_dump_start_rcu helper added in first patch - use GFP_ATOMIC and rcu list iteration - switch to .call_rcu ... thus making all dumps in nf_tables not depend on the nfnl mutex anymore. In the nf_tables_getgen: This callback just fetches the current base sequence, there is no need to serialize this with nfnl nft mutex. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nf_tables: fail batch if fatal signal is pendingFlorian Westphal2018-05-291-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | abort batch processing and return so task can exit faster. Otherwise even SIGKILL has no immediate effect. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nf_tables: fix endian mismatch in return typeFlorian Westphal2018-05-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | harmless, but it avoids sparse warnings: nf_tables_api.c:2813:16: warning: incorrect type in return expression (different base types) nf_tables_api.c:2863:47: warning: incorrect type in argument 3 (different base types) nf_tables_api.c:3524:47: warning: incorrect type in argument 3 (different base types) nf_tables_api.c:3538:55: warning: incorrect type in argument 3 (different base types) Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nft_compat: use call_rcu for nfnl_compat_getFlorian Westphal2018-05-291-11/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just use .call_rcu instead. We can drop the rcu read lock after obtaining a reference and re-acquire on return. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nat: make symbol nat_hook staticWei Yongjun2018-05-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the following sparse warning: net/netfilter/nf_nat_core.c:1039:20: warning: symbol 'nat_hook' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nf_tables: remove synchronize_rcu in commit phaseFlorian Westphal2018-05-292-18/+210
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | synchronize_rcu() is expensive. The commit phase currently enforces an unconditional synchronize_rcu() after incrementing the generation counter. This is to make sure that a packet always sees a consistent chain, either nft_do_chain is still using old generation (it will skip the newly added rules), or the new one (it will skip old ones that might still be linked into the list). We could just remove the synchronize_rcu(), it would not cause a crash but it could cause us to evaluate a rule that was removed and new rule for the same packet, instead of either-or. To resolve this, add rule pointer array holding two generations, the current one and the future generation. In commit phase, allocate the rule blob and populate it with the rules that will be active in the new generation. Then, make this rule blob public, replacing the old generation pointer. Then the generation counter can be incremented. nft_do_chain() will either continue to use the current generation (in case loop was invoked right before increment), or the new one. Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nfnetlink: allow commit to failFlorian Westphal2018-05-291-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ->commit() cannot fail at the moment. Followup-patch adds kmalloc calls in the commit phase, so we'll need to be able to handle errors. Make it so that -EGAIN causes a full replay, and make other errors cause the transaction to fail. Failing is ok from a consistency point of view as long as we perform all actions that could return an error before we increment the generation counter and the base seq. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nat: merge nf_nat_redirect into nf_natFlorian Westphal2018-05-293-10/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to previous patch, this time, merge redirect+nat. The redirect module is just 2k in size, get rid of it and make redirect part available from the nat core. before: text data bss dec hex filename 19461 1484 4138 25083 61fb net/netfilter/nf_nat.ko 1236 792 0 2028 7ec net/netfilter/nf_nat_redirect.ko after: 20340 1508 4138 25986 6582 net/netfilter/nf_nat.ko Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | | | netfilter: nat: merge ipv4/ipv6 masquerade code into main nat moduleFlorian Westphal2018-05-296-20/+4
| | | |_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using extra modules for these, turn the config options into an implicit dependency that adds masq feature to the protocol specific nf_nat module. before: text data bss dec hex filename 2001 860 4 2865 b31 net/ipv4/netfilter/nf_nat_masquerade_ipv4.ko 5579 780 2 6361 18d9 net/ipv4/netfilter/nf_nat_ipv4.ko 2860 836 8 3704 e78 net/ipv6/netfilter/nf_nat_masquerade_ipv6.ko 6648 780 2 7430 1d06 net/ipv6/netfilter/nf_nat_ipv6.ko after: text data bss dec hex filename 7245 872 8 8125 1fbd net/ipv4/netfilter/nf_nat_ipv4.ko 9165 848 12 10025 2729 net/ipv6/netfilter/nf_nat_ipv6.ko Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | | net: sched: split tc_ctl_tfilter into three handlersVlad Buslov2018-06-011-145/+293
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tc_ctl_tfilter handles three netlink message types: RTM_NEWTFILTER, RTM_DELTFILTER, RTM_GETTFILTER. However, implementation of this function involves a lot of branching on specific message type because most of the code is message-specific. This significantly complicates adding new functionality and doesn't provide much benefit of code reuse. Split tc_ctl_tfilter to three standalone functions that handle filter new, delete and get requests. The only truly protocol independent part of tc_ctl_tfilter is code that looks up queue, class, and block. Refactor this code to standalone tcf_block_find function that is used by all three new handlers. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | rtnetlink: Fix null-ptr-deref in rtnl_newlinkPrashant Bhole2018-06-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In rtnl_newlink(), NULL check is performed on m_ops however member of ops is accessed. Fixed by accessing member of m_ops instead of ops. [ 345.432629] BUG: KASAN: null-ptr-deref in rtnl_newlink+0x400/0x1110 [ 345.432629] Read of size 4 at addr 0000000000000088 by task ip/986 [ 345.432629] [ 345.432629] CPU: 1 PID: 986 Comm: ip Not tainted 4.17.0-rc6+ #9 [ 345.432629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 345.432629] Call Trace: [ 345.432629] dump_stack+0xc6/0x150 [ 345.432629] ? dump_stack_print_info.cold.0+0x1b/0x1b [ 345.432629] ? kasan_report+0xb4/0x410 [ 345.432629] kasan_report.cold.4+0x8f/0x91 [ 345.432629] ? rtnl_newlink+0x400/0x1110 [ 345.432629] rtnl_newlink+0x400/0x1110 [...] Fixes: ccf8dbcd062a ("rtnetlink: Remove VLA usage") Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Tested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | rtnetlink: Remove VLA usageKees Cook2018-05-311-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the quest to remove all stack VLA usage from the kernel[1], this allocates the maximum size expected for all possible types and adds sanity-checks at both registration and usage to make sure nothing gets out of sync. This matches the proposed VLA solution for nfnetlink[2]. The values chosen here were based on finding assignments for .maxtype and .slave_maxtype and manually counting the enums: slave_maxtype (max 33): IFLA_BRPORT_MAX 33 IFLA_BOND_SLAVE_MAX 9 maxtype (max 45): IFLA_BOND_MAX 28 IFLA_BR_MAX 45 __IFLA_CAIF_HSI_MAX 8 IFLA_CAIF_MAX 4 IFLA_CAN_MAX 16 IFLA_GENEVE_MAX 12 IFLA_GRE_MAX 25 IFLA_GTP_MAX 5 IFLA_HSR_MAX 7 IFLA_IPOIB_MAX 4 IFLA_IPTUN_MAX 21 IFLA_IPVLAN_MAX 3 IFLA_MACSEC_MAX 15 IFLA_MACVLAN_MAX 7 IFLA_PPP_MAX 2 __IFLA_RMNET_MAX 4 IFLA_VLAN_MAX 6 IFLA_VRF_MAX 2 IFLA_VTI_MAX 7 IFLA_VXLAN_MAX 28 VETH_INFO_MAX 2 VXCAN_INFO_MAX 2 This additionally changes maxtype and slave_maxtype fields to unsigned, since they're only ever using positive values. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com [2] https://patchwork.kernel.org/patch/10439647/ Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: bridge: Notify about bridge VLANsPetr Machata2018-05-311-3/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A driver might need to react to changes in settings of brentry VLANs. Therefore send switchdev port notifications for these as well. Reuse SWITCHDEV_OBJ_ID_PORT_VLAN for this purpose. Listeners should use netif_is_bridge_master() on orig_dev to determine whether the notification is about a bridge port or a bridge. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | dsa: port: Ignore bridge VLAN eventsPetr Machata2018-05-311-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A follow-up patch enables emitting VLAN notifications for the bridge CPU port in addition to the existing slave port notifications. These notifications have orig_dev set to the bridge in question. Because there's no specific support for these VLANs, just ignore the notifications to maintain the current behavior. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: bridge: Extract br_vlan_add_existing()Petr Machata2018-05-311-22/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extract the code that deals with adding a preexisting VLAN to bridge CPU port to a separate function. A follow-up patch introduces a need to roll back operations in this block due to an error, and this split will make the error-handling code clearer. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: bridge: Extract boilerplate around switchdev_port_obj_*()Petr Machata2018-05-313-23/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A call to switchdev_port_obj_add() or switchdev_port_obj_del() involves initializing a struct switchdev_obj_port_vlan, a piece of code that repeats on each call site almost verbatim. While in the current codebase there is just one duplicated add call, the follow-up patches add more of both add and del calls. Thus to remove the duplication, extract the repetition into named functions and reuse. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: remove bypassed check in sch_direct_xmit()Song Liu2018-05-311-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Checking netif_xmit_frozen_or_stopped() at the end of sch_direct_xmit() is being bypassed. This is because "ret" from sch_direct_xmit() will be either NETDEV_TX_OK or NETDEV_TX_BUSY, and only ret == NETDEV_TX_OK == 0 will reach the condition: if (ret && netif_xmit_frozen_or_stopped(txq)) return false; This patch cleans up the code by removing the whole condition. For more discussion about this, please refer to https://marc.info/?t=152727195700008 Signed-off-by: Song Liu <songliubraving@fb.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | tcp: minor optimization around tcp_hdr() usage in receive pathYafang Shao2018-05-313-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is additional to the commit ea1627c20c34 ("tcp: minor optimizations around tcp_hdr() usage"). At this point, skb->data is same with tcp_hdr() as tcp header has not been pulled yet. So use the less expensive one to get the tcp header. Remove the third parameter of tcp_rcv_established() and put it into the function body. Furthermore, the local variables are listed as a reverse christmas tree :) Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | bpfilter: fix a build errYueHaibing2018-05-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc-7.3.0 report following err: HOSTCC net/bpfilter/main.o In file included from net/bpfilter/main.c:9:0: ./include/uapi/linux/bpf.h:12:10: fatal error: linux/bpf_common.h: No such file or directory #include <linux/bpf_common.h> remove it by adding a include path. Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net/ipv6: Add support for specifying metric of connected routesDavid Ahern2018-05-291-18/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for IFA_RT_PRIORITY to ipv6 addresses. If the metric is changed on an existing address then the new route is inserted before removing the old one. Since the metric is one of the route keys, the prefix route can not be atomically replaced. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net/ipv4: Add support for specifying metric of connected routesDavid Ahern2018-05-292-11/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for IFA_RT_PRIORITY to ipv4 addresses. If the metric is changed on an existing address then the new route is inserted before removing the old one. Since the metric is one of the route keys, the prefix route can not be replaced. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net/ipv6: Pass ifa6_config struct to inet6_addr_modifyDavid Ahern2018-05-291-21/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update inet6_addr_modify to take ifa6_config argument versus a parameter list. This is an argument move only; no functional change intended. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net/ipv6: Pass ifa6_config struct to inet6_addr_addDavid Ahern2018-05-291-55/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the creation of struct ifa6_config up to callers of inet6_addr_add. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net/ipv6: Convert ipv6_add_addr to struct ifa6_configDavid Ahern2018-05-291-59/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move config parameters for adding an ipv6 address to a struct. struct names stem from inet6_rtm_newaddr which is the modern handler for adding an address. Start the conversion to ifa6_config with ipv6_add_addr. This is an argument move only; no functional change intended. Mapping of variable changes: addr --> cfg->pfx peer_addr --> cfg->peer_pfx pfxlen --> cfg->plen flags --> cfg->ifa_flags scope, valid_lft, prefered_lft have the same names within cfg (with corrected spelling). Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: remove unnecessary genlmsg_cancel() callsYueHaibing2018-05-294-23/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the message be freed immediately, no need to trim it back to the previous size. Inspired by commit 7a9b3ec1e19f ("nl80211: remove unnecessary genlmsg_cancel() calls") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: bpfilter: make function bpfilter_mbox_request() staticWei Yongjun2018-05-291-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the following sparse warnings: net/ipv4/bpfilter/sockopt.c:13:5: warning: symbol 'bpfilter_mbox_request' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: sched: mq: request stats from offloadsJakub Kicinski2018-05-291-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MQ doesn't hold any statistics on its own, however, statistic from offloads are requested starting from the root, hence MQ will read the old values for its sums. Call into the drivers, because of the additive nature of the stats drivers are aware of how much "pending updates" they have to children of the MQ. Since MQ reset its stats on every dump we can simply offset the stats, predicting how stats of offloaded children will change. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: sched: mq: add simple offload notificationJakub Kicinski2018-05-291-0/+19
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mq offload is trivial, we just need to let the device know that the root qdisc is mq. Alternative approach would be to export qdisc_lookup() and make drivers check the root type themselves, but notification via ndo_setup_tc is more in line with other qdiscs. Note that mq doesn't hold any stats on it's own, it just adds up stats of its children. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud