| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Fix 5 warnings and 14 checks issued by checkpatch.pl:
CHECK: Logical continuations should be on the previous line
+ if ((q->vars.qdelay < q->params.target / 2)
+ && (q->vars.prob < MAX_PROB / 5))
WARNING: line over 80 characters
+ q->params.tupdate = usecs_to_jiffies(nla_get_u32(tb[TCA_PIE_TUPDATE]));
CHECK: Blank lines aren't necessary after an open brace '{'
+{
+
CHECK: braces {} should be used on all arms of this statement
+ if (qlen < QUEUE_THRESHOLD)
[...]
+ else {
[...]
CHECK: Unbalanced braces around else statement
+ else {
CHECK: No space is necessary after a cast
+ if (delta > (s32) (MAX_PROB / (100 / 2)) &&
CHECK: Unnecessary parentheses around 'qdelay == 0'
+ if ((qdelay == 0) && (qdelay_old == 0) && update_prob)
CHECK: Unnecessary parentheses around 'qdelay_old == 0'
+ if ((qdelay == 0) && (qdelay_old == 0) && update_prob)
CHECK: Unnecessary parentheses around 'q->vars.prob == 0'
+ if ((q->vars.qdelay < q->params.target / 2) &&
+ (q->vars.qdelay_old < q->params.target / 2) &&
+ (q->vars.prob == 0) &&
+ (q->vars.avg_dq_rate > 0))
CHECK: Unnecessary parentheses around 'q->vars.avg_dq_rate > 0'
+ if ((q->vars.qdelay < q->params.target / 2) &&
+ (q->vars.qdelay_old < q->params.target / 2) &&
+ (q->vars.prob == 0) &&
+ (q->vars.avg_dq_rate > 0))
CHECK: Blank lines aren't necessary before a close brace '}'
+
+}
CHECK: Comparison to NULL could be written "!opts"
+ if (opts == NULL)
CHECK: No space is necessary after a cast
+ ((u32) PSCHED_TICKS2NS(q->params.target)) /
WARNING: line over 80 characters
+ nla_put_u32(skb, TCA_PIE_TUPDATE, jiffies_to_usecs(q->params.tupdate)) ||
CHECK: Blank lines aren't necessary before a close brace '}'
+
+}
CHECK: No space is necessary after a cast
+ .delay = ((u32) PSCHED_TICKS2NS(q->vars.qdelay)) /
WARNING: Missing a blank line after declarations
+ struct sk_buff *skb;
+ skb = qdisc_dequeue_head(sch);
WARNING: Missing a blank line after declarations
+ struct pie_sched_data *q = qdisc_priv(sch);
+ qdisc_reset_queue(sch);
WARNING: Missing a blank line after declarations
+ struct pie_sched_data *q = qdisc_priv(sch);
+ q->params.tupdate = 0;
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\| | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
A number of TC attributes are processed without proper validation
(e.g., length checks). Add a tca policy for all input attributes and use
when invoking nlmsg_parse.
The 2 Fixes tags below cover the latest additions. The other attributes
are a string (KIND), nested attribute (OPTIONS which does seem to have
validation in most cases), for dumps only or a flag.
Fixes: 5bc1701881e39 ("net: sched: introduce multichain support for filters")
Fixes: d47a6b0e7c492 ("net: sched: introduce ingress/egress block index attributes for qdisc")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In commit ec3ed293e766 ("net_sched: change tcf_del_walker() to take idrinfo->lock")
we move fl_hw_destroy_tmplt() to a workqueue to avoid blocking
with the spinlock held. Unfortunately, this causes a lot of
troubles here:
1. tcf_chain_destroy() could be called right after we queue the work
but before the work runs. This is a use-after-free.
2. The chain refcnt is already 0, we can't even just hold it again.
We can check refcnt==1 but it is ugly.
3. The chain with refcnt 0 is still visible in its block, which means
it could be still found and used!
4. The block has a refcnt too, we can't hold it without introducing a
proper API either.
We can make it working but the end result is ugly. Instead of wasting
time on reviewing it, let's just convert the troubling spinlock to
a mutex, which allows us to use non-atomic allocations too.
Fixes: ec3ed293e766 ("net_sched: change tcf_del_walker() to take idrinfo->lock")
Reported-by: Ido Schimmel <idosch@idosch.org>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Vlad Buslov <vladbu@mellanox.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This traffic scheduler allows traffic classes states (transmission
allowed/not allowed, in the simplest case) to be scheduled, according
to a pre-generated time sequence. This is the basis of the IEEE
802.1Qbv specification.
Example configuration:
tc qdisc replace dev enp3s0 parent root handle 100 taprio \
num_tc 3 \
map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
queues 1@0 1@1 2@2 \
base-time 1528743495910289987 \
sched-entry S 01 300000 \
sched-entry S 02 300000 \
sched-entry S 04 300000 \
clockid CLOCK_TAI
The configuration format is similar to mqprio. The main difference is
the presence of a schedule, built by multiple "sched-entry"
definitions, each entry has the following format:
sched-entry <CMD> <GATE MASK> <INTERVAL>
The only supported <CMD> is "S", which means "SetGateStates",
following the IEEE 802.1Qbv-2015 definition (Table 8-6). <GATE MASK>
is a bitmask where each bit is a associated with a traffic class, so
bit 0 (the least significant bit) being "on" means that traffic class
0 is "active" for that schedule entry. <INTERVAL> is a time duration
in nanoseconds that specifies for how long that state defined by <CMD>
and <GATE MASK> should be held before moving to the next entry.
This schedule is circular, that is, after the last entry is executed
it starts from the first one, indefinitely.
The other parameters can be defined as follows:
- base-time: specifies the instant when the schedule starts, if
'base-time' is a time in the past, the schedule will start at
base-time + (N * cycle-time)
where N is the smallest integer so the resulting time is greater
than "now", and "cycle-time" is the sum of all the intervals of the
entries in the schedule;
- clockid: specifies the reference clock to be used;
The parameters should be similar to what the IEEE 802.1Q family of
specification defines.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\| |
| | |
| | |
| | |
| | |
| | |
| | | |
Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net'
overlapped the renaming of a netlink attribute in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If "td->u.target_size" is larger than sizeof(struct xt_entry_target) we
return -EINVAL. But we don't check whether it's smaller than
sizeof(struct xt_entry_target) and that could lead to an out of bounds
read.
Fixes: 7ba699c604ab ("[NET_SCHED]: Convert actions from rtnetlink to new netlink API")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In the recent TCP/EDT patch series, I switched TCP and sch_fq
clocks from MONOTONIC to TAI, in order to meet the choice done
earlier for sch_etf packet scheduler.
But sure enough, this broke some setups were the TAI clock
jumps forward (by almost 50 year...), as reported
by Leonard Crestez.
If we want to converge later, we'll probably need to add
an skb field to differentiate the clock bases, or a socket option.
In the meantime, an UDP application will need to use CLOCK_MONOTONIC
base for its SCM_TXTIME timestamps if using fq packet scheduler.
Fixes: 72b0094f9182 ("tcp: switch tcp_clock_ns() to CLOCK_TAI base")
Fixes: 142537e41923 ("net_sched: sch_fq: switch to CLOCK_TAI")
Fixes: fd2bca2aa789 ("tcp: switch internal pacing timer to CLOCK_TAI")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Leonard Crestez <leonard.crestez@nxp.com>
Tested-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When tcf_block_find() fails, it already rollbacks the qdisc refcnt,
so its caller doesn't need to clean up this again. Avoid calling
qdisc_put() again by resetting qdisc to NULL for callers.
Reported-by: syzbot+37b8770e6d5a8220a039@syzkaller.appspotmail.com
Fixes: e368fdb61d8e ("net: sched: use Qdisc rcu API instead of relying on rtnl lock")
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Fixes the following sparse warning:
net/sched/sch_generic.c:944:6: warning:
symbol 'qdisc_free_cb' was not declared. Should it be static?
Fixes: 3a7d0d07a386 ("net: sched: extend Qdisc with rcu")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In order to remove dependency on rtnl lock on rules update path, always
take reference to block while using it on rules update path. Change
tcf_block_get() error handling to properly release block with reference
counting, instead of just destroying it, in order to accommodate potential
concurrent users.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Implement get/put function for blocks that only take/release the reference
and perform deallocation. These functions are intended to be used by
unlocked rules update path to always hold reference to block while working
with it. They use on new fine-grained locking mechanisms introduced in
previous patches in this set, instead of relying on global protection
provided by rtnl lock.
Extract code that is common with tcf_block_detach_ext() into common
function __tcf_block_put().
Extend tcf_block with rcu to allow safe deallocation when it is accessed
concurrently.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Protect block idr access with spinlock, instead of relying on rtnl lock.
Take tn->idr_lock spinlock during block insertion and removal.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Extract code that flushes and puts all chains on tcf block to two
standalone function to be shared with functions that locklessly get/put
reference to block.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
As a preparation for removing rtnl lock dependency from rules update path,
change tcf block reference counter type to refcount_t to allow modification
by concurrent users.
In block put function perform decrement and check reference counter once to
accommodate concurrent modification by unlocked users. After this change
tcf_chain_put at the end of block put function is called with
block->refcnt==0 and will deallocate block after the last chain is
released, so there is no need to manually deallocate block in this case.
However, if block reference counter reached 0 and there are no chains to
release, block must still be deallocated manually.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
As a preparation from removing rtnl lock dependency from rules update path,
use Qdisc rcu and reference counting capabilities instead of relying on
rtnl lock while working with Qdiscs. Create new tcf_block_release()
function, and use it to free resources taken by tcf_block_find().
Currently, this function only releases Qdisc and it is extended in next
patches in this series.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Currently, Qdisc API functions assume that users have rtnl lock taken. To
implement rtnl unlocked classifiers update interface, Qdisc API must be
extended with functions that do not require rtnl lock.
Extend Qdisc structure with rcu. Implement special version of put function
qdisc_put_unlocked() that is called without rtnl lock taken. This function
only takes rtnl lock if Qdisc reference counter reached zero and is
intended to be used as optimization.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Current implementation of qdisc_destroy() decrements Qdisc reference
counter and only actually destroy Qdisc if reference counter value reached
zero. Rename qdisc_destroy() to qdisc_put() in order for it to better
describe the way in which this function currently implemented and used.
Extract code that deallocates Qdisc into new private qdisc_destroy()
function. It is intended to be shared between regular qdisc_put() and its
unlocked version that is introduced in next patch in this series.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add additional counters that will store the bytes/packets processed by
hardware. These will be exported through the netlink interface for
displaying by the iproute2 tc tool
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
With the earliest departure time model, we no longer plan
special casing TCP retransmits. We therefore remove dead
code (since most compilers understood skb_is_retransmit()
was false)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
TCP keeps track of tcp_wstamp_ns by itself, meaning sch_fq
no longer has to do it.
Thanks to this model, TCP can get more accurate RTT samples,
since pacing no longer inflates them.
This has the nice effect of removing some delays caused by FQ
quantum mechanism, causing inflated max/P99 latencies.
Also we might relax TCP Small Queue tight limits in the future,
since this new model allow TCP to build bigger batches, since
sch_fq (or a device with earliest departure time offload) ensure
these packets will be delivered on time.
Note that other protocols are not converted (they will probably
never be) so sch_fq has still support for SO_MAX_PACING_RATE
Tested:
Test showing FQ pacing quantum artifact for low-rate flows,
adding unexpected throttles for RPC flows, inflating max and P99 latencies.
The parameters chosen here are to show what happens typically when
a TCP flow has a reduced pacing rate (this can be caused by a reduced
cwin after few losses, or/and rtt above few ms)
MIBS="MIN_LATENCY,MEAN_LATENCY,MAX_LATENCY,P99_LATENCY,STDDEV_LATENCY"
Before :
$ netperf -H 10.246.7.133 -t TCP_RR -Cc -T6,6 -- -q 2000000 -r 100,100 -o $MIBS
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.133 () port 0 AF_INET : first burst 0 : cpu bind
Minimum Latency Microseconds,Mean Latency Microseconds,Maximum Latency Microseconds,99th Percentile Latency Microseconds,Stddev Latency Microseconds
19,82.78,5279,3825,482.02
After :
$ netperf -H 10.246.7.133 -t TCP_RR -Cc -T6,6 -- -q 2000000 -r 100,100 -o $MIBS
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.133 () port 0 AF_INET : first burst 0 : cpu bind
Minimum Latency Microseconds,Mean Latency Microseconds,Maximum Latency Microseconds,99th Percentile Latency Microseconds,Stddev Latency Microseconds
20,49.94,128,63,3.18
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
TCP will soon provide per skb->tstamp with earliest departure time,
so that sch_fq does not have to determine departure time by looking
at socket sk_pacing_rate.
We chose in linux-4.19 CLOCK_TAI as the clock base for transports,
qdiscs, and NIC offloads.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Action API was changed to work with actions and action_idr in concurrency
safe manner, however tcf_del_walker() still uses actions without taking a
reference or idrinfo->lock first, and deletes them directly, disregarding
possible concurrent delete.
Change tcf_del_walker() to take idrinfo->lock while iterating over actions
and use new tcf_idr_release_unsafe() to release them while holding the
lock.
And the blocking function fl_hw_destroy_tmplt() could be called when we
put a filter chain, so defer it to a work queue.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
[xiyou.wangcong@gmail.com: heavily modify the code and changelog]
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
FIELD_SIZEOF is defined as a macro to calculate the specified value. Therefore,
We prefer to use the macro rather than calculating its value.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\| |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Two new tls tests added in parallel in both net and net-next.
Used Stephen Rothwell's linux-next resolution.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Matteo reported the following splat, testing the datapath of TC 'sample':
BUG: KASAN: null-ptr-deref in tcf_sample_act+0xc4/0x310
Read of size 8 at addr 0000000000000000 by task nc/433
CPU: 0 PID: 433 Comm: nc Not tainted 4.19.0-rc3-kvm #17
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28 04/01/2014
Call Trace:
kasan_report.cold.6+0x6c/0x2fa
tcf_sample_act+0xc4/0x310
? dev_hard_start_xmit+0x117/0x180
tcf_action_exec+0xa3/0x160
tcf_classify+0xdd/0x1d0
htb_enqueue+0x18e/0x6b0
? deref_stack_reg+0x7a/0xb0
? htb_delete+0x4b0/0x4b0
? unwind_next_frame+0x819/0x8f0
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
__dev_queue_xmit+0x722/0xca0
? unwind_get_return_address_ptr+0x50/0x50
? netdev_pick_tx+0xe0/0xe0
? save_stack+0x8c/0xb0
? kasan_kmalloc+0xbe/0xd0
? __kmalloc_track_caller+0xe4/0x1c0
? __kmalloc_reserve.isra.45+0x24/0x70
? __alloc_skb+0xdd/0x2e0
? sk_stream_alloc_skb+0x91/0x3b0
? tcp_sendmsg_locked+0x71b/0x15a0
? tcp_sendmsg+0x22/0x40
? __sys_sendto+0x1b0/0x250
? __x64_sys_sendto+0x6f/0x80
? do_syscall_64+0x5d/0x150
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
? __sys_sendto+0x1b0/0x250
? __x64_sys_sendto+0x6f/0x80
? do_syscall_64+0x5d/0x150
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
ip_finish_output2+0x495/0x590
? ip_copy_metadata+0x2e0/0x2e0
? skb_gso_validate_network_len+0x6f/0x110
? ip_finish_output+0x174/0x280
__tcp_transmit_skb+0xb17/0x12b0
? __tcp_select_window+0x380/0x380
tcp_write_xmit+0x913/0x1de0
? __sk_mem_schedule+0x50/0x80
tcp_sendmsg_locked+0x49d/0x15a0
? tcp_rcv_established+0x8da/0xa30
? tcp_set_state+0x220/0x220
? clear_user+0x1f/0x50
? iov_iter_zero+0x1ae/0x590
? __fget_light+0xa0/0xe0
tcp_sendmsg+0x22/0x40
__sys_sendto+0x1b0/0x250
? __ia32_sys_getpeername+0x40/0x40
? _copy_to_user+0x58/0x70
? poll_select_copy_remaining+0x176/0x200
? __pollwait+0x1c0/0x1c0
? ktime_get_ts64+0x11f/0x140
? kern_select+0x108/0x150
? core_sys_select+0x360/0x360
? vfs_read+0x127/0x150
? kernel_write+0x90/0x90
__x64_sys_sendto+0x6f/0x80
do_syscall_64+0x5d/0x150
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fefef2b129d
Code: ff ff ff ff eb b6 0f 1f 80 00 00 00 00 48 8d 05 51 37 0c 00 41 89 ca 8b 00 85 c0 75 20 45 31 c9 45 31 c0 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 6b f3 c3 66 0f 1f 84 00 00 00 00 00 41 56 41
RSP: 002b:00007fff2f5350c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 000056118d60c120 RCX: 00007fefef2b129d
RDX: 0000000000002000 RSI: 000056118d629320 RDI: 0000000000000003
RBP: 000056118d530370 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000002000
R13: 000056118d5c2a10 R14: 000056118d5c2a10 R15: 000056118d5303b8
tcf_sample_act() tried to update its per-cpu stats, but tcf_sample_init()
forgot to allocate them, because tcf_idr_create() was called with a wrong
value of 'cpustats'. Setting it to true proved to fix the reported crash.
Reported-by: Matteo Croce <mcroce@redhat.com>
Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
Fixes: 5c5670fae430 ("net/sched: Introduce sample tc action")
Tested-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When we delete a chain of filters, we need to notify
user-space we are deleting each filters in this chain
too.
Fixes: 32a4f5ecd738 ("net: sched: introduce chain object to uapi")
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
use RCU instead of spinlocks, to protect concurrent read/write on
act_police configuration. This reduces the effects of contention in the
data path, in case multiple readers are present.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
use per-CPU counters, instead of sharing a single set of stats with all
cores. This removes the need of using spinlock when statistics are read
or updated.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\| | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When nla_put*() fails after nla_nest_start(), we need
to call nla_nest_cancel() to cancel the message, otherwise
we end up calling nla_nest_end() like a success.
Fixes: 0ed5269f9e41 ("net/sched: add tunnel option support to act_tunnel_key")
Cc: Davide Caratti <dcaratti@redhat.com>
Cc: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If users try to install act_tunnel_key 'set' rules with duplicate values
of 'index', the tunnel metadata are allocated, but never released. Then,
kmemleak complains as follows:
# tc a a a tunnel_key set src_ip 1.1.1.1 dst_ip 2.2.2.2 id 42 index 111
# echo clear > /sys/kernel/debug/kmemleak
# tc a a a tunnel_key set src_ip 1.1.1.1 dst_ip 2.2.2.2 id 42 index 111
Error: TC IDR already exists.
We have an error talking to the kernel
# echo scan > /sys/kernel/debug/kmemleak
# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff8800574e6c80 (size 256):
comm "tc", pid 5617, jiffies 4298118009 (age 57.990s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 1c e8 b0 ff ff ff ff ................
81 24 c2 ad ff ff ff ff 00 00 00 00 00 00 00 00 .$..............
backtrace:
[<00000000b7afbf4e>] tunnel_key_init+0x8a5/0x1800 [act_tunnel_key]
[<000000007d98fccd>] tcf_action_init_1+0x698/0xac0
[<0000000099b8f7cc>] tcf_action_init+0x15c/0x590
[<00000000dc60eebe>] tc_ctl_action+0x336/0x5c2
[<000000002f5a2f7d>] rtnetlink_rcv_msg+0x357/0x8e0
[<000000000bfe7575>] netlink_rcv_skb+0x124/0x350
[<00000000edab656f>] netlink_unicast+0x40f/0x5d0
[<00000000b322cdcb>] netlink_sendmsg+0x6e8/0xba0
[<0000000063d9d490>] sock_sendmsg+0xb3/0xf0
[<00000000f0d3315a>] ___sys_sendmsg+0x654/0x960
[<00000000c06cbd42>] __sys_sendmsg+0xd3/0x170
[<00000000ce72e4b0>] do_syscall_64+0xa5/0x470
[<000000005caa2d97>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[<00000000fac1b476>] 0xffffffffffffffff
This problem theoretically happens also in case users attempt to setup a
geneve rule having wrong configuration data, or when the kernel fails to
allocate 'params_new'. Ensure that tunnel_key_init() releases the tunnel
metadata also in the above conditions.
Addresses-Coverity-ID: 1373974 ("Resource leak")
Fixes: d0f6dd8a914f4 ("net/sched: Introduce act_tunnel_key")
Fixes: 0ed5269f9e41f ("net/sched: add tunnel option support to act_tunnel_key")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
cl->leaf.q is slightly more readable than cl->un.leaf.q.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We no longer take any spinlock on RX path for ingress qdisc,
so this lockdep annotation is no longer needed.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Change flower in_hw_count type to fixed-size u32 and dump it as
TCA_FLOWER_IN_HW_COUNT. This change is necessary to properly test shared
blocks and re-offload functionality.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
An SKB is not on a list if skb->next is NULL.
Codify this convention into a helper function and use it
where we are dequeueing an SKB and need to mark it as such.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
By hand copies of SKB list handlers do not belong in individual packet
schedulers.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Instead, adjust __qdisc_enqueue_tail() such that HTB can use it
instead.
The only other caller of __qdisc_enqueue_tail() is
qdisc_enqueue_tail() so we can move the backlog and return value
handling (which HTB doesn't need/want) to the latter.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
According to the new locking rule, we have to take tcf_lock for both
->init() and ->dump(), as RTNL will be removed.
Use tcf spinlock to protect private nat action data from concurrent
modification during dump. (nat init already uses tcf spinlock when changing
action state)
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
According to the new locking rule, we have to take tcf_lock for both
->init() and ->dump(), as RTNL will be removed.
Use tcf lock to protect skbedit action struct private data from concurrent
modification in init and dump. Use rcu swap operation to reassign params
pointer under protection of tcf lock. (old params value is not used by
init, so there is no need of standalone rcu dereference step)
Remove rtnl lock assertion that is no longer required.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\| | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Recent refactoring of add_metainfo() caused use_all_metadata() to add
metainfo to ife action metalist without taking reference to module. This
causes warning in module_put called from ife action cleanup function.
Implement add_metainfo_and_get_ops() function that returns with reference
to module taken if metainfo was added successfully, and call it from
use_all_metadata(), instead of calling __add_metainfo() directly.
Example warning:
[ 646.344393] WARNING: CPU: 1 PID: 2278 at kernel/module.c:1139 module_put+0x1cb/0x230
[ 646.352437] Modules linked in: act_meta_skbtcindex act_meta_mark act_meta_skbprio act_ife ife veth nfsv3 nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c tun ebtable_filter ebtables ip6table_filter ip6_tables bridge stp llc mlx5_ib ib_uverbs ib_core intel_rapl sb_edac x86_pkg_temp_thermal mlx5_core coretemp kvm_intel kvm nfsd igb irqbypass crct10dif_pclmul devlink crc32_pclmul mei_me joydev ses crc32c_intel enclosure auth_rpcgss i2c_algo_bit ioatdma ptp mei pps_core ghash_clmulni_intel iTCO_wdt iTCO_vendor_support pcspkr dca ipmi_ssif lpc_ich target_core_mod i2c_i801 ipmi_si ipmi_devintf pcc_cpufreq wmi ipmi_msghandler nfs_acl lockd acpi_pad acpi_power_meter grace sunrpc mpt3sas raid_class scsi_transport_sas
[ 646.425631] CPU: 1 PID: 2278 Comm: tc Not tainted 4.19.0-rc1+ #799
[ 646.432187] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 646.440595] RIP: 0010:module_put+0x1cb/0x230
[ 646.445238] Code: f3 66 94 02 e8 26 ff fa ff 85 c0 74 11 0f b6 1d 51 30 94 02 80 fb 01 77 60 83 e3 01 74 13 65 ff 0d 3a 83 db 73 e9 2b ff ff ff <0f> 0b e9 00 ff ff ff e8 59 01 fb ff 85 c0 75 e4 48 c7 c2 20 62 6b
[ 646.464997] RSP: 0018:ffff880354d37068 EFLAGS: 00010286
[ 646.470599] RAX: 0000000000000000 RBX: ffffffffc0a52518 RCX: ffffffff8c2668db
[ 646.478118] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffffffffc0a52518
[ 646.485641] RBP: ffffffffc0a52180 R08: fffffbfff814a4a4 R09: fffffbfff814a4a3
[ 646.493164] R10: ffffffffc0a5251b R11: fffffbfff814a4a4 R12: 1ffff1006a9a6e0d
[ 646.500687] R13: 00000000ffffffff R14: ffff880362bab890 R15: dead000000000100
[ 646.508213] FS: 00007f4164c99800(0000) GS:ffff88036fe40000(0000) knlGS:0000000000000000
[ 646.516961] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 646.523080] CR2: 00007f41638b8420 CR3: 0000000351df0004 CR4: 00000000001606e0
[ 646.530595] Call Trace:
[ 646.533408] ? find_symbol_in_section+0x260/0x260
[ 646.538509] tcf_ife_cleanup+0x11b/0x200 [act_ife]
[ 646.543695] tcf_action_cleanup+0x29/0xa0
[ 646.548078] __tcf_action_put+0x5a/0xb0
[ 646.552289] ? nla_put+0x65/0xe0
[ 646.555889] __tcf_idr_release+0x48/0x60
[ 646.560187] tcf_generic_walker+0x448/0x6b0
[ 646.564764] ? tcf_action_dump_1+0x450/0x450
[ 646.569411] ? __lock_is_held+0x84/0x110
[ 646.573720] ? tcf_ife_walker+0x10c/0x20f [act_ife]
[ 646.578982] tca_action_gd+0x972/0xc40
[ 646.583129] ? tca_get_fill.constprop.17+0x250/0x250
[ 646.588471] ? mark_lock+0xcf/0x980
[ 646.592324] ? check_chain_key+0x140/0x1f0
[ 646.596832] ? debug_show_all_locks+0x240/0x240
[ 646.601839] ? memset+0x1f/0x40
[ 646.605350] ? nla_parse+0xca/0x1a0
[ 646.609217] tc_ctl_action+0x215/0x230
[ 646.613339] ? tcf_action_add+0x220/0x220
[ 646.617748] rtnetlink_rcv_msg+0x56a/0x6d0
[ 646.622227] ? rtnl_fdb_del+0x3f0/0x3f0
[ 646.626466] netlink_rcv_skb+0x18d/0x200
[ 646.630752] ? rtnl_fdb_del+0x3f0/0x3f0
[ 646.634959] ? netlink_ack+0x500/0x500
[ 646.639106] netlink_unicast+0x2d0/0x370
[ 646.643409] ? netlink_attachskb+0x340/0x340
[ 646.648050] ? _copy_from_iter_full+0xe9/0x3e0
[ 646.652870] ? import_iovec+0x11e/0x1c0
[ 646.657083] netlink_sendmsg+0x3b9/0x6a0
[ 646.661388] ? netlink_unicast+0x370/0x370
[ 646.665877] ? netlink_unicast+0x370/0x370
[ 646.670351] sock_sendmsg+0x6b/0x80
[ 646.674212] ___sys_sendmsg+0x4a1/0x520
[ 646.678443] ? copy_msghdr_from_user+0x210/0x210
[ 646.683463] ? lock_downgrade+0x320/0x320
[ 646.687849] ? debug_show_all_locks+0x240/0x240
[ 646.692760] ? do_raw_spin_unlock+0xa2/0x130
[ 646.697418] ? _raw_spin_unlock+0x24/0x30
[ 646.701798] ? __handle_mm_fault+0x1819/0x1c10
[ 646.706619] ? __pmd_alloc+0x320/0x320
[ 646.710738] ? debug_show_all_locks+0x240/0x240
[ 646.715649] ? restore_nameidata+0x7b/0xa0
[ 646.720117] ? check_chain_key+0x140/0x1f0
[ 646.724590] ? check_chain_key+0x140/0x1f0
[ 646.729070] ? __fget_light+0xbc/0xd0
[ 646.733121] ? __sys_sendmsg+0xd7/0x150
[ 646.737329] __sys_sendmsg+0xd7/0x150
[ 646.741359] ? __ia32_sys_shutdown+0x30/0x30
[ 646.746003] ? up_read+0x53/0x90
[ 646.749601] ? __do_page_fault+0x484/0x780
[ 646.754105] ? do_syscall_64+0x1e/0x2c0
[ 646.758320] do_syscall_64+0x72/0x2c0
[ 646.762353] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 646.767776] RIP: 0033:0x7f4163872150
[ 646.771713] Code: 8b 15 3c 7d 2b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 83 3d b9 d5 2b 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be cd 00 00 48 89 04 24
[ 646.791474] RSP: 002b:00007ffdef7d6b58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 646.799721] RAX: ffffffffffffffda RBX: 0000000000000024 RCX: 00007f4163872150
[ 646.807240] RDX: 0000000000000000 RSI: 00007ffdef7d6bd0 RDI: 0000000000000003
[ 646.814760] RBP: 000000005b8b9482 R08: 0000000000000001 R09: 0000000000000000
[ 646.822286] R10: 00000000000005e7 R11: 0000000000000246 R12: 00007ffdef7dad20
[ 646.829807] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000679bc0
[ 646.837360] irq event stamp: 6083
[ 646.841043] hardirqs last enabled at (6081): [<ffffffff8c220a7d>] __call_rcu+0x17d/0x500
[ 646.849882] hardirqs last disabled at (6083): [<ffffffff8c004f06>] trace_hardirqs_off_thunk+0x1a/0x1c
[ 646.859775] softirqs last enabled at (5968): [<ffffffff8d4004a1>] __do_softirq+0x4a1/0x6ee
[ 646.868784] softirqs last disabled at (6082): [<ffffffffc0a78759>] tcf_ife_cleanup+0x39/0x200 [act_ife]
[ 646.878845] ---[ end trace b1b8c12ffe51e657 ]---
Fixes: 5ffe57da29b3 ("act_ife: fix a potential deadlock")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Immediately after module_put(), user could delete this
module, so e->ops could be already freed before we call
e->ops->release().
Fix this by moving module_put() after ops->release().
Fixes: ef6980b6becb ("introduce IFE action")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Currently, tcf_action_delete() nulls actions array pointer after putting
and deleting it. However, if tcf_idr_delete_index() returns an error,
pointer to action is not set to null. That results it being released second
time in error handling code of tca_action_gd().
Kasan error:
[ 807.367755] ==================================================================
[ 807.375844] BUG: KASAN: use-after-free in tc_setup_cb_call+0x14e/0x250
[ 807.382763] Read of size 8 at addr ffff88033e636000 by task tc/2732
[ 807.391289] CPU: 0 PID: 2732 Comm: tc Tainted: G W 4.19.0-rc1+ #799
[ 807.399542] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 807.407948] Call Trace:
[ 807.410763] dump_stack+0x92/0xeb
[ 807.414456] print_address_description+0x70/0x360
[ 807.419549] kasan_report+0x14d/0x300
[ 807.423582] ? tc_setup_cb_call+0x14e/0x250
[ 807.428150] tc_setup_cb_call+0x14e/0x250
[ 807.432539] ? nla_put+0x65/0xe0
[ 807.436146] fl_dump+0x394/0x3f0 [cls_flower]
[ 807.440890] ? fl_tmplt_dump+0x140/0x140 [cls_flower]
[ 807.446327] ? lock_downgrade+0x320/0x320
[ 807.450702] ? lock_acquire+0xe2/0x220
[ 807.454819] ? is_bpf_text_address+0x5/0x140
[ 807.459475] ? memcpy+0x34/0x50
[ 807.462980] ? nla_put+0x65/0xe0
[ 807.466582] tcf_fill_node+0x341/0x430
[ 807.470717] ? tcf_block_put+0xe0/0xe0
[ 807.474859] tcf_node_dump+0xdb/0xf0
[ 807.478821] fl_walk+0x8e/0x170 [cls_flower]
[ 807.483474] tcf_chain_dump+0x35a/0x4d0
[ 807.487703] ? tfilter_notify+0x170/0x170
[ 807.492091] ? tcf_fill_node+0x430/0x430
[ 807.496411] tc_dump_tfilter+0x362/0x3f0
[ 807.500712] ? tc_del_tfilter+0x850/0x850
[ 807.505104] ? kasan_unpoison_shadow+0x30/0x40
[ 807.509940] ? __mutex_unlock_slowpath+0xcf/0x410
[ 807.515031] netlink_dump+0x263/0x4f0
[ 807.519077] __netlink_dump_start+0x2a0/0x300
[ 807.523817] ? tc_del_tfilter+0x850/0x850
[ 807.528198] rtnetlink_rcv_msg+0x46a/0x6d0
[ 807.532671] ? rtnl_fdb_del+0x3f0/0x3f0
[ 807.536878] ? tc_del_tfilter+0x850/0x850
[ 807.541280] netlink_rcv_skb+0x18d/0x200
[ 807.545570] ? rtnl_fdb_del+0x3f0/0x3f0
[ 807.549773] ? netlink_ack+0x500/0x500
[ 807.553913] netlink_unicast+0x2d0/0x370
[ 807.558212] ? netlink_attachskb+0x340/0x340
[ 807.562855] ? _copy_from_iter_full+0xe9/0x3e0
[ 807.567677] ? import_iovec+0x11e/0x1c0
[ 807.571890] netlink_sendmsg+0x3b9/0x6a0
[ 807.576192] ? netlink_unicast+0x370/0x370
[ 807.580684] ? netlink_unicast+0x370/0x370
[ 807.585154] sock_sendmsg+0x6b/0x80
[ 807.589015] ___sys_sendmsg+0x4a1/0x520
[ 807.593230] ? copy_msghdr_from_user+0x210/0x210
[ 807.598232] ? do_wp_page+0x174/0x880
[ 807.602276] ? __handle_mm_fault+0x749/0x1c10
[ 807.607021] ? __handle_mm_fault+0x1046/0x1c10
[ 807.611849] ? __pmd_alloc+0x320/0x320
[ 807.615973] ? check_chain_key+0x140/0x1f0
[ 807.620450] ? check_chain_key+0x140/0x1f0
[ 807.624929] ? __fget_light+0xbc/0xd0
[ 807.628970] ? __sys_sendmsg+0xd7/0x150
[ 807.633172] __sys_sendmsg+0xd7/0x150
[ 807.637201] ? __ia32_sys_shutdown+0x30/0x30
[ 807.641846] ? up_read+0x53/0x90
[ 807.645442] ? __do_page_fault+0x484/0x780
[ 807.649949] ? do_syscall_64+0x1e/0x2c0
[ 807.654164] do_syscall_64+0x72/0x2c0
[ 807.658198] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 807.663625] RIP: 0033:0x7f42e9870150
[ 807.667568] Code: 8b 15 3c 7d 2b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 83 3d b9 d5 2b 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be cd 00 00 48 89 04 24
[ 807.687328] RSP: 002b:00007ffdbf595b58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 807.695564] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f42e9870150
[ 807.703083] RDX: 0000000000000000 RSI: 00007ffdbf595b80 RDI: 0000000000000003
[ 807.710605] RBP: 00007ffdbf599d90 R08: 0000000000679bc0 R09: 000000000000000f
[ 807.718127] R10: 00000000000005e7 R11: 0000000000000246 R12: 00007ffdbf599d88
[ 807.725651] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 807.735048] Allocated by task 2687:
[ 807.738902] kasan_kmalloc+0xa0/0xd0
[ 807.742852] __kmalloc+0x118/0x2d0
[ 807.746615] tcf_idr_create+0x44/0x320
[ 807.750738] tcf_nat_init+0x41e/0x530 [act_nat]
[ 807.755638] tcf_action_init_1+0x4e0/0x650
[ 807.760104] tcf_action_init+0x1ce/0x2d0
[ 807.764395] tcf_exts_validate+0x1d8/0x200
[ 807.768861] fl_change+0x55a/0x26b4 [cls_flower]
[ 807.773845] tc_new_tfilter+0x748/0xa20
[ 807.778051] rtnetlink_rcv_msg+0x56a/0x6d0
[ 807.782517] netlink_rcv_skb+0x18d/0x200
[ 807.786804] netlink_unicast+0x2d0/0x370
[ 807.791095] netlink_sendmsg+0x3b9/0x6a0
[ 807.795387] sock_sendmsg+0x6b/0x80
[ 807.799240] ___sys_sendmsg+0x4a1/0x520
[ 807.803445] __sys_sendmsg+0xd7/0x150
[ 807.807473] do_syscall_64+0x72/0x2c0
[ 807.811506] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 807.818776] Freed by task 2728:
[ 807.822283] __kasan_slab_free+0x122/0x180
[ 807.826752] kfree+0xf4/0x2f0
[ 807.830080] __tcf_action_put+0x5a/0xb0
[ 807.834281] tcf_action_put_many+0x46/0x70
[ 807.838747] tca_action_gd+0x232/0xc40
[ 807.842862] tc_ctl_action+0x215/0x230
[ 807.846977] rtnetlink_rcv_msg+0x56a/0x6d0
[ 807.851444] netlink_rcv_skb+0x18d/0x200
[ 807.855731] netlink_unicast+0x2d0/0x370
[ 807.860021] netlink_sendmsg+0x3b9/0x6a0
[ 807.864312] sock_sendmsg+0x6b/0x80
[ 807.868166] ___sys_sendmsg+0x4a1/0x520
[ 807.872372] __sys_sendmsg+0xd7/0x150
[ 807.876401] do_syscall_64+0x72/0x2c0
[ 807.880431] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 807.887704] The buggy address belongs to the object at ffff88033e636000
which belongs to the cache kmalloc-256 of size 256
[ 807.900909] The buggy address is located 0 bytes inside of
256-byte region [ffff88033e636000, ffff88033e636100)
[ 807.913155] The buggy address belongs to the page:
[ 807.918322] page:ffffea000cf98d80 count:1 mapcount:0 mapping:ffff88036f80ee00 index:0x0 compound_mapcount: 0
[ 807.928831] flags: 0x5fff8000008100(slab|head)
[ 807.933647] raw: 005fff8000008100 ffffea000db44f00 0000000400000004 ffff88036f80ee00
[ 807.942050] raw: 0000000000000000 0000000080190019 00000001ffffffff 0000000000000000
[ 807.950456] page dumped because: kasan: bad access detected
[ 807.958240] Memory state around the buggy address:
[ 807.963405] ffff88033e635f00: fc fc fc fc fb fb fb fb fb fb fb fc fc fc fc fb
[ 807.971288] ffff88033e635f80: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
[ 807.979166] >ffff88033e636000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 807.994882] ^
[ 807.998477] ffff88033e636080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 808.006352] ffff88033e636100: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
[ 808.014230] ==================================================================
[ 808.022108] Disabling lock debugging due to kernel taint
Fixes: edfaf94fa705 ("net_sched: improve and refactor tcf_action_put_many()")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
After the commit 802bfb19152c ("net/sched: user-space can't set
unknown tcfa_action values"), unknown tcfa_action values are
converted to TC_ACT_UNSPEC, but the common agreement is instead
rejecting such configurations.
This change also introduces a helper to simplify the destruction
of a single action, avoiding code duplication.
v1 -> v2:
- helper is now static and renamed according to act_* convention
- updated extack message, according to the new behavior
Fixes: 802bfb19152c ("net/sched: user-space can't set unknown tcfa_action values")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
in the (rare) case of failure in nla_nest_start(), missing NULL checks in
tcf_pedit_key_ex_dump() can make the following command
# tc action add action pedit ex munge ip ttl set 64
dereference a NULL pointer:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 800000007d1cd067 P4D 800000007d1cd067 PUD 7acd3067 PMD 0
Oops: 0002 [#1] SMP PTI
CPU: 0 PID: 3336 Comm: tc Tainted: G E 4.18.0.pedit+ #425
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:tcf_pedit_dump+0x19d/0x358 [act_pedit]
Code: be 02 00 00 00 48 89 df 66 89 44 24 20 e8 9b b1 fd e0 85 c0 75 46 8b 83 c8 00 00 00 49 83 c5 08 48 03 83 d0 00 00 00 4d 39 f5 <66> 89 04 25 00 00 00 00 0f 84 81 01 00 00 41 8b 45 00 48 8d 4c 24
RSP: 0018:ffffb5d4004478a8 EFLAGS: 00010246
RAX: ffff8880fcda2070 RBX: ffff8880fadd2900 RCX: 0000000000000000
RDX: 0000000000000002 RSI: ffffb5d4004478ca RDI: ffff8880fcda206e
RBP: ffff8880fb9cb900 R08: 0000000000000008 R09: ffff8880fcda206e
R10: ffff8880fadd2900 R11: 0000000000000000 R12: ffff8880fd26cf40
R13: ffff8880fc957430 R14: ffff8880fc957430 R15: ffff8880fb9cb988
FS: 00007f75a537a740(0000) GS:ffff8880fda00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000007a2fa005 CR4: 00000000001606f0
Call Trace:
? __nla_reserve+0x38/0x50
tcf_action_dump_1+0xd2/0x130
tcf_action_dump+0x6a/0xf0
tca_get_fill.constprop.31+0xa3/0x120
tcf_action_add+0xd1/0x170
tc_ctl_action+0x137/0x150
rtnetlink_rcv_msg+0x263/0x2d0
? _cond_resched+0x15/0x40
? rtnl_calcit.isra.30+0x110/0x110
netlink_rcv_skb+0x4d/0x130
netlink_unicast+0x1a3/0x250
netlink_sendmsg+0x2ae/0x3a0
sock_sendmsg+0x36/0x40
___sys_sendmsg+0x26f/0x2d0
? do_wp_page+0x8e/0x5f0
? handle_pte_fault+0x6c3/0xf50
? __handle_mm_fault+0x38e/0x520
? __sys_sendmsg+0x5e/0xa0
__sys_sendmsg+0x5e/0xa0
do_syscall_64+0x5b/0x180
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f75a4583ba0
Code: c3 48 8b 05 f2 62 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d fd c3 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae cc 00 00 48 89 04 24
RSP: 002b:00007fff60ee7418 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fff60ee7540 RCX: 00007f75a4583ba0
RDX: 0000000000000000 RSI: 00007fff60ee7490 RDI: 0000000000000003
RBP: 000000005b842d3e R08: 0000000000000002 R09: 0000000000000000
R10: 00007fff60ee6ea0 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff60ee7554 R14: 0000000000000001 R15: 000000000066c100
Modules linked in: act_pedit(E) ip6table_filter ip6_tables iptable_filter binfmt_misc crct10dif_pclmul ext4 crc32_pclmul mbcache ghash_clmulni_intel jbd2 pcbc snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer cryptd glue_helper snd joydev pcspkr soundcore virtio_balloon i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net net_failover virtio_blk virtio_console failover qxl crc32c_intel drm_kms_helper syscopyarea serio_raw sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix virtio_pci libata virtio_ring i2c_core virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_pedit]
CR2: 0000000000000000
Like it's done for other TC actions, give up dumping pedit rules and return
an error if nla_nest_start() returns NULL.
Fixes: 71d0ed7079df ("net/act_pedit: Support using offset relative to the conventional network headers")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When chain 0 was implicitly created, removal of non-existent filter from
chain 0 gave -ENOENT. Once chain 0 became non-implicit, the same call is
giving -EINVAL. Fix this by returning -ENOENT in that case.
Reported-by: Roman Mashak <mrv@mojatatu.com>
Fixes: f71e0ca4db18 ("net: sched: Avoid implicit chain 0 creation")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |/
| |
| |
| |
| |
| |
| |
| | |
Instead "Cannot find" say "Cannot create".
Fixes: c35a4acc2985 ("net: sched: cls_api: handle generic cls errors")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
According to the new locking rule, we have to take tcf_lock
for both ->init() and ->dump(), as RTNL will be removed.
However, it is missing for act_connmark.
Cc: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|/
|
|
|
|
|
|
|
|
|
| |
This reverts commit 331a9295de23 ("net: sched: act: add extack for lookup callback").
This extack is never used after 6 months... In fact, it can be just
set in the caller, right after ->lookup().
Cc: Alexander Aring <aring@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|