History log of /linux-master/net/sched/act_mpls.c
Revision Date Author Comments
# 241a94ab 01-Feb-2024 Michal Koutný <mkoutny@suse.com>

net/sched: Add module aliases for cls_,sch_,act_ modules

No functional change intended, aliases will be used in followup commits.
Note for backporters: you may need to add aliases also for modules that
are already removed in mainline kernel but still in your version.

Patches were generated with the help of Coccinelle scripts like:

cat >scripts/coccinelle/misc/tcf_alias.cocci <<EOD
virtual patch
virtual report

@ haskernel @
@@

@ tcf_has_kind depends on report && haskernel @
identifier ops;
constant K;
@@

static struct tcf_proto_ops ops = {
.kind = K,
...
};
+char module_alias = K;
EOD

/usr/bin/spatch -D report --cocci-file scripts/coccinelle/misc/tcf_alias.cocci \
--dir . \
-I ./arch/x86/include -I ./arch/x86/include/generated -I ./include \
-I ./arch/x86/include/uapi -I ./arch/x86/include/generated/uapi \
-I ./include/uapi -I ./include/generated/uapi \
--include ./include/linux/compiler-version.h --include ./include/linux/kconfig.h \
--jobs 8 --chunksize 1 2>/dev/null | \
sed 's/char module_alias = "\([^"]*\)";/MODULE_ALIAS_NET_CLS("\1");/'

And analogously for:

static struct tc_action_ops ops = {
.kind = K,

static struct Qdisc_ops ops = {
.id = K,

(Someone familiar would be able to fit those into one .cocci file
without sed post processing.)

Signed-off-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240201130943.19536-3-mkoutny@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# c2a67de9 29-Dec-2023 Pedro Tammela <pctammela@mojatatu.com>

net/sched: introduce ACT_P_BOUND return code

Bound actions always return '0' and as of today we rely on '0'
being returned in order to properly skip bound actions in
tcf_idr_insert_many. In order to further improve maintainability,
introduce the ACT_P_BOUND return code.

Actions are updated to return 'ACT_P_BOUND' instead of plain '0'.
tcf_idr_insert_many is then updated to check for 'ACT_P_BOUND'.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20231229132642.1489088-1-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# b3be9488 21-Mar-2023 Eric Dumazet <edumazet@google.com>

net/sched: remove two skb_mac_header() uses

tcf_mirred_act() and tcf_mpls_act() can use skb_network_offset()
instead of relying on skb_mac_header().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# e88d78a7 23-Feb-2023 Pedro Tammela <pctammela@mojatatu.com>

net/sched: act_mpls: fix action bind logic

The TC architecture allows filters and actions to be created independently.
In filters the user can reference action objects using:
tc action add action mpls ... index 1
tc filter add ... action mpls index 1

In the current code for act_mpls this is broken as it checks netlink
attributes for create/update before actually checking if we are binding to an
existing action.

tdc results:
1..53
ok 1 a933 - Add MPLS dec_ttl action with pipe opcode
ok 2 08d1 - Add mpls dec_ttl action with pass opcode
ok 3 d786 - Add mpls dec_ttl action with drop opcode
ok 4 f334 - Add mpls dec_ttl action with reclassify opcode
ok 5 29bd - Add mpls dec_ttl action with continue opcode
ok 6 48df - Add mpls dec_ttl action with jump opcode
ok 7 62eb - Add mpls dec_ttl action with trap opcode
ok 8 09d2 - Add mpls dec_ttl action with opcode and cookie
ok 9 c170 - Add mpls dec_ttl action with opcode and cookie of max length
ok 10 9118 - Add mpls dec_ttl action with invalid opcode
ok 11 6ce1 - Add mpls dec_ttl action with label (invalid)
ok 12 352f - Add mpls dec_ttl action with tc (invalid)
ok 13 fa1c - Add mpls dec_ttl action with ttl (invalid)
ok 14 6b79 - Add mpls dec_ttl action with bos (invalid)
ok 15 d4c4 - Add mpls pop action with ip proto
ok 16 91fb - Add mpls pop action with ip proto and cookie
ok 17 92fe - Add mpls pop action with mpls proto
ok 18 7e23 - Add mpls pop action with no protocol (invalid)
ok 19 6182 - Add mpls pop action with label (invalid)
ok 20 6475 - Add mpls pop action with tc (invalid)
ok 21 067b - Add mpls pop action with ttl (invalid)
ok 22 7316 - Add mpls pop action with bos (invalid)
ok 23 38cc - Add mpls push action with label
ok 24 c281 - Add mpls push action with mpls_mc protocol
ok 25 5db4 - Add mpls push action with label, tc and ttl
ok 26 7c34 - Add mpls push action with label, tc ttl and cookie of max length
ok 27 16eb - Add mpls push action with label and bos
ok 28 d69d - Add mpls push action with no label (invalid)
ok 29 e8e4 - Add mpls push action with ipv4 protocol (invalid)
ok 30 ecd0 - Add mpls push action with out of range label (invalid)
ok 31 d303 - Add mpls push action with out of range tc (invalid)
ok 32 fd6e - Add mpls push action with ttl of 0 (invalid)
ok 33 19e9 - Add mpls mod action with mpls label
ok 34 1fde - Add mpls mod action with max mpls label
ok 35 0c50 - Add mpls mod action with mpls label exceeding max (invalid)
ok 36 10b6 - Add mpls mod action with mpls label of MPLS_LABEL_IMPLNULL (invalid)
ok 37 57c9 - Add mpls mod action with mpls min tc
ok 38 6872 - Add mpls mod action with mpls max tc
ok 39 a70a - Add mpls mod action with mpls tc exceeding max (invalid)
ok 40 6ed5 - Add mpls mod action with mpls ttl
ok 41 77c1 - Add mpls mod action with mpls ttl and cookie
ok 42 b80f - Add mpls mod action with mpls max ttl
ok 43 8864 - Add mpls mod action with mpls min ttl
ok 44 6c06 - Add mpls mod action with mpls ttl of 0 (invalid)
ok 45 b5d8 - Add mpls mod action with mpls ttl exceeding max (invalid)
ok 46 451f - Add mpls mod action with mpls max bos
ok 47 a1ed - Add mpls mod action with mpls min bos
ok 48 3dcf - Add mpls mod action with mpls bos exceeding max (invalid)
ok 49 db7c - Add mpls mod action with protocol (invalid)
ok 50 b070 - Replace existing mpls push action with new ID
ok 51 95a9 - Replace existing mpls push action with new label, tc, ttl and cookie
ok 52 6cce - Delete mpls pop action
ok 53 d138 - Flush mpls actions

Fixes: 2a2ea50870ba ("net: sched: add mpls manipulation actions to TC")
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 9e17f992 07-Jan-2023 Ido Schimmel <idosch@nvidia.com>

net/sched: act_mpls: Fix warning during failed attribute validation

The 'TCA_MPLS_LABEL' attribute is of 'NLA_U32' type, but has a
validation type of 'NLA_VALIDATE_FUNCTION'. This is an invalid
combination according to the comment above 'struct nla_policy':

"
Meaning of `validate' field, use via NLA_POLICY_VALIDATE_FN:
NLA_BINARY Validation function called for the attribute.
All other Unused - but note that it's a union
"

This can trigger the warning [1] in nla_get_range_unsigned() when
validation of the attribute fails. Despite being of 'NLA_U32' type, the
associated 'min'/'max' fields in the policy are negative as they are
aliased by the 'validate' field.

Fix by changing the attribute type to 'NLA_BINARY' which is consistent
with the above comment and all other users of NLA_POLICY_VALIDATE_FN().
As a result, move the length validation to the validation function.

No regressions in MPLS tests:

# ./tdc.py -f tc-tests/actions/mpls.json
[...]
# echo $?
0

[1]
WARNING: CPU: 0 PID: 17743 at lib/nlattr.c:118
nla_get_range_unsigned+0x1d8/0x1e0 lib/nlattr.c:117
Modules linked in:
CPU: 0 PID: 17743 Comm: syz-executor.0 Not tainted 6.1.0-rc8 #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
RIP: 0010:nla_get_range_unsigned+0x1d8/0x1e0 lib/nlattr.c:117
[...]
Call Trace:
<TASK>
__netlink_policy_dump_write_attr+0x23d/0x990 net/netlink/policy.c:310
netlink_policy_dump_write_attr+0x22/0x30 net/netlink/policy.c:411
netlink_ack_tlv_fill net/netlink/af_netlink.c:2454 [inline]
netlink_ack+0x546/0x760 net/netlink/af_netlink.c:2506
netlink_rcv_skb+0x1b7/0x240 net/netlink/af_netlink.c:2546
rtnetlink_rcv+0x18/0x20 net/core/rtnetlink.c:6109
netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
netlink_unicast+0x5e9/0x6b0 net/netlink/af_netlink.c:1345
netlink_sendmsg+0x739/0x860 net/netlink/af_netlink.c:1921
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg net/socket.c:734 [inline]
____sys_sendmsg+0x38f/0x500 net/socket.c:2482
___sys_sendmsg net/socket.c:2536 [inline]
__sys_sendmsg+0x197/0x230 net/socket.c:2565
__do_sys_sendmsg net/socket.c:2574 [inline]
__se_sys_sendmsg net/socket.c:2572 [inline]
__x64_sys_sendmsg+0x42/0x50 net/socket.c:2572
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

Link: https://lore.kernel.org/netdev/CAO4mrfdmjvRUNbDyP0R03_DrD_eFCLCguz6OxZ2TYRSv0K9gxA@mail.gmail.com/
Fixes: 2a2ea50870ba ("net: sched: add mpls manipulation actions to TC")
Reported-by: Wei Chen <harperchen1110@gmail.com>
Tested-by: Wei Chen <harperchen1110@gmail.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/20230107171004.608436-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 871cf386 06-Dec-2022 Pedro Tammela <pctammela@mojatatu.com>

net/sched: avoid indirect act functions on retpoline kernels

Expose the necessary tc act functions and wire up act_api to use
direct calls in retpoline kernels.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7fadae53 07-Sep-2022 Zhengchao Shao <shaozhengchao@huawei.com>

net: sched: act_mpls: get rid of tcf_mpls_walker and tcf_mpls_search

tcf_mpls_walker() and tcf_mpls_search() do the same thing as generic
walk/search function, so remove them.

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# acd0a7ab 07-Sep-2022 Zhengchao Shao <shaozhengchao@huawei.com>

net: sched: act: move global static variable net_id to tc_action_ops

Each tc action module has a corresponding net_id, so put net_id directly
into the structure tc_action_ops.

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# bca3821d 07-Apr-2022 Ido Schimmel <idosch@nvidia.com>

net/sched: act_mpls: Add extack messages for offload failure

For better error reporting to user space, add extack messages when mpls
action offload fails.

Example:

# echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable

# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action mpls dec_ttl
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel

# cat /sys/kernel/tracing/trace_pipe
tc-182 [000] b..1. 18.693915: netlink_extack: msg=act_mpls: Offload not supported when "dec_ttl" option is used
tc-182 [000] ..... 18.693921: netlink_extack: msg=cls_matchall: Failed to setup flow action

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# c2ccf84e 07-Apr-2022 Ido Schimmel <idosch@nvidia.com>

net/sched: act_api: Add extack to offload_act_setup() callback

The callback is used by various actions to populate the flow action
structure prior to offload. Pass extack to this callback so that the
various actions will be able to report accurate error messages to user
space.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 8cbfe939 17-Dec-2021 Baowen Zheng <baowen.zheng@corigine.com>

flow_offload: allow user to offload tc action to net device

Use flow_indr_dev_register/flow_indr_dev_setup_offload to
offload tc action.

We need to call tc_cleanup_flow_action to clean up tc action entry since
in tc_setup_action, some actions may hold dev refcnt, especially the mirror
action.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# c54e1d92 17-Dec-2021 Baowen Zheng <baowen.zheng@corigine.com>

flow_offload: add ops to tc_action_ops for flow action setup

Add a new ops to tc_action_ops for flow action setup.

Refactor function tc_setup_flow_action to use this new ops.

We make this change to facilitate to add standalone action module.

We will also use this ops to offload action independent of filter
in following patch.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 40bd094d 17-Dec-2021 Baowen Zheng <baowen.zheng@corigine.com>

flow_offload: fill flags to action structure

Fill flags to action structure to allow user control if
the action should be offloaded to hardware or not.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 50dc9a85 16-Oct-2021 Ahmed S. Darwish <a.darwish@linutronix.de>

net: sched: Merge Qdisc::bstats and Qdisc::cpu_bstats data types

The only factor differentiating per-CPU bstats data type (struct
gnet_stats_basic_cpu) from the packed non-per-CPU one (struct
gnet_stats_basic_packed) was a u64_stats sync point inside the former.
The two data types are now equivalent: earlier commits added a u64_stats
sync point to the latter.

Combine both data types into "struct gnet_stats_basic_sync". This
eliminates redundancy and simplifies the bstats read/write APIs.

Use u64_stats_t for bstats "packets" and "bytes" data types. On 64-bit
architectures, u64_stats sync points do not use sequence counter
protection.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 695176bf 29-Jul-2021 Cong Wang <cong.wang@bytedance.com>

net_sched: refactor TC action init API

TC action ->init() API has 10 parameters, it becomes harder
to read. Some of them are just boolean and can be replaced
by flags. Similarly for the internal API tcf_action_init()
and tcf_exts_validate().

This patch converts them to flags and fold them into
the upper 16 bits of "flags", whose lower 16 bits are still
reserved for user-space. More specifically, the following
kernel flags are introduced:

TCA_ACT_FLAGS_POLICE replace 'name' in a few contexts, to
distinguish whether it is compatible with policer.

TCA_ACT_FLAGS_BIND replaces 'bind', to indicate whether
this action is bound to a filter.

TCA_ACT_FLAGS_REPLACE replaces 'ovr' in most contexts,
means we are replacing an existing action.

TCA_ACT_FLAGS_NO_RTNL replaces 'rtnl_held' but has the
opposite meaning, because we still hold RTNL in most
cases.

The only user-space flag TCA_ACT_FLAGS_NO_PERCPU_STATS is
untouched and still stored as before.

I have tested this patch with tdc and I do not see any
failure related to this patch.

Tested-by: Vlad Buslov <vladbu@nvidia.com>
Acked-by: Jamal Hadi Salim<jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 9608fa65 03-Dec-2020 Davide Caratti <dcaratti@redhat.com>

net/sched: act_mpls: ensure LSE is pullable before reading it

when 'act_mpls' is used to mangle the LSE, the current value is read from
the packet dereferencing 4 bytes at mpls_hdr(): ensure that the label is
contained in the skb "linear" area.

Found by code inspection.

v2:
- use MPLS_HLEN instead of sizeof(new_lse), thanks to Jakub Kicinski

Fixes: 2a2ea50870ba ("net: sched: add mpls manipulation actions to TC")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Link: https://lore.kernel.org/r/3243506cba43d14858f3bd21ee0994160e44d64a.1606987058.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 501b72ae 26-Oct-2020 Guillaume Nault <gnault@redhat.com>

net/sched: act_mpls: Add softdep on mpls_gso.ko

TCA_MPLS_ACT_PUSH and TCA_MPLS_ACT_MAC_PUSH might be used on gso
packets. Such packets will thus require mpls_gso.ko for segmentation.

v2: Drop dependency on CONFIG_NET_MPLS_GSO in Kconfig (from Jakub and
David).

Fixes: 2a2ea50870ba ("net: sched: add mpls manipulation actions to TC")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Link: https://lore.kernel.org/r/1f6cab15bbd15666795061c55563aaf6a386e90e.1603708007.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# a45294af 02-Oct-2020 Guillaume Nault <gnault@redhat.com>

net/sched: act_mpls: Add action to push MPLS LSE before Ethernet header

Define the MAC_PUSH action which pushes an MPLS LSE before the mac
header (instead of between the mac and the network headers as the
plain PUSH action does).

The only special case is when the skb has an offloaded VLAN. In that
case, it has to be inlined before pushing the MPLS header.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# e49d8c22 22-Sep-2020 Cong Wang <xiyou.wangcong@gmail.com>

net_sched: defer tcf_idr_insert() in tcf_action_init_1()

All TC actions call tcf_idr_insert() for new action at the end
of their ->init(), so we can actually move it to a central place
in tcf_action_init_1().

And once the action is inserted into the global IDR, other parallel
process could free it immediately as its refcnt is still 1, so we can
not fail after this, we need to move it after the goto action
validation to avoid handling the failure case after insertion.

This is found during code review, is not directly triggered by syzbot.
And this prepares for the next patch.

Cc: Vlad Buslov <vladbu@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d7bf2ebe 03-Jul-2020 Toke Høiland-Jørgensen <toke@redhat.com>

sched: consistently handle layer3 header accesses in the presence of VLANs

There are a couple of places in net/sched/ that check skb->protocol and act
on the value there. However, in the presence of VLAN tags, the value stored
in skb->protocol can be inconsistent based on whether VLAN acceleration is
enabled. The commit quoted in the Fixes tag below fixed the users of
skb->protocol to use a helper that will always see the VLAN ethertype.

However, most of the callers don't actually handle the VLAN ethertype, but
expect to find the IP header type in the protocol field. This means that
things like changing the ECN field, or parsing diffserv values, stops
working if there's a VLAN tag, or if there are multiple nested VLAN
tags (QinQ).

To fix this, change the helper to take an argument that indicates whether
the caller wants to skip the VLAN tags or not. When skipping VLAN tags, we
make sure to skip all of them, so behaviour is consistent even in QinQ
mode.

To make the helper usable from the ECN code, move it to if_vlan.h instead
of pkt_sched.h.

v3:
- Remove empty lines
- Move vlan variable definitions inside loop in skb_protocol()
- Also use skb_protocol() helper in IP{,6}_ECN_decapsulate() and
bpf_skb_ecn_set_ce()

v2:
- Use eth_type_vlan() helper in skb_protocol()
- Also fix code that reads skb->protocol directly
- Change a couple of 'if/else if' statements to switch constructs to avoid
calling the helper twice

Reported-by: Ilya Ponetayev <i.ponetaev@ndmsystems.com>
Fixes: d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan path")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d04ac224 04-Dec-2019 Martin Varghese <martin.varghese@nokia.com>

net: Fixed updating of ethertype in skb_mpls_push()

The skb_mpls_push was not updating ethertype of an ethernet packet if
the packet was originally received from a non ARPHRD_ETHER device.

In the below OVS data path flow, since the device corresponding to
port 7 is an l3 device (ARPHRD_NONE) the skb_mpls_push function does
not update the ethertype of the packet even though the previous
push_eth action had added an ethernet header to the packet.

recirc_id(0),in_port(7),eth_type(0x0800),ipv4(tos=0/0xfc,ttl=64,frag=no),
actions:push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),
push_mpls(label=13,tc=0,ttl=64,bos=1,eth_type=0x8847),4

Fixes: 8822e270d697 ("net: core: move push MPLS functionality from OvS to core helper")
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 040b5cfb 01-Dec-2019 Martin Varghese <martin.varghese@nokia.com>

Fixed updating of ethertype in function skb_mpls_pop

The skb_mpls_pop was not updating ethertype of an ethernet packet if the
packet was originally received from a non ARPHRD_ETHER device.

In the below OVS data path flow, since the device corresponding to port 7
is an l3 device (ARPHRD_NONE) the skb_mpls_pop function does not update
the ethertype of the packet even though the previous push_eth action had
added an ethernet header to the packet.

recirc_id(0),in_port(7),eth_type(0x8847),
mpls(label=12/0xfffff,tc=0/0,ttl=0/0x0,bos=1/1),
actions:push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),
pop_mpls(eth_type=0x800),4

Fixes: ed246cee09b9 ("net: core: move pop MPLS functionality from OvS to core helper")
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# f3bed7f8 21-Nov-2019 Xin Long <lucien.xin@gmail.com>

net: remove the unnecessary strict_start_type in some policies

ct_policy and mpls_policy are parsed with nla_parse_nested(), which
does NL_VALIDATE_STRICT validation, strict_start_type is not needed
to set as it is actually trying to make some attributes parsed with
NL_VALIDATE_STRICT.

This patch is to remove it, and do the same on rtm_nh_policy which
is parsed by nlmsg_parse().

Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# e3822678 30-Oct-2019 Vlad Buslov <vladbu@mellanox.com>

net: sched: update action implementations to support flags

Extend struct tc_action with new "tcfa_flags" field. Set the field in
tcf_idr_create() function and provide new helper
tcf_idr_create_from_flags() that derives 'cpustats' boolean from flags
value. Update individual hardware-offloaded actions init() to pass their
"flags" argument to new helper in order to skip percpu stats allocation
when user requested it through flags.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# abbb0d33 30-Oct-2019 Vlad Buslov <vladbu@mellanox.com>

net: sched: extend TCA_ACT space with TCA_ACT_FLAGS

Extend TCA_ACT space with nla_bitfield32 flags. Add
TCA_ACT_FLAGS_NO_PERCPU_STATS as the only allowed flag. Parse the flags in
tcf_action_init_1() and pass resulting value as additional argument to
a_o->init().

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 445d3749 23-Sep-2019 Paul E. McKenney <paulmck@kernel.org>

net/sched: Replace rcu_swap_protected() with rcu_replace_pointer()

This commit replaces the use of rcu_swap_protected() with the more
intuitively appealing rcu_replace_pointer() as a step towards removing
rcu_swap_protected().

Link: https://lore.kernel.org/lkml/CAHk-=wiAsJLw1egFEE=Z7-GGtM6wcvtyytXZA1+BHqta4gg6Hw@mail.gmail.com/
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
[ paulmck: From rcu_replace() to rcu_replace_pointer() per Ingo Molnar. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: <netdev@vger.kernel.org>
Cc: <linux-kernel@vger.kernel.org>


# fa4e0f88 12-Oct-2019 Davide Caratti <dcaratti@redhat.com>

net/sched: fix corrupted L2 header with MPLS 'push' and 'pop' actions

the following script:

# tc qdisc add dev eth0 clsact
# tc filter add dev eth0 egress protocol ip matchall \
> action mpls push protocol mpls_uc label 0x355aa bos 1

causes corruption of all IP packets transmitted by eth0. On TC egress, we
can't rely on the value of skb->mac_len, because it's 0 and a MPLS 'push'
operation will result in an overwrite of the first 4 octets in the packet
L2 header (e.g. the Destination Address if eth0 is an Ethernet); the same
error pattern is present also in the MPLS 'pop' operation. Fix this error
in act_mpls data plane, computing 'mac_len' as the difference between the
network header and the mac header (when not at TC ingress), and use it in
MPLS 'push'/'pop' core functions.

v2: unbreak 'make htmldocs' because of missing documentation of 'mac_len'
in skb_mpls_pop(), reported by kbuild test robot

CC: Lorenzo Bianconi <lorenzo@kernel.org>
Fixes: 2a2ea50870ba ("net: sched: add mpls manipulation actions to TC")
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 981471bd 25-Aug-2019 Cong Wang <xiyou.wangcong@gmail.com>

net_sched: fix a NULL pointer deref in ipt action

The net pointer in struct xt_tgdtor_param is not explicitly
initialized therefore is still NULL when dereferencing it.
So we have to find a way to pass the correct net pointer to
ipt_destroy_target().

The best way I find is just saving the net pointer inside the per
netns struct tcf_idrinfo, which could make this patch smaller.

Fixes: 0c66dc1ea3f0 ("netfilter: conntrack: register hooks in netns when needed by ruleset")
Reported-and-tested-by: itugrok@yahoo.com
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7be8ef2c 01-Aug-2019 Dmytro Linkin <dmitrolin@mellanox.com>

net: sched: use temporary variable for actions indexes

Currently init call of all actions (except ipt) init their 'parm'
structure as a direct pointer to nla data in skb. This leads to race
condition when some of the filter actions were initialized successfully
(and were assigned with idr action index that was written directly
into nla data), but then were deleted and retried (due to following
action module missing or classifier-initiated retry), in which case
action init code tries to insert action to idr with index that was
assigned on previous iteration. During retry the index can be reused
by another action that was inserted concurrently, which causes
unintended action sharing between filters.
To fix described race condition, save action idr index to temporary
stack-allocated variable instead on nla data.

Fixes: 0190c1d452a9 ("net: sched: atomically check-allocate action")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 2a2ea508 07-Jul-2019 John Hurley <john.hurley@netronome.com>

net: sched: add mpls manipulation actions to TC

Currently, TC offers the ability to match on the MPLS fields of a packet
through the use of the flow_dissector_key_mpls struct. However, as yet, TC
actions do not allow the modification or manipulation of such fields.

Add a new module that registers TC action ops to allow manipulation of
MPLS. This includes the ability to push and pop headers as well as modify
the contents of new or existing headers. A further action to decrement the
TTL field of an MPLS header is also provided with a new helper added to
support this.

Examples of the usage of the new action with flower rules to push and pop
MPLS labels are:

tc filter add dev eth0 protocol ip parent ffff: flower \
action mpls push protocol mpls_uc label 123 \
action mirred egress redirect dev eth1

tc filter add dev eth0 protocol mpls_uc parent ffff: flower \
action mpls pop protocol ipv4 \
action mirred egress redirect dev eth1

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>