History log of /linux-master/tools/bpf/bpftool/struct_ops.c
Revision Date Author Comments
# 6bd5e167 18-Oct-2023 Manu Bretelle <chantr4@gmail.com>

bpftool: Wrap struct_ops dump in an array

When dumping a struct_ops, 2 dictionaries are emitted.

When using `name`, they were already wrapped in an array, but not when
using `id`. Causing `jq` to fail at parsing the payload as it reached
the comma following the first dict.

This change wraps those dictionaries in an array so valid json is emitted.

Before, jq fails to parse the output:
```
$ sudo bpftool struct_ops dump id 1523612 | jq . > /dev/null
parse error: Expected value before ',' at line 19, column 2
```

After, no error parsing the output:
```
sudo ./bpftool struct_ops dump id 1523612 | jq . > /dev/null
```

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20231018230133.1593152-3-chantr4@gmail.com


# 2a36c26f 05-May-2023 Pengcheng Yang <yangpc@wangsu.com>

bpftool: Support bpffs mountpoint as pin path for prog loadall

Currently, when using prog loadall and the pin path is a bpffs mountpoint,
bpffs will be repeatedly mounted to the parent directory of the bpffs
mountpoint path. For example, a `bpftool prog loadall test.o /sys/fs/bpf`
will trigger this.

Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/1683342439-3677-1-git-send-email-yangpc@wangsu.com


# 0232b788 19-Apr-2023 Kui-Feng Lee <thinker.li@gmail.com>

bpftool: Register struct_ops with a link.

You can include an optional path after specifying the object name for the
'struct_ops register' subcommand.

Since the commit 226bc6ae6405 ("Merge branch 'Transit between BPF TCP
congestion controls.'") has been accepted, it is now possible to create a
link for a struct_ops. This can be done by defining a struct_ops in
SEC(".struct_ops.link") to make libbpf returns a real link. If we don't pin
the links before leaving bpftool, they will disappear. To instruct bpftool
to pin the links in a directory with the names of the maps, we need to
provide the path of that directory.

Signed-off-by: Kui-Feng Lee <kuifeng@meta.com>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20230420002822.345222-1-kuifeng@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 38f0408e 14-Feb-2023 Ilya Leoshkevich <iii@linux.ibm.com>

bpftool: Use bpf_{btf,link,map,prog}_get_info_by_fd()

Use the new type-safe wrappers around bpf_obj_get_info_by_fd().

Split the bpf_obj_get_info_by_fd() call in build_btf_type_table() in
two, since knowing the type helps with the Memory Sanitizer.

Improve map_parse_fd_and_info() type safety by using
struct bpf_map_info * instead of void * for info.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20230214231221.249277-4-iii@linux.ibm.com


# d1313e01 20-Nov-2022 Sahid Orentino Ferdjaoui <sahid.ferdjaoui@industrialdiscipline.com>

bpftool: clean-up usage of libbpf_get_error()

bpftool is now totally compliant with libbpf 1.0 mode and is not
expected to be compiled with pre-1.0, let's clean-up the usage of
libbpf_get_error().

The changes stay aligned with returned errors always negative.

- In tools/bpf/bpftool/btf.c This fixes an uninitialized local
variable `err` in function do_dump() because it may now be returned
without having been set.
- This also removes the checks on NULL pointers before calling
btf__free() because that function already does the check.

Signed-off-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@industrialdiscipline.com>
Link: https://lore.kernel.org/r/20221120112515.38165-5-sahid.ferdjaoui@industrialdiscipline.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# d2973ffd 20-Nov-2022 Sahid Orentino Ferdjaoui <sahid.ferdjaoui@industrialdiscipline.com>

bpftool: fix error message when function can't register struct_ops

It is expected that errno be passed to strerror(). This also cleans
this part of code from using libbpf_get_error().

Signed-off-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@industrialdiscipline.com>
Acked-by: Yonghong Song <yhs@fb.com>
Suggested-by: Quentin Monnet <quentin@isovalent.com>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20221120112515.38165-4-sahid.ferdjaoui@industrialdiscipline.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 989f2851 20-Nov-2022 Sahid Orentino Ferdjaoui <sahid.ferdjaoui@industrialdiscipline.com>

bpftool: replace return value PTR_ERR(NULL) with 0

There is no reasons to keep PTR_ERR() when kern_btf=NULL, let's just
return 0.
This also cleans this part of code from using libbpf_get_error().

Signed-off-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@industrialdiscipline.com>
Acked-by: Yonghong Song <yhs@fb.com>
Suggested-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20221120112515.38165-3-sahid.ferdjaoui@industrialdiscipline.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 6b4384ff 09-Jun-2022 Quentin Monnet <quentin@isovalent.com>

Revert "bpftool: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK"

This reverts commit a777e18f1bcd32528ff5dfd10a6629b655b05eb8.

In commit a777e18f1bcd ("bpftool: Use libbpf 1.0 API mode instead of
RLIMIT_MEMLOCK"), we removed the rlimit bump in bpftool, because the
kernel has switched to memcg-based memory accounting. Thanks to the
LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK, we attempted to keep compatibility
with other systems and ask libbpf to raise the limit for us if
necessary.

How do we know if memcg-based accounting is supported? There is a probe
in libbpf to check this. But this probe currently relies on the
availability of a given BPF helper, bpf_ktime_get_coarse_ns(), which
landed in the same kernel version as the memory accounting change. This
works in the generic case, but it may fail, for example, if the helper
function has been backported to an older kernel. This has been observed
for Google Cloud's Container-Optimized OS (COS), where the helper is
available but rlimit is still in use. The probe succeeds, the rlimit is
not raised, and probing features with bpftool, for example, fails.

A patch was submitted [0] to update this probe in libbpf, based on what
the cilium/ebpf Go library does [1]. It would lower the soft rlimit to
0, attempt to load a BPF object, and reset the rlimit. But it may induce
some hard-to-debug flakiness if another process starts, or the current
application is killed, while the rlimit is reduced, and the approach was
discarded.

As a workaround to ensure that the rlimit bump does not depend on the
availability of a given helper, we restore the unconditional rlimit bump
in bpftool for now.

[0] https://lore.kernel.org/bpf/20220609143614.97837-1-quentin@isovalent.com/
[1] https://github.com/cilium/ebpf/blob/v0.9.0/rlimit/rlimit.go#L39

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Yafang Shao <laoar.shao@gmail.com>
Cc: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20220610112648.29695-2-quentin@isovalent.com


# a777e18f 08-Apr-2022 Yafang Shao <laoar.shao@gmail.com>

bpftool: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK

We have switched to memcg-based memory accouting and thus the rlimit is
not needed any more. LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK was introduced in
libbpf for backward compatibility, so we can use it instead now.

libbpf_set_strict_mode always return 0, so we don't need to check whether
the return value is 0 or not.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220409125958.92629-4-laoar.shao@gmail.com


# 3c28919f 07-Jan-2022 Christy Lee <christylee@fb.com>

bpftool: Stop using bpf_map__def() API

libbpf bpf_map__def() API is being deprecated, replace bpftool's
usage with the appropriate getters and setters

Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220108004218.355761-3-christylee@fb.com


# b59e4ce8 09-Dec-2021 Andrii Nakryiko <andrii@kernel.org>

bpftool: Switch bpf_object__load_xattr() to bpf_object__load()

Switch all the uses of to-be-deprecated bpf_object__load_xattr() into
a simple bpf_object__load() calls with optional log_level passed through
open_opts.kernel_log_level, if -d option is specified.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-13-andrii@kernel.org


# e5043894 14-Nov-2021 Hengqi Chen <hengqi.chen@gmail.com>

bpftool: Use libbpf_get_error() to check error

Currently, LIBBPF_STRICT_ALL mode is enabled by default for
bpftool which means on error cases, some libbpf APIs would
return NULL pointers. This makes IS_ERR check failed to detect
such cases and result in segfault error. Use libbpf_get_error()
instead like we do in libbpf itself.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211115012436.3143318-1-hengqi.chen@gmail.com


# 60f27075 01-Nov-2021 Dave Marchevsky <davemarchevsky@fb.com>

bpftool: Migrate -1 err checks of libbpf fn calls

Per [0], callers of libbpf functions with LIBBPF_STRICT_DIRECT_ERRS set
should handle negative error codes of various values (e.g. -EINVAL).
Migrate two callsites which were explicitly checking for -1 only to
handle the new scheme.

[0]: https://github.com/libbpf/libbpf/wiki/Libbpf-1.0-migration-guide#direct-error-code-returning-libbpf_strict_direct_errs

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20211101224357.2651181-2-davemarchevsky@fb.com


# c07ba629 30-Jul-2021 Quentin Monnet <quentin@isovalent.com>

tools: bpftool: Update and synchronise option list in doc and help msg

All bpftool commands support the options for JSON output and debug from
libbpf. In addition, some commands support additional options
corresponding to specific use cases.

The list of options described in the man pages for the different
commands are not always accurate. The messages for interactive help are
mostly limited to HELP_SPEC_OPTIONS, and are even less representative of
the actual set of options supported for the commands.

Let's update the lists:

- HELP_SPEC_OPTIONS is modified to contain the "default" options (JSON
and debug), and to be extensible (no ending curly bracket).
- All commands use HELP_SPEC_OPTIONS in their help message, and then
complete the list with their specific options.
- The lists of options in the man pages are updated.
- The formatting of the list for bpftool.rst is adjusted to match
formatting for the other man pages. This is for consistency, and also
because it will be helpful in a future patch to automatically check
that the files are synchronised.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210730215435.7095-5-quentin@isovalent.com


# 90040351 22-May-2020 Quentin Monnet <quentin@isovalent.com>

tools, bpftool: Clean subcommand help messages

This is a clean-up for the formatting of the do_help functions for
bpftool's subcommands. The following fixes are included:

- Do not use argv[-2] for "iter" help message, as the help is shown by
default if no "iter" action is selected, resulting in messages looking
like "./bpftool bpftool pin...".

- Do not print unused HELP_SPEC_PROGRAM in help message for "bpftool
link".

- Andrii used argument indexing to avoid having multiple occurrences of
bin_name and argv[-2] in the fprintf() for the help message, for
"bpftool gen" and "bpftool link". Let's reuse this for all other help
functions. We can remove up to thirty arguments for the "bpftool map"
help message.

- Harmonise all functions, e.g. use ending quotes-comma on a separate
line.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200523010751.23465-1-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 32e4c6f4 24-Apr-2020 Martin KaFai Lau <kafai@fb.com>

bpftool: Respect the -d option in struct_ops cmd

In the prog cmd, the "-d" option turns on the verifier log.
This is missed in the "struct_ops" cmd and this patch fixes it.

Fixes: 65c93628599d ("bpftool: Add struct_ops support")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200424182911.1259355-1-kafai@fb.com


# 96b2eb6e 09-Apr-2020 Daniel T. Lee <danieltimlee@gmail.com>

tools, bpftool: Fix struct_ops command invalid pointer free

In commit 65c93628599d ("bpftool: Add struct_ops support") a new
type of command named struct_ops has been added. This command requires
a kernel with CONFIG_DEBUG_INFO_BTF=y set and for retrieving BTF info
in bpftool, the helper get_btf_vmlinux() is used.

When running this command on kernel without BTF debug info, this will
lead to 'btf_vmlinux' variable being an invalid(error) pointer. And by
this, btf_free() causes a segfault when executing 'bpftool struct_ops'.

This commit adds pointer validation with IS_ERR not to free invalid
pointer, and this will fix the segfault issue.

Fixes: 65c93628599d ("bpftool: Add struct_ops support")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200410020612.2930667-1-danieltimlee@gmail.com


# 65c93628 18-Mar-2020 Martin KaFai Lau <kafai@fb.com>

bpftool: Add struct_ops support

This patch adds struct_ops support to the bpftool.

To recap a bit on the recent bpf_struct_ops feature on the kernel side:
It currently supports "struct tcp_congestion_ops" to be implemented
in bpf. At a high level, bpf_struct_ops is struct_ops map populated
with a number of bpf progs. bpf_struct_ops currently supports the
"struct tcp_congestion_ops". However, the bpf_struct_ops design is
generic enough that other kernel struct ops can be supported in
the future.

Although struct_ops is map+progs at a high lever, there are differences
in details. For example,
1) After registering a struct_ops, the struct_ops is held by the kernel
subsystem (e.g. tcp-cc). Thus, there is no need to pin a
struct_ops map or its progs in order to keep them around.
2) To iterate all struct_ops in a system, it iterates all maps
in type BPF_MAP_TYPE_STRUCT_OPS. BPF_MAP_TYPE_STRUCT_OPS is
the current usual filter. In the future, it may need to
filter by other struct_ops specific properties. e.g. filter by
tcp_congestion_ops or other kernel subsystem ops in the future.
3) struct_ops requires the running kernel having BTF info. That allows
more flexibility in handling other kernel structs. e.g. it can
always dump the latest bpf_map_info.
4) Also, "struct_ops" command is not intended to repeat all features
already provided by "map" or "prog". For example, if there really
is a need to pin the struct_ops map, the user can use the "map" cmd
to do that.

While the first attempt was to reuse parts from map/prog.c, it ended up
not a lot to share. The only obvious item is the map_parse_fds() but
that still requires modifications to accommodate struct_ops map specific
filtering (for the immediate and the future needs). Together with the
earlier mentioned differences, it is better to part away from map/prog.c.

The initial set of subcmds are, register, unregister, show, and dump.

For register, it registers all struct_ops maps that can be found in an
obj file. Option can be added in the future to specify a particular
struct_ops map. Also, the common bpf_tcp_cc is stateless (e.g.
bpf_cubic.c and bpf_dctcp.c). The "reuse map" feature is not
implemented in this patch and it can be considered later also.

For other subcmds, please see the man doc for details.

A sample output of dump:
[root@arch-fb-vm1 bpf]# bpftool struct_ops dump name cubic
[{
"bpf_map_info": {
"type": 26,
"id": 64,
"key_size": 4,
"value_size": 256,
"max_entries": 1,
"map_flags": 0,
"name": "cubic",
"ifindex": 0,
"btf_vmlinux_value_type_id": 18452,
"netns_dev": 0,
"netns_ino": 0,
"btf_id": 52,
"btf_key_type_id": 0,
"btf_value_type_id": 0
}
},{
"bpf_struct_ops_tcp_congestion_ops": {
"refcnt": {
"refs": {
"counter": 1
}
},
"state": "BPF_STRUCT_OPS_STATE_INUSE",
"data": {
"list": {
"next": 0,
"prev": 0
},
"key": 0,
"flags": 0,
"init": "void (struct sock *) bictcp_init/prog_id:138",
"release": "void (struct sock *) 0",
"ssthresh": "u32 (struct sock *) bictcp_recalc_ssthresh/prog_id:141",
"cong_avoid": "void (struct sock *, u32, u32) bictcp_cong_avoid/prog_id:140",
"set_state": "void (struct sock *, u8) bictcp_state/prog_id:142",
"cwnd_event": "void (struct sock *, enum tcp_ca_event) bictcp_cwnd_event/prog_id:139",
"in_ack_event": "void (struct sock *, u32) 0",
"undo_cwnd": "u32 (struct sock *) tcp_reno_undo_cwnd/prog_id:144",
"pkts_acked": "void (struct sock *, const struct ack_sample *) bictcp_acked/prog_id:143",
"min_tso_segs": "u32 (struct sock *) 0",
"sndbuf_expand": "u32 (struct sock *) 0",
"cong_control": "void (struct sock *, const struct rate_sample *) 0",
"get_info": "size_t (struct sock *, u32, int *, union tcp_cc_info *) 0",
"name": "bpf_cubic",
"owner": 0
}
}
}
]

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200318171656.129650-1-kafai@fb.com