History log of /linux-master/samples/bpf/Makefile
Revision Date Author Comments
# 37db10bc 25-Oct-2023 Viktor Malik <vmalik@redhat.com>

samples/bpf: Allow building with custom bpftool

samples/bpf build its own bpftool boostrap to generate vmlinux.h as well
as some BPF objects. This is a redundant step if bpftool has been
already built, so update samples/bpf/Makefile such that it accepts a
path to bpftool passed via the BPFTOOL variable. The approach is
practically the same as tools/testing/selftests/bpf/Makefile uses.

Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/bd746954ac271b02468d8d951ff9f11e655d485b.1698213811.git.vmalik@redhat.com


# f56bcfad 25-Oct-2023 Viktor Malik <vmalik@redhat.com>

samples/bpf: Fix passing LDFLAGS to libbpf

samples/bpf/Makefile passes LDFLAGS=$(TPROGS_LDFLAGS) to libbpf build
without surrounding quotes, which may cause compilation errors when
passing custom TPROGS_USER_LDFLAGS.

For example:

$ make -C samples/bpf/ TPROGS_USER_LDFLAGS="-Wl,--as-needed -specs=/usr/lib/gcc/x86_64-redhat-linux/13/libsanitizer.spec"
make: Entering directory './samples/bpf'
make -C ../../ M=./samples/bpf BPF_SAMPLES_PATH=./samples/bpf
make[1]: Entering directory '.'
make -C ./samples/bpf/../../tools/lib/bpf RM='rm -rf' EXTRA_CFLAGS="-Wall -O2 -Wmissing-prototypes -Wstrict-prototypes -I./usr/include -I./tools/testing/selftests/bpf/ -I./samples/bpf/libbpf/include -I./tools/include -I./tools/perf -I./tools/lib -DHAVE_ATTR_TEST=0" \
LDFLAGS=-Wl,--as-needed -specs=/usr/lib/gcc/x86_64-redhat-linux/13/libsanitizer.spec srctree=./samples/bpf/../../ \
O= OUTPUT=./samples/bpf/libbpf/ DESTDIR=./samples/bpf/libbpf prefix= \
./samples/bpf/libbpf/libbpf.a install_headers
make: invalid option -- 'c'
make: invalid option -- '='
make: invalid option -- '/'
make: invalid option -- 'u'
make: invalid option -- '/'
[...]

Fix the error by properly quoting $(TPROGS_LDFLAGS).

Suggested-by: Donald Zickus <dzickus@redhat.com>
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/c690de6671cc6c983d32a566d33fd7eabd18b526.1698213811.git.vmalik@redhat.com


# 870f09f1 25-Oct-2023 Viktor Malik <vmalik@redhat.com>

samples/bpf: Allow building with custom CFLAGS/LDFLAGS

Currently, it is not possible to specify custom flags when building
samples/bpf. The flags are defined in TPROGS_CFLAGS/TPROGS_LDFLAGS
variables, however, when trying to override those from the make command,
compilation fails.

For example, when trying to build with PIE:

$ make -C samples/bpf TPROGS_CFLAGS="-fpie" TPROGS_LDFLAGS="-pie"

This is because samples/bpf/Makefile updates these variables, especially
appends include paths to TPROGS_CFLAGS and these updates are overridden
by setting the variables from the make command.

This patch introduces variables TPROGS_USER_CFLAGS/TPROGS_USER_LDFLAGS
for this purpose, which can be set from the make command and their
values are propagated to TPROGS_CFLAGS/TPROGS_LDFLAGS.

Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/2d81100b830a71f0e72329cc7781edaefab75f62.1698213811.git.vmalik@redhat.com


# 9e09b750 26-Sep-2023 Ruowen Qin <ruowenq2@illinois.edu>

samples/bpf: Add -fsanitize=bounds to userspace programs

The sanitizer flag, which is supported by both clang and gcc, would make
it easier to debug array index out-of-bounds problems in these programs.

Make the Makfile smarter to detect ubsan support from the compiler and
add the '-fsanitize=bounds' accordingly.

Suggested-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Jinghao Jia <jinghao@linux.ibm.com>
Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>
Signed-off-by: Ruowen Qin <ruowenq2@illinois.edu>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230927045030.224548-2-ruowenq2@illinois.edu


# c698eaeb 06-Sep-2023 Rong Tao <rongtao@cestc.cn>

selftests/bpf: trace_helpers.c: Optimize kallsyms cache

Static ksyms often have problems because the number of symbols exceeds the
MAX_SYMS limit. Like changing the MAX_SYMS from 300000 to 400000 in
commit e76a014334a6("selftests/bpf: Bump and validate MAX_SYMS") solves
the problem somewhat, but it's not the perfect way.

This commit uses dynamic memory allocation, which completely solves the
problem caused by the limitation of the number of kallsyms. At the same
time, add APIs:

load_kallsyms_local()
ksym_search_local()
ksym_get_addr_local()
free_kallsyms_local()

There are used to solve the problem of selftests/bpf updating kallsyms
after attach new symbols during testmod testing.

Signed-off-by: Rong Tao <rongtao@cestc.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/tencent_C9BDA68F9221F21BE4081566A55D66A9700A@qq.com


# cced0699 23-Aug-2023 Toke Høiland-Jørgensen <toke@redhat.com>

samples/bpf: Remove the xdp_sample_pkts utility

The functionality of this utility is covered by the xdpdump utility in
xdp-tools.

There's a slight difference in usage as the xdpdump utility's main focus is
to dump packets before or after they are processed by an existing XDP
program. However, xdpdump also has the --load-xdp-program switch, which
will make it attach its own program if no existing program is loaded. With
this, xdp_sample_pkts usage can be converted as:

xdp_sample_pkts eth0
--> xdpdump --load-xdp-program eth0

To get roughly equivalent behaviour.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-6-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# eaca21d6 23-Aug-2023 Toke Høiland-Jørgensen <toke@redhat.com>

samples/bpf: Remove the xdp1 and xdp2 utilities

The functionality of these utilities have been incorporated into the
xdp-bench utility in xdp-tools.

Equivalent functionality is:

xdp1 eth0
--> xdp-bench drop -p parse-ip -l load-bytes eth0

xdp2 eth0
--> xdp-bench drop -p swap-macs eth0

Note that there's a slight difference in behaviour of those examples: the
swap-macs operation of xdp-bench doesn't use the bpf_xdp_load_bytes()
helper to load the packet data, whereas the xdp2 utility did so
unconditionally. For the parse-ip action the use of bpf_xdp_load_bytes()
can be selected by the '-l load-bytes' switch, with the difference that the
xdp-bench utility will perform two separate calls to the helper, one to
load the ethernet header and another to load the IP header; where the xdp1
utility only performed one call always loading 60 bytes of data.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-5-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 0e445e11 23-Aug-2023 Toke Høiland-Jørgensen <toke@redhat.com>

samples/bpf: Remove the xdp_rxq_info utility

The functionality of this utility has been incorporated into the xdp-bench
utility in xdp-tools, by way of the --rxq-stats argument to the 'drop',
'pass' and 'tx' commands of xdp-bench.

Some examples of how to convert xdp_rxq_info invocations into equivalent
xdp-bench commands:

xdp_rxq_info -d eth0
--> xdp-bench pass --rxq-stats eth0

xdp_rxq_info -d eth0 -a XDP_DROP -m
--> xdp-bench drop --rxq-stats -p swap-macs eth0

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-4-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 91dda69b 23-Aug-2023 Toke Høiland-Jørgensen <toke@redhat.com>

samples/bpf: Remove the xdp_redirect* utilities

These utilities have all been ported to xdp-tools as functions of the
xdp-bench utility. The four different utilities in samples are incorporated
as separate subcommands to xdp-bench, with most of the command line
parameters left intact, except that mandatory arguments are always
positional in xdp-bench. For full usage details see the --help output of
each command, or the xdp-bench man page.

Some examples of how to convert usage to xdp-bench are:

xdp_redirect eth0 eth1
--> xdp-bench redirect eth0 eth1

xdp_redirect_map eth0 eth1
--> xdp-bench redirect-map eth0 eth1

xdp_redirect_map_multi eth0 eth1 eth2 eth3
--> xdp-bench redirect-multi eth0 eth1 eth2 eth3

xdp_redirect_cpu -d eth0 -c 0 -c 1
--> xdp-bench redirect-cpu -c 0 -c 1 eth0

xdp_redirect_cpu -d eth0 -c 0 -c 1 -r eth1
--> xdp-bench redirect-cpu -c 0 -c 1 eth0 -r redirect -D eth1

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-3-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# e7c9e73d 23-Aug-2023 Toke Høiland-Jørgensen <toke@redhat.com>

samples/bpf: Remove the xdp_monitor utility

This utility has been ported as-is to xdp-tools as 'xdp-monitor'. The only
difference in usage between the samples and xdp-tools versions is that the
'-v' command line parameter has been changed to '-e' in the xdp-tools
version for consistency with the other utilities.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-2-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 4a0ee788 18-Aug-2023 Daniel T. Lee <danieltimlee@gmail.com>

samples/bpf: unify bpf program suffix to .bpf with tracing programs

Currently, BPF programs typically have a suffix of .bpf.c. However,
some programs still utilize a mixture of _kern.c suffix alongside the
naming convention. In order to achieve consistency in the naming of
these programs, this commit unifies the inconsistency in the naming
convention of BPF kernel programs.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-4-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 34f6e38f 18-Aug-2023 Daniel T. Lee <danieltimlee@gmail.com>

samples/bpf: fix warning with ignored-attributes

Currently, compiling the bpf programs will result the warning with the
ignored attribute as follows. This commit fixes the warning by adding
cf-protection option.

In file included from ./arch/x86/include/asm/linkage.h:6:
./arch/x86/include/asm/ibt.h:77:8: warning: 'nocf_check' attribute ignored; use -fcf-protection to enable the attribute [-Wignored-attributes]
extern __noendbr u64 ibt_save(bool disable);
^
./arch/x86/include/asm/ibt.h:32:34: note: expanded from macro '__noendbr'
^

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-2-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# bbaf1ff0 23-Jun-2023 Fangrui Song <maskray@google.com>

bpf: Replace deprecated -target with --target= for Clang

The -target option has been deprecated since clang 3.4 in 2013. Therefore, use
the preferred --target=bpf form instead. This also matches how we use --target=
in scripts/Makefile.clang.

Signed-off-by: Fangrui Song <maskray@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://github.com/llvm/llvm-project/commit/274b6f0c87a6a1798de0a68135afc7f95def6277
Link: https://lore.kernel.org/bpf/20230624001856.1903733-1-maskray@google.com


# e04946f5 15-Jan-2023 Daniel T. Lee <danieltimlee@gmail.com>

samples/bpf: change _kern suffix to .bpf with BPF test programs

This commit changes the _kern suffix to .bpf with the BPF test programs.
With this modification, test programs will inherit the benefit of the
new CLANG-BPF compile target.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-11-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# d4fffba4 24-Dec-2022 Daniel T. Lee <danieltimlee@gmail.com>

samples/bpf: Change _kern suffix to .bpf with syscall tracing program

Currently old compile rule (CLANG-bpf) doesn't contains VMLINUX_H define
flag which is essential for the bpf program that includes "vmlinux.h".
Also old compile rule doesn't directly specify the compile target as bpf,
instead it uses bunch of extra options with clang followed by long chain
of commands. (e.g. clang | opt | llvm-dis | llc)

In Makefile, there is already new compile rule which is more simple and
neat. And it also has -D__VMLINUX_H__ option. By just changing the _kern
suffix to .bpf will inherit the benefit of the new CLANG-BPF compile
target.

Also, this commit adds dummy gnu/stub.h to the samples/bpf directory.
As commit 1c2dd16add7e ("selftests/bpf: get rid of -D__x86_64__") noted,
compiling with 'clang -target bpf' will raise an error with stubs.h
unless workaround (-D__x86_64) is used. This commit solves this problem
by adding dummy stub.h to make /usr/include/features.h to follow the
expected path as the same way selftests/bpf dealt with.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221224071527.2292-4-danieltimlee@gmail.com


# 2e496628 13-Jul-2022 Pu Lehui <pulehui@huawei.com>

samples: bpf: Fix cross-compiling error by using bootstrap bpftool

Currently, when cross compiling bpf samples, the host side cannot
use arch-specific bpftool to generate vmlinux.h or skeleton. Since
samples/bpf use bpftool for vmlinux.h, skeleton, and static linking
only, we can use lightweight bootstrap version of bpftool to handle
these, and it's always host-native.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220714024612.944071-2-pulehui@huawei.com


# cfb5a2db 30-Jun-2022 Magnus Karlsson <magnus.karlsson@intel.com>

bpf, samples: Remove AF_XDP samples

Remove the AF_XDP samples from samples/bpf/ as they are dependent on
the AF_XDP support in libbpf. This support has now been removed in the
1.0 release, so these samples cannot be compiled anymore. Please start
to use libxdp instead. It is backwards compatible with the AF_XDP
support that was offered in libbpf. New samples can be found in the
various xdp-project repositories connected to libxdp and by googling.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/20220630093717.8664-1-magnus.karlsson@gmail.com


# ec247044 07-May-2022 Jerome Marchand <jmarchan@redhat.com>

samples: bpf: Don't fail for a missing VMLINUX_BTF when VMLINUX_H is provided

samples/bpf build currently always fails if it can't generate
vmlinux.h from vmlinux, even when vmlinux.h is directly provided by
VMLINUX_H variable, which makes VMLINUX_H pointless.
Only fails when neither method works.

Fixes: 384b6b3bbf0d ("samples: bpf: Add vmlinux.h generation support")
Reported-by: CKI Project <cki-project@redhat.com>
Reported-by: Veronika Kabatova <vkabatov@redhat.com>
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220507161635.2219052-1-jmarchan@redhat.com


# 587323cf 05-Apr-2022 Lorenzo Bianconi <lorenzo@kernel.org>

samples, bpf: Move routes monitor in xdp_router_ipv4 in a dedicated thread

In order to not miss any netlink message from the kernel, move routes
monitor to a dedicated thread.

Fixes: 85bf1f51691c ("samples: bpf: Convert xdp_router_ipv4 to XDP samples helper")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/e364b817c69ded73be24b677ab47a157f7c21b64.1649167911.git.lorenzo@kernel.org


# fc843ccd 04-Apr-2022 Alexander Lobakin <alexandr.lobakin@intel.com>

samples: bpf: Fix linking xdp_router_ipv4 after migration

Users of the xdp_sample_user infra should be explicitly linked
with the standard math library (`-lm`). Otherwise, the following
happens:

/usr/bin/ld: xdp_sample_user.c:(.text+0x59fc): undefined reference to `ceil'
/usr/bin/ld: xdp_sample_user.c:(.text+0x5a0d): undefined reference to `ceil'
/usr/bin/ld: xdp_sample_user.c:(.text+0x5adc): undefined reference to `floor'
/usr/bin/ld: xdp_sample_user.c:(.text+0x5b01): undefined reference to `ceil'
/usr/bin/ld: xdp_sample_user.c:(.text+0x5c1e): undefined reference to `floor'
/usr/bin/ld: xdp_sample_user.c:(.text+0x5c43): undefined reference to `ceil
[...]

That happened previously, so there's a block of linkage flags in the
Makefile. xdp_router_ipv4 has been transferred to this infra quite
recently, but hasn't been added to it. Fix.

Fixes: 85bf1f51691c ("samples: bpf: Convert xdp_router_ipv4 to XDP samples helper")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220404115451.1116478-1-alexandr.lobakin@intel.com


# 85bf1f51 16-Mar-2022 Lorenzo Bianconi <lorenzo@kernel.org>

samples: bpf: Convert xdp_router_ipv4 to XDP samples helper

Rely on the libbpf skeleton facility and other utilities provided by XDP
sample helpers in xdp_router_ipv4 sample.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/7f4d98ee2c13c04d5eb924eebf79ced32fee8418.1647414711.git.lorenzo@kernel.org


# e64fbcaa 03-Dec-2021 Alexander Lobakin <alexandr.lobakin@intel.com>

samples: bpf: Fix xdp_sample_user.o linking with Clang

Clang (13) doesn't get the jokes about specifying libraries to link in
cclags of individual .o objects:

clang-13: warning: -lm: 'linker' input unused [-Wunused-command-line-argument]
[ ... ]
LD samples/bpf/xdp_redirect_cpu
LD samples/bpf/xdp_redirect_map_multi
LD samples/bpf/xdp_redirect_map
LD samples/bpf/xdp_redirect
LD samples/bpf/xdp_monitor
/usr/bin/ld: samples/bpf/xdp_sample_user.o: in function `sample_summary_print':
xdp_sample_user.c:(.text+0x84c): undefined reference to `floor'
/usr/bin/ld: xdp_sample_user.c:(.text+0x870): undefined reference to `ceil'
/usr/bin/ld: xdp_sample_user.c:(.text+0x8cf): undefined reference to `floor'
/usr/bin/ld: xdp_sample_user.c:(.text+0x8f3): undefined reference to `ceil'
[ more ]

Specify '-lm' as ldflags for all xdp_sample_user.o users in the main
Makefile and remove it from ccflags of ^ in Makefile.target -- just
like it's done for all other samples. This works with all compilers.

Fixes: 6e1051a54e31 ("samples: bpf: Convert xdp_monitor to XDP samples helper")
Fixes: b926c55d856c ("samples: bpf: Convert xdp_redirect to XDP samples helper")
Fixes: e531a220cc59 ("samples: bpf: Convert xdp_redirect_cpu to XDP samples helper")
Fixes: bbe65865aa05 ("samples: bpf: Convert xdp_redirect_map to XDP samples helper")
Fixes: 594a116b2aa1 ("samples: bpf: Convert xdp_redirect_map_multi to XDP samples helper")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20211203195004.5803-2-alexandr.lobakin@intel.com


# 527024f7 01-Dec-2021 Andrii Nakryiko <andrii@kernel.org>

samples/bpf: Clean up samples/bpf build failes

Remove xdp_samples_user.o rule redefinition which generates Makefile
warning and instead override TPROGS_CFLAGS. This seems to work fine when
building inside selftests/bpf.

That was one big head-scratcher before I found that generic
Makefile.target hid this surprising specialization for for xdp_samples_user.o.

Main change is to use actual locally installed libbpf headers.

Also drop printk macro re-definition (not even used!).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211201232824.3166325-8-andrii@kernel.org


# 44ce0ac1 21-Oct-2021 Pu Lehui <pulehui@huawei.com>

samples: bpf: Suppress readelf stderr when probing for BTF support

When compiling bpf samples, the following warning appears:

readelf: Error: Missing knowledge of 32-bit reloc types used in DWARF
sections of machine number 247
readelf: Warning: unable to apply unsupported reloc type 10 to section
.debug_info
readelf: Warning: unable to apply unsupported reloc type 1 to section
.debug_info
readelf: Warning: unable to apply unsupported reloc type 10 to section
.debug_info

Same problem was mentioned in commit 2f0921262ba9 ("selftests/bpf:
suppress readelf stderr when probing for BTF support"), let's use
readelf that supports btf.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211021123913.48833-1-pulehui@huawei.com


# a60d24e7 07-Oct-2021 Quentin Monnet <quentin@isovalent.com>

samples/bpf: Do not FORCE-recompile libbpf

In samples/bpf/Makefile, libbpf has a FORCE dependency that force it to
be rebuilt. I read this as a way to keep the library up-to-date, given
that we do not have, in samples/bpf, a list of the source files for
libbpf itself. However, a better approach would be to use the
"$(wildcard ...)" function from make, and to have libbpf depend on all
the .c and .h files in its directory. This is what samples/bpf/Makefile
does for bpftool, and also what the BPF selftests' Makefile does for
libbpf.

Let's update the Makefile to avoid rebuilding libbpf all the time (and
bpftool on top of it).

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-11-quentin@isovalent.com


# 3f7a3318 07-Oct-2021 Quentin Monnet <quentin@isovalent.com>

samples/bpf: Install libbpf headers when building

API headers from libbpf should not be accessed directly from the source
directory. Instead, they should be exported with "make install_headers".
Make sure that samples/bpf/Makefile installs the headers properly when
building.

The object compiled from and exported by libbpf are now placed into a
subdirectory of sample/bpf/ instead of remaining in tools/lib/bpf/. We
attempt to remove this directory on "make clean". However, the "clean"
target re-enters the samples/bpf/ directory from the root of the
repository ("$(MAKE) -C ../../ M=$(CURDIR) clean"), in such a way that
$(srctree) and $(src) are not defined, making it impossible to use
$(LIBBPF_OUTPUT) and $(LIBBPF_DESTDIR) in the recipe. So we only attempt
to clean $(CURDIR)/libbpf, which is the default value.

Add a dependency on libbpf's headers for the $(TRACE_HELPERS).

We also change the output directory for bpftool, to place the generated
objects under samples/bpf/bpftool/ instead of building in bpftool's
directory directly. Doing so, we make sure bpftool reuses the libbpf
library previously compiled and installed.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-10-quentin@isovalent.com


# 929bef46 05-Oct-2021 Quentin Monnet <quentin@isovalent.com>

bpf: Use $(pound) instead of \# in Makefiles

Recent-ish versions of make do no longer consider number signs ("#") as
comment symbols when they are inserted inside of a macro reference or in
a function invocation. In such cases, the symbols should not be escaped.

There are a few occurrences of "\#" in libbpf's and samples' Makefiles.
In the former, the backslash is harmless, because grep associates no
particular meaning to the escaped symbol and reads it as a regular "#".
In samples' Makefile, recent versions of make will pass the backslash
down to the compiler, making the probe fail all the time and resulting
in the display of a warning about "make headers_install" being required,
even after headers have been installed.

A similar issue has been addressed at some other locations by commit
9564a8cf422d ("Kbuild: fix # escaping in .cmd files for future Make").
Let's address it for libbpf's and samples' Makefiles in the same
fashion, by using a "$(pound)" variable (pulled from
tools/scripts/Makefile.include for libbpf, or re-defined for the
samples).

Reference for the change in make:
https://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57

Fixes: 2f3830412786 ("libbpf: Make libbpf_version.h non-auto-generated")
Fixes: 07c3bbdb1a9b ("samples: bpf: print a warning about headers_install")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006111049.20708-1-quentin@isovalent.com


# 571fa247 27-Sep-2021 Kumar Kartikeya Dwivedi <memxor@gmail.com>

samples: bpf: Fix vmlinux.h generation for XDP samples

Generate vmlinux.h only from the in-tree vmlinux, and remove enum
declarations that would cause a build failure in case of version
mismatches.

There are now two options when building the samples:
1. Compile the kernel to use in-tree vmlinux for vmlinux.h
2. Override VMLINUX_BTF for samples using something like this:
make VMLINUX_BTF=/sys/kernel/btf/vmlinux -C samples/bpf

This change was tested with relative builds, e.g. cases like:
* make O=build -C samples/bpf
* make KBUILD_OUTPUT=build -C samples/bpf
* make -C samples/bpf
* cd samples/bpf && make

When a suitable VMLINUX_BTF is not found, the following message is
printed:
/home/kkd/src/linux/samples/bpf/Makefile:333: *** Cannot find a vmlinux
for VMLINUX_BTF at any of " ./vmlinux", build the kernel or set
VMLINUX_BTF variable. Stop.

Fixes: 384b6b3bbf0d (samples: bpf: Add vmlinux.h generation support)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210928054608.1799021-1-memxor@gmail.com


# 594a116b 20-Aug-2021 Kumar Kartikeya Dwivedi <memxor@gmail.com>

samples: bpf: Convert xdp_redirect_map_multi to XDP samples helper

Use the libbpf skeleton facility and other utilities provided by XDP
samples helper. Also adapt to change of type of mac address map, so that
no resizing is required.

Add a new flag for sample mask that skips priting the
from_device->to_device heading for each line, as xdp_redirect_map_multi
may have two devices but the flow of data may be bidirectional, so the
output would be confusing.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210821002010.845777-23-memxor@gmail.com


# a29b3ca1 20-Aug-2021 Kumar Kartikeya Dwivedi <memxor@gmail.com>

samples: bpf: Convert xdp_redirect_map_multi_kern.o to XDP samples helper

One of the notable changes is using a BPF_MAP_TYPE_HASH instead of array
map to store mac addresses of devices, as the resizing behavior was
based on max_ifindex, which unecessarily maximized the capacity of map
beyond what was needed.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210821002010.845777-22-memxor@gmail.com


# bbe65865 20-Aug-2021 Kumar Kartikeya Dwivedi <memxor@gmail.com>

samples: bpf: Convert xdp_redirect_map to XDP samples helper

Use the libbpf skeleton facility and other utilities provided by XDP
samples helper.

Since get_mac_addr is already provided by XDP samples helper, we drop
it. Also convert to XDP samples helper similar to prior samples to
minimize duplication of code.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210821002010.845777-21-memxor@gmail.com


# 54af769d 20-Aug-2021 Kumar Kartikeya Dwivedi <memxor@gmail.com>

samples: bpf: Convert xdp_redirect_map_kern.o to XDP samples helper

Also update it to use consistent SEC("xdp") and SEC("xdp_devmap")
naming, and use global variable instead of BPF map for copying the mac
address.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210821002010.845777-20-memxor@gmail.com


# e531a220 20-Aug-2021 Kumar Kartikeya Dwivedi <memxor@gmail.com>

samples: bpf: Convert xdp_redirect_cpu to XDP samples helper

Use the libbpf skeleton facility and other utilities provided by XDP
samples helper.

Similar to xdp_monitor, xdp_redirect_cpu was quite featureful except a
few minor omissions (e.g. redirect errno reporting). All of these have
been moved to XDP samples helper, hence drop the unneeded code and
convert to usage of helpers provided by it.

One of the important changes here is dropping of mprog-disable option,
as we make that the default. Also, we support built-in programs for some
common actions on the packet when it reaches kthread (pass, drop,
redirect to device). If the user still needs to install a custom
program, they can still supply a BPF object, however the program should
be suitably tagged with SEC("xdp_cpumap") annotation so that the
expected attach type is correct when updating our cpumap map element.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210821002010.845777-19-memxor@gmail.com


# 79ccf452 20-Aug-2021 Kumar Kartikeya Dwivedi <memxor@gmail.com>

samples: bpf: Convert xdp_redirect_cpu_kern.o to XDP samples helper

Similar to xdp_monitor_kern, a lot of these BPF programs have been
reimplemented properly consolidating missing features from other XDP
samples. Hence, drop the unneeded code and rename to .bpf.c suffix.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210821002010.845777-18-memxor@gmail.com


# b926c55d 20-Aug-2021 Kumar Kartikeya Dwivedi <memxor@gmail.com>

samples: bpf: Convert xdp_redirect to XDP samples helper

Use the libbpf skeleton facility and other utilities provided by XDP
samples helper.

One important note:
The XDP samples helper handles ownership of installed XDP programs on
devices, including responding to SIGINT and SIGTERM, so drop the code
here and use the helpers we provide going forward for all xdp_redirect*
conversions.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210821002010.845777-17-memxor@gmail.com


# 66fc4ca8 20-Aug-2021 Kumar Kartikeya Dwivedi <memxor@gmail.com>

samples: bpf: Convert xdp_redirect_kern.o to XDP samples helper

We moved swap_src_dst_mac to xdp_sample.bpf.h to be shared with other
potential users, so drop it while moving code to the new file.
Also, consistently use SEC("xdp") naming instead.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210821002010.845777-16-memxor@gmail.com


# 6e1051a5 20-Aug-2021 Kumar Kartikeya Dwivedi <memxor@gmail.com>

samples: bpf: Convert xdp_monitor to XDP samples helper

Use the libbpf skeleton facility and other utilities provided by XDP
samples helper.

A lot of the code in xdp_monitor and xdp_redirect_cpu has been moved to
the xdp_sample_user.o helper, so we remove the duplicate functions here
that are no longer needed.

Thanks to BPF skeleton, we no longer depend on order of tracepoints to
uninstall them on startup. Instead, the sample mask is used to install
the needed tracepoints.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210821002010.845777-15-memxor@gmail.com


# 3f199560 20-Aug-2021 Kumar Kartikeya Dwivedi <memxor@gmail.com>

samples: bpf: Convert xdp_monitor_kern.o to XDP samples helper

We already moved all the functionality it provided in XDP samples helper
userspace and kernel BPF object, so just delete the unneeded code.

We also add generation of BPF skeleton and compilation using clang
-target bpf for files ending with .bpf.c suffix (to denote that they use
vmlinux.h).

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210821002010.845777-14-memxor@gmail.com


# 384b6b3b 20-Aug-2021 Kumar Kartikeya Dwivedi <memxor@gmail.com>

samples: bpf: Add vmlinux.h generation support

Also, take this opportunity to depend on in-tree bpftool, so that we can
use static linking support in subsequent commits for XDP samples BPF
helper object.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210821002010.845777-13-memxor@gmail.com


# 5a0ae987 04-Jul-2021 Toke Høiland-Jørgensen <toke@redhat.com>

bpf, samples: Add -fno-asynchronous-unwind-tables to BPF Clang invocation

The samples/bpf Makefile currently compiles BPF files in a way that will
produce an .eh_frame section, which will in turn confuse libbpf and produce
errors when loading BPF programs, like:

libbpf: elf: skipping unrecognized data section(32) .eh_frame
libbpf: elf: skipping relo section(33) .rel.eh_frame for section(32) .eh_frame

Fix this by instruction Clang not to produce this section, as it's useless
for BPF anyway.

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210705103841.180260-1-toke@redhat.com


# e48cfe4b 19-May-2021 Hangbin Liu <liuhangbin@gmail.com>

sample/bpf: Add xdp_redirect_map_multi for redirect_map broadcast test

This is a sample for xdp redirect broadcast. In the sample we could forward
all packets between given interfaces. There is also an option -X that could
enable 2nd xdp_prog on egress interface.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210519090747.1655268-4-liuhangbin@gmail.com


# 058107ab 26-Jan-2021 Tiezhu Yang <yangtiezhu@loongson.cn>

samples/bpf: Add include dir for MIPS Loongson64 to fix build errors

There exists many build errors when make M=samples/bpf on the Loongson
platform. This issue is MIPS related, x86 compiles just fine.

Here are some errors:

CLANG-bpf samples/bpf/sockex2_kern.o
In file included from samples/bpf/sockex2_kern.c:2:
In file included from ./include/uapi/linux/in.h:24:
In file included from ./include/linux/socket.h:8:
In file included from ./include/linux/uio.h:8:
In file included from ./include/linux/kernel.h:11:
In file included from ./include/linux/bitops.h:32:
In file included from ./arch/mips/include/asm/bitops.h:19:
In file included from ./arch/mips/include/asm/barrier.h:11:
./arch/mips/include/asm/addrspace.h:13:10: fatal error: 'spaces.h' file not found
^~~~~~~~~~
1 error generated.

CLANG-bpf samples/bpf/sockex2_kern.o
In file included from samples/bpf/sockex2_kern.c:2:
In file included from ./include/uapi/linux/in.h:24:
In file included from ./include/linux/socket.h:8:
In file included from ./include/linux/uio.h:8:
In file included from ./include/linux/kernel.h:11:
In file included from ./include/linux/bitops.h:32:
In file included from ./arch/mips/include/asm/bitops.h:22:
In file included from ./arch/mips/include/asm/cpu-features.h:13:
In file included from ./arch/mips/include/asm/cpu-info.h:15:
In file included from ./include/linux/cache.h:6:
./arch/mips/include/asm/cache.h:12:10: fatal error: 'kmalloc.h' file not found
^~~~~~~~~~~
1 error generated.

CLANG-bpf samples/bpf/sockex2_kern.o
In file included from samples/bpf/sockex2_kern.c:2:
In file included from ./include/uapi/linux/in.h:24:
In file included from ./include/linux/socket.h:8:
In file included from ./include/linux/uio.h:8:
In file included from ./include/linux/kernel.h:11:
In file included from ./include/linux/bitops.h:32:
In file included from ./arch/mips/include/asm/bitops.h:22:
./arch/mips/include/asm/cpu-features.h:15:10: fatal error: 'cpu-feature-overrides.h' file not found
^~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

$ find arch/mips/include/asm -name spaces.h | sort
arch/mips/include/asm/mach-ar7/spaces.h
...
arch/mips/include/asm/mach-generic/spaces.h
...
arch/mips/include/asm/mach-loongson64/spaces.h
...
arch/mips/include/asm/mach-tx49xx/spaces.h

$ find arch/mips/include/asm -name kmalloc.h | sort
arch/mips/include/asm/mach-generic/kmalloc.h
arch/mips/include/asm/mach-ip32/kmalloc.h
arch/mips/include/asm/mach-tx49xx/kmalloc.h

$ find arch/mips/include/asm -name cpu-feature-overrides.h | sort
arch/mips/include/asm/mach-ath25/cpu-feature-overrides.h
...
arch/mips/include/asm/mach-generic/cpu-feature-overrides.h
...
arch/mips/include/asm/mach-loongson64/cpu-feature-overrides.h
...
arch/mips/include/asm/mach-tx49xx/cpu-feature-overrides.h

In the arch/mips/Makefile, there exists the following board-dependent
options:

include arch/mips/Kbuild.platforms
cflags-y += -I$(srctree)/arch/mips/include/asm/mach-generic

So we can do the similar things in samples/bpf/Makefile, just add
platform specific and generic include dir for MIPS Loongson64 to
fix the build errors.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1611669925-25315-1-git-send-email-yangtiezhu@loongson.cn


# 190d1c92 24-Jan-2021 Tiezhu Yang <yangtiezhu@loongson.cn>

samples/bpf: Set flag __SANE_USERSPACE_TYPES__ for MIPS to fix build warnings

There exists many build warnings when make M=samples/bpf on the Loongson
platform, this issue is MIPS related, x86 compiles just fine.

Here are some warnings:

CC samples/bpf/ibumad_user.o
samples/bpf/ibumad_user.c: In function ‘dump_counts’:
samples/bpf/ibumad_user.c:46:24: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=]
printf("0x%02x : %llu\n", key, value);
~~~^ ~~~~~
%lu
CC samples/bpf/offwaketime_user.o
samples/bpf/offwaketime_user.c: In function ‘print_ksym’:
samples/bpf/offwaketime_user.c:34:17: warning: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=]
printf("%s/%llx;", sym->name, addr);
~~~^ ~~~~
%lx
samples/bpf/offwaketime_user.c: In function ‘print_stack’:
samples/bpf/offwaketime_user.c:68:17: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=]
printf(";%s %lld\n", key->waker, count);
~~~^ ~~~~~
%ld

MIPS needs __SANE_USERSPACE_TYPES__ before <linux/types.h> to select
'int-ll64.h' in arch/mips/include/uapi/asm/types.h, then it can avoid
build warnings when printing __u64 with %llu, %llx or %lld.

The header tools/include/linux/types.h defines __SANE_USERSPACE_TYPES__,
it seems that we can include <linux/types.h> in the source files which
have build warnings, but it has no effect due to actually it includes
usr/include/linux/types.h instead of tools/include/linux/types.h, the
problem is that "usr/include" is preferred first than "tools/include"
in samples/bpf/Makefile, that sounds like a ugly hack to -Itools/include
before -Iusr/include.

So define __SANE_USERSPACE_TYPES__ for MIPS in samples/bpf/Makefile
is proper, if add "TPROGS_CFLAGS += -D__SANE_USERSPACE_TYPES__" in
samples/bpf/Makefile, it appears the following error:

Auto-detecting system features:
... libelf: [ on ]
... zlib: [ on ]
... bpf: [ OFF ]

BPF API too old
make[3]: *** [Makefile:293: bpfdep] Error 1
make[2]: *** [Makefile:156: all] Error 2

With #ifndef __SANE_USERSPACE_TYPES__ in tools/include/linux/types.h,
the above error has gone and this ifndef change does not hurt other
compilations.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/1611551146-14052-1-git-send-email-yangtiezhu@loongson.cn


# 628add78 21-Jan-2021 Tiezhu Yang <yangtiezhu@loongson.cn>

bpf, docs: Update build procedure for manually compiling LLVM and Clang

The current LLVM and Clang build procedure in samples/bpf/README.rst is
out of date. See below that the links are not accessible any more.

$ git clone http://llvm.org/git/llvm.git
Cloning into 'llvm'...
fatal: unable to access 'http://llvm.org/git/llvm.git/': Maximum (20) redirects followed
$ git clone --depth 1 http://llvm.org/git/clang.git
Cloning into 'clang'...
fatal: unable to access 'http://llvm.org/git/clang.git/': Maximum (20) redirects followed

The LLVM community has adopted new ways to build the compiler. There are
different ways to build LLVM and Clang, the Clang Getting Started page [1]
has one way. As Yonghong said, it is better to copy the build procedure
in Documentation/bpf/bpf_devel_QA.rst to keep consistent.

I verified the procedure and it is proved to be feasible, so we should
update README.rst to reflect the reality. At the same time, update the
related comment in Makefile.

Additionally, as Fangrui said, the dir llvm-project/llvm/build/install is
not used, BUILD_SHARED_LIBS=OFF is the default option [2], so also change
Documentation/bpf/bpf_devel_QA.rst together.

At last, we recommend that developers who want the fastest incremental
builds use the Ninja build system [1], you can find it in your system's
package manager, usually the package is ninja or ninja-build [3], so add
ninja to build dependencies suggested by Nathan.

[1] https://clang.llvm.org/get_started.html
[2] https://www.llvm.org/docs/CMake.html
[3] https://github.com/ninja-build/ninja/wiki/Pre-built-Ninja-packages

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Cc: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/bpf/1611279584-26047-1-git-send-email-yangtiezhu@loongson.cn


# 3627d970 03-Dec-2020 Mariusz Dudek <mariuszx.dudek@intel.com>

samples/bpf: Sample application for eBPF load and socket creation split

Introduce a sample program to demonstrate the control and data
plane split. For the control plane part a new program called
xdpsock_ctrl_proc is introduced. For the data plane part, some code
was added to xdpsock_user.c to act as the data plane entity.

Application xdpsock_ctrl_proc works as control entity with sudo
privileges (CAP_SYS_ADMIN and CAP_NET_ADMIN are sufficient) and the
extended xdpsock as data plane entity with CAP_NET_RAW capability
only.

Usage example:

sudo ./samples/bpf/xdpsock_ctrl_proc -i <interface>

sudo ./samples/bpf/xdpsock -i <interface> -q <queue_id>
-n <interval> -N -l -R

Signed-off-by: Mariusz Dudek <mariuszx.dudek@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20201203090546.11976-3-mariuszx.dudek@intel.com


# ceb5dea5 24-Nov-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples: bpf: Remove bpf_load loader completely

Numerous refactoring that rewrites BPF programs written with bpf_load
to use the libbpf loader was finally completed, resulting in BPF
programs using bpf_load within the kernel being completely no longer
present.

This commit removes bpf_load, an outdated bpf loader that is difficult
to keep up with the latest kernel BPF and causes confusion.

Also, this commit removes the unused trace_helper and bpf_load from
samples/bpf target objects from Makefile.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201124090310.24374-8-danieltimlee@gmail.com


# c6497df0 24-Nov-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples: bpf: Refactor test_overhead program with libbpf

This commit refactors the existing program with libbpf bpf loader.
Since the kprobe, tracepoint and raw_tracepoint bpf program can be
attached with single bpf_program__attach() interface, so the
corresponding function of libbpf is used here.

Rather than specifying the number of cpus inside the code, this commit
uses the number of available cpus with _SC_NPROCESSORS_ONLN.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201124090310.24374-6-danieltimlee@gmail.com


# 763af200 24-Nov-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples: bpf: Refactor ibumad program with libbpf

This commit refactors the existing ibumad program with libbpf bpf
loader. Attach/detach of Tracepoint bpf programs has been managed
with the generic bpf_program__attach() and bpf_link__destroy() from
the libbpf.

Also, instead of using the previous BPF MAP definition, this commit
refactors ibumad MAP definition with the new BTF-defined MAP format.

To verify that this bpf program works without an infiniband device,
try loading ib_umad kernel module and test the program as follows:

# modprobe ib_umad
# ./ibumad

Moreover, TRACE_HELPERS has been removed from the Makefile since it is
not used on this program.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201124090310.24374-5-danieltimlee@gmail.com


# 4fe66415 24-Nov-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples: bpf: Refactor task_fd_query program with libbpf

This commit refactors the existing kprobe program with libbpf bpf
loader. To attach bpf program, this uses generic bpf_program__attach()
approach rather than using bpf_load's load_bpf_file().

To attach bpf to perf_event, instead of using previous ioctl method,
this commit uses bpf_program__attach_perf_event since it manages the
enable of perf_event and attach of BPF programs to it, which is much
more intuitive way to achieve.

Also, explicit close(fd) has been removed since event will be closed
inside bpf_link__destroy() automatically.

Furthermore, to prevent conflict of same named uprobe events, O_TRUNC
flag has been used to clear 'uprobe_events' interface.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201124090310.24374-4-danieltimlee@gmail.com


# d89af13c 24-Nov-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples: bpf: Refactor test_cgrp2_sock2 program with libbpf

This commit refactors the existing cgroup program with libbpf bpf
loader. The original test_cgrp2_sock2 has keeped the bpf program
attached to the cgroup hierarchy even after the exit of user program.
To implement the same functionality with libbpf, this commit uses the
BPF_LINK_PINNING to pin the link attachment even after it is closed.

Since this uses LINK instead of ATTACH, detach of bpf program from
cgroup with 'test_cgrp2_sock' is not used anymore.

The code to mount the bpf was added to the .sh file in case the bpff
was not mounted on /sys/fs/bpf. Additionally, to fix the problem that
shell script cannot find the binary object from the current path,
relative path './' has been added in front of binary.

Fixes: 554ae6e792ef3 ("samples/bpf: add userspace example for prohibiting sockets")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201124090310.24374-3-danieltimlee@gmail.com


# c5815ac7 24-Nov-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples: bpf: Refactor hbm program with libbpf

This commit refactors the existing cgroup programs with libbpf
bpf loader. Since bpf_program__attach doesn't support cgroup program
attachment, this explicitly attaches cgroup bpf program with
bpf_program__attach_cgroup(bpf_prog, cg1).

Also, to change attach_type of bpf program, this uses libbpf's
bpf_program__set_expected_attach_type helper to switch EGRESS to
INGRESS. To keep bpf program attached to the cgroup hierarchy even
after the exit, this commit uses the BPF_LINK_PINNING to pin the link
attachment even after it is closed.

Besides, this program was broken due to the typo of BPF MAP definition.
But this commit solves the problem by fixing this from 'queue_stats' map
struct hvm_queue_stats -> hbm_queue_stats.

Fixes: 36b5d471135c ("selftests/bpf: samples/bpf: Split off legacy stuff from bpf_helpers.h")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201124090310.24374-2-danieltimlee@gmail.com


# 151936bf 10-Oct-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples: bpf: Replace attach_tracepoint() to attach() in xdp_redirect_cpu

>From commit d7a18ea7e8b6 ("libbpf: Add generic bpf_program__attach()"),
for some BPF programs, it is now possible to attach BPF programs
with __attach() instead of explicitly calling __attach_<type>().

This commit refactors the __attach_tracepoint() with libbpf's generic
__attach() method. In addition, this refactors the logic of setting
the map FD to simplify the code. Also, the missing removal of
bpf_load.o in Makefile has been fixed.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201010181734.1109-3-danieltimlee@gmail.com


# 8ac91df6 10-Oct-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples: bpf: Refactor xdp_monitor with libbpf

To avoid confusion caused by the increasing fragmentation of the BPF
Loader program, this commit would like to change to the libbpf loader
instead of using the bpf_load.

Thanks to libbpf's bpf_link interface, managing the tracepoint BPF
program is much easier. bpf_program__attach_tracepoint manages the
enable of tracepoint event and attach of BPF programs to it with a
single interface bpf_link, so there is no need to manage event_fd and
prog_fd separately.

This commit refactors xdp_monitor with using this libbpf API, and the
bpf_load is removed and migrated to libbpf.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201010181734.1109-2-danieltimlee@gmail.com


# 9618bde4 05-Oct-2020 Yonghong Song <yhs@fb.com>

samples/bpf: Change Makefile to cope with latest llvm

With latest llvm trunk, bpf programs under samples/bpf
directory, if using CORE, may experience the following
errors:

LLVM ERROR: Cannot select: intrinsic %llvm.preserve.struct.access.index
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0. Program arguments: llc -march=bpf -filetype=obj -o samples/bpf/test_probe_write_user_kern.o
1. Running pass 'Function Pass Manager' on module '<stdin>'.
2. Running pass 'BPF DAG->DAG Pattern Instruction Selection' on function '@bpf_prog1'
#0 0x000000000183c26c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int)
(/data/users/yhs/work/llvm-project/llvm/build.cur/install/bin/llc+0x183c26c)
...
#7 0x00000000017c375e (/data/users/yhs/work/llvm-project/llvm/build.cur/install/bin/llc+0x17c375e)
#8 0x00000000016a75c5 llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*)
(/data/users/yhs/work/llvm-project/llvm/build.cur/install/bin/llc+0x16a75c5)
#9 0x00000000016ab4f8 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*,
unsigned int) (/data/users/yhs/work/llvm-project/llvm/build.cur/install/bin/llc+0x16ab4f8)
...
Aborted (core dumped) | llc -march=bpf -filetype=obj -o samples/bpf/test_probe_write_user_kern.o

The reason is due to llvm change https://reviews.llvm.org/D87153
where the CORE relocation global generation is moved from the beginning
of target dependent optimization (llc) to the beginning
of target independent optimization (opt).

Since samples/bpf programs did not use vmlinux.h and its clang compilation
uses native architecture, we need to adjust arch triple at opt level
to do CORE relocation global generation properly. Otherwise, the above
error will appear.

This patch fixed the issue by introduce opt and llvm-dis to compilation chain,
which will do proper CORE relocation global generation as well as O2 level
optimization. Tested with llvm10, llvm11 and trunk/llvm12.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20201006043427.1891742-1-yhs@fb.com


# 35149b2c 28-Aug-2020 Cristian Dumitrescu <cristian.dumitrescu@intel.com>

samples/bpf: Add new sample xsk_fwd.c

This sample code illustrates the packet forwarding between multiple
AF_XDP sockets in multi-threading environment. All the threads and
sockets are sharing a common buffer pool, with each socket having
its own private buffer cache. The sockets are created with the
xsk_socket__create_shared() function, which allows multiple AF_XDP
sockets to share the same UMEM object.

Example 1: Single thread handling two sockets. Packets received
from socket A (on top of interface IFA, queue QA) are forwarded
to socket B (on top of interface IFB, queue QB) and vice-versa.
The thread is affinitized to CPU core C:

./xsk_fwd -i IFA -q QA -i IFB -q QB -c C

Example 2: Two threads, each handling two sockets. Packets from
socket A are sent to socket B (by thread X), packets
from socket B are sent to socket A (by thread X); packets from
socket C are sent to socket D (by thread Y), packets from socket
D are sent to socket C (by thread Y). The two threads are bound
to CPU cores CX and CY:

./xdp_fwd -i IFA -q QA -i IFB -q QB -i IFC -q QC -i IFD -q QD -c CX -c CY

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/1598603189-32145-15-git-send-email-magnus.karlsson@intel.com


# f0c328f8 23-Aug-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples: bpf: Refactor tracepoint tracing programs with libbpf

For the problem of increasing fragmentation of the bpf loader programs,
instead of using bpf_loader.o, which is used in samples/bpf, this
commit refactors the existing tracepoint tracing programs with libbbpf
bpf loader.

- Adding a tracepoint event and attaching a bpf program to it was done
through bpf_program_attach().
- Instead of using the existing BPF MAP definition, MAP definition
has been refactored with the new BTF-defined MAP format.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200823085334.9413-4-danieltimlee@gmail.com


# 3677d0a1 23-Aug-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples: bpf: Refactor kprobe tracing programs with libbpf

For the problem of increasing fragmentation of the bpf loader programs,
instead of using bpf_loader.o, which is used in samples/bpf, this
commit refactors the existing kprobe tracing programs with libbbpf
bpf loader.

- For kprobe events pointing to system calls, the SYSCALL() macro in
trace_common.h was used.
- Adding a kprobe event and attaching a bpf program to it was done
through bpf_program_attach().
- Instead of using the existing BPF MAP definition, MAP definition
has been refactored with the new BTF-defined MAP format.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200823085334.9413-3-danieltimlee@gmail.com


# 35a8b6dd 23-Aug-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples: bpf: Cleanup bpf_load.o from Makefile

Since commit cc7f641d637b ("samples: bpf: Refactor BPF map performance
test with libbpf") has ommited the removal of bpf_load.o from Makefile,
this commit removes the bpf_load.o rule for targets where bpf_load.o is
not used.

Fixes: cc7f641d637b ("samples: bpf: Refactor BPF map performance test with libbpf")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200823085334.9413-2-danieltimlee@gmail.com


# 88795b4a 07-Jul-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples: bpf: Refactor BPF map in map test with libbpf

From commit 646f02ffdd49 ("libbpf: Add BTF-defined map-in-map
support"), a way to define internal map in BTF-defined map has been
added.

Instead of using previous 'inner_map_idx' definition, the structure to
be used for the inner map can be directly defined using array directive.

__array(values, struct inner_map)

This commit refactors map in map test program with libbpf by explicitly
defining inner map with BTF-defined format.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200707184855.30968-3-danieltimlee@gmail.com


# bc1a8597 15-May-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples, bpf: Refactor tail call user progs with libbpf

BPF tail call uses the BPF_MAP_TYPE_PROG_ARRAY type map for calling
into other BPF programs and this PROG_ARRAY should be filled prior to
use. Currently, samples with the PROG_ARRAY type MAP fill this program
array with bpf_load. For bpf_load to fill this map, kernel BPF program
must specify the section with specific format of <prog_type>/<array_idx>
(e.g. SEC("socket/0"))

But by using libbpf instead of bpf_load, user program can specify which
programs should be added to PROG_ARRAY. The advantage of this approach
is that you can selectively add only the programs you want, rather than
adding all of them to PROG_ARRAY, and it's much more intuitive than the
traditional approach.

This commit refactors user programs with the PROG_ARRAY type MAP with
libbpf instead of using bpf_load.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200516040608.1377876-4-danieltimlee@gmail.com


# 63841bc0 15-May-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples, bpf: Refactor kprobe tracing user progs with libbpf

Currently, the kprobe BPF program attachment method for bpf_load is
quite old. The implementation of bpf_load "directly" controls and
manages(create, delete) the kprobe events of DEBUGFS. On the other hand,
using using the libbpf automatically manages the kprobe event.
(under bpf_link interface)

By calling bpf_program__attach(_kprobe) in libbpf, the corresponding
kprobe is created and the BPF program will be attached to this kprobe.
To remove this, by simply invoking bpf_link__destroy will clean up the
event.

This commit refactors kprobe tracing programs (tracex{1~7}_user.c) with
libbpf using bpf_link interface and bpf_program__attach.

tracex2_kern.c, which tracks system calls (sys_*), has been modified to
append prefix depending on architecture.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200516040608.1377876-3-danieltimlee@gmail.com


# aa5e2af6 21-Mar-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples, bpf: Refactor perf_event user program with libbpf bpf_link

The bpf_program__attach of libbpf(using bpf_link) is much more intuitive
than the previous method using ioctl.

bpf_program__attach_perf_event manages the enable of perf_event and
attach of BPF programs to it, so there's no neeed to do this
directly with ioctl.

In addition, bpf_link provides consistency in the use of API because it
allows disable (detach, destroy) for multiple events to be treated as
one bpf_link__destroy. Also, bpf_link__destroy manages the close() of
perf_event fd.

This commit refactors samples that attach the bpf program to perf_event
by using libbbpf instead of ioctl. Also the bpf_load in the samples were
removed and migrated to use libbbpf API.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200321100424.1593964-3-danieltimlee@gmail.com


# 24a6034a 21-Mar-2020 Daniel T. Lee <danieltimlee@gmail.com>

samples, bpf: Move read_trace_pipe to trace_helpers

To reduce the reliance of trace samples (trace*_user) on bpf_load,
move read_trace_pipe to trace_helpers. By moving this bpf_loader helper
elsewhere, trace functions can be easily migrated to libbbpf.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200321100424.1593964-2-danieltimlee@gmail.com


# 5f2fb52f 01-Feb-2020 Masahiro Yamada <masahiroy@kernel.org>

kbuild: rename hostprogs-y/always to hostprogs/always-y

In old days, the "host-progs" syntax was used for specifying host
programs. It was renamed to the current "hostprogs-y" in 2004.

It is typically useful in scripts/Makefile because it allows Kbuild to
selectively compile host programs based on the kernel configuration.

This commit renames like follows:

always -> always-y
hostprogs-y -> hostprogs

So, scripts/Makefile will look like this:

always-$(CONFIG_BUILD_BIN2C) += ...
always-$(CONFIG_KALLSYMS) += ...
...
hostprogs := $(always-y) $(always-m)

I think this makes more sense because a host program is always a host
program, irrespective of the kernel configuration. We want to specify
which ones to compile by CONFIG options, so always-y will be handier.

The "always", "hostprogs-y", "hostprogs-m" will be kept for backward
compatibility for a while.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>


# 7cf245a3 20-Jan-2020 Toke Høiland-Jørgensen <toke@redhat.com>

samples/bpf: Use consistent include paths for libbpf

Fix all files in samples/bpf to include libbpf header files with the bpf/
prefix, to be consistent with external users of the library. Also ensure
that all includes of exported libbpf header files (those that are exported
on 'make install' of the library) use bracketed includes instead of quoted.

To make sure no new files are introduced that doesn't include the bpf/
prefix in its include, remove tools/lib/bpf from the include path entirely,
and use tools/lib instead.

Fixes: 6910d7d3867a ("selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157952560911.1683545.8795966751309534150.stgit@toke.dk


# b2e5e93a 20-Jan-2020 Toke Høiland-Jørgensen <toke@redhat.com>

samples/bpf: Don't try to remove user's homedir on clean

The 'clean' rule in the samples/bpf Makefile tries to remove backup
files (ending in ~). However, if no such files exist, it will instead try
to remove the user's home directory. While the attempt is mostly harmless,
it does lead to a somewhat scary warning like this:

rm: cannot remove '~': Is a directory

Fix this by using find instead of shell expansion to locate any actual
backup files that need to be removed.

Fixes: b62a796c109c ("samples/bpf: allow make to be run from samples/bpf/ directory")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/157952560126.1683545.7273054725976032511.stgit@toke.dk


# 45027897 16-Dec-2019 Toke Høiland-Jørgensen <toke@redhat.com>

samples/bpf: Set -fno-stack-protector when building BPF programs

It seems Clang can in some cases turn on stack protection by default, which
doesn't work with BPF. This was reported once before[0], but it seems the
flag to explicitly turn off the stack protector wasn't added to the
Makefile, so do that now.

The symptom of this is compile errors like the following:

error: <unknown>:0:0: in function bpf_prog1 i32 (%struct.__sk_buff*): A call to built-in function '__stack_chk_fail' is not supported.

[0] https://www.spinics.net/lists/netdev/msg556400.html

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191216103819.359535-1-toke@redhat.com


# 5615ed47 16-Dec-2019 Toke Høiland-Jørgensen <toke@redhat.com>

samples/bpf: Add missing -lz to TPROGS_LDLIBS

Since libbpf now links against zlib, this needs to be included in the
linker invocation for the userspace programs in samples/bpf that link
statically against libbpf.

Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/20191216102405.353834-1-toke@redhat.com


# 5984dc6c 16-Dec-2019 Prashant Bhole <prashantbhole.linux@gmail.com>

samples/bpf: Reintroduce missed build targets

Add xdp_redirect and per_socket_stats_example in build targets.
They got removed from build targets in Makefile reorganization.

Fixes: 1d97c6c2511f ("samples/bpf: Base target programs rules on Makefile.target")
Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191216071619.25479-1-prashantbhole.linux@gmail.com


# 6cc2c876 10-Nov-2019 Jesper Dangaard Brouer <brouer@redhat.com>

samples/bpf: adjust Makefile and README.rst

Side effect of some kbuild changes resulted in breaking the
documented way to build samples/bpf/.

This patch change the samples/bpf/Makefile to work again, when
invoking make from the subdir samples/bpf/. Also update the
documentation in README.rst, to reflect the new way to build.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 2e5d72c1 07-Nov-2019 Magnus Karlsson <magnus.karlsson@intel.com>

samples/bpf: Add XDP_SHARED_UMEM support to xdpsock

Add support for the XDP_SHARED_UMEM mode to the xdpsock sample
application. As libbpf does not have a built in XDP program for this
mode, we use an explicitly loaded XDP program. This also serves as an
example on how to write your own XDP program that can route to an
AF_XDP socket.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: William Tu <u9012063@gmail.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/bpf/1573148860-30254-3-git-send-email-magnus.karlsson@intel.com


# 04ec044b 01-Oct-2019 Björn Töpel <bjorn@kernel.org>

samples/bpf: fix build by setting HAVE_ATTR_TEST to zero

To remove that test_attr__{enabled/open} are used by perf-sys.h, we
set HAVE_ATTR_TEST to zero.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: KP Singh <kpsingh@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: http://lore.kernel.org/bpf/20191001113307.27796-3-bjorn.topel@gmail.com


# b2327c10 10-Oct-2019 Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

samples/bpf: Add sysroot support

Basically it only enables that was added by previous couple fixes.
Sysroot contains correct libs installed and its headers. Useful when
working with NFC or virtual machine.

Usage example:

clean (on demand)
make ARCH=arm -C samples/bpf clean
make ARCH=arm -C tools clean
make ARCH=arm clean

configure and install headers:

make ARCH=arm defconfig
make ARCH=arm headers_install

build samples/bpf:
make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- samples/bpf/ \
SYSROOT="path/to/sysroot"

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191011002808.28206-15-ivan.khoronzhuk@linaro.org


# d8ceae91 10-Oct-2019 Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

samples/bpf: Provide C/LDFLAGS to libbpf

In order to build lib using C/LD flags of target arch, provide them
to libbpf make.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191011002808.28206-14-ivan.khoronzhuk@linaro.org


# a833effa 10-Oct-2019 Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

samples/bpf: Use target CC environment for HDR_PROBE

No need in hacking HOSTCC to be cross-compiler any more, so drop
this trick and use target CC for HDR_PROBE.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191011002808.28206-11-ivan.khoronzhuk@linaro.org


# 10cb3d87 10-Oct-2019 Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

samples/bpf: Use own flags but not HOSTCFLAGS

While compiling natively, the host's cflags and ldflags are equal to
ones used from HOSTCFLAGS and HOSTLDFLAGS. When cross compiling it
should have own, used for target arch. While verification, for arm,
arm64 and x86_64 the following flags were used always:

-Wall -O2
-fomit-frame-pointer
-Wmissing-prototypes
-Wstrict-prototypes

So, add them as they were verified and used before adding
Makefile.target and lets omit "-fomit-frame-pointer" as were proposed
while review, as no sense in such optimization for samples.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191011002808.28206-10-ivan.khoronzhuk@linaro.org


# 1d97c6c2 10-Oct-2019 Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

samples/bpf: Base target programs rules on Makefile.target

The main reason for that - HOSTCC and CC have different aims.
HOSTCC is used to build programs running on host, that can
cross-comple target programs with CC. It was tested for arm and arm64
cross compilation, based on linaro toolchain, but should work for
others.

So, in order to split cross compilation (CC) with host build (HOSTCC),
lets base samples on Makefile.target. It allows to cross-compile
samples/bpf programs with CC while auxialry tools running on host
built with HOSTCC.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191011002808.28206-9-ivan.khoronzhuk@linaro.org


# 54b7fbd4 10-Oct-2019 Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

samples/bpf: Drop unnecessarily inclusion for bpf_load

Drop inclusion for bpf_load -I$(objtree)/usr/include as it is
included for all objects anyway, with above line:
KBUILD_HOSTCFLAGS += -I$(objtree)/usr/include

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191011002808.28206-7-ivan.khoronzhuk@linaro.org


# 0e865aed 10-Oct-2019 Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

samples/bpf: Use __LINUX_ARM_ARCH__ selector for arm

For arm, -D__LINUX_ARM_ARCH__=X is min version used as instruction
set selector and is absolutely required while parsing some parts of
headers. It's present in KBUILD_CFLAGS but not in autoconf.h, so let's
retrieve it from and add to programs cflags. In another case errors
like "SMP is not supported" for armv7 and bunch of other errors are
issued resulting to incorrect final object.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191011002808.28206-6-ivan.khoronzhuk@linaro.org


# 2a560df7 10-Oct-2019 Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

samples/bpf: Use own EXTRA_CFLAGS for clang commands

It can overlap with CFLAGS used for libraries built with gcc if
not now then in next patches. Correct it here for simplicity.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191011002808.28206-5-ivan.khoronzhuk@linaro.org


# 518c1340 10-Oct-2019 Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

samples/bpf: Use --target from cross-compile

For cross compiling the target triple can be inherited from
cross-compile prefix as it's done in CLANG_FLAGS from kernel makefile.
So copy-paste this decision from kernel Makefile.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191011002808.28206-4-ivan.khoronzhuk@linaro.org


# 39e0c364 10-Oct-2019 Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

samples/bpf: Fix cookie_uid_helper_example obj build

Don't list userspace "cookie_uid_helper_example" object in list for
bpf objects.

'always' target is used for listing bpf programs, but
'cookie_uid_helper_example.o' is a user space ELF file, and covered
by rule `per_socket_stats_example`, so shouldn't be in 'always'.
Let us remove `always += cookie_uid_helper_example.o`, which avoids
breaking cross compilation due to mismatched includes.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191011002808.28206-3-ivan.khoronzhuk@linaro.org


# cdd5b2d1 10-Oct-2019 Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

samples/bpf: Fix HDR_PROBE "echo"

echo should be replaced with echo -e to handle '\n' correctly, but
instead, replace it with printf as some systems can't handle echo -e.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191011002808.28206-2-ivan.khoronzhuk@linaro.org


# e01a75c1 08-Oct-2019 Andrii Nakryiko <andriin@fb.com>

libbpf: Move bpf_{helpers, helper_defs, endian, tracing}.h into libbpf

Move bpf_helpers.h, bpf_tracing.h, and bpf_endian.h into libbpf. Move
bpf_helper_defs.h generation into libbpf's Makefile. Ensure all those
headers are installed along the other libbpf headers. Also, adjust
selftests and samples include path to include libbpf now.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191008175942.1769476-6-andriin@fb.com


# fce9501a 01-Oct-2019 Björn Töpel <bjorn@kernel.org>

samples/bpf: fix build by setting HAVE_ATTR_TEST to zero

To remove that test_attr__{enabled/open} are used by perf-sys.h, we
set HAVE_ATTR_TEST to zero.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: KP Singh <kpsingh@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20191001113307.27796-3-bjorn.topel@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 81f522f9 15-Jul-2019 Ilya Leoshkevich <iii@linux.ibm.com>

samples/bpf: build with -D__TARGET_ARCH_$(SRCARCH)

While $ARCH can be relatively flexible (see Makefile and
tools/scripts/Makefile.arch), $SRCARCH always corresponds to a directory
name under arch/.

Therefore, build samples with -D__TARGET_ARCH_$(SRCARCH), since that
matches the expectations of bpf_helpers.h.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 39533884 02-Jul-2019 Stanislav Fomichev <sdf@google.com>

samples/bpf: add sample program that periodically dumps TCP stats

Uses new RTT callback to dump stats every second.

$ mkdir -p /tmp/cgroupv2
$ mount -t cgroup2 none /tmp/cgroupv2
$ mkdir -p /tmp/cgroupv2/foo
$ echo $$ >> /tmp/cgroupv2/foo/cgroup.procs
$ bpftool prog load ./tcp_dumpstats_kern.o /sys/fs/bpf/tcp_prog
$ bpftool cgroup attach /tmp/cgroupv2/foo sock_ops pinned /sys/fs/bpf/tcp_prog
$ bpftool prog tracelog
$ # run neper/netperf/etc

Used neper to compare performance with and without this program attached
and didn't see any noticeable performance impact.

Sample output:
<idle>-0 [015] ..s. 2074.128800: 0: dsack_dups=0 delivered=242526
<idle>-0 [015] ..s. 2074.128808: 0: delivered_ce=0 icsk_retransmits=0
<idle>-0 [015] ..s. 2075.130133: 0: dsack_dups=0 delivered=323599
<idle>-0 [015] ..s. 2075.130138: 0: delivered_ce=0 icsk_retransmits=0
<idle>-0 [005] .Ns. 2076.131440: 0: dsack_dups=0 delivered=404648
<idle>-0 [005] .Ns. 2076.131447: 0: delivered_ce=0 icsk_retransmits=0

Cc: Eric Dumazet <edumazet@google.com>
Cc: Priyaranjan Jha <priyarjha@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 71634d7f 02-Jul-2019 brakmo <brakmo@fb.com>

bpf: Add support for fq's EDT to HBM

Adds support for fq's Earliest Departure Time to HBM (Host Bandwidth
Manager). Includes a new BPF program supporting EDT, and also updates
corresponding programs.

It will drop packets with an EDT of more than 500us in the future
unless the packet belongs to a flow with less than 2 packets in flight.
This is done so each flow has at least 2 packets in flight, so they
will not starve, and also to help prevent delayed ACK timeouts.

It will also work with ECN enabled traffic, where the packets will be
CE marked if their EDT is more than 50us in the future.

The table below shows some performance numbers. The flows are back to
back RPCS. One server sending to another, either 2 or 4 flows.
One flow is a 10KB RPC, the rest are 1MB RPCs. When there are more
than one flow of a given RPC size, the numbers represent averages.

The rate limit applies to all flows (they are in the same cgroup).
Tests ending with "-edt" ran with the new BPF program supporting EDT.
Tests ending with "-hbt" ran on top HBT qdisc with the specified rate
(i.e. no HBM). The other tests ran with the HBM BPF program included
in the HBM patch-set.

EDT has limited value when using DCTCP, but it helps in many cases when
using Cubic. It usually achieves larger link utilization and lower
99% latencies for the 1MB RPCs.
HBM ends up queueing a lot of packets with its default parameter values,
reducing the goodput of the 10KB RPCs and increasing their latency. Also,
the RTTs seen by the flows are quite large.

Aggr 10K 10K 10K 1MB 1MB 1MB
Limit rate drops RTT rate P90 P99 rate P90 P99
Test rate Flows Mbps % us Mbps us us Mbps ms ms
-------- ---- ----- ---- ----- --- ---- ---- ---- ---- ---- ----
cubic 1G 2 904 0.02 108 257 511 539 647 13.4 24.5
cubic-edt 1G 2 982 0.01 156 239 656 967 743 14.0 17.2
dctcp 1G 2 977 0.00 105 324 408 744 653 14.5 15.9
dctcp-edt 1G 2 981 0.01 142 321 417 811 660 15.7 17.0
cubic-htb 1G 2 919 0.00 1825 40 2822 4140 879 9.7 9.9

cubic 200M 2 155 0.30 220 81 532 655 74 283 450
cubic-edt 200M 2 188 0.02 222 87 1035 1095 101 84 85
dctcp 200M 2 188 0.03 111 77 912 939 111 76 325
dctcp-edt 200M 2 188 0.03 217 74 1416 1738 114 76 79
cubic-htb 200M 2 188 0.00 5015 8 14ms 15ms 180 48 50

cubic 1G 4 952 0.03 110 165 516 546 262 38 154
cubic-edt 1G 4 973 0.01 190 111 1034 1314 287 65 79
dctcp 1G 4 951 0.00 103 180 617 905 257 37 38
dctcp-edt 1G 4 967 0.00 163 151 732 1126 272 43 55
cubic-htb 1G 4 914 0.00 3249 13 7ms 8ms 300 29 34

cubic 5G 4 4236 0.00 134 305 490 624 1310 10 17
cubic-edt 5G 4 4865 0.00 156 306 425 759 1520 10 16
dctcp 5G 4 4936 0.00 128 485 221 409 1484 7 9
dctcp-edt 5G 4 4924 0.00 148 390 392 623 1508 11 26

v1 -> v2: Incorporated Andrii's suggestions
v2 -> v3: Incorporated Yonghong's suggestions
v3 -> v4: Removed credit update that is not needed

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# fa206dcc 15-Jun-2019 Daniel T. Lee <danieltimlee@gmail.com>

samples: bpf: remove unnecessary include options in Makefile

Due to recent change of include path at commit b552d33c80a9
("samples/bpf: fix include path in Makefile"), some of the
previous include options became unnecessary.

This commit removes duplicated include options in Makefile.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# b552d33c 14-Jun-2019 Prashant Bhole <prashantbhole.linux@gmail.com>

samples/bpf: fix include path in Makefile

Recent commit included libbpf.h in selftests/bpf/bpf_util.h.
Since some samples use bpf_util.h and samples/bpf/Makefile doesn't
have libbpf.h path included, build was failing. Let's add the path
in samples/bpf/Makefile.

Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 0ed3cc4a 07-Jun-2019 Jakub Kicinski <kuba@kernel.org>

samples: bpf: don't run probes at the local make stage

Quentin reports that commit 07c3bbdb1a9b ("samples: bpf: print
a warning about headers_install") is producing the false
positive when make is invoked locally, from the samples/bpf/
directory.

When make is run locally it hits the "all" target, which
will recursively invoke make through the full build system.

Speed up the "local" run which doesn't actually build anything,
and avoid false positives by skipping all the probes if not in
kbuild environment (cover both the new warning and the BTF
probes).

Reported-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 07c3bbdb 05-Jun-2019 Jakub Kicinski <kuba@kernel.org>

samples: bpf: print a warning about headers_install

It seems like periodically someone posts patches to "fix"
header includes. The issue is that samples expect the
include path to have the uAPI headers (from usr/) first,
and then tools/ headers, so that locally installed uAPI
headers take precedence. This means that if users didn't
run headers_install they will see all sort of strange
compilation errors, e.g.:

HOSTCC samples/bpf/test_lru_dist
samples/bpf/test_lru_dist.c:39:8: error: redefinition of ‘struct list_head’
struct list_head {
^~~~~~~~~
In file included from samples/bpf/test_lru_dist.c:9:0:
../tools/include/linux/types.h:69:8: note: originally defined here
struct list_head {
^~~~~~~~~

Try to detect this situation, and print a helpful warning.

v2: just use HOSTCC (Jiong).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# ba0c0cc0 25-May-2019 Roman Gushchin <guro@fb.com>

selftests/bpf: convert test_cgrp2_attach2 example into kselftest

Convert test_cgrp2_attach2 example into a proper test_cgroup_attach
kselftest. It's better because we do run kselftest on a constant
basis, so there are better chances to spot a potential regression.

Also make it slightly less verbose to conform kselftests output style.

Output example:
$ ./test_cgroup_attach
#override:PASS
#multi:PASS
test_cgroup_attach:PASS

Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 0ac01feb 19-Mar-2019 Ira Weiny <ira.weiny@intel.com>

BPF: Add sample code for new ib_umad tracepoint

Provide a count of class types for a summary of MAD packets. The example
shows one way to filter the trace data based on management class.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>


# a1270fe9 01-Mar-2019 brakmo <brakmo@fb.com>

bpf: User program for testing HBM

The program nrm creates a cgroup and attaches a BPF program to the
cgroup for testing HBM (Host Bandwidth Manager) for egress traffic.
One still needs to create network traffic. This can be done through
netesto, netperf or iperf3.
A follow-up patch contains a script to create traffic.

USAGE: hbm [-d] [-l] [-n <id>] [-r <rate>] [-s] [-t <secs>]
[-w] [-h] [prog]
Where:
-d Print BPF trace debug buffer
-l Also limit flows doing loopback
-n <#> To create cgroup "/hbm#" and attach prog. Default is /nrm1
This is convenient when testing HBM in more than 1 cgroup
-r <rate> Rate limit in Mbps
-s Get HBM stats (marked, dropped, etc.)
-t <time> Exit after specified seconds (deault is 0)
-w Work conserving flag. cgroup can increase its bandwidth
beyond the rate limit specified while there is available
bandwidth. Current implementation assumes there is only
NIC (eth0), but can be extended to support multiple NICs.
Currrently only supported for egress. Note, this is just
a proof of concept.
-h Print this info
prog BPF program file name. Name defaults to hbm_out_kern.o

More information about HBM can be found in the paper "BPF Host Resource
Management" presented at the 2018 Linux Plumbers Conference, Networking Track
(http://vger.kernel.org/lpc_net2018_talks/LPC%20BPF%20Network%20Resource%20Paper.pdf)

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 187d0738 01-Mar-2019 brakmo <brakmo@fb.com>

bpf: Sample HBM BPF program to limit egress bw

A cgroup skb BPF program to limit cgroup output bandwidth.
It uses a modified virtual token bucket queue to limit average
egress bandwidth. The implementation uses credits instead of tokens.
Negative credits imply that queueing would have happened (this is
a virtual queue, so no queueing is done by it. However, queueing may
occur at the actual qdisc (which is not used for rate limiting).

This implementation uses 3 thresholds, one to start marking packets and
the other two to drop packets:
CREDIT
- <--------------------------|------------------------> +
| | | 0
| Large pkt |
| drop thresh |
Small pkt drop Mark threshold
thresh

The effect of marking depends on the type of packet:
a) If the packet is ECN enabled, then the packet is ECN ce marked.
The current mark threshold is tuned for DCTCP.
c) Else, it is dropped if it is a large packet.

If the credit is below the drop threshold, the packet is dropped.
Note that dropping a packet through the BPF program does not trigger CWR
(Congestion Window Reduction) in TCP packets. A future patch will add
support for triggering CWR.

This BPF program actually uses 2 drop thresholds, one threshold
for larger packets (>= 120 bytes) and another for smaller packets. This
protects smaller packets such as SYNs, ACKs, etc.

The default bandwidth limit is set at 1Gbps but this can be changed by
a user program through a shared BPF map. In addition, by default this BPF
program does not limit connections using loopback. This behavior can be
overwritten by the user program. There is also an option to calculate
some statistics, such as percent of packets marked or dropped, which
the user program can access.

A latter patch provides such a program (hbm.c)

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 1a9b268c 27-Feb-2019 Jakub Kicinski <kuba@kernel.org>

samples: bpf: use libbpf where easy

Some samples don't really need the magic of bpf_load,
switch them to libbpf.

v2: - specify program types.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# ea9b6362 27-Feb-2019 Jakub Kicinski <kuba@kernel.org>

samples: bpf: remove load_sock_ops in favour of bpftool

bpftool can do all the things load_sock_ops used to do, and more.
Point users to bpftool instead of maintaining this sample utility.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 248c7f9c 21-Feb-2019 Magnus Karlsson <magnus.karlsson@intel.com>

samples/bpf: convert xdpsock to use libbpf for AF_XDP access

This commit converts the xdpsock sample application to use the AF_XDP
functions present in libbpf. This cuts down the size of it by nearly
300 lines of code.

The default ring sizes plus the batch size has been increased and the
size of the umem area has decreased. This so that the sample application
will provide higher throughput. Note also that the shared umem code
has been removed from the sample as this is not supported by libbpf
at this point in time.

Tested-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# bbaf6029 01-Feb-2019 Maciej Fijalkowski <maciejromanfijalkowski@gmail.com>

samples/bpf: Convert XDP samples to libbpf usage

Some of XDP samples that are attaching the bpf program to the interface
via libbpf's bpf_set_link_xdp_fd are still using the bpf_load.c for
loading and manipulating the ebpf program and maps. Convert them to do
this through libbpf usage and remove bpf_load from the picture.

While at it remove what looks like debug leftover in
xdp_redirect_map_user.c

In xdp_redirect_cpu, change the way that the program to be loaded onto
interface is chosen - user now needs to pass the program's section name
instead of the relative number. In case of typo print out the section
names to choose from.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 6bf3bbe1 12-Jan-2019 Yonghong Song <yhs@fb.com>

samples/bpf: workaround clang asm goto compilation errors

x86 compilation has required asm goto support since 4.17.
Since clang does not support asm goto, at 4.17,
Commit b1ae32dbab50 ("x86/cpufeature: Guard asm_volatile_goto usage
for BPF compilation") worked around the issue by permitting an
alternative implementation without asm goto for clang.

At 5.0, more asm goto usages appeared.
[yhs@148 x86]$ egrep -r asm_volatile_goto
include/asm/cpufeature.h: asm_volatile_goto("1: jmp 6f\n"
include/asm/jump_label.h: asm_volatile_goto("1:"
include/asm/jump_label.h: asm_volatile_goto("1:"
include/asm/rmwcc.h: asm_volatile_goto (fullop "; j" #cc " %l[cc_label]" \
include/asm/uaccess.h: asm_volatile_goto("\n" \
include/asm/uaccess.h: asm_volatile_goto("\n" \
[yhs@148 x86]$

Compiling samples/bpf directories, most bpf programs failed
compilation with error messages like:
In file included from /home/yhs/work/bpf-next/samples/bpf/xdp_sample_pkts_kern.c:2:
In file included from /home/yhs/work/bpf-next/include/linux/ptrace.h:6:
In file included from /home/yhs/work/bpf-next/include/linux/sched.h:15:
In file included from /home/yhs/work/bpf-next/include/linux/sem.h:5:
In file included from /home/yhs/work/bpf-next/include/uapi/linux/sem.h:5:
In file included from /home/yhs/work/bpf-next/include/linux/ipc.h:9:
In file included from /home/yhs/work/bpf-next/include/linux/refcount.h:72:
/home/yhs/work/bpf-next/arch/x86/include/asm/refcount.h:70:9: error: 'asm goto' constructs are not supported yet
return GEN_BINARY_SUFFIXED_RMWcc(LOCK_PREFIX "subl",
^
/home/yhs/work/bpf-next/arch/x86/include/asm/rmwcc.h:67:2: note: expanded from macro 'GEN_BINARY_SUFFIXED_RMWcc'
__GEN_RMWcc(op " %[val], %[var]\n\t" suffix, var, cc, \
^
/home/yhs/work/bpf-next/arch/x86/include/asm/rmwcc.h:21:2: note: expanded from macro '__GEN_RMWcc'
asm_volatile_goto (fullop "; j" #cc " %l[cc_label]" \
^
/home/yhs/work/bpf-next/include/linux/compiler_types.h:188:37: note: expanded from macro 'asm_volatile_goto'
#define asm_volatile_goto(x...) asm goto(x)

Most implementation does not even provide an alternative
implementation. And it is also not practical to make changes
for each call site.

This patch workarounded the asm goto issue by redefining the macro like below:
#define asm_volatile_goto(x...) asm volatile("invalid use of asm_volatile_goto")

If asm_volatile_goto is not used by bpf programs, which is typically the case, nothing bad
will happen. If asm_volatile_goto is used by bpf programs, which is incorrect, the compiler
will issue an error since "invalid use of asm_volatile_goto" is not valid assembly codes.

With this patch, all bpf programs under samples/bpf can pass compilation.

Note that bpf programs under tools/testing/selftests/bpf/ compiled fine as
they do not access kernel internal headers.

Fixes: e769742d3584 ("Revert "x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs"")
Fixes: 18fe58229d80 ("x86, asm: change the GEN_*_RMWcc() macros to not quote the condition")
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 2c667d77 22-Dec-2018 Masahiro Yamada <yamada.masahiro@socionext.com>

treewide: add intermediate .s files to targets

Avoid unneeded recreation of these in the incremental build.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>


# 4d4b5c2e 22-Dec-2018 Masahiro Yamada <yamada.masahiro@socionext.com>

treewide: remove explicit rules for *offsets.s

These explicit rules are unneeded because scripts/Makefile.build
provides a pattern rule to create %.s from %.c

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>


# 9ce6ae22 19-Nov-2018 Yonghong Song <yhs@fb.com>

tools/bpf: do not use pahole if clang/llvm can generate BTF sections

Add additional checks in tools/testing/selftests/bpf and
samples/bpf such that if clang/llvm compiler can generate
BTF sections, do not use pahole.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# acb4ea95 30-Aug-2018 Nikita V. Shirokov <tehnerd@fb.com>

bpf: add TCP_SAVE_SYN/TCP_SAVED_SYN sample program

Sample program which shows TCP_SAVE_SYN/TCP_SAVED_SYN usage example:
bpf program which is doing TOS/TCLASS reflection (server would reply
with a same TOS/TCLASS as client).

Signed-off-by: Nikita V. Shirokov <tehnerd@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 6748182c 26-Jul-2018 Jakub Kicinski <kuba@kernel.org>

samples: bpf: convert xdpsock_user.c to libbpf

Convert xdpsock_user.c to use libbpf instead of bpf_load.o.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# e1a40ef4 26-Jul-2018 Jakub Kicinski <kuba@kernel.org>

samples: bpf: convert xdp_fwd_user.c to libbpf

Convert xdp_fwd_user.c to use libbpf instead of bpf_load.o.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 9778cfdf 26-Jul-2018 Taeung Song <treeze.taeung@gmail.com>

samples/bpf: Add BTF build flags to Makefile

To smoothly test BTF supported binary on samples/bpf,
let samples/bpf/Makefile probe llc, pahole and
llvm-objcopy for BPF support and use them
like tools/testing/selftests/bpf/Makefile
changed from the commit c0fa1b6c3efc ("bpf: btf:
Add BTF tests").

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 8377bd2b 09-Jul-2018 Laura Abbott <labbott@redhat.com>

kbuild: Rename HOST_LOADLIBES to KBUILD_HOSTLDLIBS

In preparation for enabling command line LDLIBS, re-name HOST_LOADLIBES
to KBUILD_HOSTLDLIBS as the internal use only flags. Also rename
existing usage to HOSTLDLIBS for consistency. This should not have any
visible effects.

Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>


# 96f14fe7 09-Jul-2018 Laura Abbott <labbott@redhat.com>

kbuild: Rename HOSTCFLAGS to KBUILD_HOSTCFLAGS

In preparation for enabling command line CFLAGS, re-name HOSTCFLAGS to
KBUILD_HOSTCFLAGS as the internal use only flags. This should not have
any visible effects.

Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>


# 1e54ad25 25-Jun-2018 Toke Høiland-Jørgensen <toke@toke.dk>

samples/bpf: Add xdp_sample_pkts example

Add an example program showing how to sample packets from XDP using the
perf event buffer. The example userspace program just prints the ethernet
header for every packet sampled.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# ecb96f7f 24-May-2018 Yonghong Song <yhs@fb.com>

samples/bpf: add a samples/bpf test for BPF_TASK_FD_QUERY

This is mostly to test kprobe/uprobe which needs kernel headers.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 768759ed 14-May-2018 Jakub Kicinski <kuba@kernel.org>

samples: bpf: make the build less noisy

Building samples with clang ignores the $(Q) setting, always
printing full command to the output. Make it less verbose.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 0cc54db1 14-May-2018 Jakub Kicinski <kuba@kernel.org>

samples: bpf: move libbpf from object dependencies to libs

Make complains that it doesn't know how to make libbpf.a:

scripts/Makefile.host:106: target 'samples/bpf/../../tools/lib/bpf/libbpf.a' doesn't match the target pattern

Now that we have it as a dependency of the sources simply add libbpf.a
to libraries not objects.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 787360f8 14-May-2018 Jakub Kicinski <kuba@kernel.org>

samples: bpf: fix build after move to compiling full libbpf.a

There are many ways users may compile samples, some of them got
broken by commit 5f9380572b4b ("samples: bpf: compile and link
against full libbpf"). Improve path resolution and make libbpf
building a dependency of source files to force its build.

Samples should now again build with any of:
cd samples/bpf; make
make samples/bpf/
make -C samples/bpf
cd samples/bpf; make O=builddir
make samples/bpf/ O=builddir
make -C samples/bpf O=builddir
export KBUILD_OUTPUT=builddir
make samples/bpf/
make -C samples/bpf

Fixes: 5f9380572b4b ("samples: bpf: compile and link against full libbpf")
Reported-by: Björn Töpel <bjorn.topel@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# b1ae32db 13-May-2018 Alexei Starovoitov <ast@kernel.org>

x86/cpufeature: Guard asm_volatile_goto usage for BPF compilation

Workaround for the sake of BPF compilation which utilizes kernel
headers, but clang does not support ASM GOTO and fails the build.

Fixes: d0266046ad54 ("x86: Remove FAST_FEATURE_TESTS")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: daniel@iogearbox.net
Cc: peterz@infradead.org
Cc: netdev@vger.kernel.org
Cc: bp@alien8.de
Cc: yhs@fb.com
Cc: kernel-team@fb.com
Cc: torvalds@linux-foundation.org
Cc: davem@davemloft.net
Link: https://lkml.kernel.org/r/20180513193222.1997938-1-ast@kernel.org


# be5bca44 10-May-2018 Jakub Kicinski <kuba@kernel.org>

samples: bpf: convert some XDP samples from bpf_load to libbpf

Now that we can use full powers of libbpf in BPF samples, we
should perhaps make the simplest XDP programs not depend on
bpf_load helpers. This way newcomers will be exposed to the
recommended library from the start.

Use of bpf_prog_load_xattr() will also make it trivial to later
on request offload of the programs by simply adding ifindex to
the xattr.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# d0cabbb0 10-May-2018 Jakub Kicinski <kuba@kernel.org>

tools: bpf: move the event reading loop to libbpf

There are two copies of event reading loop - in bpftool and
trace_helpers "library". Consolidate them and move the code
to libbpf. Return codes from trace_helpers are kept, but
renamed to include LIBBPF prefix.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 5f938057 10-May-2018 Jakub Kicinski <kuba@kernel.org>

samples: bpf: compile and link against full libbpf

samples/bpf currently cherry-picks object files from tools/lib/bpf
to link against. Just compile the full library and link statically
against it.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# fe616055 09-May-2018 David Ahern <dsahern@gmail.com>

samples/bpf: Add example of ipv4 and ipv6 forwarding in XDP

Simple example of fast-path forwarding. It has a serious flaw
in not verifying the egress device index supports XDP forwarding.
If the egress device does not packets are dropped.

Take this only as a simple example of fast-path forwarding.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# b4b8faa1 02-May-2018 Magnus Karlsson <magnus.karlsson@intel.com>

samples/bpf: sample application and documentation for AF_XDP sockets

This is a sample application for AF_XDP sockets. The application
supports three different modes of operation: rxdrop, txonly and l2fwd.

To show-case a simple round-robin load-balancing between a set of
sockets in an xskmap, set the RR_LB compile time define option to 1 in
"xdpsock.h".

v2: The entries variable was calculated twice in {umem,xq}_nb_avail.

Co-authored-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 28dbf861 28-Apr-2018 Yonghong Song <yhs@fb.com>

samples/bpf: move common-purpose trace functions to selftests

There is no functionality change in this patch. The common-purpose
trace functions, including perf_event polling and ksym lookup,
are moved from trace_output_user.c and bpf_load.c to
selftests/bpf/trace_helpers.c so that these function can
be reused later in selftests.

Acked-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# b05cd740 26-Apr-2018 William Tu <u9012063@gmail.com>

samples/bpf: remove the bpf tunnel testsuite.

Move the testsuite to
selftests/bpf/{test_tunnel_kern.c, test_tunnel.sh}

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# c6ffd1ff 17-Apr-2018 Nikita V. Shirokov <tehnerd@tehnerd.com>

bpf: add bpf_xdp_adjust_tail sample prog

adding bpf's sample program which is using bpf_xdp_adjust_tail helper
by generating ICMPv4 "packet to big" message if ingress packet's size is
bigger then 600 bytes

Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 4662a4e5 28-Mar-2018 Alexei Starovoitov <ast@kernel.org>

samples/bpf: raw tracepoint test

add empty raw_tracepoint bpf program to test overhead similar
to kprobe and traditional tracepoint tests

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# c5350777 25-Feb-2018 Leo Yan <leo.yan@linaro.org>

samples/bpf: Add program for CPU state statistics

CPU is active when have running tasks on it and CPUFreq governor can
select different operating points (OPP) according to different workload;
we use 'pstate' to present CPU state which have running tasks with one
specific OPP. On the other hand, CPU is idle which only idle task on
it, CPUIdle governor can select one specific idle state to power off
hardware logics; we use 'cstate' to present CPU idle state.

Based on trace events 'cpu_idle' and 'cpu_frequency' we can accomplish
the duration statistics for every state. Every time when CPU enters
into or exits from idle states, the trace event 'cpu_idle' is recorded;
trace event 'cpu_frequency' records the event for CPU OPP changing, so
it's easily to know how long time the CPU stays in the specified OPP,
and the CPU must be not in any idle state.

This patch is to utilize the mentioned trace events for pstate and
cstate statistics. To achieve more accurate profiling data, the program
uses below sequence to insure CPU running/idle time aren't missed:

- Before profiling the user space program wakes up all CPUs for once, so
can avoid to missing account time for CPU staying in idle state for
long time; the program forces to set 'scaling_max_freq' to lowest
frequency and then restore 'scaling_max_freq' to highest frequency,
this can ensure the frequency to be set to lowest frequency and later
after start to run workload the frequency can be easily to be changed
to higher frequency;

- User space program reads map data and update statistics for every 5s,
so this is same with other sample bpf programs for avoiding big
overload introduced by bpf program self;

- When send signal to terminate program, the signal handler wakes up
all CPUs, set lowest frequency and restore highest frequency to
'scaling_max_freq'; this is exactly same with the first step so
avoid to missing account CPU pstate and cstate time during last
stage. Finally it reports the latest statistics.

The program has been tested on Hikey board with octa CA53 CPUs, below
is one example for statistics result, the format mainly follows up
Jesper Dangaard Brouer suggestion.

Jesper reminds to 'get printf to pretty print with thousands separators
use %' and setlocale(LC_NUMERIC, "en_US")', tried three different arm64
GCC toolchains (5.4.0 20160609, 6.2.1 20161016, 6.3.0 20170516) but all
of them cannot support printf flag character %' on arm64 platform, so go
back print number without grouping mode.

CPU states statistics:
state(ms) cstate-0 cstate-1 cstate-2 pstate-0 pstate-1 pstate-2 pstate-3 pstate-4
CPU-0 767 6111 111863 561 31 756 853 190
CPU-1 241 10606 107956 484 125 646 990 85
CPU-2 413 19721 98735 636 84 696 757 89
CPU-3 84 11711 79989 17516 909 4811 5773 341
CPU-4 152 19610 98229 444 53 649 708 1283
CPU-5 185 8781 108697 666 91 671 677 1365
CPU-6 157 21964 95825 581 67 566 684 1284
CPU-7 125 15238 102704 398 20 665 786 1197

Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# bbf48c18 30-Jan-2018 Eric Leblond <eric@regit.org>

libbpf: add error reporting in XDP

Parse netlink ext attribute to get the error message returned by
the card. Code is partially take from libnl.

We add netlink.h to the uapi include of tools. And we need to
avoid include of userspace netlink header to have a successful
build of sample so nlattr.h has a define to avoid
the inclusion. Using a direct define could have been an issue
as NLMSGERR_ATTR_MAX can change in the future.

We also define SOL_NETLINK if not defined to avoid to have to
copy socket.h for a fixed value.

Signed-off-by: Eric Leblond <eric@regit.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# c25ef6a5 25-Jan-2018 Mickaël Salaün <mic@digikod.net>

samples/bpf: Partially fixes the bpf.o build

Do not build lib/bpf/bpf.o with this Makefile but use the one from the
library directory. This avoid making a buggy bpf.o file (e.g. missing
symbols).

This patch is useful if some code (e.g. Landlock tests) needs both the
bpf.o (from tools/lib/bpf) and the bpf_load.o (from samples/bpf).

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 36e04a2d 10-Jan-2018 Jesper Dangaard Brouer <brouer@redhat.com>

samples/bpf: xdp2skb_meta shows transferring info from XDP to SKB

Creating a bpf sample that shows howto use the XDP 'data_meta'
infrastructure, created by Daniel Borkmann. Very few drivers support
this feature, but I wanted a functional sample to begin with, when
working on adding driver support.

XDP data_meta is about creating a communication channel between BPF
programs. This can be XDP tail-progs, but also other SKB based BPF
hooks, like in this case the TC clsact hook. In this sample I show
that XDP can store info named "mark", and TC/clsact chooses to use
this info and store it into the skb->mark.

It is a bit annoying that XDP and TC samples uses different tools/libs
when attaching their BPF hooks. As the XDP and TC programs need to
cooperate and agree on a struct-layout, it is best/easiest if the two
programs can be contained within the same BPF restricted-C file.

As the bpf-loader, I choose to not use bpf_load.c (or libbpf), but
instead wrote a bash shell scripted named xdp2skb_meta.sh, which
demonstrate howto use the iproute cmdline tools 'tc' and 'ip' for
loading BPF programs. To make it easy for first time users, the shell
script have command line parsing, and support --verbose and --dry-run
mode, if you just want to see/learn the tc+ip command syntax:

# ./xdp2skb_meta.sh --dev ixgbe2 --dry-run
# Dry-run mode: enable VERBOSE and don't call TC+IP
tc qdisc del dev ixgbe2 clsact
tc qdisc add dev ixgbe2 clsact
tc filter add dev ixgbe2 ingress prio 1 handle 1 bpf da obj ./xdp2skb_meta_kern.o sec tc_mark
# Flush XDP on device: ixgbe2
ip link set dev ixgbe2 xdp off
ip link set dev ixgbe2 xdp obj ./xdp2skb_meta_kern.o sec xdp_mark

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 0fca931a 03-Jan-2018 Jesper Dangaard Brouer <brouer@redhat.com>

samples/bpf: program demonstrating access to xdp_rxq_info

This sample program can be used for monitoring and reporting how many
packets per sec (pps) are received per NIC RX queue index and which
CPU processed the packet. In itself it is a useful tool for quickly
identifying RSS imbalance issues, see below.

The default XDP action is XDP_PASS in-order to provide a monitor
mode. For benchmarking purposes it is possible to specify other XDP
actions on the cmdline --action.

Output below shows an imbalance RSS case where most RXQ's deliver to
CPU-0 while CPU-2 only get packets from a single RXQ. Looking at
things from a CPU level the two CPUs are processing approx the same
amount, BUT looking at the rx_queue_index levels it is clear that
RXQ-2 receive much better service, than other RXQs which all share CPU-0.

Running XDP on dev:i40e1 (ifindex:3) action:XDP_PASS
XDP stats CPU pps issue-pps
XDP-RX CPU 0 900,473 0
XDP-RX CPU 2 906,921 0
XDP-RX CPU total 1,807,395

RXQ stats RXQ:CPU pps issue-pps
rx_queue_index 0:0 180,098 0
rx_queue_index 0:sum 180,098
rx_queue_index 1:0 180,098 0
rx_queue_index 1:sum 180,098
rx_queue_index 2:2 906,921 0
rx_queue_index 2:sum 906,921
rx_queue_index 3:0 180,098 0
rx_queue_index 3:sum 180,098
rx_queue_index 4:0 180,082 0
rx_queue_index 4:sum 180,082
rx_queue_index 5:0 180,093 0
rx_queue_index 5:sum 180,093

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 965de87e 11-Dec-2017 Josef Bacik <jbacik@fb.com>

samples/bpf: add a test for bpf_override_return

This adds a basic test for bpf_override_return to verify it works. We
override the main function for mounting a btrfs fs so it'll return
-ENOMEM and then make sure that trying to mount a btrfs fs will fail.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# bf070bb0 07-Nov-2017 Masahiro Yamada <yamada.masahiro@socionext.com>

kbuild: remove all dummy assignments to obj-

Now kbuild core scripts create empty built-in.o where necessary.
Remove "obj- := dummy.o" tricks.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>


# f3edacbd 11-Nov-2017 David S. Miller <davem@davemloft.net>

bpf: Revert bpf_overrid_function() helper changes.

NACK'd by x86 maintainer.

Signed-off-by: David S. Miller <davem@davemloft.net>


# eafb3401 07-Nov-2017 Josef Bacik <jbacik@fb.com>

samples/bpf: add a test for bpf_override_return

This adds a basic test for bpf_override_return to verify it works. We
override the main function for mounting a btrfs fs so it'll return
-ENOMEM and then make sure that trying to mount a btrfs fs will fail.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 3e29cd0e 04-Nov-2017 Christina Jacob <Christina.Jacob@cavium.com>

xdp: Sample xdp program implementing ip forward

Implements port to port forwarding with route table and arp table
lookup for ipv4 packets using bpf_redirect helper function and
lpm_trie map.

Signed-off-by: Christina Jacob <Christina.Jacob@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 9d1f1594 05-Nov-2017 Roman Gushchin <guro@fb.com>

bpf: move cgroup_helpers from samples/bpf/ to tools/testing/selftesting/bpf/

The purpose of this move is to use these files in bpf tests.

Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# b2441318 01-Nov-2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.

For non */uapi/* files that summary was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139

and resulted in the first patch in this series.

If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930

and resulted in the second patch in this series.

- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:

SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

and that resulted in the third patch in this series.

- when the two scanners agreed on the detected license(s), that became
the concluded license(s).

- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.

- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).

- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.

- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct

This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# c890063e 20-Oct-2017 Lawrence Brakmo <brakmo@fb.com>

bpf: sample BPF_SOCKET_OPS_BASE_RTT program

Sample socket_ops BPF program to test the BPF helper function
bpf_getsocketops and the new socket_ops op BPF_SOCKET_OPS_BASE_RTT.

The program provides a base RTT of 80us when the calling flow is
within a DC (as determined by the IPV6 prefix) and the congestion
algorithm is "nv".

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked_by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# fad3917e 15-Oct-2017 Jesper Dangaard Brouer <brouer@redhat.com>

samples/bpf: add cpumap sample program xdp_redirect_cpu

This sample program show how to use cpumap and the associated
tracepoints.

It provides command line stats, which shows how the XDP-RX process,
cpumap-enqueue and cpumap kthread dequeue is cooperating on a per CPU
basis. It also utilize the xdp_exception and xdp_redirect_err
transpoints to allow users quickly to identify setup issues.

One issue with ixgbe driver is that the driver reset the link when
loading XDP. This reset the procfs smp_affinity settings. Thus,
after loading the program, these must be reconfigured. The easiest
workaround it to reduce the RX-queue to e.g. two via:

# ethtool --set-channels ixgbe1 combined 2

And then add CPUs above 0 and 1, like:

# xdp_redirect_cpu --dev ixgbe1 --prog 2 --cpu 2 --cpu 3 --cpu 4

Another issue with ixgbe is that the page recycle mechanism is tied to
the RX-ring size. And the default setting of 512 elements is too
small. This is the same issue with regular devmap XDP_REDIRECT.
To overcome this I've been using 1024 rx-ring size:

# ethtool -G ixgbe1 rx 1024 tx 1024

V3:
- whitespace cleanups
- bpf tracepoint cannot access top part of struct

V4:
- report on kthread sched events, according to tracepoint change
- report average bulk enqueue size

V5:
- bpf_map_lookup_elem on cpumap not allowed from bpf_prog
use separate map to mark CPUs not available

V6:
- correct kthread sched summary output

V7:
- Added a --stress-mode for concurrently changing underlying cpumap

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 9db95838 13-Oct-2017 Abhijit Ayarekar <abhijit.ayarekar@caviumnetworks.com>

bpf: Add -target to clang switch while cross compiling.

Update to llvm excludes assembly instructions.
llvm git revision is below

commit 65fad7c26569 ("bpf: add inline-asm support")

This change will be part of llvm release 6.0

__ASM_SYSREG_H define is not required for native compile.
-target switch includes appropriate target specific files
while cross compiling

Tested on x86 and arm64.

Signed-off-by: Abhijit Ayarekar <abhijit.ayarekar@caviumnetworks.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# b655fc1c 20-Sep-2017 Joel Fernandes <joelaf@google.com>

samples/bpf: Fix pt_regs issues when cross-compiling

BPF samples fail to build when cross-compiling for ARM64 because of incorrect
pt_regs param selection. This is because clang defines __x86_64__ and
bpf_headers thinks we're building for x86. Since clang is building for the BPF
target, it shouldn't make assumptions about what target the BPF program is
going to run on. To fix this, lets pass ARCH so the header knows which target
the BPF program is being compiled for and can use the correct pt_regs code.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 876e88e3 20-Sep-2017 Joel Fernandes <joelaf@google.com>

samples/bpf: Enable cross compiler support

When cross compiling, bpf samples use HOSTCC for compiling the non-BPF part of
the sample, however what we really want is to use the cross compiler to build
for the cross target since that is what will load and run the BPF sample.
Detect this and compile samples correctly.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 3ffab546 29-Aug-2017 Jesper Dangaard Brouer <brouer@redhat.com>

samples/bpf: xdp_monitor tool based on tracepoints

This tool xdp_monitor demonstrate how to use the different xdp_redirect
tracepoints xdp_redirect{,_map}{,_err} from a BPF program.

The default mode is to only monitor the error counters, to avoid
affecting the per packet performance. Tracepoints comes with a base
overhead of 25 nanosec for an attached bpf_prog, and 48 nanosec for
using a full perf record (with non-matching filter). Thus, default
loading the --stats mode could affect the maximum performance.

This version of the tool is very simple and count all types of errors
as one. It will be natural to extend this later with the different
types of errors that can occur, which should help users quickly
identify common mistakes.

Because the TP_STRUCT was kept in sync all the tracepoints loads the
same BPF code. It would also be natural to extend the map version to
demonstrate how the map information could be used.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 1da236b6b 04-Aug-2017 Yonghong Song <yhs@fb.com>

bpf: add a test case for syscalls/sys_{enter|exit}_* tracepoints

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 9d6e0052 17-Jul-2017 John Fastabend <john.fastabend@gmail.com>

xdp: bpf redirect with map sample program

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Tested-by: Andy Gospodarek <andy@greyhouse.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 832622e6 17-Jul-2017 John Fastabend <john.fastabend@gmail.com>

xdp: sample program for new bpf_redirect helper

This implements a sample program for testing bpf_redirect. It reports
the number of packets redirected per second and as input takes the
ifindex of the device to run the xdp program on and the ifindex of the
interface to redirect packets to.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Tested-by: Andy Gospodarek <andy@greyhouse.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 53335022 10-Jul-2017 Yonghong Song <yhs@fb.com>

samples/bpf: fix a build issue

With latest net-next:

====
clang -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include -I./arch/x86/include -I./arch/x86/include/generated/uapi -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -Isamples/bpf \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-Wno-unknown-warning-option \
-O2 -emit-llvm -c samples/bpf/tcp_synrto_kern.c -o -| llc -march=bpf -filetype=obj -o samples/bpf/tcp_synrto_kern.o
samples/bpf/tcp_synrto_kern.c:20:10: fatal error: 'bpf_endian.h' file not found
^~~~~~~~~~~~~~
1 error generated.
====

net has the same issue.

Add support for ntohl and htonl in tools/testing/selftests/bpf/bpf_endian.h.
Also move bpf_helpers.h from samples/bpf to selftests/bpf and change
compiler include logic so that programs in samples/bpf can access the headers
in selftests/bpf, but not the other way around.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 6c4a01b2 30-Jun-2017 Lawrence Brakmo <brakmo@fb.com>

bpf: Sample bpf program to set sndcwnd clamp

Sample BPF program, tcp_clamp_kern.c, to demostrate the use
of setting the sndcwnd clamp. This program assumes that if the
first 5.5 bytes of the host's IPv6 addresses are the same, then
the hosts are in the same datacenter and sets sndcwnd clamp to
100 packets, SYN and SYN-ACK RTOs to 10ms and send/receive buffer
sizes to 150KB.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7bc62e28 30-Jun-2017 Lawrence Brakmo <brakmo@fb.com>

bpf: Sample BPF program to set initial cwnd

Sample BPF program that assumes hosts are far away (i.e. large RTTs)
and sets initial cwnd and initial receive window to 40 packets,
send and receive buffers to 1.5MB.

In practice there would be a test to insure the hosts are actually
far enough away.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# bb56d444 30-Jun-2017 Lawrence Brakmo <brakmo@fb.com>

bpf: Sample BPF program to set congestion control

Sample BPF program that sets congestion control to dctcp when both hosts
are within the same datacenter. In this example that is assumed to be
when they have the first 5.5 bytes of their IPv6 address are the same.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d9925368 30-Jun-2017 Lawrence Brakmo <brakmo@fb.com>

bpf: Sample BPF program to set buffer sizes

This patch contains a BPF program to set initial receive window to
40 packets and send and receive buffers to 1.5MB. This would usually
be done after doing appropriate checks that indicate the hosts are
far enough away (i.e. large RTT).

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# c400296b 30-Jun-2017 Lawrence Brakmo <brakmo@fb.com>

bpf: Sample bpf program to set initial window

The sample bpf program, tcp_rwnd_kern.c, sets the initial
advertized window to 40 packets in an environment where
distinct IPv6 prefixes indicate that both hosts are not
in the same data center.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 61bc4d8d 30-Jun-2017 Lawrence Brakmo <brakmo@fb.com>

bpf: Sample bpf program to set SYN/SYN-ACK RTOs

The sample BPF program, tcp_synrto_kern.c, sets the SYN and SYN-ACK
RTOs to 10ms when both hosts are within the same datacenter (i.e.
small RTTs) in an environment where common IPv6 prefixes indicate
both hosts are in the same data center.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ae16189e 30-Jun-2017 Lawrence Brakmo <brakmo@fb.com>

bpf: program to load and attach sock_ops BPF progs

The program load_sock_ops can be used to load sock_ops bpf programs and
to attach it to an existing (v2) cgroup. It can also be used to detach
sock_ops programs.

Examples:
load_sock_ops [-l] <cg-path> <prog filename>
Load and attaches a sock_ops program at the specified cgroup.
If "-l" is used, the program will continue to run to output the
BPF log buffer.
If the specified filename does not end in ".o", it appends
"_kern.o" to the name.

load_sock_ops -r <cg-path>
Detaches the currently attached sock_ops program from the
specified cgroup.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 00a3855d 21-Jun-2017 Yonghong Song <yhs@fb.com>

samples/bpf: fix a build problem

tracex5_kern.c build failed with the following error message:
../samples/bpf/tracex5_kern.c:12:10: fatal error: 'syscall_nrs.h' file not found
#include "syscall_nrs.h"
The generated file syscall_nrs.h is put in build/samples/bpf directory,
but this directory is not in include path, hence build failed.

The fix is to add $(obj) into the clang compilation path.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4b7190e8 13-Jun-2017 David Daney <david.daney@cavium.com>

samples/bpf: Fix tracex5 to work with MIPS syscalls.

There are two problems:

1) In MIPS the __NR_* macros expand to an expression, this causes the
sections of the object file to be named like:

.
.
.
[ 5] kprobe/(5000 + 1) PROGBITS 0000000000000000 000160 ...
[ 6] kprobe/(5000 + 0) PROGBITS 0000000000000000 000258 ...
[ 7] kprobe/(5000 + 9) PROGBITS 0000000000000000 000348 ...
.
.
.

The fix here is to use the "asm_offsets" trick to evaluate the macros
in the C compiler and generate a header file with a usable form of the
macros.

2) MIPS syscall numbers start at 5000, so we need a bigger map to hold
the sub-programs.

Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 69b6a7f7 24-Apr-2017 Alexander Alemayhu <alexander@alemayhu.com>

samples/bpf: add -Wno-unknown-warning-option to clang

I was initially going to remove '-Wno-address-of-packed-member' because I
thought it was not supposed to be there but Daniel suggested using
'-Wno-unknown-warning-option'.

This silences several warnings similiar to the one below

warning: unknown warning option '-Wno-address-of-packed-member' [-Wunknown-warning-option]
1 warning generated.
clang -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include -I./arch/x86/include -I./arch/x86/include/generated/uapi -I./arch/x86/include/generated -I./include
-I./arch/x86/include/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-O2 -emit-llvm -c samples/bpf/xdp_tx_iptunnel_kern.c -o -| llc -march=bpf -filetype=obj -o samples/bpf/xdp_tx_iptunnel_kern.o

$ clang --version

clang version 3.9.1 (tags/RELEASE_391/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

Signed-off-by: Alexander Alemayhu <alexander@alemayhu.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 51570a5a 22-Mar-2017 Chenbo Feng <fengc@google.com>

A Sample of using socket cookie and uid for traffic monitoring

Add a sample program to demostrate the possible usage of
get_socket_cookie and get_socket_uid helper function. The program will
store bytes and packets counting of in/out traffic monitored by iptables
and store the stats in a bpf map in per socket base. The owner uid of
the socket will be stored as part of the data entry. A shell script for
running the program is also included.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Chenbo Feng <fengc@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# fb30d4b7 22-Mar-2017 Martin KaFai Lau <kafai@fb.com>

bpf: Add tests for map-in-map

Test cases for array of maps and hash of maps.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 9899694a 08-Dec-2016 Joe Stringer <joe@ovn.org>

samples/bpf: Move open_raw_sock to separate header

This function was declared in libbpf.c and was the only remaining
function in this library, but has nothing to do with BPF. Shift it out
into a new header, sock_example.h, and include it from the relevant
samples.

Signed-off-by: Joe Stringer <joe@ovn.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20161209024620.31660-8-joe@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 205c8ada 08-Dec-2016 Joe Stringer <joe@ovn.org>

samples/bpf: Remove perf_event_open() declaration

This declaration was made in samples/bpf/libbpf.c for convenience, but
there's already one in tools/perf/perf-sys.h. Reuse that one.

Committer notes:

Testing it:

$ make -j4 O=../build/v4.9.0-rc8+ samples/bpf/
make[1]: Entering directory '/home/build/v4.9.0-rc8+'
CHK include/config/kernel.release
GEN ./Makefile
CHK include/generated/uapi/linux/version.h
Using /home/acme/git/linux as source for kernel
CHK include/generated/utsrelease.h
CHK include/generated/timeconst.h
CHK include/generated/bounds.h
CHK include/generated/asm-offsets.h
CALL /home/acme/git/linux/scripts/checksyscalls.sh
HOSTCC samples/bpf/test_verifier.o
HOSTCC samples/bpf/libbpf.o
HOSTCC samples/bpf/../../tools/lib/bpf/bpf.o
HOSTCC samples/bpf/test_maps.o
HOSTCC samples/bpf/sock_example.o
HOSTCC samples/bpf/bpf_load.o
<SNIP>
HOSTLD samples/bpf/trace_event
HOSTLD samples/bpf/sampleip
HOSTLD samples/bpf/tc_l2_redirect
make[1]: Leaving directory '/home/build/v4.9.0-rc8+'
$

Also tested the offwaketime resulting from the rebuild, seems to work as
before.

Signed-off-by: Joe Stringer <joe@ovn.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20161209024620.31660-7-joe@ovn.org
[ Use -I$(srctree)/tools/lib/ to support out of source code tree builds ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# 43371c83 14-Dec-2016 Joe Stringer <joe@ovn.org>

samples/bpf: Switch over to libbpf

Now that libbpf under tools/lib/bpf/* is synced with the version from
samples/bpf, we can get rid most of the libbpf library here.

Committer notes:

Built it in a docker fedora rawhide container and ran it in the f25 host, seems
to work just like it did before this patch, i.e. the switch to tools/lib/bpf/
doesn't seem to have introduced problems and Joe said he tested it with
all the entries in samples/bpf/ and other code he found:

[root@f5065a7d6272 linux]# make -j4 O=/tmp/build/linux headers_install
<SNIP>
[root@f5065a7d6272 linux]# rm -rf /tmp/build/linux/samples/bpf/
[root@f5065a7d6272 linux]# make -j4 O=/tmp/build/linux samples/bpf/
make[1]: Entering directory '/tmp/build/linux'
CHK include/config/kernel.release
HOSTCC scripts/basic/fixdep
GEN ./Makefile
CHK include/generated/uapi/linux/version.h
Using /git/linux as source for kernel
CHK include/generated/utsrelease.h
HOSTCC scripts/basic/bin2c
HOSTCC arch/x86/tools/relocs_32.o
HOSTCC arch/x86/tools/relocs_64.o
LD samples/bpf/built-in.o
<SNIP>
HOSTCC samples/bpf/fds_example.o
HOSTCC samples/bpf/sockex1_user.o
/git/linux/samples/bpf/fds_example.c: In function 'bpf_prog_create':
/git/linux/samples/bpf/fds_example.c:63:6: warning: passing argument 2 of 'bpf_load_program' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
insns, insns_cnt, "GPL", 0,
^~~~~
In file included from /git/linux/samples/bpf/libbpf.h:5:0,
from /git/linux/samples/bpf/bpf_load.h:4,
from /git/linux/samples/bpf/fds_example.c:15:
/git/linux/tools/lib/bpf/bpf.h:31:5: note: expected 'struct bpf_insn *' but argument is of type 'const struct bpf_insn *'
int bpf_load_program(enum bpf_prog_type type, struct bpf_insn *insns,
^~~~~~~~~~~~~~~~
HOSTCC samples/bpf/sockex2_user.o
<SNIP>
HOSTCC samples/bpf/xdp_tx_iptunnel_user.o
clang -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/6.2.1/include -I/git/linux/arch/x86/include -I./arch/x86/include/generated/uapi -I./arch/x86/include/generated -I/git/linux/include -I./include -I/git/linux/arch/x86/include/uapi -I/git/linux/include/uapi -I./include/generated/uapi -include /git/linux/include/linux/kconfig.h \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-O2 -emit-llvm -c /git/linux/samples/bpf/sockex1_kern.c -o -| llc -march=bpf -filetype=obj -o samples/bpf/sockex1_kern.o
HOSTLD samples/bpf/tc_l2_redirect
<SNIP>
HOSTLD samples/bpf/lwt_len_hist
HOSTLD samples/bpf/xdp_tx_iptunnel
make[1]: Leaving directory '/tmp/build/linux'
[root@f5065a7d6272 linux]#

And then, in the host:

[root@jouet bpf]# mount | grep "docker.*devicemapper\/"
/dev/mapper/docker-253:0-1705076-9bd8aa1e0af33adce89ff42090847868ca676932878942be53941a06ec5923f9 on /var/lib/docker/devicemapper/mnt/9bd8aa1e0af33adce89ff42090847868ca676932878942be53941a06ec5923f9 type xfs (rw,relatime,context="system_u:object_r:container_file_t:s0:c73,c276",nouuid,attr2,inode64,sunit=1024,swidth=1024,noquota)
[root@jouet bpf]# cd /var/lib/docker/devicemapper/mnt/9bd8aa1e0af33adce89ff42090847868ca676932878942be53941a06ec5923f9/rootfs/tmp/build/linux/samples/bpf/
[root@jouet bpf]# file offwaketime
offwaketime: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=f423d171e0487b2f802b6a792657f0f3c8f6d155, not stripped
[root@jouet bpf]# readelf -SW offwaketime
offwaketime offwaketime_kern.o offwaketime_user.o
[root@jouet bpf]# readelf -SW offwaketime_kern.o
There are 11 section headers, starting at offset 0x700:

Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .strtab STRTAB 0000000000000000 000658 0000a8 00 0 0 1
[ 2] .text PROGBITS 0000000000000000 000040 000000 00 AX 0 0 4
[ 3] kprobe/try_to_wake_up PROGBITS 0000000000000000 000040 0000d8 00 AX 0 0 8
[ 4] .relkprobe/try_to_wake_up REL 0000000000000000 0005a8 000020 10 10 3 8
[ 5] tracepoint/sched/sched_switch PROGBITS 0000000000000000 000118 000318 00 AX 0 0 8
[ 6] .reltracepoint/sched/sched_switch REL 0000000000000000 0005c8 000090 10 10 5 8
[ 7] maps PROGBITS 0000000000000000 000430 000050 00 WA 0 0 4
[ 8] license PROGBITS 0000000000000000 000480 000004 00 WA 0 0 1
[ 9] version PROGBITS 0000000000000000 000484 000004 00 WA 0 0 4
[10] .symtab SYMTAB 0000000000000000 000488 000120 18 1 4 8
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
[root@jouet bpf]# ./offwaketime | head -3
qemu-system-x86;entry_SYSCALL_64_fastpath;sys_ppoll;do_sys_poll;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule;-;try_to_wake_up;hrtimer_wakeup;__hrtimer_run_queues;hrtimer_interrupt;local_apic_timer_interrupt;smp_apic_timer_interrupt;__irqentry_text_start;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel;start_cpu;;swapper/0 4
firefox;entry_SYSCALL_64_fastpath;sys_poll;do_sys_poll;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule;-;try_to_wake_up;pollwake;__wake_up_common;__wake_up_sync_key;pipe_write;__vfs_write;vfs_write;sys_write;entry_SYSCALL_64_fastpath;;Timer 1
swapper/2;start_cpu;start_secondary;cpu_startup_entry;schedule_preempt_disabled;schedule;__schedule;-;---;; 61
[root@jouet bpf]#

Signed-off-by: Joe Stringer <joe@ovn.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: https://github.com/joestringer/linux/commit/5c40f54a52b1f437123c81e21873f4b4b1f9bd55.patch
Link: http://lkml.kernel.org/n/tip-xr8twtx7sjh5821g8qw47yxk@git.kernel.org
[ Use -I$(srctree)/tools/lib/ to support out of source code tree builds, as noticed by Wang Nan ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# e19b7cee 22-Nov-2016 Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

make use of make variable CURDIR instead of calling pwd

make already provides the current working directory in a variable, so make
use of it instead of forking a shell. Also replace usage of PWD by
CURDIR. PWD is provided by most shells, but not all, so this makes the
build system more robust.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Michal Marek <mmarek@suse.com>


# 12d8bb64 07-Dec-2016 Martin KaFai Lau <kafai@fb.com>

bpf: xdp: Add XDP example for head adjustment

The XDP prog checks if the incoming packet matches any VIP:PORT
combination in the BPF hashmap. If it is, it will encapsulate
the packet with a IPv4/v6 header as instructed by the value of
the BPF hashmap and then XDP_TX it out.

The VIP:PORT -> IP-Encap-Info can be specified by the cmd args
of the user prog.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 9b474ece 02-Dec-2016 Sargun Dhillon <sargun@sargun.me>

samples, bpf: Add automated test for cgroup filter attachments

This patch adds the sample program test_cgrp2_attach2. This program is
similar to test_cgrp2_attach, but it performs automated testing of the
cgroupv2 BPF attached filters. It runs the following checks:
* Simple filter attachment
* Application of filters to child cgroups
* Overriding filters on child cgroups
* Checking that this still works when the parent filter is removed

The filters that are used here are simply allow all / deny all filters, so
it isn't checking the actual functionality of the filters, but rather
the behaviour around detachment / attachment. If net_cls is enabled,
this test will fail.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 1a922fee 02-Dec-2016 Sargun Dhillon <sargun@sargun.me>

samples, bpf: Refactor test_current_task_under_cgroup - separate out helpers

This patch modifies test_current_task_under_cgroup_user. The test has
several helpers around creating a temporary environment for cgroup
testing, and moving the current task around cgroups. This set of
helpers can then be used in other tests.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 69a9d09b 01-Dec-2016 Alexei Starovoitov <ast@kernel.org>

samples/bpf: silence compiler warnings

silence some of the clang compiler warnings like:
include/linux/fs.h:2693:9: warning: comparison of unsigned enum expression < 0 is always false
arch/x86/include/asm/processor.h:491:30: warning: taking address of packed member 'sp0' of class or structure 'x86_hw_tss' may result in an unaligned pointer value
include/linux/cgroup-defs.h:326:16: warning: field 'cgrp' with variable sized type 'struct cgroup' not at the end of a struct or class is a GNU extension
since they add too much noise to samples/bpf/ build.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 554ae6e7 01-Dec-2016 David Ahern <dsa@cumulusnetworks.com>

samples/bpf: add userspace example for prohibiting sockets

Add examples preventing a process in a cgroup from opening a socket
based family, protocol and type.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ad2805dc 01-Dec-2016 David Ahern <dsa@cumulusnetworks.com>

samples: bpf: add userspace example for modifying sk_bound_dev_if

Add a simple program to demonstrate the ability to attach a bpf program
to a cgroup that sets sk_bound_dev_if for AF_INET{6} sockets when they
are created.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# f74599f7 30-Nov-2016 Thomas Graf <tgraf@suug.ch>

bpf: Add tests and samples for LWT-BPF

Adds a series of tests to verify the functionality of attaching
BPF programs at LWT hooks.

Also adds a sample which collects a histogram of packet sizes which
pass through an LWT hook.

$ ./lwt_len_hist.sh
Starting netserver with host 'IN(6)ADDR_ANY' port '12865' and family AF_UNSPEC
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.253.2 () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec

87380 16384 16384 10.00 39857.69
1 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 22 | |
64 -> 127 : 98 | |
128 -> 255 : 213 | |
256 -> 511 : 1444251 |******** |
512 -> 1023 : 660610 |*** |
1024 -> 2047 : 535241 |** |
2048 -> 4095 : 19 | |
4096 -> 8191 : 180 | |
8192 -> 16383 : 5578023 |************************************* |
16384 -> 32767 : 632099 |*** |
32768 -> 65535 : 6575 | |

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 9bee294f 28-Nov-2016 Alexei Starovoitov <ast@kernel.org>

samples/bpf: fix include path

Fix the following build error:
HOSTCC samples/bpf/test_lru_dist.o
../samples/bpf/test_lru_dist.c:25:22: fatal error: bpf_util.h: No such file or directory

This is due to objtree != srctree.
Use srctree, since that's where bpf_util.h is located.

Fixes: e00c7b216f34 ("bpf: fix multiple issues in selftest suite and samples")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# e00c7b21 25-Nov-2016 Daniel Borkmann <daniel@iogearbox.net>

bpf: fix multiple issues in selftest suite and samples

1) The test_lru_map and test_lru_dist fails building on my machine since
the sys/resource.h header is not included.

2) test_verifier fails in one test case where we try to call an invalid
function, since the verifier log output changed wrt printing function
names.

3) Current selftest suite code relies on sysconf(_SC_NPROCESSORS_CONF) for
retrieving the number of possible CPUs. This is broken at least in our
scenario and really just doesn't work.

glibc tries a number of things for retrieving _SC_NPROCESSORS_CONF.
First it tries equivalent of /sys/devices/system/cpu/cpu[0-9]* | wc -l,
if that fails, depending on the config, it either tries to count CPUs
in /proc/cpuinfo, or returns the _SC_NPROCESSORS_ONLN value instead.
If /proc/cpuinfo has some issue, it returns just 1 worst case. This
oddity is nothing new [1], but semantics/behaviour seems to be settled.
_SC_NPROCESSORS_ONLN will parse /sys/devices/system/cpu/online, if
that fails it looks into /proc/stat for cpuX entries, and if also that
fails for some reason, /proc/cpuinfo is consulted (and returning 1 if
unlikely all breaks down).

While that might match num_possible_cpus() from the kernel in some
cases, it's really not guaranteed with CPU hotplugging, and can result
in a buffer overflow since the array in user space could have too few
number of slots, and on perpcu map lookup, the kernel will write beyond
that memory of the value buffer.

William Tu reported such mismatches:

[...] The fact that sysconf(_SC_NPROCESSORS_CONF) != num_possible_cpu()
happens when CPU hotadd is enabled. For example, in Fusion when
setting vcpu.hotadd = "TRUE" or in KVM, setting ./qemu-system-x86_64
-smp 2, maxcpus=4 ... the num_possible_cpu() will be 4 and sysconf()
will be 2 [2]. [...]

Documentation/cputopology.txt says /sys/devices/system/cpu/possible
outputs cpu_possible_mask. That is the same as in num_possible_cpus(),
so first step would be to fix the _SC_NPROCESSORS_CONF calls with our
own implementation. Later, we could add support to bpf(2) for passing
a mask via CPU_SET(3), for example, to just select a subset of CPUs.

BPF samples code needs this fix as well (at least so that people stop
copying this). Thus, define bpf_num_possible_cpus() once in selftests
and import it from there for the sample code to avoid duplicating it.
The remaining sysconf(_SC_NPROCESSORS_CONF) in samples are unrelated.

After all three issues are fixed, the test suite runs fine again:

# make run_tests | grep self
selftests: test_verifier [PASS]
selftests: test_maps [PASS]
selftests: test_lru_map [PASS]
selftests: test_kmod.sh [PASS]

[1] https://www.sourceware.org/ml/libc-alpha/2011-06/msg00079.html
[2] https://www.mail-archive.com/netdev@vger.kernel.org/msg121183.html

Fixes: 3059303f59cf ("samples/bpf: update tracex[23] examples to use per-cpu maps")
Fixes: 86af8b4191d2 ("Add sample for adding simple drop program to link")
Fixes: df570f577231 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_ARRAY")
Fixes: e15596717948 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_HASH")
Fixes: ebb676daa1a3 ("bpf: Print function name in addition to function id")
Fixes: 5db58faf989f ("bpf: Add tests for the LRU bpf_htab")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: William Tu <u9012063@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d8c5b17f 23-Nov-2016 Daniel Mack <daniel@zonque.org>

samples: bpf: add userspace example for attaching eBPF programs to cgroups

Add a simple userpace program to demonstrate the new API to attach eBPF
programs to cgroups. This is what it does:

* Create arraymap in kernel with 4 byte keys and 8 byte values

* Load eBPF program

The eBPF program accesses the map passed in to store two pieces of
information. The number of invocations of the program, which maps
to the number of packets received, is stored to key 0. Key 1 is
incremented on each iteration by the number of bytes stored in
the skb.

* Detach any eBPF program previously attached to the cgroup

* Attach the new program to the cgroup using BPF_PROG_ATTACH

* Once a second, read map[0] and map[1] to see how many bytes and
packets were seen on any socket of tasks in the given cgroup.

The program takes a cgroup path as 1st argument, and either "ingress"
or "egress" as 2nd. Optionally, "drop" can be passed as 3rd argument,
which will make the generated eBPF program return 0 instead of 1, so
the kernel will drop the packet.

libbpf gained two new wrappers for the new syscall commands.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 5db58faf 11-Nov-2016 Martin KaFai Lau <kafai@fb.com>

bpf: Add tests for the LRU bpf_htab

This patch has some unit tests and a test_lru_dist.

The test_lru_dist reads in the numeric keys from a file.
The files used here are generated by a modified fio-genzipf tool
originated from the fio test suit. The sample data file can be
found here: https://github.com/iamkafai/bpf-lru

The zipf.* data files have 100k numeric keys and the key is also
ranged from 1 to 100k.

The test_lru_dist outputs the number of unique keys (nr_unique).
F.e. The following means, 61239 of them is unique out of 100k keys.
nr_misses means it cannot be found in the LRU map, so nr_misses
must be >= nr_unique. test_lru_dist also simulates a perfect LRU
map as a comparison:

[root@arch-fb-vm1 ~]# ~/devshare/fb-kernel/linux/samples/bpf/test_lru_dist \
/root/zipf.100k.a1_01.out 4000 1
...
test_parallel_lru_dist (map_type:9 map_flags:0x0):
task:0 BPF LRU: nr_unique:23093(/100000) nr_misses:31603(/100000)
task:0 Perfect LRU: nr_unique:23093(/100000 nr_misses:34328(/100000)
....
test_parallel_lru_dist (map_type:9 map_flags:0x2):
task:0 BPF LRU: nr_unique:23093(/100000) nr_misses:31710(/100000)
task:0 Perfect LRU: nr_unique:23093(/100000 nr_misses:34328(/100000)

[root@arch-fb-vm1 ~]# ~/devshare/fb-kernel/linux/samples/bpf/test_lru_dist \
/root/zipf.100k.a0_01.out 40000 1
...
test_parallel_lru_dist (map_type:9 map_flags:0x0):
task:0 BPF LRU: nr_unique:61239(/100000) nr_misses:67054(/100000)
task:0 Perfect LRU: nr_unique:61239(/100000 nr_misses:66993(/100000)
...
test_parallel_lru_dist (map_type:9 map_flags:0x2):
task:0 BPF LRU: nr_unique:61239(/100000) nr_misses:67068(/100000)
task:0 Perfect LRU: nr_unique:61239(/100000 nr_misses:66993(/100000)

LRU map has also been added to map_perf_test:
/* Global LRU */
[root@kerneltest003.31.prn1 ~]# for i in 1 4 8; do echo -n "$i cpus: "; \
./map_perf_test 16 $i | awk '{r += $3}END{print r " updates"}'; done
1 cpus: 2934082 updates
4 cpus: 7391434 updates
8 cpus: 6500576 updates

/* Percpu LRU */
[root@kerneltest003.31.prn1 ~]# for i in 1 4 8; do echo -n "$i cpus: "; \
./map_perf_test 32 $i | awk '{r += $3}END{print r " updates"}'; done
1 cpus: 2896553 updates
4 cpus: 9766395 updates
8 cpus: 17460553 updates

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 90e02896 09-Nov-2016 Martin KaFai Lau <kafai@fb.com>

bpf: Add test for bpf_redirect to ipip/ip6tnl

The test creates two netns, ns1 and ns2. The host (the default netns)
has an ipip or ip6tnl dev configured for tunneling traffic to the ns2.

ping VIPS from ns1 <----> host <--tunnel--> ns2 (VIPs at loopback)

The test is to have ns1 pinging VIPs configured at the loopback
interface in ns2.

The VIPs are 10.10.1.102 and 2401:face::66 (which are configured
at lo@ns2). [Note: 0x66 => 102].

At ns1, the VIPs are routed _via_ the host.

At the host, bpf programs are installed at the veth to redirect packets
from a veth to the ipip/ip6tnl. The test is configured in a way so
that both ingress and egress can be tested.

At ns2, the ipip/ip6tnl dev is configured with the local and remote address
specified. The return path is routed to the dev ipip/ip6tnl.

During egress test, the host also locally tests pinging the VIPs to ensure
that bpf_redirect at egress also works for the direct egress (i.e. not
forwarding from dev ve1 to ve2).

Acked-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 5aa5bd14 17-Oct-2016 Daniel Borkmann <daniel@iogearbox.net>

bpf: add initial suite for selftests

Add a start of a test suite for kernel selftests. This moves test_verifier
and test_maps over to tools/testing/selftests/bpf/ along with various
code improvements and also adds a script for invoking test_bpf module.
The test suite can simply be run via selftest framework, f.e.:

# cd tools/testing/selftests/bpf/
# make
# make run_tests

Both test_verifier and test_maps were kind of misplaced in samples/bpf/
directory and we were looking into adding them to selftests for a while
now, so it can be picked up by kbuild bot et al and hopefully also get
more exposure and thus new test case additions.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 72874418 01-Sep-2016 Brendan Gregg <bgregg@netflix.com>

samples/bpf: add sampleip example

sample instruction pointer and frequency count in a BPF map

Signed-off-by: Brendan Gregg <bgregg@netflix.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 1c47910e 01-Sep-2016 Alexei Starovoitov <ast@kernel.org>

samples/bpf: add perf_event+bpf example

The bpf program is called 50 times a second and does hashmap[kern&user_stackid]++
It's primary purpose to check that key bpf helpers like map lookup, update,
get_stackid, trace_printk and ctx access are all working.
It checks:
- PERF_COUNT_HW_CPU_CYCLES on all cpus
- PERF_COUNT_HW_CPU_CYCLES for current process and inherited perf_events to children
- PERF_COUNT_SW_CPU_CLOCK on all cpus
- PERF_COUNT_SW_CPU_CLOCK for current process

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 6afb1e28 19-Aug-2016 William Tu <u9012063@gmail.com>

samples/bpf: Add tunnel set/get tests.

The patch creates sample code exercising bpf_skb_{set,get}_tunnel_key,
and bpf_skb_{set,get}_tunnel_opt for GRE, VXLAN, and GENEVE. A native
tunnel device is created in a namespace to interact with a lwtunnel
device out of the namespace, with metadata enabled. The bpf_skb_set_*
program is attached to tc egress and bpf_skb_get_* is attached to egress
qdisc. A ping between two tunnels is used to verify correctness and
the result of bpf_skb_get_* printed by bpf_trace_printk.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 9e6e60ec 12-Aug-2016 Sargun Dhillon <sargun@sargun.me>

samples/bpf: Add test_current_task_under_cgroup test

This test has a BPF program which writes the last known pid to call the
sync syscall within a given cgroup to a map.

The user mode program creates its own mount namespace, and mounts the
cgroupsv2 hierarchy in there, as on all current test systems
(Ubuntu 16.04, Debian), the cgroupsv2 vfs is unmounted by default.
Once it does this, it proceeds to test.

The test checks for positive and negative condition. It ensures that
when it's part of a given cgroup, its pid is captured in the map,
and that when it leaves the cgroup, this doesn't happen.

It populate a cgroups arraymap prior to execution in userspace. This means
that the program must be run in the same cgroups namespace as the programs
that are being traced.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# cf9b1199 25-Jul-2016 Sargun Dhillon <sargun@sargun.me>

samples/bpf: Add test/example of using bpf_probe_write_user bpf helper

This example shows using a kprobe to act as a dnat mechanism to divert
traffic for arbitrary endpoints. It rewrite the arguments to a syscall
while they're still in userspace, and before the syscall has a chance
to copy the argument into kernel space.

Although this is an example, it also acts as a test because the mapped
address is 255.255.255.255:555 -> real address, and that's not a legal
address to connect to. If the helper is broken, the example will fail
on the intermediate steps, as well as the final step to verify the
rewrite of userspace memory succeeded.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 764cbcce 19-Jul-2016 Brenden Blanco <bblanco@plumgrid.com>

bpf: add sample for xdp forwarding and rewrite

Add a sample that rewrites and forwards packets out on the same
interface. Observed single core forwarding performance of ~10Mpps.

Since the mlx4 driver under test recycles every single packet page, the
perf output shows almost exclusively just the ring management and bpf
program work. Slowdowns are likely occurring due to cache misses.

Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 86af8b41 19-Jul-2016 Brenden Blanco <bblanco@plumgrid.com>

Add sample for adding simple drop program to link

Add a sample program that only drops packets at the BPF_PROG_TYPE_XDP_RX
hook of a link. With the drop-only program, observed single core rate is
~20Mpps.

Other tests were run, for instance without the dropcnt increment or
without reading from the packet header, the packet rate was mostly
unchanged.

$ perf record -a samples/bpf/xdp1 $(</sys/class/net/eth0/ifindex)
proto 17: 20403027 drops/s

./pktgen_sample03_burst_single_flow.sh -i $DEV -d $IP -m $MAC -t 4
Running... ctrl^C to stop
Device: eth4@0
Result: OK: 11791017(c11788327+d2689) usec, 59622913 (60byte,0frags)
5056638pps 2427Mb/sec (2427186240bps) errors: 0
Device: eth4@1
Result: OK: 11791012(c11787906+d3106) usec, 60526944 (60byte,0frags)
5133311pps 2463Mb/sec (2463989280bps) errors: 0
Device: eth4@2
Result: OK: 11791019(c11788249+d2769) usec, 59868091 (60byte,0frags)
5077431pps 2437Mb/sec (2437166880bps) errors: 0
Device: eth4@3
Result: OK: 11795039(c11792403+d2636) usec, 59483181 (60byte,0frags)
5043067pps 2420Mb/sec (2420672160bps) errors: 0

perf report --no-children:
26.05% ksoftirqd/0 [mlx4_en] [k] mlx4_en_process_rx_cq
17.84% ksoftirqd/0 [mlx4_en] [k] mlx4_en_alloc_frags
5.52% ksoftirqd/0 [mlx4_en] [k] mlx4_en_free_frag
4.90% swapper [kernel.vmlinux] [k] poll_idle
4.14% ksoftirqd/0 [kernel.vmlinux] [k] get_page_from_freelist
2.78% ksoftirqd/0 [kernel.vmlinux] [k] __free_pages_ok
2.57% ksoftirqd/0 [kernel.vmlinux] [k] bpf_map_lookup_elem
2.51% swapper [mlx4_en] [k] mlx4_en_process_rx_cq
1.94% ksoftirqd/0 [kernel.vmlinux] [k] percpu_array_map_lookup_elem
1.45% swapper [mlx4_en] [k] mlx4_en_alloc_frags
1.35% ksoftirqd/0 [kernel.vmlinux] [k] free_one_page
1.33% swapper [kernel.vmlinux] [k] intel_idle
1.04% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c5c5
0.96% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c58d
0.93% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c6ee
0.92% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c6b9
0.89% ksoftirqd/0 [kernel.vmlinux] [k] __alloc_pages_nodemask
0.83% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c686
0.83% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c5d5
0.78% ksoftirqd/0 [mlx4_en] [k] mlx4_alloc_pages.isra.23
0.77% ksoftirqd/0 [mlx4_en] [k] 0x000000000001c5b4
0.77% ksoftirqd/0 [kernel.vmlinux] [k] net_rx_action

machine specs:
receiver - Intel E5-1630 v3 @ 3.70GHz
sender - Intel E5645 @ 2.40GHz
Mellanox ConnectX-3 @40G

Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a3f74617 30-Jun-2016 Martin KaFai Lau <kafai@fb.com>

cgroup: bpf: Add an example to do cgroup checking in BPF

test_cgrp2_array_pin.c:
A userland program that creates a bpf_map (BPF_MAP_TYPE_GROUP_ARRAY),
pouplates/updates it with a cgroup2's backed fd and pins it to a
bpf-fs's file. The pinned file can be loaded by tc and then used
by the bpf prog later. This program can also update an existing pinned
array and it could be useful for debugging/testing purpose.

test_cgrp2_tc_kern.c:
A bpf prog which should be loaded by tc. It is to demonstrate
the usage of bpf_skb_in_cgroup.

test_cgrp2_tc.sh:
A script that glues the test_cgrp2_array_pin.c and
test_cgrp2_tc_kern.c together. The idea is like:
1. Load the test_cgrp2_tc_kern.o by tc
2. Use test_cgrp2_array_pin.c to populate a BPF_MAP_TYPE_CGROUP_ARRAY
with a cgroup fd
3. Do a 'ping -6 ff02::1%ve' to ensure the packet has been
dropped because of a match on the cgroup

Most of the lines in test_cgrp2_tc.sh is the boilerplate
to setup the cgroup/bpf-fs/net-devices/netns...etc. It is
not bulletproof on errors but should work well enough and
give enough debug info if things did not go well.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 65d472fb 05-May-2016 Alexei Starovoitov <ast@kernel.org>

samples/bpf: add 'pointer to packet' tests

parse_simple.c - packet parser exapmle with single length check that
filters out udp packets for port 9

parse_varlen.c - variable length parser that understand multiple vlan headers,
ipip, ipip6 and ip options to filter out udp or tcp packets on port 9.
The packet is parsed layer by layer with multitple length checks.

parse_ldabs.c - classic style of packet parsing using LD_ABS instruction.
Same functionality as parse_simple.

simple = 24.1Mpps per core
varlen = 22.7Mpps
ldabs = 21.4Mpps

Parser with LD_ABS instructions is slower than full direct access parser
which does more packet accesses and checks.

These examples demonstrate the choice bpf program authors can make between
flexibility of the parser vs speed.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# bdefbbf2 28-Apr-2016 Jesper Dangaard Brouer <brouer@redhat.com>

samples/bpf: like LLC also verify and allow redefining CLANG command

Users are likely to manually compile both LLVM 'llc' and 'clang'
tools. Thus, also allow redefining CLANG and verify command exist.

Makefile implementation wise, the target that verify the command have
been generalized.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# b62a796c 28-Apr-2016 Jesper Dangaard Brouer <brouer@redhat.com>

samples/bpf: allow make to be run from samples/bpf/ directory

It is not intuitive that 'make' must be run from the top level
directory with argument "samples/bpf/" to compile these eBPF samples.

Introduce a kbuild make file trick that allow make to be run from the
"samples/bpf/" directory itself. It basically change to the top level
directory and call "make samples/bpf/" with the "/" slash after the
directory name.

Also add a clean target that only cleans this directory, by taking
advantage of the kbuild external module setting M=$PWD.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7b01dd57 28-Apr-2016 Jesper Dangaard Brouer <brouer@redhat.com>

samples/bpf: Makefile verify LLVM compiler avail and bpf target is supported

Make compiling samples/bpf more user friendly, by detecting if LLVM
compiler tool 'llc' is available, and also detect if the 'bpf' target
is available in this version of LLVM.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 6ccfba75 28-Apr-2016 Jesper Dangaard Brouer <brouer@redhat.com>

samples/bpf: add back functionality to redefine LLC command

It is practical to be-able-to redefine the location of the LLVM
command 'llc', because not all distros have a LLVM version with bpf
target support. Thus, it is sometimes required to compile LLVM from
source, and sometimes it is not desired to overwrite the distros
default LLVM version.

This feature was removed with 128d1514be35 ("samples/bpf: Use llc in
PATH, rather than a hardcoded value").

Add this features back. Note that it is possible to redefine the LLC
on the make command like:

make samples/bpf/ LLC=~/git/llvm/build/bin/llc

Fixes: 128d1514be35 ("samples/bpf: Use llc in PATH, rather than a hardcoded value")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# e3edfdec 06-Apr-2016 Alexei Starovoitov <ast@kernel.org>

samples/bpf: add tracepoint vs kprobe performance tests

the first microbenchmark does
fd=open("/proc/self/comm");
for() {
write(fd, "test");
}
and on 4 cpus in parallel:
writes per sec
base (no tracepoints, no kprobes) 930k
with kprobe at __set_task_comm() 420k
with tracepoint at task:task_rename 730k

For kprobe + full bpf program manully fetches oldcomm, newcomm via bpf_probe_read.
For tracepint bpf program does nothing, since arguments are copied by tracepoint.

2nd microbenchmark does:
fd=open("/dev/urandom");
for() {
read(fd, buf);
}
and on 4 cpus in parallel:
reads per sec
base (no tracepoints, no kprobes) 300k
with kprobe at urandom_read() 279k
with tracepoint at random:urandom_read 290k

bpf progs attached to kprobe and tracepoint are noop.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 128d1514 04-Apr-2016 Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>

samples/bpf: Use llc in PATH, rather than a hardcoded value

While at it, remove the generation of .s files and fix some typos in the
related comment.

Cc: Alexei Starovoitov <ast@fb.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 26e90931 08-Mar-2016 Alexei Starovoitov <ast@kernel.org>

samples/bpf: add map performance test

performance tests for hash map and per-cpu hash map
with and without pre-allocation

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 9d8b612d 08-Mar-2016 Alexei Starovoitov <ast@kernel.org>

samples/bpf: add bpf map stress test

this test calls bpf programs from different contexts:
from inside of slub, from rcu, from pretty much everywhere,
since it kprobes all spin_lock functions.
It stresses the bpf hash and percpu map pre-allocation,
deallocation logic and call_rcu mechanisms.
User space part adding more stress by walking and deleting map elements.

Note that due to nature bpf_load.c the earlier kprobe+bpf programs are
already active while loader loads new programs, creates new kprobes and
attaches them.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a6ffe7b9 17-Feb-2016 Alexei Starovoitov <ast@kernel.org>

samples/bpf: offwaketime example

This is simplified version of Brendan Gregg's offwaketime:
This program shows kernel stack traces and task names that were blocked and
"off-CPU", along with the stack traces and task names for the threads that woke
them, and the total elapsed time from when they blocked to when they were woken
up. The combined stacks, task names, and total time is summarized in kernel
context for efficiency.

Example:
$ sudo ./offwaketime | flamegraph.pl > demo.svg
Open demo.svg in the browser as FlameGraph visualization.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 30b50aa6 12-Nov-2015 Yang Shi <yang.shi@linaro.org>

bpf: samples: exclude asm/sysreg.h for arm64

commit 338d4f49d6f7114a017d294ccf7374df4f998edc
("arm64: kernel: Add support for Privileged Access Never") includes sysreg.h
into futex.h and uaccess.h. But, the inline assembly used by asm/sysreg.h is
incompatible with llvm so it will cause BPF samples build failure for ARM64.
Since sysreg.h is useless for BPF samples, just exclude it from Makefile via
defining __ASM_SYSREG_H.

Signed-off-by: Yang Shi <yang.shi@linaro.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 42984d7c 29-Oct-2015 Daniel Borkmann <daniel@iogearbox.net>

bpf: add sample usages for persistent maps/progs

This patch adds a couple of stand-alone examples on how BPF_OBJ_PIN
and BPF_OBJ_GET commands can be used.

Example with maps:

# ./fds_example -F /sys/fs/bpf/m -P -m -k 1 -v 42
bpf: map fd:3 (Success)
bpf: pin ret:(0,Success)
bpf: fd:3 u->(1:42) ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m -G -m -k 1
bpf: get fd:3 (Success)
bpf: fd:3 l->(1):42 ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m -G -m -k 1 -v 24
bpf: get fd:3 (Success)
bpf: fd:3 u->(1:24) ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m -G -m -k 1
bpf: get fd:3 (Success)
bpf: fd:3 l->(1):24 ret:(0,Success)

# ./fds_example -F /sys/fs/bpf/m2 -P -m
bpf: map fd:3 (Success)
bpf: pin ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m2 -G -m -k 1
bpf: get fd:3 (Success)
bpf: fd:3 l->(1):0 ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m2 -G -m
bpf: get fd:3 (Success)

Example with progs:

# ./fds_example -F /sys/fs/bpf/p -P -p
bpf: prog fd:3 (Success)
bpf: pin ret:(0,Success)
bpf sock:4 <- fd:3 attached ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/p -G -p
bpf: get fd:3 (Success)
bpf: sock:4 <- fd:3 attached ret:(0,Success)

# ./fds_example -F /sys/fs/bpf/p2 -P -p -o ./sockex1_kern.o
bpf: prog fd:5 (Success)
bpf: pin ret:(0,Success)
bpf: sock:3 <- fd:5 attached ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/p2 -G -p
bpf: get fd:3 (Success)
bpf: sock:4 <- fd:3 attached ret:(0,Success)

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 39111695 20-Oct-2015 Alexei Starovoitov <ast@kernel.org>

samples: bpf: add bpf_perf_event_output example

Performance test and example of bpf_perf_event_output().
kprobe is attached to sys_write() and trivial bpf program streams
pid+cookie into userspace via PERF_COUNT_SW_BPF_OUTPUT event.

Usage:
$ sudo ./bld_x64/samples/bpf/trace_output
recv 2968913 events per sec

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 47efb302 06-Aug-2015 Kaixu Xia <xiakaixu@huawei.com>

samples/bpf: example of get selected PMU counter value

This is a simple example and shows how to use the new ability
to get the selected Hardware PMU counter value.

Signed-off-by: Kaixu Xia <xiakaixu@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 0fb1170e 19-Jun-2015 Daniel Wagner <daniel.wagner@bmw-carit.de>

bpf: BPF based latency tracing

BPF offers another way to generate latency histograms. We attach
kprobes at trace_preempt_off and trace_preempt_on and calculate the
time it takes to from seeing the off/on transition.

The first array is used to store the start time stamp. The key is the
CPU id. The second array stores the log2(time diff). We need to use
static allocation here (array and not hash tables). The kprobes
hooking into trace_preempt_on|off should not calling any dynamic
memory allocation or free path. We need to avoid recursivly
getting called. Besides that, it reduces jitter in the measurement.

CPU 0
latency : count distribution
1 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 166723 |*************************************** |
4096 -> 8191 : 19870 |*** |
8192 -> 16383 : 6324 | |
16384 -> 32767 : 1098 | |
32768 -> 65535 : 190 | |
65536 -> 131071 : 179 | |
131072 -> 262143 : 18 | |
262144 -> 524287 : 4 | |
524288 -> 1048575 : 1363 | |
CPU 1
latency : count distribution
1 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 114042 |*************************************** |
4096 -> 8191 : 9587 |** |
8192 -> 16383 : 4140 | |
16384 -> 32767 : 673 | |
32768 -> 65535 : 179 | |
65536 -> 131071 : 29 | |
131072 -> 262143 : 4 | |
262144 -> 524287 : 1 | |
524288 -> 1048575 : 364 | |
CPU 2
latency : count distribution
1 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 40147 |*************************************** |
4096 -> 8191 : 2300 |* |
8192 -> 16383 : 828 | |
16384 -> 32767 : 178 | |
32768 -> 65535 : 59 | |
65536 -> 131071 : 2 | |
131072 -> 262143 : 0 | |
262144 -> 524287 : 1 | |
524288 -> 1048575 : 174 | |
CPU 3
latency : count distribution
1 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 29626 |*************************************** |
4096 -> 8191 : 2704 |** |
8192 -> 16383 : 1090 | |
16384 -> 32767 : 160 | |
32768 -> 65535 : 72 | |
65536 -> 131071 : 32 | |
131072 -> 262143 : 26 | |
262144 -> 524287 : 12 | |
524288 -> 1048575 : 298 | |

All this is based on the trace3 examples written by
Alexei Starovoitov <ast@plumgrid.com>.

Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 530b2c86 19-May-2015 Alexei Starovoitov <ast@kernel.org>

samples/bpf: bpf_tail_call example for networking

Usage:
$ sudo ./sockex3
IP src.port -> dst.port bytes packets
127.0.0.1.42010 -> 127.0.0.1.12865 1568 8
127.0.0.1.59526 -> 127.0.0.1.33778 11422636 173070
127.0.0.1.33778 -> 127.0.0.1.59526 11260224828 341974
127.0.0.1.12865 -> 127.0.0.1.42010 1832 12
IP src.port -> dst.port bytes packets
127.0.0.1.42010 -> 127.0.0.1.12865 1568 8
127.0.0.1.59526 -> 127.0.0.1.33778 23198092 351486
127.0.0.1.33778 -> 127.0.0.1.59526 22972698518 698616
127.0.0.1.12865 -> 127.0.0.1.42010 1832 12

this example is similar to sockex2 in a way that it accumulates per-flow
statistics, but it does packet parsing differently.
sockex2 inlines full packet parser routine into single bpf program.
This sockex3 example have 4 independent programs that parse vlan, mpls, ip, ipv6
and one main program that starts the process.
bpf_tail_call() mechanism allows each program to be small and be called
on demand potentially multiple times, so that many vlan, mpls, ip in ip,
gre encapsulations can be parsed. These and other protocol parsers can
be added or removed at runtime. TLVs can be parsed in similar manner.
Note, tail_call_cnt dynamic check limits the number of tail calls to 32.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 5bacd780 19-May-2015 Alexei Starovoitov <ast@kernel.org>

samples/bpf: bpf_tail_call example for tracing

kprobe example that demonstrates how future seccomp programs may look like.
It attaches to seccomp_phase1() function and tail-calls other BPF programs
depending on syscall number.

Existing optimized classic BPF seccomp programs generated by Chrome look like:
if (sd.nr < 121) {
if (sd.nr < 57) {
if (sd.nr < 22) {
if (sd.nr < 7) {
if (sd.nr < 4) {
if (sd.nr < 1) {
check sys_read
} else {
if (sd.nr < 3) {
check sys_write and sys_open
} else {
check sys_close
}
}
} else {
} else {
} else {
} else {
} else {
}

the future seccomp using native eBPF may look like:
bpf_tail_call(&sd, &syscall_jmp_table, sd.nr);
which is simpler, faster and leaves more room for per-syscall checks.

Usage:
$ sudo ./tracex5
<...>-366 [001] d... 4.870033: : read(fd=1, buf=00007f6d5bebf000, size=771)
<...>-369 [003] d... 4.870066: : mmap
<...>-369 [003] d... 4.870077: : syscall=110 (one of get/set uid/pid/gid)
<...>-369 [003] d... 4.870089: : syscall=107 (one of get/set uid/pid/gid)
sh-369 [000] d... 4.891740: : read(fd=0, buf=00000000023d1000, size=512)
sh-369 [000] d... 4.891747: : write(fd=1, buf=00000000023d3000, size=512)
sh-369 [000] d... 4.891747: : read(fd=1, buf=00000000023d3000, size=512)

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# b88c06e3 11-May-2015 Brenden Blanco <bblanco@plumgrid.com>

samples/bpf: fix in-source build of samples with clang

in-source build of 'make samples/bpf/' was incorrectly
using default compiler instead of invoking clang/llvm.
out-of-source build was ok.

Fixes: a80857822b0c ("samples: bpf: trivial eBPF program in C")
Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 91bc4822 01-Apr-2015 Alexei Starovoitov <ast@kernel.org>

tc: bpf: add checksum helpers

Commit 608cd71a9c7c ("tc: bpf: generalize pedit action") has added the
possibility to mangle packet data to BPF programs in the tc pipeline.
This patch adds two helpers bpf_l3_csum_replace() and bpf_l4_csum_replace()
for fixing up the protocol checksums after the packet mangling.

It also adds 'flags' argument to bpf_skb_store_bytes() helper to avoid
unnecessary checksum recomputations when BPF programs adjusting l3/l4
checksums and documents all three helpers in uapi header.

Moreover, a sample program is added to show how BPF programs can make use
of the mangle and csum helpers.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 9811e353 25-Mar-2015 Alexei Starovoitov <ast@kernel.org>

samples/bpf: Add kmem_alloc()/free() tracker tool

One BPF program attaches to kmem_cache_alloc_node() and
remembers all allocated objects in the map.
Another program attaches to kmem_cache_free() and deletes
corresponding object from the map.

User space walks the map every second and prints any objects
which are older than 1 second.

Usage:

$ sudo tracex4

Then start few long living processes. The 'tracex4' will print
something like this:

obj 0xffff880465928000 is 13sec old was allocated at ip ffffffff8105dc32
obj 0xffff88043181c280 is 13sec old was allocated at ip ffffffff8105dc32
obj 0xffff880465848000 is 8sec old was allocated at ip ffffffff8105dc32
obj 0xffff8804338bc280 is 15sec old was allocated at ip ffffffff8105dc32

$ addr2line -fispe vmlinux ffffffff8105dc32
do_fork at fork.c:1665

As soon as processes exit the memory is reclaimed and 'tracex4'
prints nothing.

Similar experiment can be done with the __kmalloc()/kfree() pair.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1427312966-8434-10-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 5c7fc2d2 25-Mar-2015 Alexei Starovoitov <ast@kernel.org>

samples/bpf: Add IO latency analysis (iosnoop/heatmap) tool

BPF C program attaches to
blk_mq_start_request()/blk_update_request() kprobe events to
calculate IO latency.

For every completed block IO event it computes the time delta
in nsec and records in a histogram map:

map[log10(delta)*10]++

User space reads this histogram map every 2 seconds and prints
it as a 'heatmap' using gray shades of text terminal. Black
spaces have many events and white spaces have very few events.
Left most space is the smallest latency, right most space is
the largest latency in the range.

Usage:

$ sudo ./tracex3
and do 'sudo dd if=/dev/sda of=/dev/null' in other terminal.

Observe IO latencies and how different activity (like 'make
kernel') affects it.

Similar experiments can be done for network transmit latencies,
syscalls, etc.

'-t' flag prints the heatmap using normal ascii characters:

$ sudo ./tracex3 -t
heatmap of IO latency
# - many events with this latency
- few events
|1us |10us |100us |1ms |10ms |100ms |1s |10s
*ooo. *O.#. # 221
. *# . # 125
.. .o#*.. # 55
. . . . .#O # 37
.# # 175
.#*. # 37
# # 199
. . *#*. # 55
*#..* # 42
# # 266
...***Oo#*OO**o#* . # 629
# # 271
. .#o* o.*o* # 221
. . o* *#O.. # 50

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1427312966-8434-9-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# d822a192 25-Mar-2015 Alexei Starovoitov <ast@kernel.org>

samples/bpf: Add counting example for kfree_skb() function calls and the write() syscall

this example has two probes in one C file that attach to
different kprove events and use two different maps.

1st probe is x64 specific equivalent of dropmon. It attaches to
kfree_skb, retrevies 'ip' address of kfree_skb() caller and
counts number of packet drops at that 'ip' address. User space
prints 'location - count' map every second.

2nd probe attaches to kprobe:sys_write and computes a histogram
of different write sizes

Usage:
$ sudo tracex2
location 0xffffffff81695995 count 1
location 0xffffffff816d0da9 count 2

location 0xffffffff81695995 count 2
location 0xffffffff816d0da9 count 2

location 0xffffffff81695995 count 3
location 0xffffffff816d0da9 count 2

557145+0 records in
557145+0 records out
285258240 bytes (285 MB) copied, 1.02379 s, 279 MB/s
syscall write() stats
byte_size : count distribution
1 -> 1 : 3 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 2 | |
32 -> 63 : 3 | |
64 -> 127 : 1 | |
128 -> 255 : 1 | |
256 -> 511 : 0 | |
512 -> 1023 : 1118968 |************************************* |

Ctrl-C at any time. Kernel will auto cleanup maps and programs

$ addr2line -ape ./bld_x64/vmlinux 0xffffffff81695995
0xffffffff816d0da9 0xffffffff81695995:
./bld_x64/../net/ipv4/icmp.c:1038 0xffffffff816d0da9:
./bld_x64/../net/unix/af_unix.c:1231

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1427312966-8434-8-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# b896c4f9 25-Mar-2015 Alexei Starovoitov <ast@kernel.org>

samples/bpf: Add simple non-portable kprobe filter example

tracex1_kern.c - C program compiled into BPF.

It attaches to kprobe:netif_receive_skb()

When skb->dev->name == "lo", it prints sample debug message into
trace_pipe via bpf_trace_printk() helper function.

tracex1_user.c - corresponding user space component that:
- loads BPF program via bpf() syscall
- opens kprobes:netif_receive_skb event via perf_event_open()
syscall
- attaches the program to event via ioctl(event_fd,
PERF_EVENT_IOC_SET_BPF, prog_fd);
- prints from trace_pipe

Note, this BPF program is non-portable. It must be recompiled
with current kernel headers. kprobe is not a stable ABI and
BPF+kprobe scripts may no longer be meaningful when kernel
internals change.

No matter in what way the kernel changes, neither the kprobe,
nor the BPF program can ever crash or corrupt the kernel,
assuming the kprobes, perf and BPF subsystem has no bugs.

The verifier will detect that the program is using
bpf_trace_printk() and the kernel will print 'this is a DEBUG
kernel' warning banner, which means that bpf_trace_printk()
should be used for debugging of the BPF program only.

Usage:
$ sudo tracex1
ping-19826 [000] d.s2 63103.382648: : skb ffff880466b1ca00 len 84
ping-19826 [000] d.s2 63103.382684: : skb ffff880466b1d300 len 84

ping-19826 [000] d.s2 63104.382533: : skb ffff880466b1ca00 len 84
ping-19826 [000] d.s2 63104.382594: : skb ffff880466b1d300 len 84

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1427312966-8434-7-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# fbe33108 01-Dec-2014 Alexei Starovoitov <ast@kernel.org>

samples: bpf: large eBPF program in C

sockex2_kern.c is purposefully large eBPF program in C.
llvm compiles ~200 lines of C code into ~300 eBPF instructions.

It's similar to __skb_flow_dissect() to demonstrate that complex packet parsing
can be done by eBPF.
Then it uses (struct flow_keys)->dst IP address (or hash of ipv6 dst) to keep
stats of number of packets per IP.
User space loads eBPF program, attaches it to loopback interface and prints
dest_ip->#packets stats every second.

Usage:
$sudo samples/bpf/sockex2
ip 127.0.0.1 count 19
ip 127.0.0.1 count 178115
ip 127.0.0.1 count 369437
ip 127.0.0.1 count 559841
ip 127.0.0.1 count 750539

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a8085782 01-Dec-2014 Alexei Starovoitov <ast@kernel.org>

samples: bpf: trivial eBPF program in C

this example does the same task as previous socket example
in assembler, but this one does it in C.

eBPF program in kernel does:
/* assume that packet is IPv4, load one byte of IP->proto */
int index = load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol));
long *value;

value = bpf_map_lookup_elem(&my_map, &index);
if (value)
__sync_fetch_and_add(value, 1);

Corresponding user space reads map[tcp], map[udp], map[icmp]
and prints protocol stats every second

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 03f4723e 01-Dec-2014 Alexei Starovoitov <ast@kernel.org>

samples: bpf: example of stateful socket filtering

this socket filter example does:
- creates arraymap in kernel with key 4 bytes and value 8 bytes

- loads eBPF program which assumes that packet is IPv4 and loads one byte of
IP->proto from the packet and uses it as a key in a map

r0 = skb->data[ETH_HLEN + offsetof(struct iphdr, protocol)];
*(u32*)(fp - 4) = r0;
value = bpf_map_lookup_elem(map_fd, fp - 4);
if (value)
(*(u64*)value) += 1;

- attaches this program to raw socket

- every second user space reads map[IPPROTO_TCP], map[IPPROTO_UDP], map[IPPROTO_ICMP]
to see how many packets of given protocol were seen on loopback interface

Usage:
$sudo samples/bpf/sock_example
TCP 0 UDP 0 ICMP 0 packets
TCP 187600 UDP 0 ICMP 4 packets
TCP 376504 UDP 0 ICMP 8 packets
TCP 563116 UDP 0 ICMP 12 packets
TCP 753144 UDP 0 ICMP 16 packets

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ffb65f27 13-Nov-2014 Alexei Starovoitov <ast@kernel.org>

bpf: add a testsuite for eBPF maps

. check error conditions and sanity of hash and array map APIs
. check large maps (that kernel gracefully switches to vmalloc from kmalloc)
. check multi-process parallel access and stress test

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 3c731eba 26-Sep-2014 Alexei Starovoitov <ast@kernel.org>

bpf: mini eBPF library, test stubs and verifier testsuite

1.
the library includes a trivial set of BPF syscall wrappers:
int bpf_create_map(int key_size, int value_size, int max_entries);
int bpf_update_elem(int fd, void *key, void *value);
int bpf_lookup_elem(int fd, void *key, void *value);
int bpf_delete_elem(int fd, void *key);
int bpf_get_next_key(int fd, void *key, void *next_key);
int bpf_prog_load(enum bpf_prog_type prog_type,
const struct sock_filter_int *insns, int insn_len,
const char *license);
bpf_prog_load() stores verifier log into global bpf_log_buf[] array

and BPF_*() macros to build instructions

2.
test stubs configure eBPF infra with 'unspec' map and program types.
These are fake types used by user space testsuite only.

3.
verifier tests valid and invalid programs and expects predefined
error log messages from kernel.
40 tests so far.

$ sudo ./test_verifier
#0 add+sub+mul OK
#1 unreachable OK
#2 unreachable2 OK
#3 out of range jump OK
#4 out of range jump2 OK
#5 test1 ld_imm64 OK
...

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>