History log of /linux-master/arch/arm/net/bpf_jit_32.h
Revision Date Author Comments
# 5097faa5 07-Sep-2023 Puranjay Mohan <puranjay12@gmail.com>

arm32, bpf: add support for 32-bit signed division

The cpuv4 added a new BPF_SDIV instruction that does signed division.
The encoding is similar to BPF_DIV but BPF_SDIV sets offset=1.

ARM32 already supports 32-bit BPF_DIV which can be easily extended to
support BPF_SDIV as ARM32 has the SDIV instruction. When the CPU is not
ARM-v7, we implement that SDIV/SMOD with the function call similar to
the implementation of DIV/MOD.

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20230907230550.1417590-6-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# f9e6981b 07-Sep-2023 Puranjay Mohan <puranjay12@gmail.com>

arm32, bpf: add support for sign-extension load instruction

The cpuv4 added the support of an instruction that is similar to load
but also sign-extends the result after the load.

BPF_MEMSX | <size> | BPF_LDX means dst = *(signed size *) (src + offset)
here <size> can be one of BPF_B, BPF_H, BPF_W.

ARM32 has instructions to load a byte or a half word with sign
extension into a 32bit register. As the JIT uses two 32 bit registers
to simulate a 64-bit BPF register, an extra instruction is emitted to
sign-extent the result up to the second register.

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20230907230550.1417590-3-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# c648c9c7 30-Apr-2020 Luke Nelson <lukenels@cs.washington.edu>

bpf, arm: Optimize ALU ARSH K using asr immediate instruction

This patch adds an optimization that uses the asr immediate instruction
for BPF_ALU BPF_ARSH BPF_K, rather than loading the immediate to
a temporary register. This is similar to existing code for handling
BPF_ALU BPF_{LSH,RSH} BPF_K. This optimization saves two instructions
and is more consistent with LSH and RSH.

Example of the code generated for BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 5)
before the optimization:

2c: mov r8, #5
30: mov r9, #0
34: asr r0, r0, r8

and after optimization:

2c: asr r0, r0, #5

Tested on QEMU using lib/test_bpf and test_verifier.

Co-developed-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200501020210.32294-3-luke.r.nels@gmail.com


# b886d83c 01-Jun-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation version 2 of the license

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 315 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Armijn Hemel <armijn@tjaldur.nl>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# b85062ac 25-Jan-2019 Jiong Wang <jiong.wang@netronome.com>

arm: bpf: implement jitting of JMP32

This patch implements code-gen for new JMP32 instructions on arm.

For JSET, "ands" (AND with flags updated) is used, so corresponding
encoding helper is added.

Cc: Shubham Bansal <illusionist.neo@gmail.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 8c9602d3 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: use double-word load/stores where available

Use double-word load and stores where support for this instruction is
supported by the CPU architecture.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 2b6958ef 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: use ldr instructions with shifted rm register

Rather than pre-shifting the rm register for the ldr in the tail call,
shift it in the load instruction. This eliminates one unnecessary
instruction.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 828e2b90 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: use immediate forms of instructions where possible

Rather than moving constants to a register and then using them in a
subsequent instruction, use them directly in the desired instruction
cutting out the "middle" register. This removes two instructions from
the tail call code path.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# a8ef95a0 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: provide load/store ops with negative immediates

Provide a set of load/store opcode generators that work with negative
immediates as well as positive ones.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 39c13c20 21-Aug-2017 Shubham Bansal <illusionist.neo@gmail.com>

arm: eBPF JIT compiler

The JIT compiler emits ARM 32 bit instructions. Currently, It supports
eBPF only. Classic BPF is supported because of the conversion by BPF core.

This patch is essentially changing the current implementation of JIT compiler
of Berkeley Packet Filter from classic to internal with almost all
instructions from eBPF ISA supported except the following
BPF_ALU64 | BPF_DIV | BPF_K
BPF_ALU64 | BPF_DIV | BPF_X
BPF_ALU64 | BPF_MOD | BPF_K
BPF_ALU64 | BPF_MOD | BPF_X
BPF_STX | BPF_XADD | BPF_W
BPF_STX | BPF_XADD | BPF_DW

Implementation is using scratch space to emulate 64 bit eBPF ISA on 32 bit
ARM because of deficiency of general purpose registers on ARM. Currently,
only LITTLE ENDIAN machines are supported in this eBPF JIT Compiler.

Tested on ARMv7 with QEMU by me (Shubham Bansal).

Testing results on ARMv7:

1) test_bpf: Summary: 341 PASSED, 0 FAILED, [312/333 JIT'ed]
2) test_tag: OK (40945 tests)
3) test_progs: Summary: 30 PASSED, 0 FAILED
4) test_lpm: OK
5) test_lru_map: OK

Above tests are all done with following flags enabled discreatly.

1) bpf_jit_enable=1
a) CONFIG_FRAME_POINTER enabled
b) CONFIG_FRAME_POINTER disabled
2) bpf_jit_enable=1 and bpf_jit_harden=2
a) CONFIG_FRAME_POINTER enabled
b) CONFIG_FRAME_POINTER disabled

See Documentation/networking/filter.txt for more information.

Signed-off-by: Shubham Bansal <illusionist.neo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4560cdff 02-Oct-2015 Nicolas Schichan <nschichan@freebox.fr>

ARM: net: support BPF_ALU | BPF_MOD instructions in the BPF JIT.

For ARMv7 with UDIV instruction support, generate an UDIV instruction
followed by an MLS instruction.

For other ARM variants, generate code calling a C wrapper similar to
the jit_udiv() function used for BPF_ALU | BPF_DIV instructions.

Some performance numbers reported by the test_bpf module (the duration
per filter run is reported in nanoseconds, between "jitted:<x>" and
"PASS":

ARMv7 QEMU nojit: test_bpf: #3 DIV_MOD_KX jited:0 2196 PASS
ARMv7 QEMU jit: test_bpf: #3 DIV_MOD_KX jited:1 104 PASS
ARMv5 QEMU nojit: test_bpf: #3 DIV_MOD_KX jited:0 2176 PASS
ARMv5 QEMU jit: test_bpf: #3 DIV_MOD_KX jited:1 1104 PASS
ARMv5 kirkwood nojit: test_bpf: #3 DIV_MOD_KX jited:0 1103 PASS
ARMv5 kirkwood jit: test_bpf: #3 DIV_MOD_KX jited:1 311 PASS

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 5bf705b4 27-Jul-2015 Nicolas Schichan <nschichan@freebox.fr>

ARM: net: add support for BPF_ANC | SKF_AD_HATYPE in ARM JIT.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>


# e8b56d55 19-Sep-2014 Daniel Borkmann <daniel@iogearbox.net>

net: bpf: arm: make hole-faulting more robust

Will Deacon pointed out, that the currently used opcode for filling holes,
that is 0xe7ffffff, seems not robust enough ...

$ echo 0xffffffe7 | xxd -r > test.bin
$ arm-linux-gnueabihf-objdump -m arm -D -b binary test.bin
...
0: e7ffffff udf #65535 ; 0xffff

... while for Thumb, it ends up as ...

0: ffff e7ff vqshl.u64 q15, <illegal reg q15.5>, #63

... which is a bit fragile. The ARM specification defines some *permanently*
guaranteed undefined instruction (UDF) space, for example for ARM in ARMv7-AR,
section A5.4 and for Thumb in ARMv7-M, section A5.2.6.

Similarly, ptrace, kprobes, kgdb, bug and uprobes make use of such instruction
as well to trap. Given mentioned section from the specification, we can find
such a universe as (where 'x' denotes 'don't care'):

ARM: xxxx 0111 1111 xxxx xxxx xxxx 1111 xxxx
Thumb: 1101 1110 xxxx xxxx

We therefore should use a more robust opcode that fits both. Russell King
suggested that we can even reuse a single 32-bit word, that is, 0xe7fddef1
which will fault if executed in ARM *or* Thumb mode as done in f928d4f2a86f
("ARM: poison the vectors page"). That will still hold our requirements:

$ echo 0xf1defde7 | xxd -r > test.bin
$ arm-unknown-linux-gnueabi-objdump -m arm -D -b binary test.bin
...
0: e7fddef1 udf #56801 ; 0xdde1
$ echo 0xf1defde7f1defde7f1defde7 | xxd -r > test.bin
$ arm-unknown-linux-gnueabi-objdump -marm -Mforce-thumb -D -b binary test.bin
...
0: def1 udf #241 ; 0xf1
2: e7fd b.n 0x0
4: def1 udf #241 ; 0xf1
6: e7fd b.n 0x4
8: def1 udf #241 ; 0xf1
a: e7fd b.n 0x8

So on ARM 0xe7fddef1 conforms to the above UDF pattern, and the low 16 bit
likewise correspond to UDF in Thumb case. The 0xe7fd part is an unconditional
branch back to the UDF instruction.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 3cbe2041 07-Nov-2012 Daniel Borkmann <daniel@iogearbox.net>

ARM: net: bpf_jit_32: add XOR instruction for BPF JIT

This patch is a follow-up for patch "filter: add XOR instruction for use
with X/K" that implements BPF ARM JIT parts for the BPF XOR operation.

Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mircea Gherzan <mgherzan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 2bea29b7 11-Jun-2012 Mircea Gherzan <mgherzan@gmail.com>

ARM: 7421/1: bpf_jit: BPF_S_ANC_ALU_XOR_X support

JIT support for the XOR operation introduced by the commit
ffe06c17afbb.

Signed-off-by: Mircea Gherzan <mgherzan@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>


# ddecdfce 16-Mar-2012 Mircea Gherzan <mgherzan@gmail.com>

ARM: 7259/3: net: JIT compiler for packet filters

Based of Matt Evans's PPC64 implementation.

The compiler generates ARM instructions but interworking is
supported for Thumb2 kernels.

Supports both little and big endian. Unaligned loads are emitted
for ARMv6+. Not all the BPF opcodes that deal with ancillary data
are supported. The scratch memory of the filter lives on the stack.
Hardware integer division is used if it is available.

Enabled in the same way as for x86-64 and PPC64:

echo 1 > /proc/sys/net/core/bpf_jit_enable

A value greater than 1 enables opcode output.

Signed-off-by: Mircea Gherzan <mgherzan@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>