History log of /linux-master/arch/arm/net/bpf_jit_32.c
Revision Date Author Comments
# 71086041 07-Sep-2023 Puranjay Mohan <puranjay12@gmail.com>

arm32, bpf: add support for 64 bit division instruction

ARM32 doesn't have instructions to do 64-bit/64-bit divisions. So, to
implement the following instructions:
BPF_ALU64 | BPF_DIV
BPF_ALU64 | BPF_MOD
BPF_ALU64 | BPF_SDIV
BPF_ALU64 | BPF_SMOD

We implement the above instructions by doing function calls to div64_u64()
and div64_u64_rem() for unsigned division/mod and calls to div64_s64()
for signed division/mod.

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20230907230550.1417590-7-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 5097faa5 07-Sep-2023 Puranjay Mohan <puranjay12@gmail.com>

arm32, bpf: add support for 32-bit signed division

The cpuv4 added a new BPF_SDIV instruction that does signed division.
The encoding is similar to BPF_DIV but BPF_SDIV sets offset=1.

ARM32 already supports 32-bit BPF_DIV which can be easily extended to
support BPF_SDIV as ARM32 has the SDIV instruction. When the CPU is not
ARM-v7, we implement that SDIV/SMOD with the function call similar to
the implementation of DIV/MOD.

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20230907230550.1417590-6-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 1cfb7eae 07-Sep-2023 Puranjay Mohan <puranjay12@gmail.com>

arm32, bpf: add support for unconditional bswap instruction

The cpuv4 added a new unconditional bswap instruction with following
behaviour:

BPF_ALU64 | BPF_TO_LE | BPF_END with imm = 16/32/64 means:
dst = bswap16(dst)
dst = bswap32(dst)
dst = bswap64(dst)

As we already support converting to big-endian from little-endian we can
use the same for unconditional bswap. just treat the unconditional scenario
the same as big-endian conversion.

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20230907230550.1417590-5-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# fc832653 07-Sep-2023 Puranjay Mohan <puranjay12@gmail.com>

arm32, bpf: add support for sign-extension mov instruction

The cpuv4 added a new BPF_MOVSX instruction that sign extends the src
before moving it to the destination.

BPF_ALU | BPF_MOVSX sign extends 8-bit and 16-bit operands into 32-bit
operands, and zeroes the remaining upper 32 bits.

BPF_ALU64 | BPF_MOVSX sign extends 8-bit, 16-bit, and 32-bit operands
into 64-bit operands.

The offset field of the instruction is used to tell the number of bit to
use for sign-extension. BPF_MOV and BPF_MOVSX have the same code but the
former sets offset to 0 and the later one sets the offset to 8, 16 or 32

The behaviour of this instruction is dst = (s8,s16,s32)src

On ARM32 the implementation uses LSH and ARSH to extend the 8/16 bits to
a 32-bit register and then it is sign extended to the upper 32-bit
register using ARSH. For 32-bit we just move it to the destination
register and use ARSH to extend it to the upper 32-bit register.

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20230907230550.1417590-4-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# f9e6981b 07-Sep-2023 Puranjay Mohan <puranjay12@gmail.com>

arm32, bpf: add support for sign-extension load instruction

The cpuv4 added the support of an instruction that is similar to load
but also sign-extends the result after the load.

BPF_MEMSX | <size> | BPF_LDX means dst = *(signed size *) (src + offset)
here <size> can be one of BPF_B, BPF_H, BPF_W.

ARM32 has instructions to load a byte or a half word with sign
extension into a 32bit register. As the JIT uses two 32 bit registers
to simulate a 64-bit BPF register, an extra instruction is emitted to
sign-extent the result up to the second register.

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20230907230550.1417590-3-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 471f3d4e 07-Sep-2023 Puranjay Mohan <puranjay12@gmail.com>

arm32, bpf: add support for 32-bit offset jmp instruction

The cpuv4 adds unconditional jump with 32-bit offset where the immediate
field of the instruction is to be used to calculate the jump offset.

BPF_JA | BPF_K | BPF_JMP32 => gotol +imm => PC += imm.

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20230907230550.1417590-2-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# fc386ba7 10-Jun-2022 YueHaibing <yuehaibing@huawei.com>

bpf, arm: Remove unused function emit_a32_alu_r()

Since commit b18bea2a45b1 ("ARM: net: bpf: improve 64-bit ALU implementation")
this is unused anymore, so can remove it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220611040904.8976-1-yuehaibing@huawei.com


# d8dc09a4 18-Mar-2022 Julia Lawall <Julia.Lawall@inria.fr>

bpf, arm: Fix various typos in comments

Various spelling mistakes in comments. Detected with the help of
Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220318103729.157574-9-Julia.Lawall@inria.fr


# 06edc59c 19-Nov-2021 Christoph Hellwig <hch@lst.de>

bpf, docs: Prune all references to "internal BPF"

The eBPF name has completely taken over from eBPF in general usage for
the actual eBPF representation, or BPF for any general in-kernel use.
Prune all remaining references to "internal BPF".

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211119163215.971383-4-hch@lst.de


# ebf7f6f0 04-Nov-2021 Tiezhu Yang <yangtiezhu@loongson.cn>

bpf: Change value of MAX_TAIL_CALL_CNT from 32 to 33

In the current code, the actual max tail call count is 33 which is greater
than MAX_TAIL_CALL_CNT (defined as 32). The actual limit is not consistent
with the meaning of MAX_TAIL_CALL_CNT and thus confusing at first glance.
We can see the historical evolution from commit 04fd61ab36ec ("bpf: allow
bpf programs to tail-call other bpf programs") and commit f9dabe016b63
("bpf: Undo off-by-one in interpreter tail call count limit"). In order
to avoid changing existing behavior, the actual limit is 33 now, this is
reasonable.

After commit 874be05f525e ("bpf, tests: Add tail call test suite"), we can
see there exists failed testcase.

On all archs when CONFIG_BPF_JIT_ALWAYS_ON is not set:
# echo 0 > /proc/sys/net/core/bpf_jit_enable
# modprobe test_bpf
# dmesg | grep -w FAIL
Tail call error path, max count reached jited:0 ret 34 != 33 FAIL

On some archs:
# echo 1 > /proc/sys/net/core/bpf_jit_enable
# modprobe test_bpf
# dmesg | grep -w FAIL
Tail call error path, max count reached jited:1 ret 34 != 33 FAIL

Although the above failed testcase has been fixed in commit 18935a72eb25
("bpf/tests: Fix error in tail call limit tests"), it would still be good
to change the value of MAX_TAIL_CALL_CNT from 32 to 33 to make the code
more readable.

The 32-bit x86 JIT was using a limit of 32, just fix the wrong comments and
limit to 33 tail calls as the constant MAX_TAIL_CALL_CNT updated. For the
mips64 JIT, use "ori" instead of "addiu" as suggested by Johan Almbladh.
For the riscv JIT, use RV_REG_TCC directly to save one register move as
suggested by Björn Töpel. For the other implementations, no function changes,
it does not change the current limit 33, the new value of MAX_TAIL_CALL_CNT
can reflect the actual max tail call count, the related tail call testcases
in test_bpf module and selftests can work well for the interpreter and the
JIT.

Here are the test results on x86_64:

# uname -m
x86_64
# echo 0 > /proc/sys/net/core/bpf_jit_enable
# modprobe test_bpf test_suite=test_tail_calls
# dmesg | tail -1
test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [0/8 JIT'ed]
# rmmod test_bpf
# echo 1 > /proc/sys/net/core/bpf_jit_enable
# modprobe test_bpf test_suite=test_tail_calls
# dmesg | tail -1
test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [8/8 JIT'ed]
# rmmod test_bpf
# ./test_progs -t tailcalls
#142 tailcalls:OK
Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/bpf/1636075800-3264-1-git-send-email-yangtiezhu@loongson.cn


# 90982e13 06-Oct-2021 Daniel Borkmann <daniel@iogearbox.net>

bpf, arm: Remove dummy bpf_jit_compile stub

The BPF core defines a __weak bpf_jit_compile() dummy function already
which should only be overridden by JITs if they actually implement a
legacy cBPF JIT. Given arm implements an eBPF JIT, this stub is not
needed.

Now that MIPS cBPF JIT is finally gone, the only JIT left that is still
implementing bpf_jit_compile() is the sparc32 one.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>


# 79e3445b 28-Sep-2021 Johan Almbladh <johan.almbladh@anyfinetworks.com>

bpf, arm: Fix register clobbering in div/mod implementation

On ARM CPUs that lack div/mod instructions, ALU32 BPF_DIV and BPF_MOD are
implemented using a call to a helper function. Before, the emitted code
for those function calls failed to preserve caller-saved ARM registers.
Since some of those registers happen to be mapped to BPF registers, it
resulted in eBPF register values being overwritten.

This patch emits code to push and pop the remaining caller-saved ARM
registers r2-r3 into the stack during the div/mod function call. ARM
registers r0-r1 are used as arguments and return value, and those were
already saved and restored correctly.

Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# f5e81d11 13-Jul-2021 Daniel Borkmann <daniel@iogearbox.net>

bpf: Introduce BPF nospec instruction for mitigating Spectre v4

In case of JITs, each of the JIT backends compiles the BPF nospec instruction
/either/ to a machine instruction which emits a speculation barrier /or/ to
/no/ machine instruction in case the underlying architecture is not affected
by Speculative Store Bypass or has different mitigations in place already.

This covers both x86 and (implicitly) arm64: In case of x86, we use 'lfence'
instruction for mitigation. In case of arm64, we rely on the firmware mitigation
as controlled via the ssbd kernel parameter. Whenever the mitigation is enabled,
it works for all of the kernel code with no need to provide any additional
instructions here (hence only comment in arm64 JIT). Other archs can follow
as needed. The BPF nospec instruction is specifically targeting Spectre v4
since i) we don't use a serialization barrier for the Spectre v1 case, and
ii) mitigation instructions for v1 and v4 might be different on some archs.

The BPF nospec is required for a future commit, where the BPF verifier does
annotate intermediate BPF programs with speculation barriers.

Co-developed-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Benedict Schlueter <benedict.schlueter@rub.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Signed-off-by: Benedict Schlueter <benedict.schlueter@rub.de>
Acked-by: Alexei Starovoitov <ast@kernel.org>


# 91c960b0 14-Jan-2021 Brendan Jackman <jackmanb@google.com>

bpf: Rename BPF_XADD and prepare to encode other atomics in .imm

A subsequent patch will add additional atomic operations. These new
operations will use the same opcode field as the existing XADD, with
the immediate discriminating different operations.

In preparation, rename the instruction mode BPF_ATOMIC and start
calling the zero immediate BPF_ADD.

This is possible (doesn't break existing valid BPF progs) because the
immediate field is currently reserved MBZ and BPF_ADD is zero.

All uses are removed from the tree but the BPF_XADD definition is
kept around to avoid breaking builds for people including kernel
headers.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Björn Töpel <bjorn.topel@gmail.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-5-jackmanb@google.com


# c648c9c7 30-Apr-2020 Luke Nelson <lukenels@cs.washington.edu>

bpf, arm: Optimize ALU ARSH K using asr immediate instruction

This patch adds an optimization that uses the asr immediate instruction
for BPF_ALU BPF_ARSH BPF_K, rather than loading the immediate to
a temporary register. This is similar to existing code for handling
BPF_ALU BPF_{LSH,RSH} BPF_K. This optimization saves two instructions
and is more consistent with LSH and RSH.

Example of the code generated for BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 5)
before the optimization:

2c: mov r8, #5
30: mov r9, #0
34: asr r0, r0, r8

and after optimization:

2c: asr r0, r0, #5

Tested on QEMU using lib/test_bpf and test_verifier.

Co-developed-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200501020210.32294-3-luke.r.nels@gmail.com


# cf48db69 30-Apr-2020 Luke Nelson <lukenels@cs.washington.edu>

bpf, arm: Optimize ALU64 ARSH X using orrpl conditional instruction

This patch optimizes the code generated by emit_a32_arsh_r64, which
handles the BPF_ALU64 BPF_ARSH BPF_X instruction.

The original code uses a conditional B followed by an unconditional ORR.
The optimization saves one instruction by removing the B instruction
and using a conditional ORR (with an inverted condition).

Example of the code generated for BPF_ALU64_REG(BPF_ARSH, BPF_REG_0,
BPF_REG_1), before optimization:

34: rsb ip, r2, #32
38: subs r9, r2, #32
3c: lsr lr, r0, r2
40: orr lr, lr, r1, lsl ip
44: bmi 0x4c
48: orr lr, lr, r1, asr r9
4c: asr ip, r1, r2
50: mov r0, lr
54: mov r1, ip

and after optimization:

34: rsb ip, r2, #32
38: subs r9, r2, #32
3c: lsr lr, r0, r2
40: orr lr, lr, r1, lsl ip
44: orrpl lr, lr, r1, asr r9
48: asr ip, r1, r2
4c: mov r0, lr
50: mov r1, ip

Tested on QEMU using lib/test_bpf and test_verifier.

Co-developed-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200501020210.32294-2-luke.r.nels@gmail.com


# 4178417c 09-Apr-2020 Luke Nelson <lukenels@cs.washington.edu>

arm, bpf: Fix offset overflow for BPF_MEM BPF_DW

This patch fixes an incorrect check in how immediate memory offsets are
computed for BPF_DW on arm.

For BPF_LDX/ST/STX + BPF_DW, the 32-bit arm JIT breaks down an 8-byte
access into two separate 4-byte accesses using off+0 and off+4. If off
fits in imm12, the JIT emits a ldr/str instruction with the immediate
and avoids the use of a temporary register. While the current check off
<= 0xfff ensures that the first immediate off+0 doesn't overflow imm12,
it's not sufficient for the second immediate off+4, which may cause the
second access of BPF_DW to read/write the wrong address.

This patch fixes the problem by changing the check to
off <= 0xfff - 4 for BPF_DW, ensuring off+4 will never overflow.

A side effect of simplifying the check is that it now allows using
negative immediate offsets in ldr/str. This means that small negative
offsets can also avoid the use of a temporary register.

This patch introduces no new failures in test_verifier or test_bpf.c.

Fixes: c5eae692571d6 ("ARM: net: bpf: improve 64-bit store implementation")
Fixes: ec19e02b343db ("ARM: net: bpf: fix LDX instructions")
Co-developed-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200409221752.28448-1-luke.r.nels@gmail.com


# bb9562cf 08-Apr-2020 Luke Nelson <lukenels@cs.washington.edu>

arm, bpf: Fix bugs with ALU64 {RSH, ARSH} BPF_K shift by 0

The current arm BPF JIT does not correctly compile RSH or ARSH when the
immediate shift amount is 0. This causes the "rsh64 by 0 imm" and "arsh64
by 0 imm" BPF selftests to hang the kernel by reaching an instruction
the verifier determines to be unreachable.

The root cause is in how immediate right shifts are encoded on arm.
For LSR and ASR (logical and arithmetic right shift), a bit-pattern
of 00000 in the immediate encodes a shift amount of 32. When the BPF
immediate is 0, the generated code shifts by 32 instead of the expected
behavior (a no-op).

This patch fixes the bugs by adding an additional check if the BPF
immediate is 0. After the change, the above mentioned BPF selftests pass.

Fixes: 39c13c204bb11 ("arm: eBPF JIT compiler")
Co-developed-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200408181229.10909-1-luke.r.nels@gmail.com


# c4533128 09-Dec-2019 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: Improve prologue code sequence

Improve the prologue code sequence to be able to take advantage of
64-bit stores, changing the code from:

push {r4, r5, r6, r7, r8, r9, fp, lr}
mov fp, sp
sub ip, sp, #80 ; 0x50
sub sp, sp, #600 ; 0x258
str ip, [fp, #-100] ; 0xffffff9c
mov r6, #0
str r6, [fp, #-96] ; 0xffffffa0
mov r4, #0
mov r3, r4
mov r2, r0
str r4, [fp, #-104] ; 0xffffff98
str r4, [fp, #-108] ; 0xffffff94

to the tighter:

push {r4, r5, r6, r7, r8, r9, fp, lr}
mov fp, sp
mov r3, #0
sub r2, sp, #80 ; 0x50
sub sp, sp, #600 ; 0x258
strd r2, [fp, #-100] ; 0xffffff9c
mov r2, #0
strd r2, [fp, #-108] ; 0xffffff94
mov r2, r0

resulting in a saving of three instructions.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/E1ieH2g-0004ih-Rb@rmk-PC.armlinux.org.uk


# b886d83c 01-Jun-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation version 2 of the license

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 315 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Armijn Hemel <armijn@tjaldur.nl>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 163541e6 24-May-2019 Jiong Wang <jiong.wang@netronome.com>

arm: bpf: eliminate zero extension code-gen

Cc: Shubham Bansal <illusionist.neo@gmail.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# b85062ac 25-Jan-2019 Jiong Wang <jiong.wang@netronome.com>

arm: bpf: implement jitting of JMP32

This patch implements code-gen for new JMP32 instructions on arm.

For JSET, "ands" (AND with flags updated) is used, so corresponding
encoding helper is added.

Cc: Shubham Bansal <illusionist.neo@gmail.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# b18bea2a 12-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: improve 64-bit ALU implementation

Improbe the 64-bit ALU implementation from:

movw r8, #65532
movt r8, #65535
movw r9, #65535
movt r9, #65535
ldr r7, [fp, #-44]
adds r7, r7, r8
str r7, [fp, #-44]
ldr r7, [fp, #-40]
adc r7, r7, r9
str r7, [fp, #-40]

to:

movw r8, #65532
movt r8, #65535
movw r9, #65535
movt r9, #65535
ldrd r6, [fp, #-44]
adds r6, r6, r8
adc r7, r7, r9
strd r6, [fp, #-44]

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# c5eae692 12-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: improve 64-bit store implementation

Improve the 64-bit store implementation from:

ldr r6, [fp, #-8]
str r8, [r6]
ldr r6, [fp, #-8]
mov r7, #4
add r7, r6, r7
str r9, [r7]

to:

ldr r6, [fp, #-8]
str r8, [r6]
str r9, [r6, #4]

We leave the store as two separate STR instructions rather than using
STRD as the store may not be aligned, and STR can handle misalignment.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 077513b8 12-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: improve 64-bit sign-extended immediate load

Improve the 64-bit sign-extended immediate from:

mov r6, #1
str r6, [fp, #-52] ; 0xffffffcc
mov r6, #0
str r6, [fp, #-48] ; 0xffffffd0

to:

mov r6, #1
mov r7, #0
strd r6, [fp, #-52] ; 0xffffffcc

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# f9ff5018 12-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: improve 64-bit load immediate implementation

Rather than writing each 32-bit half of the 64-bit immediate value
separately when the register is on the stack:

movw r6, #45056 ; 0xb000
movt r6, #60979 ; 0xee33
str r6, [fp, #-44] ; 0xffffffd4
mov r6, #0
str r6, [fp, #-40] ; 0xffffffd8

arrange to use the double-word store when available instead:

movw r6, #45056 ; 0xb000
movt r6, #60979 ; 0xee33
mov r7, #0
strd r6, [fp, #-44] ; 0xffffffd4

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 8c9602d3 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: use double-word load/stores where available

Use double-word load and stores where support for this instruction is
supported by the CPU architecture.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# bef8968d 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: always use odd/even register pair

Always use an odd/even register pair for our 64-bit registers, so that
we're able to use the double-word load/store instructions in the future.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# b5045229 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: avoid reloading 'array'

Rearranging the order of the initial tail call code a little allows is
to avoid reloading the 'array' pointer.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# aaffd2f5 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: avoid reloading 'index'

Avoid reloading 'index' after we have validated it - it remains in
tmp2[1] up to the point that we begin the code to index the pointer
array, so with a little rearrangement of the registers, we can use
the already loaded value.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 2b6958ef 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: use ldr instructions with shifted rm register

Rather than pre-shifting the rm register for the ldr in the tail call,
shift it in the load instruction. This eliminates one unnecessary
instruction.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 828e2b90 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: use immediate forms of instructions where possible

Rather than moving constants to a register and then using them in a
subsequent instruction, use them directly in the desired instruction
cutting out the "middle" register. This removes two instructions from
the tail call code path.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 1ca3b17b 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: imm12 constant conversion

Provide a version of the imm8m() function that the compiler can optimise
when used with a constant expression.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 96cced4e 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: access eBPF scratch space using ARM FP register

Access the eBPF scratch space using the frame pointer rather than our
stack pointer, as the offsets from the ARM frame pointer are constant
across all eBPF programs.

Since we no longer reference the scratch space registers from the stack
pointer, this simplifies emit_push_r64() as it no longer needs to know
how many words are pushed onto the stack.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# a6eccac5 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: 64-bit accessor functions for BPF registers

Provide a couple of 64-bit register accessors, and use them where
appropriate

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 7a987025 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: provide accessor functions for BPF registers

Many of the code paths need to have knowledge about whether a register
is stacked or in a CPU register. Move this decision making to a pair
of helper functions instead of having it scattered throughout the
code.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 47b9c3bf 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: remove is_on_stack() and sstk/dstk

The decision about whether a BPF register is on the stack or in a CPU
register is detected at the top BPF insn processing level, and then
percolated throughout the remainder of the code. Since we now use
negative register values to represent stacked registers, we can detect
where a BPF register is stored without restoring to carrying this
additional metadata through all code paths.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 1c35ba12 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: use negative numbers for stacked registers

Use negative numbers for eBPF registers that live on the stack.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# a8ef95a0 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: provide load/store ops with negative immediates

Provide a set of load/store opcode generators that work with negative
immediates as well as positive ones.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# d449ceb1 11-Jul-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: enumerate the JIT scratch stack layout

Enumerate the contents of the JIT scratch stack layout used for storing
some of the JITs 64-bit registers, tail call counter and AX register.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 18d405af 28-Jun-2018 Daniel Borkmann <daniel@iogearbox.net>

bpf, arm32: fix to use bpf_jit_binary_lock_ro api

Any eBPF JIT that where its underlying arch supports ARCH_HAS_SET_MEMORY
would need to use bpf_jit_binary_{un,}lock_ro() pair instead of the
set_memory_{ro,rw}() pair directly as otherwise changes to the former
might break. arm32's eBPF conversion missed to change it, so fix this
up here.

Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 68565a1a 10-May-2018 Wang YanQing <udknight@gmail.com>

bpf, arm32: fix inconsistent naming about emit_a32_lsr_{r64,i64}

The names for BPF_ALU64 | BPF_ARSH are emit_a32_arsh_*,
the names for BPF_ALU64 | BPF_LSH are emit_a32_lsh_*, but
the names for BPF_ALU64 | BPF_RSH are emit_a32_lsr_*.

For consistence reason, let's rename emit_a32_lsr_* to
emit_a32_rsh_*.

This patch also corrects a wrong comment.

Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Wang YanQing <udknight@gmail.com>
Cc: Shubham Bansal <illusionist.neo@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux@armlinux.org.uk
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 2b589a7e 10-May-2018 Wang YanQing <udknight@gmail.com>

bpf, arm32: correct check_imm24

imm24 is signed, so the right range is:

[-(1<<(24 - 1)), (1<<(24 - 1)) - 1]

Note: this patch also fix a typo.

Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Wang YanQing <udknight@gmail.com>
Cc: Shubham Bansal <illusionist.neo@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux@armlinux.org.uk
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 38ca9306 14-May-2018 Daniel Borkmann <daniel@iogearbox.net>

bpf, arm32: save 4 bytes of unneeded stack space

The extra skb_copy_bits() buffer is not used anymore, therefore
remove the extra 4 byte stack space requirement.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 0d2d0ced 03-May-2018 Daniel Borkmann <daniel@iogearbox.net>

bpf, arm32: remove ld_abs/ld_ind

Since LD_ABS/LD_IND instructions are now removed from the core and
reimplemented through a combination of inlined BPF instructions and
a slow-path helper, we can get rid of the complexity from arm32 JIT.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 73ae3c04 26-Jan-2018 Daniel Borkmann <daniel@iogearbox.net>

bpf, arm: remove obsolete exception handling from div/mod

Since we've changed div/mod exception handling for src_reg in
eBPF verifier itself, remove the leftovers from arm32 JIT.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Shubham Bansal <illusionist.neo@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# fa9dd599 19-Jan-2018 Daniel Borkmann <daniel@iogearbox.net>

bpf: get rid of pure_initcall dependency to enable jits

Having a pure_initcall() callback just to permanently enable BPF
JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave
a small race window in future where JIT is still disabled on boot.
Since we know about the setting at compilation time anyway, just
initialize it properly there. Also consolidate all the individual
bpf_jit_enable variables into a single one and move them under one
location. Moreover, don't allow for setting unspecified garbage
values on them.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>


# 091f0248 12-Jan-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: clarify tail_call index

As per 90caccdd8cc0 ("bpf: fix bpf_tail_call() x64 JIT"), the index used
for array lookup is defined to be 32-bit wide. Update a misleading
comment that suggests it is 64-bit wide.

Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>


# ec19e02b 13-Jan-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: fix LDX instructions

When the source and destination register are identical, our JIT does not
generate correct code, which leads to kernel oopses.

Fix this by (a) generating more efficient code, and (b) making use of
the temporary earlier if we will overwrite the address register.

Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>


# 02088d9b 13-Jan-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: fix register saving

When an eBPF program tail-calls another eBPF program, it enters it after
the prologue to avoid having complex stack manipulations. This can lead
to kernel oopses, and similar.

Resolve this by always using a fixed stack layout, a CPU register frame
pointer, and using this when reloading registers before returning.

Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>


# 0005e55a 13-Jan-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: correct stack layout documentation

The stack layout documentation incorrectly suggests that the BPF JIT
scratch space starts immediately below BPF_FP. This is not correct,
so let's fix the documentation to reflect reality.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>


# 70ec3a6c 13-Jan-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: move stack documentation

Move the stack documentation towards the top of the file, where it's
relevant for things like the register layout.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>


# d1220efd 13-Jan-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: fix stack alignment

As per 2dede2d8e925 ("ARM EABI: stack pointer must be 64-bit aligned
after a CPU exception") the stack should be aligned to a 64-bit boundary
on EABI systems. Ensure that the eBPF JIT appropraitely aligns the
stack.

Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>


# f4483f2c 13-Jan-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: fix tail call jumps

When a tail call fails, it is documented that the tail call should
continue execution at the following instruction. An example tail call
sequence is:

12: (85) call bpf_tail_call#12
13: (b7) r0 = 0
14: (95) exit

The ARM assembler for the tail call in this case ends up branching to
instruction 14 instead of instruction 13, resulting in the BPF filter
returning a non-zero value:

178: ldr r8, [sp, #588] ; insn 12
17c: ldr r6, [r8, r6]
180: ldr r8, [sp, #580]
184: cmp r8, r6
188: bcs 0x1e8
18c: ldr r6, [sp, #524]
190: ldr r7, [sp, #528]
194: cmp r7, #0
198: cmpeq r6, #32
19c: bhi 0x1e8
1a0: adds r6, r6, #1
1a4: adc r7, r7, #0
1a8: str r6, [sp, #524]
1ac: str r7, [sp, #528]
1b0: mov r6, #104
1b4: ldr r8, [sp, #588]
1b8: add r6, r8, r6
1bc: ldr r8, [sp, #580]
1c0: lsl r7, r8, #2
1c4: ldr r6, [r6, r7]
1c8: cmp r6, #0
1cc: beq 0x1e8
1d0: mov r8, #32
1d4: ldr r6, [r6, r8]
1d8: add r6, r6, #44
1dc: bx r6
1e0: mov r0, #0 ; insn 13
1e4: mov r1, #0
1e8: add sp, sp, #596 ; insn 14
1ec: pop {r4, r5, r6, r7, r8, sl, pc}

For other sequences, the tail call could end up branching midway through
the following BPF instructions, or maybe off the end of the function,
leading to unknown behaviours.

Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>


# e9062481 13-Jan-2018 Russell King <rmk+kernel@armlinux.org.uk>

ARM: net: bpf: avoid 'bx' instruction on non-Thumb capable CPUs

Avoid the 'bx' instruction on CPUs that have no support for Thumb and
thus do not implement this instruction by moving the generation of this
opcode to a separate function that selects between:

bx reg

and

mov pc, reg

according to the capabilities of the CPU.

Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>


# 60b58afc 14-Dec-2017 Alexei Starovoitov <ast@kernel.org>

bpf: fix net.core.bpf_jit_enable race

global bpf_jit_enable variable is tested multiple times in JITs,
blinding and verifier core. The malicious root can try to toggle
it while loading the programs. This race condition was accounted
for and there should be no issues, but it's safer to avoid
this race condition.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>


# 39c13c20 21-Aug-2017 Shubham Bansal <illusionist.neo@gmail.com>

arm: eBPF JIT compiler

The JIT compiler emits ARM 32 bit instructions. Currently, It supports
eBPF only. Classic BPF is supported because of the conversion by BPF core.

This patch is essentially changing the current implementation of JIT compiler
of Berkeley Packet Filter from classic to internal with almost all
instructions from eBPF ISA supported except the following
BPF_ALU64 | BPF_DIV | BPF_K
BPF_ALU64 | BPF_DIV | BPF_X
BPF_ALU64 | BPF_MOD | BPF_K
BPF_ALU64 | BPF_MOD | BPF_X
BPF_STX | BPF_XADD | BPF_W
BPF_STX | BPF_XADD | BPF_DW

Implementation is using scratch space to emulate 64 bit eBPF ISA on 32 bit
ARM because of deficiency of general purpose registers on ARM. Currently,
only LITTLE ENDIAN machines are supported in this eBPF JIT Compiler.

Tested on ARMv7 with QEMU by me (Shubham Bansal).

Testing results on ARMv7:

1) test_bpf: Summary: 341 PASSED, 0 FAILED, [312/333 JIT'ed]
2) test_tag: OK (40945 tests)
3) test_progs: Summary: 30 PASSED, 0 FAILED
4) test_lpm: OK
5) test_lru_map: OK

Above tests are all done with following flags enabled discreatly.

1) bpf_jit_enable=1
a) CONFIG_FRAME_POINTER enabled
b) CONFIG_FRAME_POINTER disabled
2) bpf_jit_enable=1 and bpf_jit_harden=2
a) CONFIG_FRAME_POINTER enabled
b) CONFIG_FRAME_POINTER disabled

See Documentation/networking/filter.txt for more information.

Signed-off-by: Shubham Bansal <illusionist.neo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 74d86a70 08-May-2017 Laura Abbott <labbott@redhat.com>

arm: use set_memory.h header

set_memory_* functions have moved to set_memory.h. Switch to this
explicitly

Link: http://lkml.kernel.org/r/1488920133-27229-3-git-send-email-labbott@redhat.com
Signed-off-by: Laura Abbott <labbott@redhat.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# f941461c 05-Jan-2016 Rabin Vincent <rabin@rab.in>

ARM: net: bpf: fix zero right shift

The LSR instruction cannot be used to perform a zero right shift since a
0 as the immediate value (imm5) in the LSR instruction encoding means
that a shift of 32 is perfomed. See DecodeIMMShift() in the ARM ARM.

Make the JIT skip generation of the LSR if a zero-shift is requested.

This was found using american fuzzy lop.

Signed-off-by: Rabin Vincent <rabin@rab.in>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 55795ef5 05-Jan-2016 Rabin Vincent <rabin@rab.in>

net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
instructions since it XORs A with X while all the others replace A with
some loaded value. All the BPF JITs fail to clear A if this is used as
the first instruction in a filter. This was found using american fuzzy
lop.

Add a helper to determine if A needs to be cleared given the first
instruction in a filter, and use this in the JITs. Except for ARM, the
rest have only been compile-tested.

Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum")
Signed-off-by: Rabin Vincent <rabin@rab.in>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ebaef649 13-Nov-2015 Daniel Borkmann <daniel@iogearbox.net>

bpf, arm: start flushing icache range from header

During review I noticed that the icache range we're flushing should
start at header already and not at ctx.image.

Reason is that after 55309dd3d4cd ("net: bpf: arm: address randomize
and write protect JIT code"), we also want to make sure to flush the
random-sized trap in front of the start of the actual program (analogous
to x86). No operational differences from user side.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Nicolas Schichan <nschichan@freebox.fr>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4560cdff 02-Oct-2015 Nicolas Schichan <nschichan@freebox.fr>

ARM: net: support BPF_ALU | BPF_MOD instructions in the BPF JIT.

For ARMv7 with UDIV instruction support, generate an UDIV instruction
followed by an MLS instruction.

For other ARM variants, generate code calling a C wrapper similar to
the jit_udiv() function used for BPF_ALU | BPF_DIV instructions.

Some performance numbers reported by the test_bpf module (the duration
per filter run is reported in nanoseconds, between "jitted:<x>" and
"PASS":

ARMv7 QEMU nojit: test_bpf: #3 DIV_MOD_KX jited:0 2196 PASS
ARMv7 QEMU jit: test_bpf: #3 DIV_MOD_KX jited:1 104 PASS
ARMv5 QEMU nojit: test_bpf: #3 DIV_MOD_KX jited:0 2176 PASS
ARMv5 QEMU jit: test_bpf: #3 DIV_MOD_KX jited:1 1104 PASS
ARMv5 kirkwood nojit: test_bpf: #3 DIV_MOD_KX jited:0 1103 PASS
ARMv5 kirkwood jit: test_bpf: #3 DIV_MOD_KX jited:1 311 PASS

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 8690f47d 02-Oct-2015 Nicolas Schichan <nschichan@freebox.fr>

ARM: net: make BPF_LD | BPF_IND instruction trigger r_X initialisation to 0.

Without this patch, if the only instructions using r_X are of the
BPF_LD | BPF_IND type, r_X would not be reset to 0, using whatever
value was there when entering the jited code. With this patch, r_X
will be correctly marked as used so it will be reset to 0 in the
prologue code.

This fix also makes the test "LD_IND byte default X" pass in the
test_bpf module when the ARM JIT is enabled.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a91263d5 29-Sep-2015 Daniel Borkmann <daniel@iogearbox.net>

ebpf: migrate bpf_prog's flags to bitfield

As we need to add further flags to the bpf_prog structure, lets migrate
both bools to a bitfield representation. The size of the base structure
(excluding insns) remains unchanged at 40 bytes.

Add also tags for the kmemchecker, so that it doesn't throw false
positives. Even in case gcc would generate suboptimal code, it's not
being accessed in performance critical paths.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 5bf705b4 27-Jul-2015 Nicolas Schichan <nschichan@freebox.fr>

ARM: net: add support for BPF_ANC | SKF_AD_HATYPE in ARM JIT.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 303249ab 27-Jul-2015 Nicolas Schichan <nschichan@freebox.fr>

ARM: net: add support for BPF_ANC | SKF_AD_PAY_OFFSET in ARM JIT.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 1447f93f 27-Jul-2015 Nicolas Schichan <nschichan@freebox.fr>

ARM: net: add support for BPF_ANC | SKF_AD_PKTTYPE in ARM JIT.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>


# c18fe54b 21-Jul-2015 Nicolas Schichan <nschichan@freebox.fr>

ARM: net: fix vlan access instructions in ARM JIT.

This makes BPF_ANC | SKF_AD_VLAN_TAG and BPF_ANC | SKF_AD_VLAN_TAG_PRESENT
have the same behaviour as the in kernel VM and makes the test_bpf LD_VLAN_TAG
and LD_VLAN_TAG_PRESENT tests pass.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 6d715e30 21-Jul-2015 Nicolas Schichan <nschichan@freebox.fr>

ARM: net: handle negative offsets in BPF JIT.

Previously, the JIT would reject negative offsets known during code
generation and mishandle negative offsets provided at runtime.

Fix that by calling bpf_internal_load_pointer_neg_helper()
appropriately in the jit_get_skb_{b,h,w} slow path helpers and by forcing
the execution flow to the slow path helpers when the offset is
negative.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7aed35cb 21-Jul-2015 Nicolas Schichan <nschichan@freebox.fr>

ARM: net: fix condition for load_order > 0 when translating load instructions.

To check whether the load should take the fast path or not, the code
would check that (r_skb_hlen - load_order) is greater than the offset
of the access using an "Unsigned higher or same" condition. For
halfword accesses and an skb length of 1 at offset 0, that test is
valid, as we end up comparing 0xffffffff(-1) and 0, so the fast path
is taken and the filter allows the load to wrongly succeed. A similar
issue exists for word loads at offset 0 and an skb length of less than
4.

Fix that by using the condition "Signed greater than or equal"
condition for the fast path code for load orders greater than 0.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 24e737c1 07-May-2015 Nicolas Schichan <nschichan@freebox.fr>

ARM: net: add JIT support for loads from struct seccomp_data.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 0b59d880 07-May-2015 Nicolas Schichan <nschichan@freebox.fr>

ARM: net: delegate filter to kernel interpreter when imm_offset() return value can't fit into 12bits.

The ARM JIT code emits "ldr rX, [pc, #offset]" to access the literal
pool. #offset maximum value is 4095 and if the generated code is too
large, the #offset value can overflow and not point to the expected
slot in the literal pool. Additionally, when overflow occurs, bits of
the overflow can end up changing the destination register of the ldr
instruction.

Fix that by detecting the overflow in imm_offset() and setting a flag
that is checked for each BPF instructions converted in
build_body(). As of now it can only be detected in the second pass. As
a result the second build_body() call can now fail, so add the
corresponding cleanup code in that case.

Using multiple literal pools in the JITed code is going to require
lots of intrusive changes to the JIT code (which would better be done
as a feature instead of fix), just delegating to the kernel BPF
interpreter in that case is a more straight forward, minimal fix and
easy to backport.

Fixes: ddecdfcea0ae ("ARM: 7259/3: net: JIT compiler for packet filters")
Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 19fc99d0 06-May-2015 Nicolas Schichan <nschichan@freebox.fr>

ARM: net fix emit_udiv() for BPF_ALU | BPF_DIV | BPF_K intruction.

In that case, emit_udiv() will be called with rn == ARM_R0 (r_scratch)
and loading rm first into ARM_R0 will result in jit_udiv() function
being called the same dividend and divisor. Fix that by loading rn
first into ARM_R1 and then rm into ARM_R0.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Cc: <stable@vger.kernel.org> # v3.13+
Fixes: aee636c4809f (bpf: do not use reciprocal divide)
Acked-by: Mircea Gherzan <mgherzan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# e8b56d55 19-Sep-2014 Daniel Borkmann <daniel@iogearbox.net>

net: bpf: arm: make hole-faulting more robust

Will Deacon pointed out, that the currently used opcode for filling holes,
that is 0xe7ffffff, seems not robust enough ...

$ echo 0xffffffe7 | xxd -r > test.bin
$ arm-linux-gnueabihf-objdump -m arm -D -b binary test.bin
...
0: e7ffffff udf #65535 ; 0xffff

... while for Thumb, it ends up as ...

0: ffff e7ff vqshl.u64 q15, <illegal reg q15.5>, #63

... which is a bit fragile. The ARM specification defines some *permanently*
guaranteed undefined instruction (UDF) space, for example for ARM in ARMv7-AR,
section A5.4 and for Thumb in ARMv7-M, section A5.2.6.

Similarly, ptrace, kprobes, kgdb, bug and uprobes make use of such instruction
as well to trap. Given mentioned section from the specification, we can find
such a universe as (where 'x' denotes 'don't care'):

ARM: xxxx 0111 1111 xxxx xxxx xxxx 1111 xxxx
Thumb: 1101 1110 xxxx xxxx

We therefore should use a more robust opcode that fits both. Russell King
suggested that we can even reuse a single 32-bit word, that is, 0xe7fddef1
which will fault if executed in ARM *or* Thumb mode as done in f928d4f2a86f
("ARM: poison the vectors page"). That will still hold our requirements:

$ echo 0xf1defde7 | xxd -r > test.bin
$ arm-unknown-linux-gnueabi-objdump -m arm -D -b binary test.bin
...
0: e7fddef1 udf #56801 ; 0xdde1
$ echo 0xf1defde7f1defde7f1defde7 | xxd -r > test.bin
$ arm-unknown-linux-gnueabi-objdump -marm -Mforce-thumb -D -b binary test.bin
...
0: def1 udf #241 ; 0xf1
2: e7fd b.n 0x0
4: def1 udf #241 ; 0xf1
6: e7fd b.n 0x4
8: def1 udf #241 ; 0xf1
a: e7fd b.n 0x8

So on ARM 0xe7fddef1 conforms to the above UDF pattern, and the low 16 bit
likewise correspond to UDF in Thumb case. The 0xe7fd part is an unconditional
branch back to the UDF instruction.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 286aad3c 08-Sep-2014 Daniel Borkmann <daniel@iogearbox.net>

net: bpf: be friendly to kmemcheck

Reported by Mikulas Patocka, kmemcheck currently barks out a
false positive since we don't have special kmemcheck annotation
for bitfields used in bpf_prog structure.

We currently have jited:1, len:31 and thus when accessing len
while CONFIG_KMEMCHECK enabled, kmemcheck throws a warning that
we're reading uninitialized memory.

As we don't need the whole bit universe for pages member, we
can just split it to u16 and use a bool flag for jited instead
of a bitfield.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 55309dd3 08-Sep-2014 Daniel Borkmann <daniel@iogearbox.net>

net: bpf: arm: address randomize and write protect JIT code

This is the ARM variant for 314beb9bcab ("x86: bpf_jit_comp: secure bpf
jit against spraying attacks").

It is now possible to implement it due to commits 75374ad47c64 ("ARM: mm:
Define set_memory_* functions for ARM") and dca9aa92fc7c ("ARM: add
DEBUG_SET_MODULE_RONX option to Kconfig") which added infrastructure for
this facility.

Thus, this patch makes sure the BPF generated JIT code is marked RO, as
other kernel text sections, and also lets the generated JIT code start
at a pseudo random offset instead on a page boundary. The holes are filled
with illegal instructions.

JIT tested on armv7hl with BPF test suite.

Reference: http://mainisusuallyafunction.blogspot.com/2012/11/attacking-hardened-linux-systems-with.html
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Mircea Gherzan <mgherzan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 60a3b225 02-Sep-2014 Daniel Borkmann <daniel@iogearbox.net>

net: bpf: make eBPF interpreter images read-only

With eBPF getting more extended and exposure to user space is on it's way,
hardening the memory range the interpreter uses to steer its command flow
seems appropriate. This patch moves the to be interpreted bytecode to
read-only pages.

In case we execute a corrupted BPF interpreter image for some reason e.g.
caused by an attacker which got past a verifier stage, it would not only
provide arbitrary read/write memory access but arbitrary function calls
as well. After setting up the BPF interpreter image, its contents do not
change until destruction time, thus we can setup the image on immutable
made pages in order to mitigate modifications to that code. The idea
is derived from commit 314beb9bcabf ("x86: bpf_jit_comp: secure bpf jit
against spraying attacks").

This is possible because bpf_prog is not part of sk_filter anymore.
After setup bpf_prog cannot be altered during its life-time. This prevents
any modifications to the entire bpf_prog structure (incl. function/JIT
image pointer).

Every eBPF program (including classic BPF that are migrated) have to call
bpf_prog_select_runtime() to select either interpreter or a JIT image
as a last setup step, and they all are being freed via bpf_prog_free(),
including non-JIT. Therefore, we can easily integrate this into the
eBPF life-time, plus since we directly allocate a bpf_prog, we have no
performance penalty.

Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual
inspection of kernel_page_tables. Brad Spengler proposed the same idea
via Twitter during development of this patch.

Joint work with Hannes Frederic Sowa.

Suggested-by: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7ae457c1 30-Jul-2014 Alexei Starovoitov <ast@kernel.org>

net: filter: split 'struct sk_filter' into socket and bpf parts

clean up names related to socket filtering and bpf in the following way:
- everything that deals with sockets keeps 'sk_*' prefix
- everything that is pure BPF is changed to 'bpf_*' prefix

split 'struct sk_filter' into
struct sk_filter {
atomic_t refcnt;
struct rcu_head rcu;
struct bpf_prog *prog;
};
and
struct bpf_prog {
u32 jited:1,
len:31;
struct sock_fprog_kern *orig_prog;
unsigned int (*bpf_func)(const struct sk_buff *skb,
const struct bpf_insn *filter);
union {
struct sock_filter insns[0];
struct bpf_insn insnsi[0];
struct work_struct work;
};
};
so that 'struct bpf_prog' can be used independent of sockets and cleans up
'unattached' bpf use cases

split SK_RUN_FILTER macro into:
SK_RUN_FILTER to be used with 'struct sk_filter *' and
BPF_PROG_RUN to be used with 'struct bpf_prog *'

__sk_filter_release(struct sk_filter *) gains
__bpf_prog_release(struct bpf_prog *) helper function

also perform related renames for the functions that work
with 'struct bpf_prog *', since they're on the same lines:

sk_filter_size -> bpf_prog_size
sk_filter_select_runtime -> bpf_prog_select_runtime
sk_filter_free -> bpf_prog_free
sk_unattached_filter_create -> bpf_prog_create
sk_unattached_filter_destroy -> bpf_prog_destroy
sk_store_orig_filter -> bpf_prog_store_orig_filter
sk_release_orig_filter -> bpf_release_orig_filter
__sk_migrate_filter -> bpf_migrate_filter
__sk_prepare_filter -> bpf_prepare_filter

API for attaching classic BPF to a socket stays the same:
sk_attach_filter(prog, struct sock *)/sk_detach_filter(struct sock *)
and SK_RUN_FILTER(struct sk_filter *, ctx) to execute a program
which is used by sockets, tun, af_packet

API for 'unattached' BPF programs becomes:
bpf_prog_create(struct bpf_prog **)/bpf_prog_destroy(struct bpf_prog *)
and BPF_PROG_RUN(struct bpf_prog *, ctx) to execute a program
which is used by isdn, ppp, team, seccomp, ptp, xt_bpf, cls_bpf, test_bpf

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 34805931 29-May-2014 Daniel Borkmann <daniel@iogearbox.net>

net: filter: get rid of BPF_S_* enum

This patch finally allows us to get rid of the BPF_S_* enum.
Currently, the code performs unnecessary encode and decode
workarounds in seccomp and filter migration itself when a filter
is being attached in order to overcome BPF_S_* encoding which
is not used anymore by the new interpreter resp. JIT compilers.

Keeping it around would mean that also in future we would need
to extend and maintain this enum and related encoders/decoders.
We can get rid of all that and save us these operations during
filter attaching. Naturally, also JIT compilers need to be updated
by this.

Before JIT conversion is being done, each compiler checks if A
is being loaded at startup to obtain information if it needs to
emit instructions to clear A first. Since BPF extensions are a
subset of BPF_LD | BPF_{W,H,B} | BPF_ABS variants, case statements
for extensions can be removed at that point. To ease and minimalize
code changes in the classic JITs, we have introduced bpf_anc_helper().

Tested with test_bpf on x86_64 (JIT, int), s390x (JIT, int),
arm (JIT, int), i368 (int), ppc64 (JIT, int); for sparc we
unfortunately didn't have access, but changes are analogous to
the rest.

Joint work with Alexei Starovoitov.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Chema Gonzalez <chemag@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# f8bbbfc3 28-Mar-2014 Daniel Borkmann <daniel@iogearbox.net>

net: filter: add jited flag to indicate jit compiled filters

This patch adds a jited flag into sk_filter struct in order to indicate
whether a filter is currently jited or not. The size of sk_filter is
not being expanded as the 32 bit 'len' member allows upper bits to be
reused since a filter can currently only grow as large as BPF_MAXINSNS.

Therefore, there's enough room also for other in future needed flags to
reuse 'len' field if necessary. The jited flag also allows for having
alternative interpreter functions running as currently, we can only
detect jit compiled filters by testing fp->bpf_func to not equal the
address of sk_run_filter().

Joint work with Alexei Starovoitov.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 61b905da 24-Mar-2014 Tom Herbert <therbert@google.com>

net: Rename skb->rxhash to skb->hash

The packet hash can be considered a property of the packet, not just
on RX path.

This patch changes name of rxhash and l4_rxhash skbuff fields to be
hash and l4_hash respectively. This includes changing uses of the
field in the code which don't call the access functions.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# aee636c4 15-Jan-2014 Eric Dumazet <edumazet@google.com>

bpf: do not use reciprocal divide

At first Jakub Zawadzki noticed that some divisions by reciprocal_divide
were not correct. (off by one in some cases)
http://www.wireshark.org/~darkjames/reciprocal-buggy.c

He could also show this with BPF:
http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c

The reciprocal divide in linux kernel is not generic enough,
lets remove its use in BPF, as it is not worth the pain with
current cpus.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Daniel Borkmann <dxchgb@gmail.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Matt Evans <matt@ozlabs.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 3460743e 24-Jul-2013 Ben Dooks <ben.dooks@codethink.co.uk>

ARM: net: fix arm instruction endian-ness in bpf_jit_32.c

Use <asm/opcodes.h> to correctly transform instruction byte ordering
into in-memory ordering.

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>


# d45ed4a4 04-Oct-2013 Alexei Starovoitov <ast@kernel.org>

net: fix unsafe set_memory_rw from softirq

on x86 system with net.core.bpf_jit_enable = 1

sudo tcpdump -i eth1 'tcp port 22'

causes the warning:
[ 56.766097] Possible unsafe locking scenario:
[ 56.766097]
[ 56.780146] CPU0
[ 56.786807] ----
[ 56.793188] lock(&(&vb->lock)->rlock);
[ 56.799593] <Interrupt>
[ 56.805889] lock(&(&vb->lock)->rlock);
[ 56.812266]
[ 56.812266] *** DEADLOCK ***
[ 56.812266]
[ 56.830670] 1 lock held by ksoftirqd/1/13:
[ 56.836838] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff8118f44c>] vm_unmap_aliases+0x8c/0x380
[ 56.849757]
[ 56.849757] stack backtrace:
[ 56.862194] CPU: 1 PID: 13 Comm: ksoftirqd/1 Not tainted 3.12.0-rc3+ #45
[ 56.868721] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[ 56.882004] ffffffff821944c0 ffff88080bbdb8c8 ffffffff8175a145 0000000000000007
[ 56.895630] ffff88080bbd5f40 ffff88080bbdb928 ffffffff81755b14 0000000000000001
[ 56.909313] ffff880800000001 ffff880800000000 ffffffff8101178f 0000000000000001
[ 56.923006] Call Trace:
[ 56.929532] [<ffffffff8175a145>] dump_stack+0x55/0x76
[ 56.936067] [<ffffffff81755b14>] print_usage_bug+0x1f7/0x208
[ 56.942445] [<ffffffff8101178f>] ? save_stack_trace+0x2f/0x50
[ 56.948932] [<ffffffff810cc0a0>] ? check_usage_backwards+0x150/0x150
[ 56.955470] [<ffffffff810ccb52>] mark_lock+0x282/0x2c0
[ 56.961945] [<ffffffff810ccfed>] __lock_acquire+0x45d/0x1d50
[ 56.968474] [<ffffffff810cce6e>] ? __lock_acquire+0x2de/0x1d50
[ 56.975140] [<ffffffff81393bf5>] ? cpumask_next_and+0x55/0x90
[ 56.981942] [<ffffffff810cef72>] lock_acquire+0x92/0x1d0
[ 56.988745] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380
[ 56.995619] [<ffffffff817628f1>] _raw_spin_lock+0x41/0x50
[ 57.002493] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380
[ 57.009447] [<ffffffff8118f52a>] vm_unmap_aliases+0x16a/0x380
[ 57.016477] [<ffffffff8118f44c>] ? vm_unmap_aliases+0x8c/0x380
[ 57.023607] [<ffffffff810436b0>] change_page_attr_set_clr+0xc0/0x460
[ 57.030818] [<ffffffff810cfb8d>] ? trace_hardirqs_on+0xd/0x10
[ 57.037896] [<ffffffff811a8330>] ? kmem_cache_free+0xb0/0x2b0
[ 57.044789] [<ffffffff811b59c3>] ? free_object_rcu+0x93/0xa0
[ 57.051720] [<ffffffff81043d9f>] set_memory_rw+0x2f/0x40
[ 57.058727] [<ffffffff8104e17c>] bpf_jit_free+0x2c/0x40
[ 57.065577] [<ffffffff81642cba>] sk_filter_release_rcu+0x1a/0x30
[ 57.072338] [<ffffffff811108e2>] rcu_process_callbacks+0x202/0x7c0
[ 57.078962] [<ffffffff81057f17>] __do_softirq+0xf7/0x3f0
[ 57.085373] [<ffffffff81058245>] run_ksoftirqd+0x35/0x70

cannot reuse jited filter memory, since it's readonly,
so use original bpf insns memory to hold work_struct

defer kfree of sk_filter until jit completed freeing

tested on x86_64 and i386

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# aafc787e 20-May-2013 Daniel Borkmann <daniel@iogearbox.net>

arm: bpf_jit: can call module_free() from any context

Follow-up on module_free()/vfree() that takes care of the rest, so no
longer this workaround with work_struct needed.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 79617801 21-Mar-2013 Daniel Borkmann <daniel@iogearbox.net>

filter: bpf_jit_comp: refactor and unify BPF JIT image dump output

If bpf_jit_enable > 1, then we dump the emitted JIT compiled image
after creation. Currently, only SPARC and PowerPC has similar output
as in the reference implementation on x86_64. Make a small helper
function in order to reduce duplicated code and make the dump output
uniform across architectures x86_64, SPARC, PPC, ARM (e.g. on ARM
flen, pass and proglen are currently not shown, but would be
interesting to know as well), also for future BPF JIT implementations
on other archs.

Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Matt Evans <matt@ozlabs.org>
Cc: Eric Dumazet <eric.dumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 45549a68 09-Mar-2013 Chen Gang <gang.chen@asianux.com>

ARM:net: an issue for k which is u32, never < 0

k is u32 which never < 0, need type cast, or cause issue.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Mircea Gherzan <mgherzan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 462738f4 13-Feb-2013 Nicolas Schichan <nschichan@freebox.fr>

ARM: net: bpf_jit: fix emit_swap16() for non ARMv6+.

The original code was generating an lsl instructions using the value
of ARM_R8 (skb_headlen, possibly uninitialized if no skb_headlen
access was required) as a shift amount.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Acked-by: Mircea Gherzan <mgherzan@gmail.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>


# fe15f3f1 10-Dec-2012 Schichan Nicolas <nschichan@freebox.fr>

ARM: 7598/1: net: bpf_jit_32: fix sp-relative load/stores offsets.

The offset must be multiplied by 4 to be sure to access the correct
32bit word in the stack scratch space.

For instance, a store at scratch memory cell #1 was generating the
following:

st r4, [sp, #1]

While the correct code for this is:

st r4, [sp, #4]

To reproduce the bug (assuming your system has a NIC with the mac
address 52:54:00:12:34:56):

echo 0 > /proc/sys/net/core/bpf_jit_enable
tcpdump -ni eth0 "ether[1] + ether[2] - ether[3] * ether[4] - ether[5] \
== -0x3AA" # this will capture packets as expected

echo 1 > /proc/sys/net/core/bpf_jit_enable
tcpdump -ni eth0 "ether[1] + ether[2] - ether[3] * ether[4] - ether[5] \
== -0x3AA" # this will not.

This bug was present since the original inclusion of bpf_jit for ARM
(ddecdfce: ARM: 7259/3: net: JIT compiler for packet filters).

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>


# 89c2e009 10-Dec-2012 Schichan Nicolas <nschichan@freebox.fr>

ARM: 7597/1: net: bpf_jit_32: fix kzalloc gfp/size mismatch.

Official prototype for kzalloc is:

void *kzalloc(size_t, gfp_t);

The ARM bpf_jit code was having the assumption that it was:

void *kzalloc(gfp_t, size);

This was resulting the use of some random GFP flags depending on the
size requested and some random overflows once the really needed size
was more than the value of GFP_KERNEL.

This bug was present since the original inclusion of bpf_jit for ARM
(ddecdfce: ARM: 7259/3: net: JIT compiler for packet filters).

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>


# bf0098f2 07-Nov-2012 Daniel Borkmann <daniel@iogearbox.net>

ARM: net: bpf_jit_32: add VLAN instructions for BPF JIT

This patch is a follow-up for patch "net: filter: add vlan tag access"
to support the new VLAN_TAG/VLAN_TAG_PRESENT accessors in BPF JIT.

Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mircea Gherzan <mgherzan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 3cbe2041 07-Nov-2012 Daniel Borkmann <daniel@iogearbox.net>

ARM: net: bpf_jit_32: add XOR instruction for BPF JIT

This patch is a follow-up for patch "filter: add XOR instruction for use
with X/K" that implements BPF ARM JIT parts for the BPF XOR operation.

Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mircea Gherzan <mgherzan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 2bea29b7 11-Jun-2012 Mircea Gherzan <mgherzan@gmail.com>

ARM: 7421/1: bpf_jit: BPF_S_ANC_ALU_XOR_X support

JIT support for the XOR operation introduced by the commit
ffe06c17afbb.

Signed-off-by: Mircea Gherzan <mgherzan@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>


# ddecdfce 16-Mar-2012 Mircea Gherzan <mgherzan@gmail.com>

ARM: 7259/3: net: JIT compiler for packet filters

Based of Matt Evans's PPC64 implementation.

The compiler generates ARM instructions but interworking is
supported for Thumb2 kernels.

Supports both little and big endian. Unaligned loads are emitted
for ARMv6+. Not all the BPF opcodes that deal with ancillary data
are supported. The scratch memory of the filter lives on the stack.
Hardware integer division is used if it is available.

Enabled in the same way as for x86-64 and PPC64:

echo 1 > /proc/sys/net/core/bpf_jit_enable

A value greater than 1 enables opcode output.

Signed-off-by: Mircea Gherzan <mgherzan@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>