Cross Reference: /linux-master/arch/arm64/lib/xor-neon.c

History log of /linux-master/arch/arm64/lib/xor-neon.c
Revision	Date	Author	Comments
# 320a93d4	16-May-2023	Arnd Bergmann <arnd@arndb.de>	arm64: xor-neon: mark xor_arm64_neon_*() static The only references to these functions are in the same file, and there is no prototype, which causes a harmless warning: arch/arm64/lib/xor-neon.c:13:6: error: no previous prototype for 'xor_arm64_neon_2' [-Werror=missing-prototypes] arch/arm64/lib/xor-neon.c:40:6: error: no previous prototype for 'xor_arm64_neon_3' [-Werror=missing-prototypes] arch/arm64/lib/xor-neon.c:76:6: error: no previous prototype for 'xor_arm64_neon_4' [-Werror=missing-prototypes] arch/arm64/lib/xor-neon.c:121:6: error: no previous prototype for 'xor_arm64_neon_5' [-Werror=missing-prototypes] Fixes: cc9f8349cb33 ("arm64: crypto: add NEON accelerated XOR implementation") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20230516160642.523862-2-arnd@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 297565aa2	05-Feb-2022	Ard Biesheuvel <ardb@kernel.org>	lib/xor: make xor prototypes more friendly to compiler vectorization Modern compilers are perfectly capable of extracting parallelism from the XOR routines, provided that the prototypes reflect the nature of the input accurately, in particular, the fact that the input vectors are expected not to overlap. This is not documented explicitly, but is implied by the interchangeability of the various C routines, some of which use temporary variables while others don't: this means that these routines only behave identically for non-overlapping inputs. So let's decorate these input vectors with the __restrict modifier, which informs the compiler that there is no overlap. While at it, make the input-only vectors pointer-to-const as well. Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://github.com/ClangBuiltLinux/linux/issues/563 Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# 2c54b423	13-Dec-2021	Ard Biesheuvel <ardb@kernel.org>	arm64/xor: use EOR3 instructions when available Use the EOR3 instruction to implement xor_blocks() if the instruction is available, which is the case if the CPU implements the SHA-3 extension. This is about 20% faster on Apple M1 when using the 5-way version. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20211213140252.2856053-1-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# d2912cb1	04-Jun-2019	Thomas Gleixner <tglx@linutronix.de>	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
# cc9f8349	03-Dec-2018	Jackie Liu <liuyun01@kylinos.cn>	arm64: crypto: add NEON accelerated XOR implementation This is a NEON acceleration method that can improve performance by approximately 20%. I got the following data from the centos 7.5 on Huawei's HISI1616 chip: [ 93.837726] xor: measuring software checksum speed [ 93.874039] 8regs : 7123.200 MB/sec [ 93.914038] 32regs : 7180.300 MB/sec [ 93.954043] arm64_neon: 9856.000 MB/sec [ 93.954047] xor: using function: arm64_neon (9856.000 MB/sec) I believe this code can bring some optimization for all arm64 platform. thanks for Ard Biesheuvel's suggestions. Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>