Cross Reference: /linux-master/arch/arm64/crypto/aes-ce.S

History log of /linux-master/arch/arm64/crypto/aes-ce.S
Revision	Date	Author	Comments
# b8e50548	18-Feb-2020	Mark Brown <broonie@kernel.org>	arm64: crypto: Modernize names for AES function macros Now that the rest of the code has been converted to the modern START/END macros the AES_ENTRY() and AES_ENDPROC() macros look out of place and like they need updating. Rename them to AES_FUNC_START() and AES_FUNC_END() to line up with the modern style assembly macros. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
# 0e89640b	13-Dec-2019	Mark Brown <broonie@kernel.org>	crypto: arm64 - Use modern annotations for assembly functions In an effort to clarify and simplify the annotation of assembly functions in the kernel new macros have been introduced. These replace ENTRY and ENDPROC and also add a new annotation for static functions which previously had no ENTRY equivalent. Update the annotations in the crypto code to the new macros. There are a small number of files imported from OpenSSL where the assembly is generated using perl programs, these are not currently annotated at all and have not been modified. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# 67cfa5d3	03-Sep-2019	Ard Biesheuvel <ardb@kernel.org>	crypto: arm64/aes-neonbs - implement ciphertext stealing for XTS Update the AES-XTS implementation based on NEON instructions so that it can deal with inputs whose size is not a multiple of the cipher block size. This is part of the original XTS specification, but was never implemented before in the Linux kernel. Since the bit slicing driver is only faster if it can operate on at least 7 blocks of input at the same time, let's reuse the alternate path we are adding for CTS to process any data tail whose size is not a multiple of 128 bytes. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# 7367bfeb	24-Jun-2019	Ard Biesheuvel <ardb@kernel.org>	crypto: arm64/aes-ce - implement 5 way interleave for ECB, CBC and CTR This implements 5-way interleaving for ECB, CBC decryption and CTR, resulting in a speedup of ~11% on Marvell ThunderX2, which has a very deep pipeline and therefore a high issue latency for NEON instructions operating on the same registers. Note that XTS is left alone: implementing 5-way interleave there would either involve spilling of the calculated tweaks to the stack, or recalculating them after the encryption operation, and doing either of those would most likely penalize low end cores. For ECB, this is not a concern at all, given that we have plenty of spare registers. For CTR and CBC decryption, we take advantage of the fact that v16 is not used by the CE version of the code (which is the only one targeted by the optimization), and so we can reshuffle the code a bit and avoid having to spill to memory (with the exception of one extra reload in the CBC routine) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# e2174139	24-Jun-2019	Ard Biesheuvel <ardb@kernel.org>	crypto: arm64/aes-ce - add 5 way interleave routines In preparation of tweaking the accelerated AES chaining mode routines to be able to use a 5-way stride, implement the core routines to support processing 5 blocks of input at a time. While at it, drop the 2 way versions, which have been unused for a while now. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# d2912cb1	04-Jun-2019	Thomas Gleixner <tglx@linutronix.de>	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
# 2e5d2f33	10-Sep-2018	Ard Biesheuvel <ardb@kernel.org>	crypto: arm64/aes-blk - improve XTS mask handling The Crypto Extension instantiation of the aes-modes.S collection of skciphers uses only 15 NEON registers for the round key array, whereas the pure NEON flavor uses 16 NEON registers for the AES S-box. This means we have a spare register available that we can use to hold the XTS mask vector, removing the need to reload it at every iteration of the inner loop. Since the pure NEON version does not permit this optimization, tweak the macros so we can factor out this functionality. Also, replace the literal load with a short sequence to compose the mask vector. On Cortex-A53, this results in a ~4% speedup. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# 0c8f838a	30-Apr-2018	Ard Biesheuvel <ardb@kernel.org>	crypto: arm64/aes-blk - yield NEON after every block of input Avoid excessive scheduling delays under a preemptible kernel by yielding the NEON after every block of input. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# f402e311	24-Jul-2017	Ard Biesheuvel <ardb@kernel.org>	crypto: arm64/aes-ce-cipher - match round key endianness with generic code In order to be able to reuse the generic AES code as a fallback for situations where the NEON may not be used, update the key handling to match the byte order of the generic code: it stores round keys as sequences of 32-bit quantities rather than streams of bytes, and so our code needs to be updated to reflect that. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# caf4b9e2	11-Oct-2016	Ard Biesheuvel <ardb@kernel.org>	crypto: arm64/aes-xts-ce: fix for big endian Emit the XTS tweak literal constants in the appropriate order for a single 128-bit scalar literal load. Fixes: 49788fe2a128 ("arm64/crypto: AES-ECB/CBC/CTR/XTS using ARMv8 NEON and Crypto Extensions") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# 4a97abd4	17-Mar-2015	Ard Biesheuvel <ardb@kernel.org>	arm64/crypto: issue aese/aesmc instructions in pairs This changes the AES core transform implementations to issue aese/aesmc (and aesd/aesimc) in pairs. This enables a micro-architectural optimization in recent Cortex-A5x cores that improves performance by 50-90%. Measured performance in cycles per byte (Cortex-A57): CBC enc CBC dec CTR before 3.64 1.34 1.32 after 1.95 0.85 0.93 Note that this results in a ~5% performance decrease for older cores. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
# 49788fe2	21-Mar-2014	Ard Biesheuvel <ardb@kernel.org>	arm64/crypto: AES-ECB/CBC/CTR/XTS using ARMv8 NEON and Crypto Extensions This adds ARMv8 implementations of AES in ECB, CBC, CTR and XTS modes, both for ARMv8 with Crypto Extensions and for plain ARMv8 NEON. The Crypto Extensions version can only run on ARMv8 implementations that have support for these optional extensions. The plain NEON version is a table based yet time invariant implementation. All S-box substitutions are performed in parallel, leveraging the wide range of ARMv8's tbl/tbx instructions, and the huge NEON register file, which can comfortably hold the entire S-box and still have room to spare for doing the actual computations. The key expansion routines were borrowed from aes_generic. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>