History log of /linux-master/arch/x86/crypto/curve25519-x86_64.c
Revision Date Author Comments
# acd93f8a 14-Dec-2021 Jason A. Donenfeld <Jason@zx2c4.com>

crypto: x86/curve25519 - use in/out register constraints more precisely

Rather than passing all variables as modified, pass ones that are only
read into that parameter. This helps with old gcc versions when
alternatives are additionally used, and lets gcc's codegen be a little
bit more efficient. This also syncs up with the latest Vale/EverCrypt
output.

Reported-by: Mathias Krause <minipli@grsecurity.net>
Cc: Aymeric Fromherz <aymeric.fromherz@inria.fr>
Link: https://lore.kernel.org/wireguard/1554725710.1290070.1639240504281.JavaMail.zimbra@inria.fr/
Link: https://github.com/project-everest/hacl-star/pull/501
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 1b82435d 02-Jun-2021 Hangbin Liu <liuhangbin@gmail.com>

crypto: x86/curve25519 - fix cpu feature checking logic in mod_exit

In curve25519_mod_init() the curve25519_alg will be registered only when
(X86_FEATURE_BMI2 && X86_FEATURE_ADX). But in curve25519_mod_exit()
it still checks (X86_FEATURE_BMI2 || X86_FEATURE_ADX) when do crypto
unregister. This will trigger a BUG_ON in crypto_unregister_alg() as
alg->cra_refcnt is 0 if the cpu only supports one of X86_FEATURE_BMI2
and X86_FEATURE_ADX.

Fixes: 07b586fe0662 ("crypto: x86/curve25519 - replace with formally verified implementation")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# d9f6e12f 18-Mar-2021 Ingo Molnar <mingo@kernel.org>

x86: Fix various typos in comments

Fix ~144 single-word typos in arch/x86/ code comments.

Doing this in a single commit should reduce the churn.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-kernel@vger.kernel.org


# db719539 27-Aug-2020 Uros Bizjak <ubizjak@gmail.com>

crypto: curve25519-x86_64 - Use XORL r32,32

x86_64 zero extends 32bit operations, so for 64bit operands,
XORL r32,r32 is functionally equal to XORL r64,r64, but avoids
a REX prefix byte when legacy registers are used.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 0c3dc787 19-Aug-2020 Herbert Xu <herbert@gondor.apana.org.au>

crypto: algapi - Remove skbuff.h inclusion

The header file algapi.h includes skbuff.h unnecessarily since
all we need is a forward declaration for struct sk_buff. This
patch removes that inclusion.

Unfortunately skbuff.h pulls in a lot of things and drivers over
the years have come to rely on it so this patch adds a lot of
missing inclusions that result from this.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 054a5540 23-Jul-2020 Herbert Xu <herbert@gondor.apana.org.au>

crypto: x86/curve25519 - Remove unused carry variables

The carry variables are assigned but never used, which upsets
the compiler. This patch removes them.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Karthikeyan Bhargavan <karthik.bhargavan@gmail.com>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# dc7fc3a5 01-Mar-2020 Jason A. Donenfeld <Jason@zx2c4.com>

crypto: x86/curve25519 - leave r12 as spare register

This updates to the newer register selection proved by HACL*, which
leads to a more compact instruction encoding, and saves around 100
cycles.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 07b586fe 20-Jan-2020 Jason A. Donenfeld <Jason@zx2c4.com>

crypto: x86/curve25519 - replace with formally verified implementation

This comes from INRIA's HACL*/Vale. It implements the same algorithm and
implementation strategy as the code it replaces, only this code has been
formally verified, sans the base point multiplication, which uses code
similar to prior, only it uses the formally verified field arithmetic
alongside reproducable ladder generation steps. This doesn't have a
pure-bmi2 version, which means haswell no longer benefits, but the
increased (doubled) code complexity is not worth it for a single
generation of chips that's already old.

Performance-wise, this is around 1% slower on older microarchitectures,
and slightly faster on newer microarchitectures, mainly 10nm ones or
backports of 10nm to 14nm. This implementation is "everest" below:

Xeon E5-2680 v4 (Broadwell)

armfazh: 133340 cycles per call
everest: 133436 cycles per call

Xeon Gold 5120 (Sky Lake Server)

armfazh: 112636 cycles per call
everest: 113906 cycles per call

Core i5-6300U (Sky Lake Client)

armfazh: 116810 cycles per call
everest: 117916 cycles per call

Core i7-7600U (Kaby Lake)

armfazh: 119523 cycles per call
everest: 119040 cycles per call

Core i7-8750H (Coffee Lake)

armfazh: 113914 cycles per call
everest: 113650 cycles per call

Core i9-9880H (Coffee Lake Refresh)

armfazh: 112616 cycles per call
everest: 114082 cycles per call

Core i3-8121U (Cannon Lake)

armfazh: 113202 cycles per call
everest: 111382 cycles per call

Core i7-8265U (Whiskey Lake)

armfazh: 127307 cycles per call
everest: 127697 cycles per call

Core i7-8550U (Kaby Lake Refresh)

armfazh: 127522 cycles per call
everest: 127083 cycles per call

Xeon Platinum 8275CL (Cascade Lake)

armfazh: 114380 cycles per call
everest: 114656 cycles per call

Achieving these kind of results with formally verified code is quite
remarkable, especialy considering that performance is favorable for
newer chips.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 8394bfec 25-Nov-2019 Jason A. Donenfeld <Jason@zx2c4.com>

crypto: arch - conditionalize crypto api in arch glue for lib code

For glue code that's used by Zinc, the actual Crypto API functions might
not necessarily exist, and don't need to exist either. Before this
patch, there are valid build configurations that lead to a unbuildable
kernel. This fixes it to conditionalize those symbols on the existence
of the proper config entry.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# bb611bdf 08-Nov-2019 Jason A. Donenfeld <Jason@zx2c4.com>

crypto: curve25519 - x86_64 library and KPP implementations

This implementation is the fastest available x86_64 implementation, and
unlike Sandy2x, it doesn't requie use of the floating point registers at
all. Instead it makes use of BMI2 and ADX, available on recent
microarchitectures. The implementation was written by Armando
Faz-Hernández with contributions (upstream) from Samuel Neves and me,
in addition to further changes in the kernel implementation from us.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Co-developed-by: Samuel Neves <sneves@dei.uc.pt>
[ardb: - move to arch/x86/crypto
- wire into lib/crypto framework
- implement crypto API KPP hooks ]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>