History log of /linux-master/crypto/adiantum.c
Revision Date Author Comments
# a312e07a 27-Oct-2023 Eric Biggers <ebiggers@google.com>

crypto: adiantum - flush destination page before unmapping

Upon additional review, the new fast path in adiantum_finish() is
missing the call to flush_dcache_page() that scatterwalk_map_and_copy()
was doing. It's apparently debatable whether flush_dcache_page() is
actually needed, as per the discussion at
https://lore.kernel.org/lkml/YYP1lAq46NWzhOf0@casper.infradead.org/T/#u.
However, it appears that currently all the helper functions that write
to a page, such as scatterwalk_map_and_copy(), memcpy_to_page(), and
memzero_page(), do the dcache flush. So do it to be consistent.

Fixes: dadf5e56c967 ("crypto: adiantum - add fast path for single-page messages")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 321dfe97 18-Oct-2023 Eric Biggers <ebiggers@google.com>

crypto: adiantum - stop using alignmask of shash_alg

Now that the shash algorithm type does not support nonzero alignmasks,
shash_alg::base.cra_alignmask is always 0, so OR-ing it into another
value is a no-op.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# dadf5e56 09-Oct-2023 Eric Biggers <ebiggers@google.com>

crypto: adiantum - add fast path for single-page messages

When the source scatterlist is a single page, optimize the first hash
step of adiantum to use crypto_shash_digest() instead of
init/update/final, and use the same local kmap for both hashing the bulk
part and loading the narrow part of the source data.

Likewise, when the destination scatterlist is a single page, optimize
the second hash step of adiantum to use crypto_shash_digest() instead of
init/update/final, and use the same local kmap for both hashing the bulk
part and storing the narrow part of the destination data.

In some cases these optimizations improve performance significantly.

Note: ideally, for optimal performance each architecture should
implement the full "adiantum(xchacha12,aes)" algorithm and fully
optimize the contiguous buffer case to use no indirect calls. That's
not something I've gotten around to doing, though. This commit just
makes a relatively small change that provides some benefit with the
existing template-based approach.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 3c45b578 02-Oct-2023 Herbert Xu <herbert@gondor.apana.org.au>

crypto: adiantum - Only access common skcipher fields on spawn

As skcipher spawns may be of the type lskcipher, only the common
fields may be accessed. This was already the case but use the
correct helpers to make this more obvious.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 255e48eb 07-Feb-2023 Herbert Xu <herbert@gondor.apana.org.au>

crypto: api - Use data directly in completion function

This patch does the final flag day conversion of all completion
functions which are now all contained in the Crypto API.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 0eb76ba2 11-Dec-2020 Ard Biesheuvel <ardb@kernel.org>

crypto: remove cipher routines from public crypto API

The cipher routines in the crypto API are mostly intended for templates
implementing skcipher modes generically in software, and shouldn't be
used outside of the crypto subsystem. So move the prototypes and all
related definitions to a new header file under include/crypto/internal.
Also, let's use the new module namespace feature to move the symbol
exports into a new namespace CRYPTO_INTERNAL.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 453431a5 07-Aug-2020 Waiman Long <longman@redhat.com>

mm, treewide: rename kzfree() to kfree_sensitive()

As said by Linus:

A symmetric naming is only helpful if it implies symmetries in use.
Otherwise it's actively misleading.

In "kzalloc()", the z is meaningful and an important part of what the
caller wants.

In "kzfree()", the z is actively detrimental, because maybe in the
future we really _might_ want to use that "memfill(0xdeadbeef)" or
something. The "zero" part of the interface isn't even _relevant_.

The main reason that kzfree() exists is to clear sensitive information
that should not be leaked to other future users of the same memory
objects.

Rename kzfree() to kfree_sensitive() to follow the example of the recently
added kvfree_sensitive() and make the intention of the API more explicit.
In addition, memzero_explicit() is used to clear the memory to make sure
that it won't get optimized away by the compiler.

The renaming is done by using the command sequence:

git grep -w --name-only kzfree |\
xargs sed -i 's/kzfree/kfree_sensitive/'

followed by some editing of the kfree_sensitive() kerneldoc and adding
a kzfree backward compatibility macro in slab.h.

[akpm@linux-foundation.org: fs/crypto/inline_crypt.c needs linux/slab.h]
[akpm@linux-foundation.org: fix fs/crypto/inline_crypt.c some more]

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Joe Perches <joe@perches.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: "Jason A . Donenfeld" <Jason@zx2c4.com>
Link: http://lkml.kernel.org/r/20200616154311.12314-3-longman@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 7bcb2c99 10-Jul-2020 Eric Biggers <ebiggers@google.com>

crypto: algapi - use common mechanism for inheriting flags

The flag CRYPTO_ALG_ASYNC is "inherited" in the sense that when a
template is instantiated, the template will have CRYPTO_ALG_ASYNC set if
any of the algorithms it uses has CRYPTO_ALG_ASYNC set.

We'd like to add a second flag (CRYPTO_ALG_ALLOCATES_MEMORY) that gets
"inherited" in the same way. This is difficult because the handling of
CRYPTO_ALG_ASYNC is hardcoded everywhere. Address this by:

- Add CRYPTO_ALG_INHERITED_FLAGS, which contains the set of flags that
have these inheritance semantics.

- Add crypto_algt_inherited_mask(), for use by template ->create()
methods. It returns any of these flags that the user asked to be
unset and thus must be passed in the 'mask' to crypto_grab_*().

- Also modify crypto_check_attr_type() to handle computing the 'mask'
so that most templates can just use this.

- Make crypto_grab_*() propagate these flags to the template instance
being created so that templates don't have to do this themselves.

Make crypto/simd.c propagate these flags too, since it "wraps" another
algorithm, similar to a template.

Based on a patch by Mikulas Patocka <mpatocka@redhat.com>
(https://lore.kernel.org/r/alpine.LRH.2.02.2006301414580.30526@file01.intranet.prod.int.rdu2.redhat.com).

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 1c08a104 05-Jan-2020 Jason A. Donenfeld <Jason@zx2c4.com>

crypto: poly1305 - add new 32 and 64-bit generic versions

These two C implementations from Zinc -- a 32x32 one and a 64x64 one,
depending on the platform -- come from Andrew Moon's public domain
poly1305-donna portable code, modified for usage in the kernel. The
precomputation in the 32-bit version and the use of 64x64 multiplies in
the 64-bit version make these perform better than the code it replaces.
Moon's code is also very widespread and has received many eyeballs of
scrutiny.

There's a bit of interference between the x86 implementation, which
relies on internal details of the old scalar implementation. In the next
commit, the x86 implementation will be replaced with a faster one that
doesn't rely on this, so none of this matters much. But for now, to keep
this passing the tests, we inline the bits of the old implementation
that the x86 implementation relied on. Also, since we now support a
slightly larger key space, via the union, some offsets had to be fixed
up.

Nonce calculation was folded in with the emit function, to take
advantage of 64x64 arithmetic. However, Adiantum appeared to rely on no
nonce handling in emit, so this path was conditionalized. We also
introduced a new struct, poly1305_core_key, to represent the precise
amount of space that particular implementation uses.

Testing with kbench9000, depending on the CPU, the update function for
the 32x32 version has been improved by 4%-7%, and for the 64x64 by
19%-30%. The 32x32 gains are small, but I think there's great value in
having a parallel implementation to the 64x64 one so that the two can be
compared side-by-side as nice stand-alone units.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# d5ed3b65 02-Jan-2020 Eric Biggers <ebiggers@google.com>

crypto: cipher - make crypto_spawn_cipher() take a crypto_cipher_spawn

Now that all users of single-block cipher spawns have been converted to
use 'struct crypto_cipher_spawn' rather than the less specifically typed
'struct crypto_spawn', make crypto_spawn_cipher() take a pointer to a
'struct crypto_cipher_spawn' rather than a 'struct crypto_spawn'.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# ba448407 02-Jan-2020 Eric Biggers <ebiggers@google.com>

crypto: adiantum - use crypto_grab_{cipher,shash} and simplify error paths

Make the adiantum template use the new functions crypto_grab_cipher()
and crypto_grab_shash() to initialize its cipher and shash spawns.

This is needed to make all spawns be initialized in a consistent way.

Also simplify the error handling by taking advantage of crypto_drop_*()
now accepting (as a no-op) spawns that haven't been initialized yet, and
by taking advantage of crypto_grab_*() now handling ERR_PTR() names.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# de95c957 02-Jan-2020 Eric Biggers <ebiggers@google.com>

crypto: algapi - pass instance to crypto_grab_spawn()

Currently, crypto_spawn::inst is first used temporarily to pass the
instance to crypto_grab_spawn(). Then crypto_init_spawn() overwrites it
with crypto_spawn::next, which shares the same union. Finally,
crypto_spawn::inst is set again when the instance is registered.

Make this less convoluted by just passing the instance as an argument to
crypto_grab_spawn() instead.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# b9f76ddd 02-Jan-2020 Eric Biggers <ebiggers@google.com>

crypto: skcipher - pass instance to crypto_grab_skcipher()

Initializing a crypto_skcipher_spawn currently requires:

1. Set spawn->base.inst to point to the instance.
2. Call crypto_grab_skcipher().

But there's no reason for these steps to be separate, and in fact this
unneeded complication has caused at least one bug, the one fixed by
commit 6db43410179b ("crypto: adiantum - initialize crypto_spawn::inst")

So just make crypto_grab_skcipher() take the instance as an argument.

To keep the function calls from getting too unwieldy due to this extra
argument, also introduce a 'mask' variable into the affected places
which weren't already using one.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# af5034e8 30-Dec-2019 Eric Biggers <ebiggers@google.com>

crypto: remove propagation of CRYPTO_TFM_RES_* flags

The CRYPTO_TFM_RES_* flags were apparently meant as a way to make the
->setkey() functions provide more information about errors. But these
flags weren't actually being used or tested, and in many cases they
weren't being set correctly anyway. So they've now been removed.

Also, if someone ever actually needs to start better distinguishing
->setkey() errors (which is somewhat unlikely, as this has been unneeded
for a long time), we'd be much better off just defining different return
values, like -EINVAL if the key is invalid for the algorithm vs.
-EKEYREJECTED if the key was rejected by a policy like "no weak keys".
That would be much simpler, less error-prone, and easier to test.

So just remove CRYPTO_TFM_RES_MASK and all the unneeded logic that
propagates these flags around.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# c593642c 09-Dec-2019 Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>

treewide: Use sizeof_field() macro

Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except
at places where these are defined. Later patches will remove the unused
definition of FIELD_SIZEOF().

This patch is generated using following script:

EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h"

git grep -l -e "\bFIELD_SIZEOF\b" | while read file;
do

if [[ "$file" =~ $EXCLUDE_FILES ]]; then
continue
fi
sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file;
done

Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: David Miller <davem@davemloft.net> # for net


# 48ea8c6e 08-Nov-2019 Ard Biesheuvel <ardb@kernel.org>

crypto: poly1305 - move core routines into a separate library

Move the core Poly1305 routines shared between the generic Poly1305
shash driver and the Adiantum and NHPoly1305 drivers into a separate
library so that using just this pieces does not pull in the crypto
API pieces of the generic Poly1305 routine.

In a subsequent patch, we will augment this generic library with
init/update/final routines so that Poyl1305 algorithm can be used
directly without the need for using the crypto API's shash abstraction.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 877b5691 14-Apr-2019 Eric Biggers <ebiggers@google.com>

crypto: shash - remove shash_desc::flags

The flags field in 'struct shash_desc' never actually does anything.
The only ostensibly supported flag is CRYPTO_TFM_REQ_MAY_SLEEP.
However, no shash algorithm ever sleeps, making this flag a no-op.

With this being the case, inevitably some users who can't sleep wrongly
pass MAY_SLEEP. These would all need to be fixed if any shash algorithm
actually started sleeping. For example, the shash_ahash_*() functions,
which wrap a shash algorithm with the ahash API, pass through MAY_SLEEP
from the ahash API to the shash API. However, the shash functions are
called under kmap_atomic(), so actually they're assumed to never sleep.

Even if it turns out that some users do need preemption points while
hashing large buffers, we could easily provide a helper function
crypto_shash_update_large() which divides the data into smaller chunks
and calls crypto_shash_update() and cond_resched() for each chunk. It's
not necessary to have a flag in 'struct shash_desc', nor is it necessary
to make individual shash algorithms aware of this at all.

Therefore, remove shash_desc::flags, and document that the
crypto_shash_*() functions can be called from any context.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# c4741b23 11-Apr-2019 Eric Biggers <ebiggers@google.com>

crypto: run initcalls for generic implementations earlier

Use subsys_initcall for registration of all templates and generic
algorithm implementations, rather than module_init. Then change
cryptomgr to use arch_initcall, to place it before the subsys_initcalls.

This is needed so that when both a generic and optimized implementation
of an algorithm are built into the kernel (not loadable modules), the
generic implementation is registered before the optimized one.
Otherwise, the self-tests for the optimized implementation are unable to
allocate the generic implementation for the new comparison fuzz tests.

Note that on arm, a side effect of this change is that self-tests for
generic implementations may run before the unaligned access handler has
been installed. So, unaligned accesses will crash the kernel. This is
arguably a good thing as it makes it easier to detect that type of bug.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 6db43410 06-Jan-2019 Eric Biggers <ebiggers@google.com>

crypto: adiantum - initialize crypto_spawn::inst

crypto_grab_*() doesn't set crypto_spawn::inst, so templates must set it
beforehand. Otherwise it will be left NULL, which causes a crash in
certain cases where algorithms are dynamically loaded/unloaded. E.g.
with CONFIG_CRYPTO_CHACHA20_X86_64=m, the following caused a crash:

insmod chacha-x86_64.ko
python -c 'import socket; socket.socket(socket.AF_ALG, 5, 0).bind(("skcipher", "adiantum(xchacha12,aes)"))'
rmmod chacha-x86_64.ko
python -c 'import socket; socket.socket(socket.AF_ALG, 5, 0).bind(("skcipher", "adiantum(xchacha12,aes)"))'

Fixes: 059c2a4d8e16 ("crypto: adiantum - add Adiantum support")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 00c9fe37 10-Dec-2018 Eric Biggers <ebiggers@google.com>

crypto: adiantum - fix leaking reference to hash algorithm

crypto_alg_mod_lookup() takes a reference to the hash algorithm but
crypto_init_shash_spawn() doesn't take ownership of it, hence the
reference needs to be dropped in adiantum_create().

Fixes: 059c2a4d8e16 ("crypto: adiantum - add Adiantum support")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# c6018e1a 06-Dec-2018 Eric Biggers <ebiggers@google.com>

crypto: adiantum - adjust some comments to match latest paper

The 2018-11-28 revision of the Adiantum paper has revised some notation:

- 'M' was replaced with 'L' (meaning "Left", for the left-hand part of
the message) in the definition of Adiantum hashing, to avoid confusion
with the full message
- ε-almost-∆-universal is now abbreviated as ε-∆U instead of εA∆U
- "block" is now used only to mean block cipher and Poly1305 blocks

Also, Adiantum hashing was moved from the appendix to the main paper.

To avoid confusion, update relevant comments in the code to match.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# b299362e 04-Dec-2018 Eric Biggers <ebiggers@google.com>

crypto: adiantum - propagate CRYPTO_ALG_ASYNC flag to instance

If the stream cipher implementation is asynchronous, then the Adiantum
instance must be flagged as asynchronous as well. Otherwise someone
asking for a synchronous algorithm can get an asynchronous algorithm.

There are no asynchronous xchacha12 or xchacha20 implementations yet
which makes this largely a theoretical issue, but it should be fixed.

Fixes: 059c2a4d8e16 ("crypto: adiantum - add Adiantum support")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


# 059c2a4d 16-Nov-2018 Eric Biggers <ebiggers@google.com>

crypto: adiantum - add Adiantum support

Add support for the Adiantum encryption mode. Adiantum was designed by
Paul Crowley and is specified by our paper:

Adiantum: length-preserving encryption for entry-level processors
(https://eprint.iacr.org/2018/720.pdf)

See our paper for full details; this patch only provides an overview.

Adiantum is a tweakable, length-preserving encryption mode designed for
fast and secure disk encryption, especially on CPUs without dedicated
crypto instructions. Adiantum encrypts each sector using the XChaCha12
stream cipher, two passes of an ε-almost-∆-universal (εA∆U) hash
function, and an invocation of the AES-256 block cipher on a single
16-byte block. On CPUs without AES instructions, Adiantum is much
faster than AES-XTS; for example, on ARM Cortex-A7, on 4096-byte sectors
Adiantum encryption is about 4 times faster than AES-256-XTS encryption,
and decryption about 5 times faster.

Adiantum is a specialization of the more general HBSH construction. Our
earlier proposal, HPolyC, was also a HBSH specialization, but it used a
different εA∆U hash function, one based on Poly1305 only. Adiantum's
εA∆U hash function, which is based primarily on the "NH" hash function
like that used in UMAC (RFC4418), is about twice as fast as HPolyC's;
consequently, Adiantum is about 20% faster than HPolyC.

This speed comes with no loss of security: Adiantum is provably just as
secure as HPolyC, in fact slightly *more* secure. Like HPolyC,
Adiantum's security is reducible to that of XChaCha12 and AES-256,
subject to a security bound. XChaCha12 itself has a security reduction
to ChaCha12. Therefore, one need not "trust" Adiantum; one need only
trust ChaCha12 and AES-256. Note that the εA∆U hash function is only
used for its proven combinatorical properties so cannot be "broken".

Adiantum is also a true wide-block encryption mode, so flipping any
plaintext bit in the sector scrambles the entire ciphertext, and vice
versa. No other such mode is available in the kernel currently; doing
the same with XTS scrambles only 16 bytes. Adiantum also supports
arbitrary-length tweaks and naturally supports any length input >= 16
bytes without needing "ciphertext stealing".

For the stream cipher, Adiantum uses XChaCha12 rather than XChaCha20 in
order to make encryption feasible on the widest range of devices.
Although the 20-round variant is quite popular, the best known attacks
on ChaCha are on only 7 rounds, so ChaCha12 still has a substantial
security margin; in fact, larger than AES-256's. 12-round Salsa20 is
also the eSTREAM recommendation. For the block cipher, Adiantum uses
AES-256, despite it having a lower security margin than XChaCha12 and
needing table lookups, due to AES's extensive adoption and analysis
making it the obvious first choice. Nevertheless, for flexibility this
patch also permits the "adiantum" template to be instantiated with
XChaCha20 and/or with an alternate block cipher.

We need Adiantum support in the kernel for use in dm-crypt and fscrypt,
where currently the only other suitable options are block cipher modes
such as AES-XTS. A big problem with this is that many low-end mobile
devices (e.g. Android Go phones sold primarily in developing countries,
as well as some smartwatches) still have CPUs that lack AES
instructions, e.g. ARM Cortex-A7. Sadly, AES-XTS encryption is much too
slow to be viable on these devices. We did find that some "lightweight"
block ciphers are fast enough, but these suffer from problems such as
not having much cryptanalysis or being too controversial.

The ChaCha stream cipher has excellent performance but is insecure to
use directly for disk encryption, since each sector's IV is reused each
time it is overwritten. Even restricting the threat model to offline
attacks only isn't enough, since modern flash storage devices don't
guarantee that "overwrites" are really overwrites, due to wear-leveling.
Adiantum avoids this problem by constructing a
"tweakable super-pseudorandom permutation"; this is the strongest
possible security model for length-preserving encryption.

Of course, storing random nonces along with the ciphertext would be the
ideal solution. But doing that with existing hardware and filesystems
runs into major practical problems; in most cases it would require data
journaling (like dm-integrity) which severely degrades performance.
Thus, for now length-preserving encryption is still needed.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>