Cross Reference: /linux-master/crypto/adiantum.c

History log of /linux-master/crypto/adiantum.c
Revision	Date	Author	Comments
# a312e07a	27-Oct-2023	Eric Biggers <ebiggers@google.com>	crypto: adiantum - flush destination page before unmapping Upon additional review, the new fast path in adiantum_finish() is missing the call to flush_dcache_page() that scatterwalk_map_and_copy() was doing. It's apparently debatable whether flush_dcache_page() is actually needed, as per the discussion at https://lore.kernel.org/lkml/YYP1lAq46NWzhOf0@casper.infradead.org/T/#u. However, it appears that currently all the helper functions that write to a page, such as scatterwalk_map_and_copy(), memcpy_to_page(), and memzero_page(), do the dcache flush. So do it to be consistent. Fixes: dadf5e56c967 ("crypto: adiantum - add fast path for single-page messages") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# 321dfe97	18-Oct-2023	Eric Biggers <ebiggers@google.com>	crypto: adiantum - stop using alignmask of shash_alg Now that the shash algorithm type does not support nonzero alignmasks, shash_alg::base.cra_alignmask is always 0, so OR-ing it into another value is a no-op. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# dadf5e56	09-Oct-2023	Eric Biggers <ebiggers@google.com>	crypto: adiantum - add fast path for single-page messages When the source scatterlist is a single page, optimize the first hash step of adiantum to use crypto_shash_digest() instead of init/update/final, and use the same local kmap for both hashing the bulk part and loading the narrow part of the source data. Likewise, when the destination scatterlist is a single page, optimize the second hash step of adiantum to use crypto_shash_digest() instead of init/update/final, and use the same local kmap for both hashing the bulk part and storing the narrow part of the destination data. In some cases these optimizations improve performance significantly. Note: ideally, for optimal performance each architecture should implement the full "adiantum(xchacha12,aes)" algorithm and fully optimize the contiguous buffer case to use no indirect calls. That's not something I've gotten around to doing, though. This commit just makes a relatively small change that provides some benefit with the existing template-based approach. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# 3c45b578	02-Oct-2023	Herbert Xu <herbert@gondor.apana.org.au>	crypto: adiantum - Only access common skcipher fields on spawn As skcipher spawns may be of the type lskcipher, only the common fields may be accessed. This was already the case but use the correct helpers to make this more obvious. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# 255e48eb	07-Feb-2023	Herbert Xu <herbert@gondor.apana.org.au>	crypto: api - Use data directly in completion function This patch does the final flag day conversion of all completion functions which are now all contained in the Crypto API. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# 0eb76ba2	11-Dec-2020	Ard Biesheuvel <ardb@kernel.org>	crypto: remove cipher routines from public crypto API The cipher routines in the crypto API are mostly intended for templates implementing skcipher modes generically in software, and shouldn't be used outside of the crypto subsystem. So move the prototypes and all related definitions to a new header file under include/crypto/internal. Also, let's use the new module namespace feature to move the symbol exports into a new namespace CRYPTO_INTERNAL. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# 453431a5	07-Aug-2020	Waiman Long <longman@redhat.com>	mm, treewide: rename kzfree() to kfree_sensitive() As said by Linus: A symmetric naming is only helpful if it implies symmetries in use. Otherwise it's actively misleading. In "kzalloc()", the z is meaningful and an important part of what the caller wants. In "kzfree()", the z is actively detrimental, because maybe in the future we really _might_ want to use that "memfill(0xdeadbeef)" or something. The "zero" part of the interface isn't even _relevant_. The main reason that kzfree() exists is to clear sensitive information that should not be leaked to other future users of the same memory objects. Rename kzfree() to kfree_sensitive() to follow the example of the recently added kvfree_sensitive() and make the intention of the API more explicit. In addition, memzero_explicit() is used to clear the memory to make sure that it won't get optimized away by the compiler. The renaming is done by using the command sequence: git grep -w --name-only kzfree \|\ xargs sed -i 's/kzfree/kfree_sensitive/' followed by some editing of the kfree_sensitive() kerneldoc and adding a kzfree backward compatibility macro in slab.h. [akpm@linux-foundation.org: fs/crypto/inline_crypt.c needs linux/slab.h] [akpm@linux-foundation.org: fix fs/crypto/inline_crypt.c some more] Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Cc: James Morris <jmorris@namei.org> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Joe Perches <joe@perches.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: David Rientjes <rientjes@google.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: "Jason A . Donenfeld" <Jason@zx2c4.com> Link: http://lkml.kernel.org/r/20200616154311.12314-3-longman@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 7bcb2c99	10-Jul-2020	Eric Biggers <ebiggers@google.com>	crypto: algapi - use common mechanism for inheriting flags The flag CRYPTO_ALG_ASYNC is "inherited" in the sense that when a template is instantiated, the template will have CRYPTO_ALG_ASYNC set if any of the algorithms it uses has CRYPTO_ALG_ASYNC set. We'd like to add a second flag (CRYPTO_ALG_ALLOCATES_MEMORY) that gets "inherited" in the same way. This is difficult because the handling of CRYPTO_ALG_ASYNC is hardcoded everywhere. Address this by: - Add CRYPTO_ALG_INHERITED_FLAGS, which contains the set of flags that have these inheritance semantics. - Add crypto_algt_inherited_mask(), for use by template ->create() methods. It returns any of these flags that the user asked to be unset and thus must be passed in the 'mask' to crypto_grab_(). - Also modify crypto_check_attr_type() to handle computing the 'mask' so that most templates can just use this. - Make crypto_grab_() propagate these flags to the template instance being created so that templates don't have to do this themselves. Make crypto/simd.c propagate these flags too, since it "wraps" another algorithm, similar to a template. Based on a patch by Mikulas Patocka <mpatocka@redhat.com> (https://lore.kernel.org/r/alpine.LRH.2.02.2006301414580.30526@file01.intranet.prod.int.rdu2.redhat.com). Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# 1c08a104	05-Jan-2020	Jason A. Donenfeld <Jason@zx2c4.com>	crypto: poly1305 - add new 32 and 64-bit generic versions These two C implementations from Zinc -- a 32x32 one and a 64x64 one, depending on the platform -- come from Andrew Moon's public domain poly1305-donna portable code, modified for usage in the kernel. The precomputation in the 32-bit version and the use of 64x64 multiplies in the 64-bit version make these perform better than the code it replaces. Moon's code is also very widespread and has received many eyeballs of scrutiny. There's a bit of interference between the x86 implementation, which relies on internal details of the old scalar implementation. In the next commit, the x86 implementation will be replaced with a faster one that doesn't rely on this, so none of this matters much. But for now, to keep this passing the tests, we inline the bits of the old implementation that the x86 implementation relied on. Also, since we now support a slightly larger key space, via the union, some offsets had to be fixed up. Nonce calculation was folded in with the emit function, to take advantage of 64x64 arithmetic. However, Adiantum appeared to rely on no nonce handling in emit, so this path was conditionalized. We also introduced a new struct, poly1305_core_key, to represent the precise amount of space that particular implementation uses. Testing with kbench9000, depending on the CPU, the update function for the 32x32 version has been improved by 4%-7%, and for the 64x64 by 19%-30%. The 32x32 gains are small, but I think there's great value in having a parallel implementation to the 64x64 one so that the two can be compared side-by-side as nice stand-alone units. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# d5ed3b65	02-Jan-2020	Eric Biggers <ebiggers@google.com>	crypto: cipher - make crypto_spawn_cipher() take a crypto_cipher_spawn Now that all users of single-block cipher spawns have been converted to use 'struct crypto_cipher_spawn' rather than the less specifically typed 'struct crypto_spawn', make crypto_spawn_cipher() take a pointer to a 'struct crypto_cipher_spawn' rather than a 'struct crypto_spawn'. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# ba448407	02-Jan-2020	Eric Biggers <ebiggers@google.com>	crypto: adiantum - use crypto_grab_{cipher,shash} and simplify error paths Make the adiantum template use the new functions crypto_grab_cipher() and crypto_grab_shash() to initialize its cipher and shash spawns. This is needed to make all spawns be initialized in a consistent way. Also simplify the error handling by taking advantage of crypto_drop_() now accepting (as a no-op) spawns that haven't been initialized yet, and by taking advantage of crypto_grab_() now handling ERR_PTR() names. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# de95c957	02-Jan-2020	Eric Biggers <ebiggers@google.com>	crypto: algapi - pass instance to crypto_grab_spawn() Currently, crypto_spawn::inst is first used temporarily to pass the instance to crypto_grab_spawn(). Then crypto_init_spawn() overwrites it with crypto_spawn::next, which shares the same union. Finally, crypto_spawn::inst is set again when the instance is registered. Make this less convoluted by just passing the instance as an argument to crypto_grab_spawn() instead. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# b9f76ddd	02-Jan-2020	Eric Biggers <ebiggers@google.com>	crypto: skcipher - pass instance to crypto_grab_skcipher() Initializing a crypto_skcipher_spawn currently requires: 1. Set spawn->base.inst to point to the instance. 2. Call crypto_grab_skcipher(). But there's no reason for these steps to be separate, and in fact this unneeded complication has caused at least one bug, the one fixed by commit 6db43410179b ("crypto: adiantum - initialize crypto_spawn::inst") So just make crypto_grab_skcipher() take the instance as an argument. To keep the function calls from getting too unwieldy due to this extra argument, also introduce a 'mask' variable into the affected places which weren't already using one. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# af5034e8	30-Dec-2019	Eric Biggers <ebiggers@google.com>	crypto: remove propagation of CRYPTO_TFM_RES_* flags The CRYPTO_TFM_RES_* flags were apparently meant as a way to make the ->setkey() functions provide more information about errors. But these flags weren't actually being used or tested, and in many cases they weren't being set correctly anyway. So they've now been removed. Also, if someone ever actually needs to start better distinguishing ->setkey() errors (which is somewhat unlikely, as this has been unneeded for a long time), we'd be much better off just defining different return values, like -EINVAL if the key is invalid for the algorithm vs. -EKEYREJECTED if the key was rejected by a policy like "no weak keys". That would be much simpler, less error-prone, and easier to test. So just remove CRYPTO_TFM_RES_MASK and all the unneeded logic that propagates these flags around. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# c593642c	09-Dec-2019	Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>	treewide: Use sizeof_field() macro Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except at places where these are defined. Later patches will remove the unused definition of FIELD_SIZEOF(). This patch is generated using following script: EXCLUDE_FILES="include/linux/stddef.h\|include/linux/kernel.h" git grep -l -e "\bFIELD_SIZEOF\b" \| while read file; do if [[ "$file" =~ $EXCLUDE_FILES ]]; then continue fi sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file; done Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: David Miller <davem@davemloft.net> # for net
# 48ea8c6e	08-Nov-2019	Ard Biesheuvel <ardb@kernel.org>	crypto: poly1305 - move core routines into a separate library Move the core Poly1305 routines shared between the generic Poly1305 shash driver and the Adiantum and NHPoly1305 drivers into a separate library so that using just this pieces does not pull in the crypto API pieces of the generic Poly1305 routine. In a subsequent patch, we will augment this generic library with init/update/final routines so that Poyl1305 algorithm can be used directly without the need for using the crypto API's shash abstraction. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# 877b5691	14-Apr-2019	Eric Biggers <ebiggers@google.com>	crypto: shash - remove shash_desc::flags The flags field in 'struct shash_desc' never actually does anything. The only ostensibly supported flag is CRYPTO_TFM_REQ_MAY_SLEEP. However, no shash algorithm ever sleeps, making this flag a no-op. With this being the case, inevitably some users who can't sleep wrongly pass MAY_SLEEP. These would all need to be fixed if any shash algorithm actually started sleeping. For example, the shash_ahash_() functions, which wrap a shash algorithm with the ahash API, pass through MAY_SLEEP from the ahash API to the shash API. However, the shash functions are called under kmap_atomic(), so actually they're assumed to never sleep. Even if it turns out that some users do need preemption points while hashing large buffers, we could easily provide a helper function crypto_shash_update_large() which divides the data into smaller chunks and calls crypto_shash_update() and cond_resched() for each chunk. It's not necessary to have a flag in 'struct shash_desc', nor is it necessary to make individual shash algorithms aware of this at all. Therefore, remove shash_desc::flags, and document that the crypto_shash_() functions can be called from any context. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# c4741b23	11-Apr-2019	Eric Biggers <ebiggers@google.com>	crypto: run initcalls for generic implementations earlier Use subsys_initcall for registration of all templates and generic algorithm implementations, rather than module_init. Then change cryptomgr to use arch_initcall, to place it before the subsys_initcalls. This is needed so that when both a generic and optimized implementation of an algorithm are built into the kernel (not loadable modules), the generic implementation is registered before the optimized one. Otherwise, the self-tests for the optimized implementation are unable to allocate the generic implementation for the new comparison fuzz tests. Note that on arm, a side effect of this change is that self-tests for generic implementations may run before the unaligned access handler has been installed. So, unaligned accesses will crash the kernel. This is arguably a good thing as it makes it easier to detect that type of bug. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# 6db43410	06-Jan-2019	Eric Biggers <ebiggers@google.com>	crypto: adiantum - initialize crypto_spawn::inst crypto_grab_*() doesn't set crypto_spawn::inst, so templates must set it beforehand. Otherwise it will be left NULL, which causes a crash in certain cases where algorithms are dynamically loaded/unloaded. E.g. with CONFIG_CRYPTO_CHACHA20_X86_64=m, the following caused a crash: insmod chacha-x86_64.ko python -c 'import socket; socket.socket(socket.AF_ALG, 5, 0).bind(("skcipher", "adiantum(xchacha12,aes)"))' rmmod chacha-x86_64.ko python -c 'import socket; socket.socket(socket.AF_ALG, 5, 0).bind(("skcipher", "adiantum(xchacha12,aes)"))' Fixes: 059c2a4d8e16 ("crypto: adiantum - add Adiantum support") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# 00c9fe37	10-Dec-2018	Eric Biggers <ebiggers@google.com>	crypto: adiantum - fix leaking reference to hash algorithm crypto_alg_mod_lookup() takes a reference to the hash algorithm but crypto_init_shash_spawn() doesn't take ownership of it, hence the reference needs to be dropped in adiantum_create(). Fixes: 059c2a4d8e16 ("crypto: adiantum - add Adiantum support") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# c6018e1a	06-Dec-2018	Eric Biggers <ebiggers@google.com>	crypto: adiantum - adjust some comments to match latest paper The 2018-11-28 revision of the Adiantum paper has revised some notation: - 'M' was replaced with 'L' (meaning "Left", for the left-hand part of the message) in the definition of Adiantum hashing, to avoid confusion with the full message - ε-almost-∆-universal is now abbreviated as ε-∆U instead of εA∆U - "block" is now used only to mean block cipher and Poly1305 blocks Also, Adiantum hashing was moved from the appendix to the main paper. To avoid confusion, update relevant comments in the code to match. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# b299362e	04-Dec-2018	Eric Biggers <ebiggers@google.com>	crypto: adiantum - propagate CRYPTO_ALG_ASYNC flag to instance If the stream cipher implementation is asynchronous, then the Adiantum instance must be flagged as asynchronous as well. Otherwise someone asking for a synchronous algorithm can get an asynchronous algorithm. There are no asynchronous xchacha12 or xchacha20 implementations yet which makes this largely a theoretical issue, but it should be fixed. Fixes: 059c2a4d8e16 ("crypto: adiantum - add Adiantum support") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
# 059c2a4d	16-Nov-2018	Eric Biggers <ebiggers@google.com>	crypto: adiantum - add Adiantum support Add support for the Adiantum encryption mode. Adiantum was designed by Paul Crowley and is specified by our paper: Adiantum: length-preserving encryption for entry-level processors (https://eprint.iacr.org/2018/720.pdf) See our paper for full details; this patch only provides an overview. Adiantum is a tweakable, length-preserving encryption mode designed for fast and secure disk encryption, especially on CPUs without dedicated crypto instructions. Adiantum encrypts each sector using the XChaCha12 stream cipher, two passes of an ε-almost-∆-universal (εA∆U) hash function, and an invocation of the AES-256 block cipher on a single 16-byte block. On CPUs without AES instructions, Adiantum is much faster than AES-XTS; for example, on ARM Cortex-A7, on 4096-byte sectors Adiantum encryption is about 4 times faster than AES-256-XTS encryption, and decryption about 5 times faster. Adiantum is a specialization of the more general HBSH construction. Our earlier proposal, HPolyC, was also a HBSH specialization, but it used a different εA∆U hash function, one based on Poly1305 only. Adiantum's εA∆U hash function, which is based primarily on the "NH" hash function like that used in UMAC (RFC4418), is about twice as fast as HPolyC's; consequently, Adiantum is about 20% faster than HPolyC. This speed comes with no loss of security: Adiantum is provably just as secure as HPolyC, in fact slightly more secure. Like HPolyC, Adiantum's security is reducible to that of XChaCha12 and AES-256, subject to a security bound. XChaCha12 itself has a security reduction to ChaCha12. Therefore, one need not "trust" Adiantum; one need only trust ChaCha12 and AES-256. Note that the εA∆U hash function is only used for its proven combinatorical properties so cannot be "broken". Adiantum is also a true wide-block encryption mode, so flipping any plaintext bit in the sector scrambles the entire ciphertext, and vice versa. No other such mode is available in the kernel currently; doing the same with XTS scrambles only 16 bytes. Adiantum also supports arbitrary-length tweaks and naturally supports any length input >= 16 bytes without needing "ciphertext stealing". For the stream cipher, Adiantum uses XChaCha12 rather than XChaCha20 in order to make encryption feasible on the widest range of devices. Although the 20-round variant is quite popular, the best known attacks on ChaCha are on only 7 rounds, so ChaCha12 still has a substantial security margin; in fact, larger than AES-256's. 12-round Salsa20 is also the eSTREAM recommendation. For the block cipher, Adiantum uses AES-256, despite it having a lower security margin than XChaCha12 and needing table lookups, due to AES's extensive adoption and analysis making it the obvious first choice. Nevertheless, for flexibility this patch also permits the "adiantum" template to be instantiated with XChaCha20 and/or with an alternate block cipher. We need Adiantum support in the kernel for use in dm-crypt and fscrypt, where currently the only other suitable options are block cipher modes such as AES-XTS. A big problem with this is that many low-end mobile devices (e.g. Android Go phones sold primarily in developing countries, as well as some smartwatches) still have CPUs that lack AES instructions, e.g. ARM Cortex-A7. Sadly, AES-XTS encryption is much too slow to be viable on these devices. We did find that some "lightweight" block ciphers are fast enough, but these suffer from problems such as not having much cryptanalysis or being too controversial. The ChaCha stream cipher has excellent performance but is insecure to use directly for disk encryption, since each sector's IV is reused each time it is overwritten. Even restricting the threat model to offline attacks only isn't enough, since modern flash storage devices don't guarantee that "overwrites" are really overwrites, due to wear-leveling. Adiantum avoids this problem by constructing a "tweakable super-pseudorandom permutation"; this is the strongest possible security model for length-preserving encryption. Of course, storing random nonces along with the ciphertext would be the ideal solution. But doing that with existing hardware and filesystems runs into major practical problems; in most cases it would require data journaling (like dm-integrity) which severely degrades performance. Thus, for now length-preserving encryption is still needed. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>