#
e9b10713 |
|
20-Feb-2024 |
Gabriel Krisman Bertazi <krisman@suse.de> |
fscrypt: Drop d_revalidate once the key is added When a key is added, existing directory dentries in the DCACHE_NOKEY_NAME form are moved by the VFS to the plaintext version. But, since they have the DCACHE_OP_REVALIDATE flag set, revalidation will be done at each lookup only to return immediately, since plaintext dentries can't go stale until eviction. This patch optimizes this case, by dropping the flag once the nokey_name dentry becomes plain-text. Note that non-directory dentries are not moved this way, so they won't be affected. Of course, this can only be done if fscrypt is the only thing requiring revalidation for a dentry. For this reason, we only disable d_revalidate if the .d_revalidate hook is fscrypt_d_revalidate itself. It is safe to do it here because when moving the dentry to the plain-text version, we are holding the d_lock. We might race with a concurrent RCU lookup but this is harmless because, at worst, we will get an extra d_revalidate on the keyed dentry, which will still find the dentry to be valid. Finally, now that we do more than just clear the DCACHE_NOKEY_NAME in fscrypt_handle_d_move, skip it entirely for plaintext dentries, to avoid extra costs. Reviewed-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240221171412.10710-5-krisman@suse.de Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
|
#
e86e6638 |
|
20-Feb-2024 |
Gabriel Krisman Bertazi <krisman@suse.de> |
fscrypt: Drop d_revalidate for valid dentries during lookup Unencrypted and encrypted-dentries where the key is available don't need to be revalidated by fscrypt, since they don't go stale from under VFS and the key cannot be removed for the encrypted case without evicting the dentry. Disable their d_revalidate hook on the first lookup, to avoid repeated revalidation later. This is done in preparation to always configuring d_op through sb->s_d_op. The only part detail is that, since the filesystem might have other features that require revalidation, we only apply this optimization if the d_revalidate handler is fscrypt_d_revalidate itself. Finally, we need to clean the dentry->flags even for unencrypted dentries, so the ->d_lock might be acquired even for them. In order to avoid doing it for filesystems that don't care about fscrypt at all, we peek ->d_flags without the lock at first, and only acquire it if we actually need to write the flag. Reviewed-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240221171412.10710-4-krisman@suse.de Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
|
#
8b6bb995 |
|
20-Feb-2024 |
Gabriel Krisman Bertazi <krisman@suse.de> |
fscrypt: Factor out a helper to configure the lookup dentry Both fscrypt_prepare_lookup_partial and fscrypt_prepare_lookup will set DCACHE_NOKEY_NAME for dentries when the key is not available. Extract out a helper to set this flag in a single place, in preparation to also add the optimization that will disable ->d_revalidate if possible. Reviewed-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240221171412.10710-3-krisman@suse.de Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
|
#
3e7807d5 |
|
04-Oct-2023 |
Josef Bacik <josef@toxicpanda.com> |
fscrypt: rename fscrypt_info => fscrypt_inode_info We are going to track per-extent information, so it'll be necessary to distinguish between inode infos and extent infos. Rename fscrypt_info to fscrypt_inode_info, adjusting any lines that now exceed 80 characters. Signed-off-by: Josef Bacik <josef@toxicpanda.com> [ebiggers: rebased onto fscrypt tree, renamed fscrypt_get_info(), adjusted two comments, and fixed some lines over 80 characters] Link: https://lore.kernel.org/r/20231005025757.33521-1-ebiggers@kernel.org Reviewed-by: Neal Gompa <neal@gompa.dev> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
5b118884 |
|
24-Sep-2023 |
Eric Biggers <ebiggers@google.com> |
fscrypt: support crypto data unit size less than filesystem block size Until now, fscrypt has always used the filesystem block size as the granularity of file contents encryption. Two scenarios have come up where a sub-block granularity of contents encryption would be useful: 1. Inline crypto hardware that only supports a crypto data unit size that is less than the filesystem block size. 2. Support for direct I/O at a granularity less than the filesystem block size, for example at the block device's logical block size in order to match the traditional direct I/O alignment requirement. (1) first came up with older eMMC inline crypto hardware that only supports a crypto data unit size of 512 bytes. That specific case ultimately went away because all systems with that hardware continued using out of tree code and never actually upgraded to the upstream inline crypto framework. But, now it's coming back in a new way: some current UFS controllers only support a data unit size of 4096 bytes, and there is a proposal to increase the filesystem block size to 16K. (2) was discussed as a "nice to have" feature, though not essential, when support for direct I/O on encrypted files was being upstreamed. Still, the fact that this feature has come up several times does suggest it would be wise to have available. Therefore, this patch implements it by using one of the reserved bytes in fscrypt_policy_v2 to allow users to select a sub-block data unit size. Supported data unit sizes are powers of 2 between 512 and the filesystem block size, inclusively. Support is implemented for both the FS-layer and inline crypto cases. This patch focuses on the basic support for sub-block data units. Some things are out of scope for this patch but may be addressed later: - Supporting sub-block data units in combination with FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64, in most cases. Unfortunately this combination usually causes data unit indices to exceed 32 bits, and thus fscrypt_supported_policy() correctly disallows it. The users who potentially need this combination are using f2fs. To support it, f2fs would need to provide an option to slightly reduce its max file size. - Supporting sub-block data units in combination with FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32. This has the same problem described above, but also it will need special code to make DUN wraparound still happen on a FS block boundary. - Supporting use case (2) mentioned above. The encrypted direct I/O code will need to stop requiring and assuming FS block alignment. This won't be hard, but it belongs in a separate patch. - Supporting this feature on filesystems other than ext4 and f2fs. (Filesystems declare support for it via their fscrypt_operations.) On UBIFS, sub-block data units don't make sense because UBIFS encrypts variable-length blocks as a result of compression. CephFS could support it, but a bit more work would be needed to make the fscrypt_*_block_inplace functions play nicely with sub-block data units. I don't think there's a use case for this on CephFS anyway. Link: https://lore.kernel.org/r/20230925055451.59499-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
7a0263dc |
|
24-Sep-2023 |
Eric Biggers <ebiggers@google.com> |
fscrypt: replace get_ino_and_lblk_bits with just has_32bit_inodes Now that fs/crypto/ computes the filesystem's lblk_bits from its maximum file size, it is no longer necessary for filesystems to provide lblk_bits via fscrypt_operations::get_ino_and_lblk_bits. It is still necessary for fs/crypto/ to retrieve ino_bits from the filesystem. However, this is used only to decide whether inode numbers fit in 32 bits. Also, ino_bits is static for all relevant filesystems, i.e. it doesn't depend on the filesystem instance. Therefore, in the interest of keeping things as simple as possible, replace 'get_ino_and_lblk_bits' with a flag 'has_32bit_inodes'. This can always be changed back to a function if a filesystem needs it to be dynamic, but for now a static flag is all that's needed. Link: https://lore.kernel.org/r/20230925055451.59499-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
40e13e18 |
|
24-Sep-2023 |
Eric Biggers <ebiggers@google.com> |
fscrypt: make the bounce page pool opt-in instead of opt-out Replace FS_CFLG_OWN_PAGES with a bit flag 'needs_bounce_pages' which has the opposite meaning. I.e., filesystems now opt into the bounce page pool instead of opt out. Make fscrypt_alloc_bounce_page() check that the bounce page pool has been initialized. I believe the opt-in makes more sense, since nothing else in fscrypt_operations is opt-out, and these days filesystems can choose to use blk-crypto which doesn't need the fscrypt bounce page pool. Also, I happen to be planning to add two more flags, and I wanted to fix the "FS_CFLG_" name anyway as it wasn't prefixed with "FSCRYPT_". Link: https://lore.kernel.org/r/20230925055451.59499-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
5970fbad |
|
24-Sep-2023 |
Eric Biggers <ebiggers@google.com> |
fscrypt: make it clearer that key_prefix is deprecated fscrypt_operations::key_prefix should not be set by any filesystems that aren't setting it already. This is already documented, but apparently it's not sufficiently clear, as both ceph and btrfs have tried to set it. Rename the field to legacy_key_prefix and improve the documentation to hopefully make it clearer. Link: https://lore.kernel.org/r/20230925055451.59499-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
c76e14dc |
|
24-Mar-2023 |
Matthew Wilcox <willy@infradead.org> |
fscrypt: Add some folio helper functions fscrypt_is_bounce_folio() is the equivalent of fscrypt_is_bounce_page() and fscrypt_pagecache_folio() is the equivalent of fscrypt_pagecache_page(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/20230324180129.1220691-3-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
6f2656ea |
|
16-Mar-2023 |
Luís Henriques <lhenriques@suse.de> |
fscrypt: new helper function - fscrypt_prepare_lookup_partial() This patch introduces a new helper function which can be used both in lookups and in atomic_open operations by filesystems that want to handle filename encryption and no-key dentries themselves. The reason for this function to be used in atomic open is that this operation can act as a lookup if handed a dentry that is negative. And in this case we may need to set DCACHE_NOKEY_NAME. Signed-off-by: Luís Henriques <lhenriques@suse.de> Tested-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Xiubo Li <xiubli@redhat.com> [ebiggers: improved the function comment, and moved the function to just below __fscrypt_prepare_lookup()] Link: https://lore.kernel.org/r/20230320220149.21863-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
097d7c1f |
|
07-Feb-2023 |
Eric Biggers <ebiggers@google.com> |
fscrypt: clean up fscrypt_add_test_dummy_key() Now that fscrypt_add_test_dummy_key() is only called by setup_file_encryption_key() and not by the individual filesystems, un-export it. Also change its prototype to take the fscrypt_key_specifier directly, as the caller already has it. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20230208062107.199831-6-ebiggers@kernel.org
|
#
51e4e315 |
|
27-Jan-2023 |
Eric Biggers <ebiggers@google.com> |
fscrypt: support decrypting data from large folios Try to make the filesystem-level decryption functions in fs/crypto/ aware of large folios. This includes making fscrypt_decrypt_bio() support the case where the bio contains large folios, and making fscrypt_decrypt_pagecache_blocks() take a folio instead of a page. There's no way to actually test this with large folios yet, but I've tested that this doesn't cause any regressions. Note that this patch just handles *decryption*, not encryption which will be a little more difficult. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20230127224202.355629-1-ebiggers@kernel.org
|
#
ccd30a47 |
|
11-Oct-2022 |
Eric Biggers <ebiggers@google.com> |
fscrypt: fix keyring memory leak on mount failure Commit d7e7b9af104c ("fscrypt: stop using keyrings subsystem for fscrypt_master_key") moved the keyring destruction from __put_super() to generic_shutdown_super() so that the filesystem's block device(s) are still available. Unfortunately, this causes a memory leak in the case where a mount is attempted with the test_dummy_encryption mount option, but the mount fails after the option has already been processed. To fix this, attempt the keyring destruction in both places. Reported-by: syzbot+104c2a89561289cec13e@syzkaller.appspotmail.com Fixes: d7e7b9af104c ("fscrypt: stop using keyrings subsystem for fscrypt_master_key") Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Link: https://lore.kernel.org/r/20221011213838.209879-1-ebiggers@kernel.org
|
#
0e91fc1e |
|
01-Sep-2022 |
Christoph Hellwig <hch@lst.de> |
fscrypt: work on block_devices instead of request_queues request_queues are a block layer implementation detail that should not leak into file systems. Change the fscrypt inline crypto code to retrieve block devices instead of request_queues from the file system. As part of that, clean up the interaction with multi-device file systems by returning both the number of devices and the actual device array in a single method call. Signed-off-by: Christoph Hellwig <hch@lst.de> [ebiggers: bug fixes and minor tweaks] Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220901193208.138056-4-ebiggers@kernel.org
|
#
d7e7b9af |
|
01-Sep-2022 |
Eric Biggers <ebiggers@google.com> |
fscrypt: stop using keyrings subsystem for fscrypt_master_key The approach of fs/crypto/ internally managing the fscrypt_master_key structs as the payloads of "struct key" objects contained in a "struct key" keyring has outlived its usefulness. The original idea was to simplify the code by reusing code from the keyrings subsystem. However, several issues have arisen that can't easily be resolved: - When a master key struct is destroyed, blk_crypto_evict_key() must be called on any per-mode keys embedded in it. (This started being the case when inline encryption support was added.) Yet, the keyrings subsystem can arbitrarily delay the destruction of keys, even past the time the filesystem was unmounted. Therefore, currently there is no easy way to call blk_crypto_evict_key() when a master key is destroyed. Currently, this is worked around by holding an extra reference to the filesystem's request_queue(s). But it was overlooked that the request_queue reference is *not* guaranteed to pin the corresponding blk_crypto_profile too; for device-mapper devices that support inline crypto, it doesn't. This can cause a use-after-free. - When the last inode that was using an incompletely-removed master key is evicted, the master key removal is completed by removing the key struct from the keyring. Currently this is done via key_invalidate(). Yet, key_invalidate() takes the key semaphore. This can deadlock when called from the shrinker, since in fscrypt_ioctl_add_key(), memory is allocated with GFP_KERNEL under the same semaphore. - More generally, the fact that the keyrings subsystem can arbitrarily delay the destruction of keys (via garbage collection delay, or via random processes getting temporary key references) is undesirable, as it means we can't strictly guarantee that all secrets are ever wiped. - Doing the master key lookups via the keyrings subsystem results in the key_permission LSM hook being called. fscrypt doesn't want this, as all access control for encrypted files is designed to happen via the files themselves, like any other files. The workaround which SELinux users are using is to change their SELinux policy to grant key search access to all domains. This works, but it is an odd extra step that shouldn't really have to be done. The fix for all these issues is to change the implementation to what I should have done originally: don't use the keyrings subsystem to keep track of the filesystem's fscrypt_master_key structs. Instead, just store them in a regular kernel data structure, and rework the reference counting, locking, and lifetime accordingly. Retain support for RCU-mode key lookups by using a hash table. Replace fscrypt_sb_free() with fscrypt_sb_delete(), which releases the keys synchronously and runs a bit earlier during unmount, so that block devices are still available. A side effect of this patch is that neither the master keys themselves nor the filesystem keyrings will be listed in /proc/keys anymore. ("Master key users" and the master key users keyrings will still be listed.) However, this was mostly an implementation detail, and it was intended just for debugging purposes. I don't know of anyone using it. This patch does *not* change how "master key users" (->mk_users) works; that still uses the keyrings subsystem. That is still needed for key quotas, and changing that isn't necessary to solve the issues listed above. If we decide to change that too, it would be a separate patch. I've marked this as fixing the original commit that added the fscrypt keyring, but as noted above the most important issue that this patch fixes wasn't introduced until the addition of inline encryption support. Fixes: 22d94f493bfb ("fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctl") Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220901193208.138056-2-ebiggers@kernel.org
|
#
53dd3f80 |
|
27-Aug-2022 |
Eric Biggers <ebiggers@google.com> |
fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGN To prepare for STATX_DIOALIGN support, make two changes to fscrypt_dio_supported(). First, remove the filesystem-block-alignment check and make the filesystems handle it instead. It previously made sense to have it in fs/crypto/; however, to support STATX_DIOALIGN the alignment restriction would have to be returned to filesystems. It ends up being simpler if filesystems handle this part themselves, especially for f2fs which only allows fs-block-aligned DIO in the first place. Second, make fscrypt_dio_supported() work on inodes whose encryption key hasn't been set up yet, by making it set up the key if needed. This is required for statx(), since statx() doesn't require a file descriptor. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220827065851.135710-4-ebiggers@kernel.org
|
#
14db0b3c |
|
15-Aug-2022 |
Eric Biggers <ebiggers@google.com> |
fscrypt: stop using PG_error to track error status As a step towards freeing the PG_error flag for other uses, change ext4 and f2fs to stop using PG_error to track decryption errors. Instead, if a decryption error occurs, just mark the whole bio as failed. The coarser granularity isn't really a problem since it isn't any worse than what the block layer provides, and errors from a multi-page readahead aren't reported to applications unless a single-page read fails too. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <chao@kernel.org> # for f2fs part Link: https://lore.kernel.org/r/20220815235052.86545-2-ebiggers@kernel.org
|
#
272ac150 |
|
15-Aug-2022 |
Eric Biggers <ebiggers@google.com> |
fscrypt: remove fscrypt_set_test_dummy_encryption() Now that all its callers have been converted to fscrypt_parse_test_dummy_encryption() and fscrypt_add_test_dummy_key() instead, fscrypt_set_test_dummy_encryption() can be removed. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220513231605.175121-6-ebiggers@kernel.org
|
#
637fa738 |
|
31-Aug-2020 |
Jeff Layton <jlayton@kernel.org> |
fscrypt: add fscrypt_context_for_new_inode Most filesystems just call fscrypt_set_context on new inodes, which usually causes a setxattr. That's a bit late for ceph, which can send along a full set of attributes with the create request. Doing so allows it to avoid race windows that where the new inode could be seen by other clients without the crypto context attached. It also avoids the separate round trip to the server. Refactor the fscrypt code a bit to allow us to create a new crypto context, attach it to the inode, and write it to the buffer, but without calling set_context on it. ceph can later use this to marshal the context into the attributes we send along with the create request. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
d3e94fdc |
|
08-Jan-2021 |
Jeff Layton <jlayton@kernel.org> |
fscrypt: export fscrypt_fname_encrypt and fscrypt_fname_encrypted_size For ceph, we want to use our own scheme for handling filenames that are are longer than NAME_MAX after encryption and Base64 encoding. This allows us to have a consistent view of the encrypted filenames for clients that don't support fscrypt and clients that do but that don't have the key. Currently, fs/crypto only supports encrypting filenames using fscrypt_setup_filename, but that also handles encoding nokey names. Ceph can't use that because it handles nokey names in a different way. Export fscrypt_fname_encrypt. Rename fscrypt_fname_encrypted_size to __fscrypt_fname_encrypted_size and add a new wrapper called fscrypt_fname_encrypted_size that takes an inode argument rather than a pointer to a fscrypt_policy union. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
218d921b |
|
30-Apr-2022 |
Eric Biggers <ebiggers@google.com> |
fscrypt: add new helper functions for test_dummy_encryption Unfortunately the design of fscrypt_set_test_dummy_encryption() doesn't work properly for the new mount API, as it combines too many steps into one function: - Parse the argument to test_dummy_encryption - Check the setting against the filesystem instance - Apply the setting to the filesystem instance The new mount API has split these into separate steps. ext4 partially worked around this by duplicating some of the logic, but it still had some bugs. To address this, add some new helper functions that split up the steps of fscrypt_set_test_dummy_encryption(): - fscrypt_parse_test_dummy_encryption() - fscrypt_dummy_policies_equal() - fscrypt_add_test_dummy_key() While we're add it, also add a function fscrypt_is_dummy_policy_set() which will be useful to avoid some #ifdef's. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220501050857.538984-5-ebiggers@kernel.org
|
#
63cec138 |
|
04-Apr-2022 |
Eric Biggers <ebiggers@google.com> |
fscrypt: split up FS_CRYPTO_BLOCK_SIZE FS_CRYPTO_BLOCK_SIZE is neither the filesystem block size nor the granularity of encryption. Rather, it defines two logically separate constraints that both arise from the block size of the AES cipher: - The alignment required for the lengths of file contents blocks - The minimum input/output length for the filenames encryption modes Since there are way too many things called the "block size", and the connection with the AES block size is not easily understood, split FS_CRYPTO_BLOCK_SIZE into two constants FSCRYPT_CONTENTS_ALIGNMENT and FSCRYPT_FNAME_MIN_MSG_LEN that more clearly describe what they are. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220405010914.18519-1-ebiggers@kernel.org
|
#
c6c89783 |
|
28-Jan-2022 |
Eric Biggers <ebiggers@google.com> |
fscrypt: add functions for direct I/O support Encrypted files traditionally haven't supported DIO, due to the need to encrypt/decrypt the data. However, when the encryption is implemented using inline encryption (blk-crypto) instead of the traditional filesystem-layer encryption, it is straightforward to support DIO. In preparation for supporting this, add the following functions: - fscrypt_dio_supported() checks whether a DIO request is supported as far as encryption is concerned. Encrypted files will only support DIO when inline encryption is used and the I/O request is properly aligned; this function checks these preconditions. - fscrypt_limit_io_blocks() limits the length of a bio to avoid crossing a place in the file that a bio with an encryption context cannot cross due to a DUN discontiguity. This function is needed by filesystems that use the iomap DIO implementation (which operates directly on logical ranges, so it won't use fscrypt_mergeable_bio()) and that support FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32. Co-developed-by: Satya Tangirala <satyat@google.com> Signed-off-by: Satya Tangirala <satyat@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220128233940.79464-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
4373b3dc |
|
09-Sep-2021 |
Eric Biggers <ebiggers@google.com> |
fscrypt: remove fscrypt_operations::max_namelen The max_namelen field is unnecessary, as it is set to 255 (NAME_MAX) on all filesystems that support fscrypt (or plan to support fscrypt). For simplicity, just use NAME_MAX directly instead. Link: https://lore.kernel.org/r/20210909184513.139281-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
38ef66b0 |
|
28-Jul-2021 |
Eric Biggers <ebiggers@google.com> |
fscrypt: document struct fscrypt_operations Document all fields of struct fscrypt_operations so that it's more clear what filesystems that use (or plan to use) fs/crypto/ need to implement. Link: https://lore.kernel.org/r/20210729043728.18480-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
d1876056 |
|
02-Jul-2021 |
Eric Biggers <ebiggers@google.com> |
fscrypt: add fscrypt_symlink_getattr() for computing st_size Add a helper function fscrypt_symlink_getattr() which will be called from the various filesystems' ->getattr() methods to read and decrypt the target of encrypted symlinks in order to report the correct st_size. Detailed explanation: As required by POSIX and as documented in various man pages, st_size for a symlink is supposed to be the length of the symlink target. Unfortunately, st_size has always been wrong for encrypted symlinks because st_size is populated from i_size from disk, which intentionally contains the length of the encrypted symlink target. That's slightly greater than the length of the decrypted symlink target (which is the symlink target that userspace usually sees), and usually won't match the length of the no-key encoded symlink target either. This hadn't been fixed yet because reporting the correct st_size would require reading the symlink target from disk and decrypting or encoding it, which historically has been considered too heavyweight to do in ->getattr(). Also historically, the wrong st_size had only broken a test (LTP lstat03) and there were no known complaints from real users. (This is probably because the st_size of symlinks isn't used too often, and when it is, typically it's for a hint for what buffer size to pass to readlink() -- which a slightly-too-large size still works for.) However, a couple things have changed now. First, there have recently been complaints about the current behavior from real users: - Breakage in rpmbuild: https://github.com/rpm-software-management/rpm/issues/1682 https://github.com/google/fscrypt/issues/305 - Breakage in toybox cpio: https://www.mail-archive.com/toybox@lists.landley.net/msg07193.html - Breakage in libgit2: https://issuetracker.google.com/issues/189629152 (on Android public issue tracker, requires login) Second, we now cache decrypted symlink targets in ->i_link. Therefore, taking the performance hit of reading and decrypting the symlink target in ->getattr() wouldn't be as big a deal as it used to be, since usually it will just save having to do the same thing later. Also note that eCryptfs ended up having to read and decrypt symlink targets in ->getattr() as well, to fix this same issue; see commit 3a60a1686f0d ("eCryptfs: Decrypt symlink target for stat size"). So, let's just bite the bullet, and read and decrypt the symlink target in ->getattr() in order to report the correct st_size. Add a function fscrypt_symlink_getattr() which the filesystems will call to do this. (Alternatively, we could store the decrypted size of symlinks on-disk. But there isn't a great place to do so, and encryption is meant to hide the original size to some extent; that property would be lost.) Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210702065350.209646-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
bb9cd910 |
|
18-Nov-2020 |
Daniel Rosenberg <drosen@google.com> |
fscrypt: Have filesystems handle their d_ops This shifts the responsibility of setting up dentry operations from fscrypt to the individual filesystems, allowing them to have their own operations while still setting fscrypt's d_revalidate as appropriate. Most filesystems can just use generic_set_encrypted_ci_d_ops, unless they have their own specific dentry operations as well. That operation will set the minimal d_ops required under the circumstances. Since the fscrypt d_ops are set later on, we must set all d_ops there, since we cannot adjust those later on. This should not result in any change in behavior. Signed-off-by: Daniel Rosenberg <drosen@google.com> Acked-by: Theodore Ts'o <tytso@mit.edu> Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
#
a14d0b67 |
|
02-Dec-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: allow deleting files with unsupported encryption policy Currently it's impossible to delete files that use an unsupported encryption policy, as the kernel will just return an error when performing any operation on the top-level encrypted directory, even just a path lookup into the directory or opening the directory for readdir. More specifically, this occurs in any of the following cases: - The encryption context has an unrecognized version number. Current kernels know about v1 and v2, but there could be more versions in the future. - The encryption context has unrecognized encryption modes (FSCRYPT_MODE_*) or flags (FSCRYPT_POLICY_FLAG_*), an unrecognized combination of modes, or reserved bits set. - The encryption key has been added and the encryption modes are recognized but aren't available in the crypto API -- for example, a directory is encrypted with FSCRYPT_MODE_ADIANTUM but the kernel doesn't have CONFIG_CRYPTO_ADIANTUM enabled. It's desirable to return errors for most operations on files that use an unsupported encryption policy, but the current behavior is too strict. We need to allow enough to delete files, so that people can't be stuck with undeletable files when downgrading kernel versions. That includes allowing directories to be listed and allowing dentries to be looked up. Fix this by modifying the key setup logic to treat an unsupported encryption policy in the same way as "key unavailable" in the cases that are required for a recursive delete to work: preparing for a readdir or a dentry lookup, revalidating a dentry, or checking whether an inode has the same encryption policy as its parent directory. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20201203022041.230976-10-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
5b421f08 |
|
02-Dec-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: unexport fscrypt_get_encryption_info() Now that fscrypt_get_encryption_info() is only called from files in fs/crypto/ (due to all key setup now being handled by higher-level helper functions instead of directly by filesystems), unexport it and move its declaration to fscrypt_private.h. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20201203022041.230976-9-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
de3cdc6e |
|
02-Dec-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: move fscrypt_require_key() to fscrypt_private.h fscrypt_require_key() is now only used by files in fs/crypto/. So reduce its visibility to fscrypt_private.h. This is also a prerequsite for unexporting fscrypt_get_encryption_info(). Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20201203022041.230976-8-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
7622350e |
|
02-Dec-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: move body of fscrypt_prepare_setattr() out-of-line In preparation for reducing the visibility of fscrypt_require_key() by moving it to fscrypt_private.h, move the call to it from fscrypt_prepare_setattr() to an out-of-line function. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20201203022041.230976-7-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
ec0caa97 |
|
02-Dec-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: introduce fscrypt_prepare_readdir() The last remaining use of fscrypt_get_encryption_info() from filesystems is for readdir (->iterate_shared()). Every other call is now in fs/crypto/ as part of some other higher-level operation. We need to add a new argument to fscrypt_get_encryption_info() to indicate whether the encryption policy is allowed to be unrecognized or not. Doing this is easier if we can work with high-level operations rather than direct filesystem use of fscrypt_get_encryption_info(). So add a function fscrypt_prepare_readdir() which wraps the call to fscrypt_get_encryption_info() for the readdir use case. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20201203022041.230976-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
234f1b7f |
|
18-Nov-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: remove unnecessary calls to fscrypt_require_key() In an encrypted directory, a regular dentry (one that doesn't have the no-key name flag) can only be created if the directory's encryption key is available. Therefore the calls to fscrypt_require_key() in __fscrypt_prepare_link() and __fscrypt_prepare_rename() are unnecessary, as these functions already check that the dentries they're given aren't no-key names. Remove these unnecessary calls to fscrypt_require_key(). Link: https://lore.kernel.org/r/20201118075609.120337-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
159e1de2 |
|
18-Nov-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: add fscrypt_is_nokey_name() It's possible to create a duplicate filename in an encrypted directory by creating a file concurrently with adding the encryption key. Specifically, sys_open(O_CREAT) (or sys_mkdir(), sys_mknod(), or sys_symlink()) can lookup the target filename while the directory's encryption key hasn't been added yet, resulting in a negative no-key dentry. The VFS then calls ->create() (or ->mkdir(), ->mknod(), or ->symlink()) because the dentry is negative. Normally, ->create() would return -ENOKEY due to the directory's key being unavailable. However, if the key was added between the dentry lookup and ->create(), then the filesystem will go ahead and try to create the file. If the target filename happens to already exist as a normal name (not a no-key name), a duplicate filename may be added to the directory. In order to fix this, we need to fix the filesystems to prevent ->create(), ->mkdir(), ->mknod(), and ->symlink() on no-key names. (->rename() and ->link() need it too, but those are already handled correctly by fscrypt_prepare_rename() and fscrypt_prepare_link().) In preparation for this, add a helper function fscrypt_is_nokey_name() that filesystems can use to do this check. Use this helper function for the existing checks that fs/crypto/ does for rename and link. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201118075609.120337-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
5b2a828b |
|
23-Sep-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: export fscrypt_d_revalidate() Dentries that represent no-key names must have a dentry_operations that includes fscrypt_d_revalidate(). Currently, this is handled by fscrypt_prepare_lookup() installing fscrypt_d_ops. However, ceph support for encryption (https://lore.kernel.org/r/20200914191707.380444-1-jlayton@kernel.org) can't use fscrypt_d_ops, since ceph already has its own dentry_operations. Similarly, ext4 and f2fs support for directories that are both encrypted and casefolded (https://lore.kernel.org/r/20200923010151.69506-1-drosen@google.com) can't use fscrypt_d_ops either, since casefolding requires some dentry operations too. To satisfy both users, we need to move the responsibility of installing the dentry_operations to filesystems. In preparation for this, export fscrypt_d_revalidate() and give it a !CONFIG_FS_ENCRYPTION stub. Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20200924054721.187797-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
501e43fb |
|
23-Sep-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: rename DCACHE_ENCRYPTED_NAME to DCACHE_NOKEY_NAME Originally we used the term "encrypted name" or "ciphertext name" to mean the encoded filename that is shown when an encrypted directory is listed without its key. But these terms are ambiguous since they also mean the filename stored on-disk. "Encrypted name" is especially ambiguous since it could also be understood to mean "this filename is encrypted on-disk", similar to "encrypted file". So we've started calling these encoded names "no-key names" instead. Therefore, rename DCACHE_ENCRYPTED_NAME to DCACHE_NOKEY_NAME to avoid confusion about what this flag means. Link: https://lore.kernel.org/r/20200924042624.98439-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
70fb2612 |
|
23-Sep-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: don't call no-key names "ciphertext names" Currently we're using the term "ciphertext name" ambiguously because it can mean either the actual ciphertext filename, or the encoded filename that is shown when an encrypted directory is listed without its key. The latter we're now usually calling the "no-key name"; and while it's derived from the ciphertext name, it's not the same thing. To avoid this ambiguity, rename fscrypt_name::is_ciphertext_name to fscrypt_name::is_nokey_name, and update comments that say "ciphertext name" (or "encrypted name") to say "no-key name" instead when warranted. Link: https://lore.kernel.org/r/20200924042624.98439-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
c8c868ab |
|
16-Sep-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: make fscrypt_set_test_dummy_encryption() take a 'const char *' fscrypt_set_test_dummy_encryption() requires that the optional argument to the test_dummy_encryption mount option be specified as a substring_t. That doesn't work well with filesystems that use the new mount API, since the new way of parsing mount options doesn't use substring_t. Make it take the argument as a 'const char *' instead. Instead of moving the match_strdup() into the callers in ext4 and f2fs, make them just use arg->from directly. Since the pattern is "test_dummy_encryption=%s", the argument will be null-terminated. Acked-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20200917041136.178600-14-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
ac4acb1f |
|
16-Sep-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: handle test_dummy_encryption in more logical way The behavior of the test_dummy_encryption mount option is that when a new file (or directory or symlink) is created in an unencrypted directory, it's automatically encrypted using a dummy encryption policy. That's it; in particular, the encryption (or lack thereof) of existing files (or directories or symlinks) doesn't change. Unfortunately the implementation of test_dummy_encryption is a bit weird and confusing. When test_dummy_encryption is enabled and a file is being created in an unencrypted directory, we set up an encryption key (->i_crypt_info) for the directory. This isn't actually used to do any encryption, however, since the directory is still unencrypted! Instead, ->i_crypt_info is only used for inheriting the encryption policy. One consequence of this is that the filesystem ends up providing a "dummy context" (policy + nonce) instead of a "dummy policy". In commit ed318a6cc0b6 ("fscrypt: support test_dummy_encryption=v2"), I mistakenly thought this was required. However, actually the nonce only ends up being used to derive a key that is never used. Another consequence of this implementation is that it allows for 'inode->i_crypt_info != NULL && !IS_ENCRYPTED(inode)', which is an edge case that can be forgotten about. For example, currently FS_IOC_GET_ENCRYPTION_POLICY on an unencrypted directory may return the dummy encryption policy when the filesystem is mounted with test_dummy_encryption. That seems like the wrong thing to do, since again, the directory itself is not actually encrypted. Therefore, switch to a more logical and maintainable implementation where the dummy encryption policy inheritance is done without setting up keys for unencrypted directories. This involves: - Adding a function fscrypt_policy_to_inherit() which returns the encryption policy to inherit from a directory. This can be a real policy, a dummy policy, or no policy. - Replacing struct fscrypt_dummy_context, ->get_dummy_context(), etc. with struct fscrypt_dummy_policy, ->get_dummy_policy(), etc. - Making fscrypt_fname_encrypted_size() take an fscrypt_policy instead of an inode. Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Acked-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20200917041136.178600-13-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
31114726 |
|
16-Sep-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: move fscrypt_prepare_symlink() out-of-line In preparation for moving the logic for "get the encryption policy inherited by new files in this directory" to a single place, make fscrypt_prepare_symlink() a regular function rather than an inline function that wraps __fscrypt_prepare_symlink(). This way, the new function fscrypt_policy_to_inherit() won't need to be exported to filesystems. Acked-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20200917041136.178600-12-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
e9d5e31d |
|
16-Sep-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: remove fscrypt_inherit_context() Now that all filesystems have been converted to use fscrypt_prepare_new_inode() and fscrypt_set_context(), fscrypt_inherit_context() is no longer used. Remove it. Acked-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20200917041136.178600-8-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
a992b20c |
|
16-Sep-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: add fscrypt_prepare_new_inode() and fscrypt_set_context() fscrypt_get_encryption_info() is intended to be GFP_NOFS-safe. But actually it isn't, since it uses functions like crypto_alloc_skcipher() which aren't GFP_NOFS-safe, even when called under memalloc_nofs_save(). Therefore it can deadlock when called from a context that needs GFP_NOFS, e.g. during an ext4 transaction or between f2fs_lock_op() and f2fs_unlock_op(). This happens when creating a new encrypted file. We can't fix this by just not setting up the key for new inodes right away, since new symlinks need their key to encrypt the symlink target. So we need to set up the new inode's key before starting the transaction. But just calling fscrypt_get_encryption_info() earlier doesn't work, since it assumes the encryption context is already set, and the encryption context can't be set until the transaction. The recently proposed fscrypt support for the ceph filesystem (https://lkml.kernel.org/linux-fscrypt/20200821182813.52570-1-jlayton@kernel.org/T/#u) will have this same ordering problem too, since ceph will need to encrypt new symlinks before setting their encryption context. Finally, f2fs can deadlock when the filesystem is mounted with '-o test_dummy_encryption' and a new file is created in an existing unencrypted directory. Similarly, this is caused by holding too many locks when calling fscrypt_get_encryption_info(). To solve all these problems, add new helper functions: - fscrypt_prepare_new_inode() sets up a new inode's encryption key (fscrypt_info), using the parent directory's encryption policy and a new random nonce. It neither reads nor writes the encryption context. - fscrypt_set_context() persists the encryption context of a new inode, using the information from the fscrypt_info already in memory. This replaces fscrypt_inherit_context(). Temporarily keep fscrypt_inherit_context() around until all filesystems have been converted to use fscrypt_set_context(). Acked-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20200917041136.178600-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
8b10fe68 |
|
10-Aug-2020 |
Jeff Layton <jlayton@kernel.org> |
fscrypt: drop unused inode argument from fscrypt_fname_alloc_buffer Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20200810142139.487631-1-jlayton@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
ab673b98 |
|
21-Jul-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: use smp_load_acquire() for ->i_crypt_info Normally smp_store_release() or cmpxchg_release() is paired with smp_load_acquire(). Sometimes smp_load_acquire() can be replaced with the more lightweight READ_ONCE(). However, for this to be safe, all the published memory must only be accessed in a way that involves the pointer itself. This may not be the case if allocating the object also involves initializing a static or global variable, for example. fscrypt_info includes various sub-objects which are internal to and are allocated by other kernel subsystems such as keyrings and crypto. So by using READ_ONCE() for ->i_crypt_info, we're relying on internal implementation details of these other kernel subsystems. Remove this fragile assumption by using smp_load_acquire() instead. (Note: I haven't seen any real-world problems here. This change is just fixing the code to be guaranteed correct and less fragile.) Fixes: e37a784d8b6a ("fscrypt: use READ_ONCE() to access ->i_crypt_info") Link: https://lore.kernel.org/r/20200721225920.114347-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
5fee3609 |
|
01-Jul-2020 |
Satya Tangirala <satyat@google.com> |
fscrypt: add inline encryption support Add support for inline encryption to fs/crypto/. With "inline encryption", the block layer handles the decryption/encryption as part of the bio, instead of the filesystem doing the crypto itself via Linux's crypto API. This model is needed in order to take advantage of the inline encryption hardware present on most modern mobile SoCs. To use inline encryption, the filesystem needs to be mounted with '-o inlinecrypt'. Blk-crypto will then be used instead of the traditional filesystem-layer crypto whenever possible to encrypt the contents of any encrypted files in that filesystem. Fscrypt still provides the key and IV to use, and the actual ciphertext on-disk is still the same; therefore it's testable using the existing fscrypt ciphertext verification tests. Note that since blk-crypto has a fallback to Linux's crypto API, and also supports all the encryption modes currently supported by fscrypt, this feature is usable and testable even without actual inline encryption hardware. Per-filesystem changes will be needed to set encryption contexts when submitting bios and to implement the 'inlinecrypt' mount option. This patch just adds the common code. Signed-off-by: Satya Tangirala <satyat@google.com> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20200702015607.1215430-3-satyat@google.com Co-developed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
ed318a6c |
|
12-May-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: support test_dummy_encryption=v2 v1 encryption policies are deprecated in favor of v2, and some new features (e.g. encryption+casefolding) are only being added for v2. Therefore, the "test_dummy_encryption" mount option (which is used for encryption I/O testing with xfstests) needs to support v2 policies. To do this, extend its syntax to be "test_dummy_encryption=v1" or "test_dummy_encryption=v2". The existing "test_dummy_encryption" (no argument) also continues to be accepted, to specify the default setting -- currently v1, but the next patch changes it to v2. To cleanly support both v1 and v2 while also making it easy to support specifying other encryption settings in the future (say, accepting "$contents_mode:$filenames_mode:v2"), make ext4 and f2fs maintain a pointer to the dummy fscrypt_context rather than using mount flags. To avoid concurrency issues, don't allow test_dummy_encryption to be set or changed during a remount. (The former restriction is new, but xfstests doesn't run into it, so no one should notice.) Tested with 'gce-xfstests -c {ext4,f2fs}/encrypt -g auto'. On ext4, there are two regressions, both of which are test bugs: ext4/023 and ext4/028 fail because they set an xattr and expect it to be stored inline, but the increase in size of the fscrypt_context from 24 to 40 bytes causes this xattr to be spilled into an external block. Link: https://lore.kernel.org/r/20200512233251.118314-4-ebiggers@kernel.org Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
60700902 |
|
11-May-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: remove unnecessary extern keywords Remove the unnecessary 'extern' keywords from function declarations. This makes it so that we don't have a mix of both styles, so it won't be ambiguous what to use in new fscrypt patches. This also makes the code shorter and matches the 'checkpatch --strict' expectation. Link: https://lore.kernel.org/r/20200511191358.53096-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
fe015a78 |
|
11-May-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: name all function parameters Name all the function parameters. This makes it so that we don't have a mix of both styles, so it won't be ambiguous what to use in new fscrypt patches. This also matches the checkpatch expectation. Link: https://lore.kernel.org/r/20200511191358.53096-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
d2fe9754 |
|
11-May-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: fix all kerneldoc warnings Fix all kerneldoc warnings in fs/crypto/ and include/linux/fscrypt.h. Most of these were due to missing documentation for function parameters. Detected with: scripts/kernel-doc -v -none fs/crypto/*.{c,h} include/linux/fscrypt.h This cleanup makes it possible to check new patches for kerneldoc warnings without having to filter out all the existing ones. For consistency, also adjust some function "brief descriptions" to include the parentheses and to wrap at 80 characters. (The latter matches the checkpatch expectation.) Link: https://lore.kernel.org/r/20200511191358.53096-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
e98ad464 |
|
14-Mar-2020 |
Eric Biggers <ebiggers@google.com> |
fscrypt: add FS_IOC_GET_ENCRYPTION_NONCE ioctl Add an ioctl FS_IOC_GET_ENCRYPTION_NONCE which retrieves the nonce from an encrypted file or directory. The nonce is the 16-byte random value stored in the inode's encryption xattr. It is normally used together with the master key to derive the inode's actual encryption key. The nonces are needed by automated tests that verify the correctness of the ciphertext on-disk. Except for the IV_INO_LBLK_64 case, there's no way to replicate a file's ciphertext without knowing that file's nonce. The nonces aren't secret, and the existing ciphertext verification tests in xfstests retrieve them from disk using debugfs or dump.f2fs. But in environments that lack these debugging tools, getting the nonces by manually parsing the filesystem structure would be very hard. To make this important type of testing much easier, let's just add an ioctl that retrieves the nonce. Link: https://lore.kernel.org/r/20200314205052.93294-2-ebiggers@kernel.org Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
edc440e3 |
|
20-Jan-2020 |
Daniel Rosenberg <drosen@google.com> |
fscrypt: improve format of no-key names When an encrypted directory is listed without the key, the filesystem must show "no-key names" that uniquely identify directory entries, are at most 255 (NAME_MAX) bytes long, and don't contain '/' or '\0'. Currently, for short names the no-key name is the base64 encoding of the ciphertext filename, while for long names it's the base64 encoding of the ciphertext filename's dirhash and second-to-last 16-byte block. This format has the following problems: - Since it doesn't always include the dirhash, it's incompatible with directories that will use a secret-keyed dirhash over the plaintext filenames. In this case, the dirhash won't be computable from the ciphertext name without the key, so it instead must be retrieved from the directory entry and always included in the no-key name. Casefolded encrypted directories will use this type of dirhash. - It's ambiguous: it's possible to craft two filenames that map to the same no-key name, since the method used to abbreviate long filenames doesn't use a proper cryptographic hash function. Solve both these problems by switching to a new no-key name format that is the base64 encoding of a variable-length structure that contains the dirhash, up to 149 bytes of the ciphertext filename, and (if any bytes remain) the SHA-256 of the remaining bytes of the ciphertext filename. This ensures that each no-key name contains everything needed to find the directory entry again, contains only legal characters, doesn't exceed NAME_MAX, is unambiguous unless there's a SHA-256 collision, and that we only take the performance hit of SHA-256 on very long filenames. Note: this change does *not* address the existing issue where users can modify the 'dirhash' part of a no-key name and the filesystem may still accept the name. Signed-off-by: Daniel Rosenberg <drosen@google.com> [EB: improved comments and commit message, fixed checking return value of base64_decode(), check for SHA-256 error, continue to set disk_name for short names to keep matching simpler, and many other cleanups] Link: https://lore.kernel.org/r/20200120223201.241390-7-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
aa408f83 |
|
20-Jan-2020 |
Daniel Rosenberg <drosen@google.com> |
fscrypt: derive dirhash key for casefolded directories When we allow indexed directories to use both encryption and casefolding, for the dirhash we can't just hash the ciphertext filenames that are stored on-disk (as is done currently) because the dirhash must be case insensitive, but the stored names are case-preserving. Nor can we hash the plaintext names with an unkeyed hash (or a hash keyed with a value stored on-disk like ext4's s_hash_seed), since that would leak information about the names that encryption is meant to protect. Instead, if we can accept a dirhash that's only computable when the fscrypt key is available, we can hash the plaintext names with a keyed hash using a secret key derived from the directory's fscrypt master key. We'll use SipHash-2-4 for this purpose. Prepare for this by deriving a SipHash key for each casefolded encrypted directory. Make sure to handle deriving the key not only when setting up the directory's fscrypt_info, but also in the case where the casefold flag is enabled after the fscrypt_info was already set up. (We could just always derive the key regardless of casefolding, but that would introduce unnecessary overhead for people not using casefolding.) Signed-off-by: Daniel Rosenberg <drosen@google.com> [EB: improved commit message, updated fscrypt.rst, squashed with change that avoids unnecessarily deriving the key, and many other cleanups] Link: https://lore.kernel.org/r/20200120223201.241390-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
6e1918cf |
|
20-Jan-2020 |
Daniel Rosenberg <drosen@google.com> |
fscrypt: don't allow v1 policies with casefolding Casefolded encrypted directories will use a new dirhash method that requires a secret key. If the directory uses a v2 encryption policy, it's easy to derive this key from the master key using HKDF. However, v1 encryption policies don't provide a way to derive additional keys. Therefore, don't allow casefolding on directories that use a v1 policy. Specifically, make it so that trying to enable casefolding on a directory that has a v1 policy fails, trying to set a v1 policy on a casefolded directory fails, and trying to open a casefolded directory that has a v1 policy (if one somehow exists on-disk) fails. Signed-off-by: Daniel Rosenberg <drosen@google.com> [EB: improved commit message, updated fscrypt.rst, and other cleanups] Link: https://lore.kernel.org/r/20200120223201.241390-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
56dce717 |
|
09-Dec-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: introduce fscrypt_needs_contents_encryption() Add a function fscrypt_needs_contents_encryption() which takes an inode and returns true if it's an encrypted regular file and the kernel was built with fscrypt support. This will allow replacing duplicated checks of IS_ENCRYPTED() && S_ISREG() on the I/O paths in ext4 and f2fs, while also optimizing out unneeded code when !CONFIG_FS_ENCRYPTION. Link: https://lore.kernel.org/r/20191209205021.231767-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
8a4ab0b8 |
|
15-Dec-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: constify inode parameter to filename encryption functions Constify the struct inode parameter to fscrypt_fname_disk_to_usr() and the other filename encryption functions so that users don't have to pass in a non-const inode when they are dealing with a const one, as in [1]. [1] https://lkml.kernel.org/linux-ext4/20191203051049.44573-6-drosen@google.com/ Cc: Daniel Rosenberg <drosen@google.com> Link: https://lore.kernel.org/r/20191215213947.9521-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
b103fb76 |
|
24-Oct-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: add support for IV_INO_LBLK_64 policies Inline encryption hardware compliant with the UFS v2.1 standard or with the upcoming version of the eMMC standard has the following properties: (1) Per I/O request, the encryption key is specified by a previously loaded keyslot. There might be only a small number of keyslots. (2) Per I/O request, the starting IV is specified by a 64-bit "data unit number" (DUN). IV bits 64-127 are assumed to be 0. The hardware automatically increments the DUN for each "data unit" of configurable size in the request, e.g. for each filesystem block. Property (1) makes it inefficient to use the traditional fscrypt per-file keys. Property (2) precludes the use of the existing DIRECT_KEY fscrypt policy flag, which needs at least 192 IV bits. Therefore, add a new fscrypt policy flag IV_INO_LBLK_64 which causes the encryption to modified as follows: - The encryption keys are derived from the master key, encryption mode number, and filesystem UUID. - The IVs are chosen as (inode_number << 32) | file_logical_block_num. For filenames encryption, file_logical_block_num is 0. Since the file nonces aren't used in the key derivation, many files may share the same encryption key. This is much more efficient on the target hardware. Including the inode number in the IVs and mixing the filesystem UUID into the keys ensures that data in different files is nevertheless still encrypted differently. Additionally, limiting the inode and block numbers to 32 bits and placing the block number in the low bits maintains compatibility with the 64-bit DUN convention (property (2) above). Since this scheme assumes that inode numbers are stable (which may preclude filesystem shrinking) and that inode and file logical block numbers are at most 32-bit, IV_INO_LBLK_64 will only be allowed on filesystems that meet these constraints. These are acceptable limitations for the cases where this format would actually be used. Note that IV_INO_LBLK_64 is an on-disk format, not an implementation. This patch just adds support for it using the existing filesystem layer encryption. A later patch will add support for inline encryption. Reviewed-by: Paul Crowley <paulcrowley@google.com> Co-developed-by: Satya Tangirala <satyat@google.com> Signed-off-by: Satya Tangirala <satyat@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
1565bdad |
|
09-Oct-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: remove struct fscrypt_ctx Now that ext4 and f2fs implement their own post-read workflow that supports both fscrypt and fsverity, the fscrypt-only workflow based around struct fscrypt_ctx is no longer used. So remove the unused code. This is based on a patch from Chandan Rajendra's "Consolidate FS read I/O callbacks code" patchset, but rebased onto the latest kernel, folded __fscrypt_decrypt_bio() into fscrypt_decrypt_bio(), cleaned up fscrypt_initialize(), and updated the commit message. Originally-from: Chandan Rajendra <chandan@linux.ibm.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
78a1b96b |
|
04-Aug-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS ioctl Add a root-only variant of the FS_IOC_REMOVE_ENCRYPTION_KEY ioctl which removes all users' claims of the key, not just the current user's claim. I.e., it always removes the key itself, no matter how many users have added it. This is useful for forcing a directory to be locked, without having to figure out which user ID(s) the key was added under. This is planned to be used by a command like 'sudo fscrypt lock DIR --all-users' in the fscrypt userspace tool (http://github.com/google/fscrypt). Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
5dae460c |
|
04-Aug-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: v2 encryption policy support Add a new fscrypt policy version, "v2". It has the following changes from the original policy version, which we call "v1" (*): - Master keys (the user-provided encryption keys) are only ever used as input to HKDF-SHA512. This is more flexible and less error-prone, and it avoids the quirks and limitations of the AES-128-ECB based KDF. Three classes of cryptographically isolated subkeys are defined: - Per-file keys, like used in v1 policies except for the new KDF. - Per-mode keys. These implement the semantics of the DIRECT_KEY flag, which for v1 policies made the master key be used directly. These are also planned to be used for inline encryption when support for it is added. - Key identifiers (see below). - Each master key is identified by a 16-byte master_key_identifier, which is derived from the key itself using HKDF-SHA512. This prevents users from associating the wrong key with an encrypted file or directory. This was easily possible with v1 policies, which identified the key by an arbitrary 8-byte master_key_descriptor. - The key must be provided in the filesystem-level keyring, not in a process-subscribed keyring. The following UAPI additions are made: - The existing ioctl FS_IOC_SET_ENCRYPTION_POLICY can now be passed a fscrypt_policy_v2 to set a v2 encryption policy. It's disambiguated from fscrypt_policy/fscrypt_policy_v1 by the version code prefix. - A new ioctl FS_IOC_GET_ENCRYPTION_POLICY_EX is added. It allows getting the v1 or v2 encryption policy of an encrypted file or directory. The existing FS_IOC_GET_ENCRYPTION_POLICY ioctl could not be used because it did not have a way for userspace to indicate which policy structure is expected. The new ioctl includes a size field, so it is extensible to future fscrypt policy versions. - The ioctls FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY, and FS_IOC_GET_ENCRYPTION_KEY_STATUS now support managing keys for v2 encryption policies. Such keys are kept logically separate from keys for v1 encryption policies, and are identified by 'identifier' rather than by 'descriptor'. The 'identifier' need not be provided when adding a key, since the kernel will calculate it anyway. This patch temporarily keeps adding/removing v2 policy keys behind the same permission check done for adding/removing v1 policy keys: capable(CAP_SYS_ADMIN). However, the next patch will carefully take advantage of the cryptographically secure master_key_identifier to allow non-root users to add/remove v2 policy keys, thus providing a full replacement for v1 policies. (*) Actually, in the API fscrypt_policy::version is 0 while on-disk fscrypt_context::format is 1. But I believe it makes the most sense to advance both to '2' to have them be in sync, and to consider the numbering to start at 1 except for the API quirk. Reviewed-by: Paul Crowley <paulcrowley@google.com> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
5a7e2992 |
|
04-Aug-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: add FS_IOC_GET_ENCRYPTION_KEY_STATUS ioctl Add a new fscrypt ioctl, FS_IOC_GET_ENCRYPTION_KEY_STATUS. Given a key specified by 'struct fscrypt_key_specifier' (the same way a key is specified for the other fscrypt key management ioctls), it returns status information in a 'struct fscrypt_get_key_status_arg'. The main motivation for this is that applications need to be able to check whether an encrypted directory is "unlocked" or not, so that they can add the key if it is not, and avoid adding the key (which may involve prompting the user for a passphrase) if it already is. It's possible to use some workarounds such as checking whether opening a regular file fails with ENOKEY, or checking whether the filenames "look like gibberish" or not. However, no workaround is usable in all cases. Like the other key management ioctls, the keyrings syscalls may seem at first to be a good fit for this. Unfortunately, they are not. Even if we exposed the keyring ID of the ->s_master_keys keyring and gave everyone Search permission on it (note: currently the keyrings permission system would also allow everyone to "invalidate" the keyring too), the fscrypt keys have an additional state that doesn't map cleanly to the keyrings API: the secret can be removed, but we can be still tracking the files that were using the key, and the removal can be re-attempted or the secret added again. After later patches, some applications will also need a way to determine whether a key was added by the current user vs. by some other user. Reserved fields are included in fscrypt_get_key_status_arg for this and other future extensions. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
b1c0ec35 |
|
04-Aug-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl Add a new fscrypt ioctl, FS_IOC_REMOVE_ENCRYPTION_KEY. This ioctl removes an encryption key that was added by FS_IOC_ADD_ENCRYPTION_KEY. It wipes the secret key itself, then "locks" the encrypted files and directories that had been unlocked using that key -- implemented by evicting the relevant dentries and inodes from the VFS caches. The problem this solves is that many fscrypt users want the ability to remove encryption keys, causing the corresponding encrypted directories to appear "locked" (presented in ciphertext form) again. Moreover, users want removing an encryption key to *really* remove it, in the sense that the removed keys cannot be recovered even if kernel memory is compromised, e.g. by the exploit of a kernel security vulnerability or by a physical attack. This is desirable after a user logs out of the system, for example. In many cases users even already assume this to be the case and are surprised to hear when it's not. It is not sufficient to simply unlink the master key from the keyring (or to revoke or invalidate it), since the actual encryption transform objects are still pinned in memory by their inodes. Therefore, to really remove a key we must also evict the relevant inodes. Currently one workaround is to run 'sync && echo 2 > /proc/sys/vm/drop_caches'. But, that evicts all unused inodes in the system rather than just the inodes associated with the key being removed, causing severe performance problems. Moreover, it requires root privileges, so regular users can't "lock" their encrypted files. Another workaround, used in Chromium OS kernels, is to add a new VFS-level ioctl FS_IOC_DROP_CACHE which is a more restricted version of drop_caches that operates on a single super_block. It does: shrink_dcache_sb(sb); invalidate_inodes(sb, false); But it's still a hack. Yet, the major users of filesystem encryption want this feature badly enough that they are actually using these hacks. To properly solve the problem, start maintaining a list of the inodes which have been "unlocked" using each master key. Originally this wasn't possible because the kernel didn't keep track of in-use master keys at all. But, with the ->s_master_keys keyring it is now possible. Then, add an ioctl FS_IOC_REMOVE_ENCRYPTION_KEY. It finds the specified master key in ->s_master_keys, then wipes the secret key itself, which prevents any additional inodes from being unlocked with the key. Then, it syncs the filesystem and evicts the inodes in the key's list. The normal inode eviction code will free and wipe the per-file keys (in ->i_crypt_info). Note that freeing ->i_crypt_info without evicting the inodes was also considered, but would have been racy. Some inodes may still be in use when a master key is removed, and we can't simply revoke random file descriptors, mmap's, etc. Thus, the ioctl simply skips in-use inodes, and returns -EBUSY to indicate that some inodes weren't evicted. The master key *secret* is still removed, but the fscrypt_master_key struct remains to keep track of the remaining inodes. Userspace can then retry the ioctl to evict the remaining inodes. Alternatively, if userspace adds the key again, the refreshed secret will be associated with the existing list of inodes so they remain correctly tracked for future key removals. The ioctl doesn't wipe pagecache pages. Thus, we tolerate that after a kernel compromise some portions of plaintext file contents may still be recoverable from memory. This can be solved by enabling page poisoning system-wide, which security conscious users may choose to do. But it's very difficult to solve otherwise, e.g. note that plaintext file contents may have been read in other places than pagecache pages. Like FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY is initially restricted to privileged users only. This is sufficient for some use cases, but not all. A later patch will relax this restriction, but it will require introducing key hashes, among other changes. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
22d94f49 |
|
04-Aug-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctl Add a new fscrypt ioctl, FS_IOC_ADD_ENCRYPTION_KEY. This ioctl adds an encryption key to the filesystem's fscrypt keyring ->s_master_keys, making any files encrypted with that key appear "unlocked". Why we need this ~~~~~~~~~~~~~~~~ The main problem is that the "locked/unlocked" (ciphertext/plaintext) status of encrypted files is global, but the fscrypt keys are not. fscrypt only looks for keys in the keyring(s) the process accessing the filesystem is subscribed to: the thread keyring, process keyring, and session keyring, where the session keyring may contain the user keyring. Therefore, userspace has to put fscrypt keys in the keyrings for individual users or sessions. But this means that when a process with a different keyring tries to access encrypted files, whether they appear "unlocked" or not is nondeterministic. This is because it depends on whether the files are currently present in the inode cache. Fixing this by consistently providing each process its own view of the filesystem depending on whether it has the key or not isn't feasible due to how the VFS caches work. Furthermore, while sometimes users expect this behavior, it is misguided for two reasons. First, it would be an OS-level access control mechanism largely redundant with existing access control mechanisms such as UNIX file permissions, ACLs, LSMs, etc. Encryption is actually for protecting the data at rest. Second, almost all users of fscrypt actually do need the keys to be global. The largest users of fscrypt, Android and Chromium OS, achieve this by having PID 1 create a "session keyring" that is inherited by every process. This works, but it isn't scalable because it prevents session keyrings from being used for any other purpose. On general-purpose Linux distros, the 'fscrypt' userspace tool [1] can't similarly abuse the session keyring, so to make 'sudo' work on all systems it has to link all the user keyrings into root's user keyring [2]. This is ugly and raises security concerns. Moreover it can't make the keys available to system services, such as sshd trying to access the user's '~/.ssh' directory (see [3], [4]) or NetworkManager trying to read certificates from the user's home directory (see [5]); or to Docker containers (see [6], [7]). By having an API to add a key to the *filesystem* we'll be able to fix the above bugs, remove userspace workarounds, and clearly express the intended semantics: the locked/unlocked status of an encrypted directory is global, and encryption is orthogonal to OS-level access control. Why not use the add_key() syscall ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We use an ioctl for this API rather than the existing add_key() system call because the ioctl gives us the flexibility needed to implement fscrypt-specific semantics that will be introduced in later patches: - Supporting key removal with the semantics such that the secret is removed immediately and any unused inodes using the key are evicted; also, the eviction of any in-use inodes can be retried. - Calculating a key-dependent cryptographic identifier and returning it to userspace. - Allowing keys to be added and removed by non-root users, but only keys for v2 encryption policies; and to prevent denial-of-service attacks, users can only remove keys they themselves have added, and a key is only really removed after all users who added it have removed it. Trying to shoehorn these semantics into the keyrings syscalls would be very difficult, whereas the ioctls make things much easier. However, to reuse code the implementation still uses the keyrings service internally. Thus we get lockless RCU-mode key lookups without having to re-implement it, and the keys automatically show up in /proc/keys for debugging purposes. References: [1] https://github.com/google/fscrypt [2] https://goo.gl/55cCrI#heading=h.vf09isp98isb [3] https://github.com/google/fscrypt/issues/111#issuecomment-444347939 [4] https://github.com/google/fscrypt/issues/116 [5] https://bugs.launchpad.net/ubuntu/+source/fscrypt/+bug/1770715 [6] https://github.com/google/fscrypt/issues/128 [7] https://askubuntu.com/questions/1130306/cannot-run-docker-on-an-encrypted-filesystem Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
feed8258 |
|
04-Aug-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: rename keyinfo.c to keysetup.c Rename keyinfo.c to keysetup.c since this better describes what the file does (sets up the key), and it matches the new file keysetup_v1.c. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
7af0ab0d |
|
04-Aug-2019 |
Eric Biggers <ebiggers@google.com> |
fs, fscrypt: move uapi definitions to new header <linux/fscrypt.h> More fscrypt definitions are being added, and we shouldn't use a disproportionate amount of space in <linux/fs.h> for fscrypt stuff. So move the fscrypt definitions to a new header <linux/fscrypt.h>. For source compatibility with existing userspace programs, <linux/fs.h> still includes the new header. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
aa8bc1ac |
|
20-May-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: support decrypting multiple filesystem blocks per page Rename fscrypt_decrypt_page() to fscrypt_decrypt_pagecache_blocks() and redefine its behavior to decrypt all filesystem blocks in the given region of the given page, rather than assuming that the region consists of just one filesystem block. Also remove the 'inode' and 'lblk_num' parameters, since they can be retrieved from the page as it's already assumed to be a pagecache page. This is in preparation for allowing encryption on ext4 filesystems with blocksize != PAGE_SIZE. This is based on work by Chandan Rajendra. Reviewed-by: Chandan Rajendra <chandan@linux.ibm.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
41adbcb7 |
|
20-May-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: introduce fscrypt_decrypt_block_inplace() Currently fscrypt_decrypt_page() does one of two logically distinct things depending on whether FS_CFLG_OWN_PAGES is set in the filesystem's fscrypt_operations: decrypt a pagecache page in-place, or decrypt a filesystem block in-place in any page. Currently these happen to share the same implementation, but this conflates the notion of blocks and pages. It also makes it so that all callers have to provide inode and lblk_num, when fscrypt could determine these itself for pagecache pages. Therefore, move the FS_CFLG_OWN_PAGES behavior into a new function fscrypt_decrypt_block_inplace(). This mirrors fscrypt_encrypt_block_inplace(). This is in preparation for allowing encryption on ext4 filesystems with blocksize != PAGE_SIZE. Reviewed-by: Chandan Rajendra <chandan@linux.ibm.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
53bc1d85 |
|
20-May-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: support encrypting multiple filesystem blocks per page Rename fscrypt_encrypt_page() to fscrypt_encrypt_pagecache_blocks() and redefine its behavior to encrypt all filesystem blocks from the given region of the given page, rather than assuming that the region consists of just one filesystem block. Also remove the 'inode' and 'lblk_num' parameters, since they can be retrieved from the page as it's already assumed to be a pagecache page. This is in preparation for allowing encryption on ext4 filesystems with blocksize != PAGE_SIZE. This is based on work by Chandan Rajendra. Reviewed-by: Chandan Rajendra <chandan@linux.ibm.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
03569f2f |
|
20-May-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: introduce fscrypt_encrypt_block_inplace() fscrypt_encrypt_page() behaves very differently depending on whether the filesystem set FS_CFLG_OWN_PAGES in its fscrypt_operations. This makes the function difficult to understand and document. It also makes it so that all callers have to provide inode and lblk_num, when fscrypt could determine these itself for pagecache pages. Therefore, move the FS_CFLG_OWN_PAGES behavior into a new function fscrypt_encrypt_block_inplace(). This is in preparation for allowing encryption on ext4 filesystems with blocksize != PAGE_SIZE. Reviewed-by: Chandan Rajendra <chandan@linux.ibm.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
2a415a02 |
|
20-May-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: remove the "write" part of struct fscrypt_ctx Now that fscrypt_ctx is not used for writes, remove the 'w' fields. Reviewed-by: Chandan Rajendra <chandan@linux.ibm.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
d2d0727b |
|
20-May-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: simplify bounce page handling Currently, bounce page handling for writes to encrypted files is unnecessarily complicated. A fscrypt_ctx is allocated along with each bounce page, page_private(bounce_page) points to this fscrypt_ctx, and fscrypt_ctx::w::control_page points to the original pagecache page. However, because writes don't use the fscrypt_ctx for anything else, there's no reason why page_private(bounce_page) can't just point to the original pagecache page directly. Therefore, this patch makes this change. In the process, it also cleans up the API exposed to filesystems that allows testing whether a page is a bounce page, getting the pagecache page from a bounce page, and freeing a bounce page. Reviewed-by: Chandan Rajendra <chandan@linux.ibm.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
eea2c05d |
|
26-Mar-2019 |
Sascha Hauer <s.hauer@pengutronix.de> |
ubifs: Remove #ifdef around CONFIG_FS_ENCRYPTION ifdefs reduce readablity and compile coverage. This removes the ifdefs around CONFIG_FS_ENCRYPTION by using IS_ENABLED and relying on static inline wrappers. A new static inline wrapper for setting sb->s_cop is introduced to allow filesystems to unconditionally compile in their s_cop operations. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
|
#
2c58d548 |
|
10-Apr-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: cache decrypted symlink target in ->i_link Path lookups that traverse encrypted symlink(s) are very slow because each encrypted symlink needs to be decrypted each time it's followed. This also involves dropping out of rcu-walk mode. Make encrypted symlinks faster by caching the decrypted symlink target in ->i_link. The first call to fscrypt_get_symlink() sets it. Then, the existing VFS path lookup code uses the non-NULL ->i_link to take the fast path where ->get_link() isn't called, and lookups in rcu-walk mode remain in rcu-walk mode. Also set ->i_link immediately when a new encrypted symlink is created. To safely free the symlink target after an RCU grace period has elapsed, introduce a new function fscrypt_free_inode(), and make the relevant filesystems call it just before actually freeing the inode. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
b01531db |
|
20-Mar-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: fix race where ->lookup() marks plaintext dentry as ciphertext ->lookup() in an encrypted directory begins as follows: 1. fscrypt_prepare_lookup(): a. Try to load the directory's encryption key. b. If the key is unavailable, mark the dentry as a ciphertext name via d_flags. 2. fscrypt_setup_filename(): a. Try to load the directory's encryption key. b. If the key is available, encrypt the name (treated as a plaintext name) to get the on-disk name. Otherwise decode the name (treated as a ciphertext name) to get the on-disk name. But if the key is concurrently added, it may be found at (2a) but not at (1a). In this case, the dentry will be wrongly marked as a ciphertext name even though it was actually treated as plaintext. This will cause the dentry to be wrongly invalidated on the next lookup, potentially causing problems. For example, if the racy ->lookup() was part of sys_mount(), then the new mount will be detached when anything tries to access it. This is despite the mountpoint having a plaintext path, which should remain valid now that the key was added. Of course, this is only possible if there's a userspace race. Still, the additional kernel-side race is confusing and unexpected. Close the kernel-side race by changing fscrypt_prepare_lookup() to also set the on-disk filename (step 2b), consistent with the d_flags update. Fixes: 28b4c263961c ("ext4 crypto: revalidate dentry after adding or removing the key") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
0bf3d5c1 |
|
20-Mar-2019 |
Eric Biggers <ebiggers@google.com> |
fs, fscrypt: clear DCACHE_ENCRYPTED_NAME when unaliasing directory Make __d_move() clear DCACHE_ENCRYPTED_NAME on the source dentry. This is needed for when d_splice_alias() moves a directory's encrypted alias to its decrypted alias as a result of the encryption key being added. Otherwise, the decrypted alias will incorrectly be invalidated on the next lookup, causing problems such as unmounting a mount the user just mount()ed there. Note that we don't have to support arbitrary moves of this flag because fscrypt doesn't allow dentries with DCACHE_ENCRYPTED_NAME to be the source or target of a rename(). Fixes: 28b4c263961c ("ext4 crypto: revalidate dentry after adding or removing the key") Reported-by: Sarthak Kukreti <sarthakkukreti@chromium.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
968dd6d0 |
|
20-Mar-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: fix race allowing rename() and link() of ciphertext dentries Close some race conditions where fscrypt allowed rename() and link() on ciphertext dentries that had been looked up just prior to the key being concurrently added. It's better to return -ENOKEY in this case. This avoids doing the nonsensical thing of encrypting the names a second time when searching for the actual on-disk dir entries. It also guarantees that DCACHE_ENCRYPTED_NAME dentries are never rename()d, so the dcache won't have support all possible combinations of moving DCACHE_ENCRYPTED_NAME around during __d_move(). Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
6cc24868 |
|
20-Mar-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: clean up and improve dentry revalidation Make various improvements to fscrypt dentry revalidation: - Don't try to handle the case where the per-directory key is removed, as this can't happen without the inode (and dentries) being evicted. - Flag ciphertext dentries rather than plaintext dentries, since it's ciphertext dentries that need the special handling. - Avoid doing unnecessary work for non-ciphertext dentries. - When revalidating ciphertext dentries, try to set up the directory's i_crypt_info to make sure the key is really still absent, rather than invalidating all negative dentries as the previous code did. An old comment suggested we can't do this for locking reasons, but AFAICT this comment was outdated and it actually works fine. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
e37a784d |
|
11-Apr-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: use READ_ONCE() to access ->i_crypt_info ->i_crypt_info starts out NULL and may later be locklessly set to a non-NULL value by the cmpxchg() in fscrypt_get_encryption_info(). But ->i_crypt_info is used directly, which technically is incorrect. It's a data race, and it doesn't include the data dependency barrier needed to safely dereference the pointer on at least one architecture. Fix this by using READ_ONCE() instead. Note: we don't need to use smp_load_acquire(), since dereferencing the pointer only requires a data dependency barrier, which is already included in READ_ONCE(). We also don't need READ_ONCE() in places where ->i_crypt_info is unconditionally dereferenced, since it must have already been checked. Also downgrade the cmpxchg() to cmpxchg_release(), since RELEASE semantics are sufficient on the write side. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
cd0265fc |
|
18-Mar-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: drop inode argument from fscrypt_get_ctx() The only reason the inode is being passed to fscrypt_get_ctx() is to verify that the encryption key is available. However, all callers already ensure this because if we get as far as trying to do I/O to an encrypted file without the key, there's already a bug. Therefore, remove this unnecessary argument. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
f5e55e77 |
|
22-Jan-2019 |
Eric Biggers <ebiggers@google.com> |
fscrypt: return -EXDEV for incompatible rename or link into encrypted dir Currently, trying to rename or link a regular file, directory, or symlink into an encrypted directory fails with EPERM when the source file is unencrypted or is encrypted with a different encryption policy, and is on the same mountpoint. It is correct for the operation to fail, but the choice of EPERM breaks tools like 'mv' that know to copy rather than rename if they see EXDEV, but don't know what to do with EPERM. Our original motivation for EPERM was to encourage users to securely handle their data. Encrypting files by "moving" them into an encrypted directory can be insecure because the unencrypted data may remain in free space on disk, where it can later be recovered by an attacker. It's much better to encrypt the data from the start, or at least try to securely delete the source data e.g. using the 'shred' program. However, the current behavior hasn't been effective at achieving its goal because users tend to be confused, hack around it, and complain; see e.g. https://github.com/google/fscrypt/issues/76. And in some cases it's actually inconsistent or unnecessary. For example, 'mv'-ing files between differently encrypted directories doesn't work even in cases where it can be secure, such as when in userspace the same passphrase protects both directories. Yet, you *can* already 'mv' unencrypted files into an encrypted directory if the source files are on a different mountpoint, even though doing so is often insecure. There are probably better ways to teach users to securely handle their files. For example, the 'fscrypt' userspace tool could provide a command that migrates unencrypted files into an encrypted directory, acting like 'shred' on the source files and providing appropriate warnings depending on the type of the source filesystem and disk. Receiving errors on unimportant files might also force some users to disable encryption, thus making the behavior counterproductive. It's desirable to make encryption as unobtrusive as possible. Therefore, change the error code from EPERM to EXDEV so that tools looking for EXDEV will fall back to a copy. This, of course, doesn't prevent users from still doing the right things to securely manage their files. Note that this also matches the behavior when a file is renamed between two project quota hierarchies; so there's precedent for using EXDEV for things other than mountpoints. xfstests generic/398 will require an update with this change. [Rewritten from an earlier patch series by Michael Halcrow.] Cc: Michael Halcrow <mhalcrow@google.com> Cc: Joe Richey <joerichey@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
643fa961 |
|
12-Dec-2018 |
Chandan Rajendra <chandan@linux.vnet.ibm.com> |
fscrypt: remove filesystem specific build config option In order to have a common code base for fscrypt "post read" processing for all filesystems which support encryption, this commit removes filesystem specific build config option (e.g. CONFIG_EXT4_FS_ENCRYPTION) and replaces it with a build option (i.e. CONFIG_FS_ENCRYPTION) whose value affects all the filesystems making use of fscrypt. Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
|
#
0eaab5b1 |
|
11-Jan-2018 |
Eric Biggers <ebiggers@google.com> |
fscrypt: move fscrypt_symlink_data to fscrypt_private.h Now that all filesystems have been converted to use the symlink helper functions, they no longer need the declaration of 'struct fscrypt_symlink_data'. Move it from fscrypt.h to fscrypt_private.h. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
76e81d6d |
|
05-Jan-2018 |
Eric Biggers <ebiggers@google.com> |
fscrypt: new helper functions for ->symlink() Currently, filesystems supporting fscrypt need to implement some tricky logic when creating encrypted symlinks, including handling a peculiar on-disk format (struct fscrypt_symlink_data) and correctly calculating the size of the encrypted symlink. Introduce helper functions to make things a bit easier: - fscrypt_prepare_symlink() computes and validates the size the symlink target will require on-disk. - fscrypt_encrypt_symlink() creates the encrypted target if needed. The new helpers actually fix some subtle bugs. First, when checking whether the symlink target was too long, filesystems didn't account for the fact that the NUL padding is meant to be truncated if it would cause the maximum length to be exceeded, as is done for filenames in directories. Consequently users would receive ENAMETOOLONG when creating symlinks close to what is supposed to be the maximum length. For example, with EXT4 with a 4K block size, the maximum symlink target length in an encrypted directory is supposed to be 4093 bytes (in comparison to 4095 in an unencrypted directory), but in FS_POLICY_FLAGS_PAD_32-mode only up to 4064 bytes were accepted. Second, symlink targets of "." and ".." were not being encrypted, even though they should be, as these names are special in *directory entries* but not in symlink targets. Fortunately, we can fix this simply by starting to encrypt them, as old kernels already accept them in encrypted form. Third, the output string length the filesystems were providing when doing the actual encryption was incorrect, as it was forgotten to exclude 'sizeof(struct fscrypt_symlink_data)'. Fortunately though, this bug didn't make a difference. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
a575784c |
|
05-Jan-2018 |
Eric Biggers <ebiggers@google.com> |
fscrypt: trim down fscrypt.h includes fscrypt.h included way too many other headers, given that it is included by filesystems both with and without encryption support. Trim down the includes list by moving the needed includes into more appropriate places, and removing the unneeded ones. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
dcf0db9e |
|
05-Jan-2018 |
Eric Biggers <ebiggers@google.com> |
fscrypt: move fscrypt_is_dot_dotdot() to fs/crypto/fname.c Only fs/crypto/fname.c cares about treating the "." and ".." filenames specially with regards to encryption, so move fscrypt_is_dot_dotdot() from fscrypt.h to there. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
bb8179e5 |
|
05-Jan-2018 |
Eric Biggers <ebiggers@google.com> |
fscrypt: move fscrypt_valid_enc_modes() to fscrypt_private.h The encryption modes are validated by fs/crypto/, not by individual filesystems. Therefore, move fscrypt_valid_enc_modes() from fscrypt.h to fscrypt_private.h. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
bdd23476 |
|
05-Jan-2018 |
Eric Biggers <ebiggers@google.com> |
fscrypt: move fscrypt_operations declaration to fscrypt_supp.h Filesystems now only define their fscrypt_operations when they are compiled with encryption support, so move the fscrypt_operations declaration from fscrypt.h to fscrypt_supp.h. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
1493651b |
|
05-Jan-2018 |
Eric Biggers <ebiggers@google.com> |
fscrypt: split fscrypt_dummy_context_enabled() into supp/notsupp versions fscrypt_dummy_context_enabled() accesses ->s_cop, which now is only set when the filesystem is built with encryption support. This didn't actually matter because no filesystems called it. However, it will start being used soon, so fix it by moving it from fscrypt.h to fscrypt_supp.h and stubbing it out in fscrypt_notsupp.h. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
542060c0 |
|
05-Jan-2018 |
Eric Biggers <ebiggers@google.com> |
fscrypt: move fscrypt_ctx declaration to fscrypt_supp.h Filesystems only ever access 'struct fscrypt_ctx' through fscrypt functions. But when a filesystem is built without encryption support, these functions are all stubbed out, so the declaration of fscrypt_ctx is unneeded. Therefore, move it from fscrypt.h to fscrypt_supp.h. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
4fd4b15c |
|
05-Jan-2018 |
Eric Biggers <ebiggers@google.com> |
fscrypt: move fscrypt_control_page() to supp/notsupp headers fscrypt_control_page() is already split into two versions depending on whether the filesystem is being built with encryption support or not. Move them into the appropriate headers. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
3d463f28 |
|
05-Jan-2018 |
Eric Biggers <ebiggers@google.com> |
fscrypt: move fscrypt_has_encryption_key() to supp/notsupp headers fscrypt_has_encryption_key() is already split into two versions depending on whether the filesystem is being built with encryption support or not. Move them into the appropriate headers. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
815dac33 |
|
09-Oct-2017 |
Eric Biggers <ebiggers@google.com> |
fscrypt: new helper function - fscrypt_prepare_setattr() Introduce a helper function for filesystems to call when processing ->setattr() on a possibly-encrypted inode. It handles enforcing that an encrypted file can only be truncated if its encryption key is available. Acked-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
32c3cf02 |
|
09-Oct-2017 |
Eric Biggers <ebiggers@google.com> |
fscrypt: new helper function - fscrypt_prepare_lookup() Introduce a helper function which prepares to look up the given dentry in the given directory. If the directory is encrypted, it handles loading the directory's encryption key, setting the dentry's ->d_op to fscrypt_d_ops, and setting DCACHE_ENCRYPTED_WITH_KEY if the directory's encryption key is available. Note: once all filesystems switch over to this, we'll be able to move fscrypt_d_ops and fscrypt_set_encrypted_dentry() to fscrypt_private.h. Acked-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
94b26f36 |
|
09-Oct-2017 |
Eric Biggers <ebiggers@google.com> |
fscrypt: new helper function - fscrypt_prepare_rename() Introduce a helper function which prepares to rename a file into a possibly encrypted directory. It handles loading the encryption keys for the source and target directories if needed, and it handles enforcing that if the target directory (and the source directory for a cross-rename) is encrypted, then the file being moved into the directory has the same encryption policy as its containing directory. Acked-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
0ea87a96 |
|
09-Oct-2017 |
Eric Biggers <ebiggers@google.com> |
fscrypt: new helper function - fscrypt_prepare_link() Introduce a helper function which prepares to link an inode into a possibly-encrypted directory. It handles setting up the target directory's encryption key, then verifying that the link won't violate the constraint that all files in an encrypted directory tree use the same encryption policy. Acked-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
d293c3e4 |
|
09-Oct-2017 |
Eric Biggers <ebiggers@google.com> |
fscrypt: new helper function - fscrypt_require_key() Add a helper function which checks if an inode is encrypted, and if so, tries to set up its encryption key. This is a pattern which is duplicated in multiple places in each of ext4, f2fs, and ubifs --- for example, when a regular file is asked to be opened or truncated. Acked-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
f7293e48 |
|
09-Oct-2017 |
Eric Biggers <ebiggers@google.com> |
fscrypt: remove ->is_encrypted() Now that all callers of fscrypt_operations.is_encrypted() have been switched to IS_ENCRYPTED(), remove ->is_encrypted(). Reviewed-by: Chao Yu <yuchao0@huawei.com> Acked-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
#
734f0d24 |
|
09-Oct-2017 |
Dave Chinner <dchinner@redhat.com> |
fscrypt: clean up include file mess Filesystems have to include different header files based on whether they are compiled with encryption support or not. That's nasty and messy. Instead, rationalise the headers so we have a single include fscrypt.h and let it decide what internal implementation to include based on the __FS_HAS_ENCRYPTION define. Filesystems set __FS_HAS_ENCRYPTION to 1 before including linux/fscrypt.h if they are built with encryption support. Otherwise, they must set __FS_HAS_ENCRYPTION to 0. Add guards to prevent fscrypt_supp.h and fscrypt_notsupp.h from being directly included by filesystems. Signed-off-by: Dave Chinner <dchinner@redhat.com> [EB: use 1 and 0 rather than defined/undefined] Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|