#
38d46409 |
|
11-Jun-2023 |
Xiubo Li <xiubli@redhat.com> |
ceph: print cluster fsid and client global_id in all debug logs Multiple CephFS mounts on a host is increasingly common so disambiguating messages like this is necessary and will make it easier to debug issues. At the same this will improve the debug logs to make them easier to troubleshooting issues, such as print the ino# instead only printing the memory addresses of the corresponding inodes and print the dentry names instead of the corresponding memory addresses for the dentry,etc. Link: https://tracker.ceph.com/issues/61590 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Milind Changire <mchangir@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
5995d90d |
|
11-Jun-2023 |
Xiubo Li <xiubli@redhat.com> |
ceph: rename _to_client() to _to_fs_client() We need to covert the inode to ceph_client in the following commit, and will add one new helper for that, here we rename the old helper to _fs_client(). Link: https://tracker.ceph.com/issues/61590 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Milind Changire <mchangir@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
10f9fbe9 |
|
29-Sep-2023 |
Wedson Almeida Filho <walmeida@microsoft.com> |
ceph: move ceph_xattr_handlers to .rodata This makes it harder for accidental or malicious changes to ceph_xattr_handlers at runtime. Cc: Xiubo Li <xiubli@redhat.com> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jeff Layton <jlayton@kernel.org> Cc: ceph-devel@vger.kernel.org Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com> Link: https://lore.kernel.org/r/20230930050033.41174-7-wedsonaf@gmail.com Acked-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
|
#
f061feda |
|
28-Jul-2020 |
Jeff Layton <jlayton@kernel.org> |
ceph: add fscrypt ioctls and ceph.fscrypt.auth vxattr We gate most of the ioctls on MDS feature support. The exception is the key removal and status functions that we still want to work if the MDS's were to (inexplicably) lose the feature. For the set_policy ioctl, we take Fs caps to ensure that nothing can create files in the directory while the ioctl is running. That should be enough to ensure that the "empty_dir" check is reliable. The vxattr is read-only, added mostly for future debugging purposes. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Milind Changire <mchangir@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
6b5717bd |
|
08-Sep-2020 |
Jeff Layton <jlayton@kernel.org> |
ceph: implement -o test_dummy_encryption mount option Add support for the test_dummy_encryption mount option. This allows us to test the encrypted codepaths in ceph without having to manually set keys, etc. [ lhenriques: fix potential fsc->fsc_dummy_enc_policy memory leak in ceph_real_mount() ] Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Milind Changire <mchangir@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
7795aef0 |
|
05-Jul-2023 |
Jeff Layton <jlayton@kernel.org> |
ceph: convert to ctime accessor functions In later patches, we're going to change how the inode's ctime field is used. Switch to using accessor functions instead of raw accesses of inode->i_ctime. Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230705190309.579783-28-jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
|
#
7a6c3a03 |
|
28-Feb-2023 |
Xiubo Li <xiubli@redhat.com> |
ceph: do not print the whole xattr value if it's too long If the xattr's value size is long enough the kernel will warn and then will fail the xfstests test case. Just print part of the value string if it's too long. At the same time fix the function name issue in the debug logs. Link: https://tracker.ceph.com/issues/58404 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
0c95c025 |
|
01-Feb-2023 |
Christian Brauner <brauner@kernel.org> |
fs: drop unused posix acl handlers Remove struct posix_acl_{access,default}_handler for all filesystems that don't depend on the xattr handler in their inode->i_op->listxattr() method in any way. There's nothing more to do than to simply remove the handler. It's been effectively unused ever since we introduced the new posix acl api. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
#
39f60c1c |
|
12-Jan-2023 |
Christian Brauner <brauner@kernel.org> |
fs: port xattr to mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
#
d93231a6 |
|
03-Jun-2022 |
Luís Henriques <lhenriques@suse.de> |
ceph: prevent a client from exceeding the MDS maximum xattr size The MDS tries to enforce a limit on the total key/values in extended attributes. However, this limit is enforced only if doing a synchronous operation (MDS_OP_SETXATTR) -- if we're buffering the xattrs, the MDS doesn't have a chance to enforce these limits. This patch adds support for decoding the xattrs maximum size setting that is distributed in the mdsmap. Then, when setting an xattr, the kernel client will revert to do a synchronous operation if that maximum size is exceeded. While there, fix a dout() that would trigger a printk warning: [ 98.718078] ------------[ cut here ]------------ [ 98.719012] precision 65536 too large [ 98.719039] WARNING: CPU: 1 PID: 3755 at lib/vsprintf.c:2703 vsnprintf+0x5e3/0x600 ... Link: https://tracker.ceph.com/issues/55725 Signed-off-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
874c8ca1 |
|
09-Jun-2022 |
David Howells <dhowells@redhat.com> |
netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_context While randstruct was satisfied with using an open-coded "void *" offset cast for the netfs_i_context <-> inode casting, __builtin_object_size() as used by FORTIFY_SOURCE was not as easily fooled. This was causing the following complaint[1] from gcc v12: In file included from include/linux/string.h:253, from include/linux/ceph/ceph_debug.h:7, from fs/ceph/inode.c:2: In function 'fortify_memset_chk', inlined from 'netfs_i_context_init' at include/linux/netfs.h:326:2, inlined from 'ceph_alloc_inode' at fs/ceph/inode.c:463:2: include/linux/fortify-string.h:242:25: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning] 242 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by embedding a struct inode into struct netfs_i_context (which should perhaps be renamed to struct netfs_inode). The struct inode vfs_inode fields are then removed from the 9p, afs, ceph and cifs inode structs and vfs_inode is then simply changed to "netfs.inode" in those filesystems. Further, rename netfs_i_context to netfs_inode, get rid of the netfs_inode() function that converted a netfs_i_context pointer to an inode pointer (that can now be done with &ctx->inode) and rename the netfs_i_context() function to netfs_inode() (which is now a wrapper around container_of()). Most of the changes were done with: perl -p -i -e 's/vfs_inode/netfs.inode/'g \ `git grep -l 'vfs_inode' -- fs/{9p,afs,ceph,cifs}/*.[ch]` Kees suggested doing it with a pair structure[2] and a special declarator to insert that into the network filesystem's inode wrapper[3], but I think it's cleaner to embed it - and then it doesn't matter if struct randomisation reorders things. Dave Chinner suggested using a filesystem-specific VFS_I() function in each filesystem to convert that filesystem's own inode wrapper struct into the VFS inode struct[4]. Version #2: - Fix a couple of missed name changes due to a disabled cifs option. - Rename nfs_i_context to nfs_inode - Use "netfs" instead of "nic" as the member name in per-fs inode wrapper structs. [ This also undoes commit 507160f46c55 ("netfs: gcc-12: temporarily disable '-Wattribute-warning' for now") that is no longer needed ] Fixes: bc899ee1c898 ("netfs: Add a netfs inode context") Reported-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> cc: Jonathan Corbet <corbet@lwn.net> cc: Eric Van Hensbergen <ericvh@gmail.com> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Ilya Dryomov <idryomov@gmail.com> cc: Steve French <smfrench@gmail.com> cc: William Kucharski <william.kucharski@oracle.com> cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> cc: Dave Chinner <david@fromorbit.com> cc: linux-doc@vger.kernel.org cc: v9fs-developer@lists.sourceforge.net cc: linux-afs@lists.infradead.org cc: ceph-devel@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: samba-technical@lists.samba.org cc: linux-fsdevel@vger.kernel.org cc: linux-hardening@vger.kernel.org Link: https://lore.kernel.org/r/d2ad3a3d7bdd794c6efb562d2f2b655fb67756b9.camel@kernel.org/ [1] Link: https://lore.kernel.org/r/20220517210230.864239-1-keescook@chromium.org/ [2] Link: https://lore.kernel.org/r/20220518202212.2322058-1-keescook@chromium.org/ [3] Link: https://lore.kernel.org/r/20220524101205.GI2306852@dread.disaster.area/ [4] Link: https://lore.kernel.org/r/165296786831.3591209.12111293034669289733.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165305805651.4094995.7763502506786714216.stgit@warthog.procyon.org.uk # v2 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
d7a2dc52 |
|
10-Mar-2022 |
Venky Shankar <vshankar@redhat.com> |
ceph: allow ceph.dir.rctime xattr to be updatable `rctime' has been a pain point in cephfs due to its buggy nature - inconsistent values reported and those sorts. Fixing rctime is non-trivial needing an overall redesign of the entire nested statistics infrastructure. As a workaround, PR http://github.com/ceph/ceph/pull/37938 allows this extended attribute to be manually set. This allows users to "fixup" inconsistent rctime values. While this sounds messy, its probably the wisest approach allowing users/scripts to workaround buggy rctime values. The above PR enables Ceph MDS to allow manually setting rctime extended attribute with the corresponding user-land changes. We may as well allow the same to be done via kclient for parity. Signed-off-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
6ddf5f16 |
|
13-Feb-2022 |
Milind Changire <milindchangire@gmail.com> |
ceph: add getvxattr op Problem: Some directory vxattrs (e.g. ceph.dir.pin.random) are governed by information that isn't necessarily shared with the client. Add support for the new GETVXATTR operation, which allows the client to query the MDS directly for vxattrs. When the client is queried for a vxattr that doesn't have a special handler, have it issue a GETVXATTR to the MDS directly. Solution: Adds new getvxattr op to fetch ceph.dir.pin*, ceph.dir.layout* and ceph.file.layout* vxattrs. If the entire layout for a dir or a file is being set, then it is expected that the layout be set in standard JSON format. Individual field value retrieval is not wrapped in JSON. The JSON format also applies while setting the vxattr if the entire layout is being set in one go. As a temporary measure, setting a vxattr can also be done in the old format. The old format will be deprecated in the future. URL: https://tracker.ceph.com/issues/51062 Signed-off-by: Milind Changire <mchangir@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
15bf3239 |
|
12-Oct-2021 |
Vivek Goyal <vgoyal@redhat.com> |
security: Return xattr name from security_dentry_init_security() Right now security_dentry_init_security() only supports single security label and is used by SELinux only. There are two users of this hook, namely ceph and nfs. NFS does not care about xattr name. Ceph hardcodes the xattr name to security.selinux (XATTR_NAME_SELINUX). I am making changes to fuse/virtiofs to send security label to virtiofsd and I need to send xattr name as well. I also hardcoded the name of xattr to security.selinux. Stephen Smalley suggested that it probably is a good idea to modify security_dentry_init_security() to also return name of xattr so that we can avoid this hardcoding in the callers. This patch adds a new parameter "const char **xattr_name" to security_dentry_init_security() and LSM puts the name of xattr too if caller asked for it (xattr_name != NULL). Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: James Morris <jamorris@linux.microsoft.com> [PM: fixed typos in the commit description] Signed-off-by: Paul Moore <paul@paul-moore.com>
|
#
40e309de |
|
26-Jul-2021 |
Jeff Layton <jlayton@kernel.org> |
ceph: add a new vxattr to return auth mds for an inode Add a new vxattr that shows what MDS is authoritative for an inode (if we happen to have auth caps). If we don't have an auth cap for the inode then just return -1. URL: https://tracker.ceph.com/issues/1276 Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
e7f72952 |
|
27-Aug-2020 |
Yanhu Cao <gmayyyha@gmail.com> |
ceph: support getting ceph.dir.rsnaps vxattr Add support for grabbing the rsnaps value out of the inode info in traces, and exposing that via ceph.dir.rsnaps xattr. Signed-off-by: Yanhu Cao <gmayyyha@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
e65ce2a5 |
|
21-Jan-2021 |
Christian Brauner <christian.brauner@ubuntu.com> |
acl: handle idmapped mounts The posix acl permission checking helpers determine whether a caller is privileged over an inode according to the acls associated with the inode. Add helpers that make it possible to handle acls on idmapped mounts. The vfs and the filesystems targeted by this first iteration make use of posix_acl_fix_xattr_from_user() and posix_acl_fix_xattr_to_user() to translate basic posix access and default permissions such as the ACL_USER and ACL_GROUP type according to the initial user namespace (or the superblock's user namespace) to and from the caller's current user namespace. Adapt these two helpers to handle idmapped mounts whereby we either map from or into the mount's user namespace depending on in which direction we're translating. Similarly, cap_convert_nscap() is used by the vfs to translate user namespace and non-user namespace aware filesystem capabilities from the superblock's user namespace to the caller's user namespace. Enable it to handle idmapped mounts by accounting for the mount's user namespace. In addition the fileystems targeted in the first iteration of this patch series make use of the posix_acl_chmod() and, posix_acl_update_mode() helpers. Both helpers perform permission checks on the target inode. Let them handle idmapped mounts. These two helpers are called when posix acls are set by the respective filesystems to handle this case we extend the ->set() method to take an additional user namespace argument to pass the mount's user namespace down. Link: https://lore.kernel.org/r/20210121131959.646623-9-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
#
968cd14e |
|
08-Dec-2020 |
Xiubo Li <xiubli@redhat.com> |
ceph: set osdmap epoch for setxattr When setting the file/dir layout, it may need data pool info. So in mds server, it needs to check the osdmap. At present, if mds doesn't find the data pool specified, it will try to get the latest osdmap. Now if pass the osd epoch for setxattr, the mds server can only check this epoch of osdmap. URL: https://tracker.ceph.com/issues/48504 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
dd980fc0 |
|
23-Nov-2020 |
Luis Henriques <lhenriques@suse.de> |
ceph: add ceph.caps vxattr Add a new vxattr that allows userspace to list the caps for a specific directory or file. [ jlayton: change format delimiter to '/' ] Signed-off-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
5a9e2f5d |
|
10-Nov-2020 |
Xiubo Li <xiubli@redhat.com> |
ceph: add ceph.{cluster_fsid/client_id} vxattrs These two vxattrs will only exist in local client side, with which we can easily know which mountpoint the file belongs to and also they can help locate the debugfs path quickly. URL: https://tracker.ceph.com/issues/48057 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
81048c00 |
|
03-Nov-2020 |
Jeff Layton <jlayton@kernel.org> |
ceph: acquire Fs caps when getting dir stats We only update the inode's dirstats when we have Fs caps from the MDS. Declare a new VXATTR_FLAG_DIRSTAT that we set on all dirstats, and have the vxattr handling code acquire those caps when it's set. URL: https://tracker.ceph.com/issues/48104 Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
f6fbdcd9 |
|
17-Sep-2020 |
Ilya Dryomov <idryomov@gmail.com> |
ceph: mark ceph_fmt_xattr() as printf-like for better type checking Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
c00e4522 |
|
08-Jul-2020 |
Xu Wang <vulab@iscas.ac.cn> |
ceph: remove unnecessary cast in kfree() Remove unnecassary casts in the argument to kfree. Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
1af16d54 |
|
19-Mar-2020 |
Xiubo Li <xiubli@redhat.com> |
ceph: add caps perf metric for each superblock Count hits and misses in the caps cache. If the client has all of the necessary caps when a task needs references, then it's counted as a hit. Any other situation is a miss. URL: https://tracker.ceph.com/issues/43215 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
d36e0b62 |
|
07-Jan-2020 |
Jeff Layton <jlayton@kernel.org> |
ceph: print name of xattr in __ceph_{get,set}xattr() douts Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
0eb30853 |
|
18-Dec-2019 |
Xiubo Li <xiubli@redhat.com> |
ceph: print dentry offset in hex and fix xattr_version type In the debug logs about the di->offset or ctx->pos it is in hex format, but some others are using the dec format. It is a little hard to read. For the xattr version, it is u64 type, using a shorter type may truncate it. Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
b8fe918b |
|
06-Aug-2019 |
Jeff Layton <jlayton@kernel.org> |
ceph: allow arbitrary security.* xattrs Most filesystems don't limit what security.* xattrs can be set or fetched. I see no reason that we need to limit that on cephfs either. Drop the special xattr handler for "security." xattrs, and allow the "other" xattr handler to handle security xattrs as well. In addition to fixing xfstest generic/093, this allows us to support per-file capabilities (a'la setcap(8)). Link: https://tracker.ceph.com/issues/41135 Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
026105eb |
|
06-Aug-2019 |
Jeff Layton <jlayton@kernel.org> |
ceph: only set CEPH_I_SEC_INITED if we got a MAC label __ceph_getxattr will set the CEPH_I_SEC_INITED flag whenever it gets any xattr that starts with "security.". We only want to set that flag when fetching the MAC label for the currently-active LSM, however. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
668959a5 |
|
06-Aug-2019 |
Jeff Layton <jlayton@kernel.org> |
ceph: turn ceph_security_invalidate_secctx into static inline No need to do an extra jump here. Also add some comments on the endifs. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
e09580b3 |
|
23-Jul-2019 |
Jeff Layton <jlayton@kernel.org> |
ceph: don't list vxattrs in listxattr() Most filesystems that provide virtual xattrs (e.g. CIFS) don't display them via listxattr(). Ceph does, and that causes some of the tests in xfstests to fail. Have cephfs stop listing vxattrs in listxattr. Userspace can always query them directly when the name is known. Signed-off-by: Jeff Layton <jlayton@kernel.org> Acked-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
12fe3dda |
|
19-Jul-2019 |
Luis Henriques <lhenriques@suse.com> |
ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob() Calling ceph_buffer_put() in __ceph_build_xattrs_blob() may result in freeing the i_xattrs.blob buffer while holding the i_ceph_lock. This can be fixed by having this function returning the old blob buffer and have the callers of this function freeing it when the lock is released. The following backtrace was triggered by fstests generic/117. BUG: sleeping function called from invalid context at mm/vmalloc.c:2283 in_atomic(): 1, irqs_disabled(): 0, pid: 649, name: fsstress 4 locks held by fsstress/649: #0: 00000000a7478e7e (&type->s_umount_key#19){++++}, at: iterate_supers+0x77/0xf0 #1: 00000000f8de1423 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: ceph_check_caps+0x7b/0xc60 #2: 00000000562f2b27 (&s->s_mutex){+.+.}, at: ceph_check_caps+0x3bd/0xc60 #3: 00000000f83ce16a (&mdsc->snap_rwsem){++++}, at: ceph_check_caps+0x3ed/0xc60 CPU: 1 PID: 649 Comm: fsstress Not tainted 5.2.0+ #439 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x67/0x90 ___might_sleep.cold+0x9f/0xb1 vfree+0x4b/0x60 ceph_buffer_release+0x1b/0x60 __ceph_build_xattrs_blob+0x12b/0x170 __send_cap+0x302/0x540 ? __lock_acquire+0x23c/0x1e40 ? __mark_caps_flushing+0x15c/0x280 ? _raw_spin_unlock+0x24/0x30 ceph_check_caps+0x5f0/0xc60 ceph_flush_dirty_caps+0x7c/0x150 ? __ia32_sys_fdatasync+0x20/0x20 ceph_sync_fs+0x5a/0x130 iterate_supers+0x8f/0xf0 ksys_sync+0x4f/0xb0 __ia32_sys_sync+0xa/0x10 do_syscall_64+0x50/0x1c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fc6409ab617 Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
86968ef2 |
|
19-Jul-2019 |
Luis Henriques <lhenriques@suse.com> |
ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr() Calling ceph_buffer_put() in __ceph_setxattr() may end up freeing the i_xattrs.prealloc_blob buffer while holding the i_ceph_lock. This can be fixed by postponing the call until later, when the lock is released. The following backtrace was triggered by fstests generic/117. BUG: sleeping function called from invalid context at mm/vmalloc.c:2283 in_atomic(): 1, irqs_disabled(): 0, pid: 650, name: fsstress 3 locks held by fsstress/650: #0: 00000000870a0fe8 (sb_writers#8){.+.+}, at: mnt_want_write+0x20/0x50 #1: 00000000ba0c4c74 (&type->i_mutex_dir_key#6){++++}, at: vfs_setxattr+0x55/0xa0 #2: 000000008dfbb3f2 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: __ceph_setxattr+0x297/0x810 CPU: 1 PID: 650 Comm: fsstress Not tainted 5.2.0+ #437 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x67/0x90 ___might_sleep.cold+0x9f/0xb1 vfree+0x4b/0x60 ceph_buffer_release+0x1b/0x60 __ceph_setxattr+0x2b4/0x810 __vfs_setxattr+0x66/0x80 __vfs_setxattr_noperm+0x59/0xf0 vfs_setxattr+0x81/0xa0 setxattr+0x115/0x230 ? filename_lookup+0xc9/0x140 ? rcu_read_lock_sched_held+0x74/0x80 ? rcu_sync_lockdep_assert+0x2e/0x60 ? __sb_start_write+0x142/0x1a0 ? mnt_want_write+0x20/0x50 path_setxattr+0xba/0xd0 __x64_sys_lsetxattr+0x24/0x30 do_syscall_64+0x50/0x1c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7ff23514359a Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
26350535 |
|
24-Jun-2019 |
Jeff Layton <jlayton@kernel.org> |
ceph: don't NULL terminate virtual xattrs The convention with xattrs is to not store the termination with string data, given that it returns the length. This is how setfattr/getfattr operate. Most of ceph's virtual xattr routines use snprintf to plop the string directly into the destination buffer, but snprintf always NULL terminates the string. This means that if we send the kernel a buffer that is the exact length needed to hold the string, it'll end up truncated. Add a ceph_fmt_xattr helper function to format the string into an on-stack buffer that should always be large enough to hold the whole thing and then memcpy the result into the destination buffer. If it does turn out that the formatted string won't fit in the on-stack buffer, then return -E2BIG and do a WARN_ONCE(). Change over most of the virtual xattr routines to use the new helper. A couple of the xattrs are sourced from strings however, and it's difficult to know how long they'll be. Just have those memcpy the result in place after verifying the length. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Acked-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
3b421018 |
|
13-Jun-2019 |
Jeff Layton <jlayton@kernel.org> |
ceph: return -ERANGE if virtual xattr value didn't fit in buffer The getxattr manpage states that we should return ERANGE if the destination buffer size is too small to hold the value. ceph_vxattrcb_layout does this internally, but we should be doing this for all vxattrs. Fix the only caller of getxattr_cb to check the returned size against the buffer length and return -ERANGE if it doesn't fit. Drop the same check in ceph_vxattrcb_layout and just rely on the caller to handle it. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Acked-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
f1d1b51d |
|
24-Jun-2019 |
Jeff Layton <jlayton@kernel.org> |
ceph: make getxattr_cb return ssize_t The getxattr_cb functions return size_t, which is unsigned and then cast that value to int and then ssize_t before returning it. While all of this works, it relies on implicit casting rules for signed/unsigned conversions. Change getxattr_cb to return ssize_t to better conform with what the caller actually wants. Also, remove some suspicious casts. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Acked-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
ac6713cc |
|
26-May-2019 |
Yan, Zheng <zyan@redhat.com> |
ceph: add selinux support When creating new file/directory, use security_dentry_init_security() to prepare selinux context for the new inode, then send openc/mkdir request to MDS, together with selinux xattr. security_dentry_init_security() only supports single security module and only selinux has dentry_init_security hook. So only selinux is supported for now. We can add support for other security modules once kernel has a generic version of dentry_init_security() Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
5c31e92d |
|
26-May-2019 |
Yan, Zheng <zyan@redhat.com> |
ceph: rename struct ceph_acls_info to ceph_acl_sec_ctx Also rename ceph_release_acls_info() to ceph_release_acl_sec_ctx(). And move their definitions to different files. This is preparation for security label support. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
05729781 |
|
27-May-2019 |
Yan, Zheng <zyan@redhat.com> |
ceph: fix debug print format in __set_xattr() name is not '\0' terminated. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
71880728 |
|
15-May-2019 |
David Disseldorp <ddiss@suse.de> |
ceph: fix "ceph.dir.rctime" vxattr value The vxattr value incorrectly places a "09" prefix to the nanoseconds field, instead of providing it as a zero-pad width specifier after '%'. Fixes: 3489b42a72a4 ("ceph: fix three bugs, two in ceph_vxattrcb_file_layout()") Link: https://tracker.ceph.com/issues/39943 Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
d0f191d2 |
|
18-Apr-2019 |
David Disseldorp <ddiss@suse.de> |
ceph: remove unused vxattr length helpers ceph_listxattr() now calculates the length of vxattrs dynamically, so these helpers, which incorrectly ignore vxattr.exists_cb(), can be removed. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
2b2abcac |
|
18-Apr-2019 |
David Disseldorp <ddiss@suse.de> |
ceph: fix listxattr vxattr buffer length calculation ceph_listxattr() incorrectly returns a length based on the static ceph_vxattrs_name_size() value, which only takes into account whether vxattrs are hidden, ignoring vxattr.exists_cb(). When filling the xattr buffer ceph_listxattr() checks VXATTR_FLAG_HIDDEN and vxattr.exists_cb(). If both are false, we return an incorrect (oversize) length. Fix this behaviour by always calculating the vxattrs length at runtime, taking both vxattr.hidden and vxattr.exists_cb() into account. This bug is only exposed with the new "ceph.snap.btime" vxattr, as all other vxattrs with a non-null exists_cb also carry VXATTR_FLAG_HIDDEN. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
100cc610 |
|
18-Apr-2019 |
David Disseldorp <ddiss@suse.de> |
ceph: add ceph.snap.btime vxattr The ceph.snap.btime virtual xattr provides the snapshot creation (birth) time in $secs.$nsecs format. Link: https://tracker.ceph.com/issues/38838 Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
e1b81439 |
|
18-Apr-2019 |
David Disseldorp <ddiss@suse.de> |
ceph: clean up ceph.dir.pin vxattr name sizeof() .name_size should use the same string as .name. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
08796873 |
|
08-Jan-2019 |
Yan, Zheng <zyan@redhat.com> |
ceph: support getting ceph.dir.pin vxattr Link: http://tracker.ceph.com/issues/37576 Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
33165d47 |
|
28-Sep-2018 |
Ilya Dryomov <idryomov@gmail.com> |
libceph: introduce ceph_pagelist_alloc() struct ceph_pagelist cannot be embedded into anything else because it has its own refcount. Merge allocation and initialization together. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
9bbeab41 |
|
13-Jul-2018 |
Arnd Bergmann <arnd@arndb.de> |
ceph: use timespec64 for inode timestamp Since the vfs structures are all using timespec64, we can now change the internal representation, using ceph_encode_timespec64 and ceph_decode_timespec64. In case of ceph_aux_inode however, we need to avoid doing a memcmp() on uninitialized padding data, so the members of the i_mtime field get copied individually into 64-bit integers. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
49a9f4f6 |
|
25-Apr-2018 |
Yan, Zheng <zyan@redhat.com> |
ceph: always get rstat from auth mds rstat is not tracked by capability. client can't know if rstat from non-auth mds is uptodate or not. Link: http://tracker.ceph.com/issues/23538 Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
4e9906e7 |
|
25-Apr-2018 |
Yan, Zheng <zyan@redhat.com> |
ceph: use bit flags to define vxattr attributes Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
f1919826 |
|
07-Apr-2018 |
Yan, Zheng <zyan@redhat.com> |
ceph: check if mds create snaprealm when setting quota If mds does not, return -EOPNOTSUPP. Link: http://tracker.ceph.com/issues/23491 Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
fb18a575 |
|
05-Jan-2018 |
Luis Henriques <lhenriques@suse.com> |
ceph: quota: add initial infrastructure to support cephfs quotas This patch adds the infrastructure required to support cephfs quotas as it is currently implemented in the ceph fuse client. Cephfs quotas can be set on any directory, and can restrict the number of bytes or the number of files stored beneath that point in the directory hierarchy. Quotas are set using the extended attributes 'ceph.quota.max_files' and 'ceph.quota.max_bytes', and can be removed by setting these attributes to '0'. Link: http://tracker.ceph.com/issues/22372 Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
b2441318 |
|
01-Nov-2017 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
d37b1d99 |
|
20-Aug-2017 |
Markus Elfring <elfring@users.sourceforge.net> |
ceph: adjust 36 checks for NULL pointers The script “checkpatch.pl” pointed information out like the following. Comparison to NULL could be written ... Thus fix the affected source code places. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Reviewed-by: Yan, Zheng <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
1684dd03 |
|
14-Jun-2017 |
Yan, Zheng <zyan@redhat.com> |
ceph: getattr before read on ceph.* xattrs Previously we were returning values for quota, layout xattrs without any kind of update -- the user just got whatever happened to be in our cache. Clearly this extra round trip has a cost, but reads of these xattrs are fairly rare, happening on admin intervention rather than in normal operation. Link: http://tracker.ceph.com/issues/17939 Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
eeca958d |
|
28-Apr-2017 |
Luis Henriques <lhenriques@suse.com> |
ceph: fix memory leak in __ceph_setxattr() The ceph_inode_xattr needs to be released when removing an xattr. Easily reproducible running the 'generic/020' test from xfstests or simply by doing: attr -s attr0 -V 0 /mnt/test && attr -r attr0 /mnt/test While there, also fix the error path. Here's the kmemleak splat: unreferenced object 0xffff88001f86fbc0 (size 64): comm "attr", pid 244, jiffies 4294904246 (age 98.464s) hex dump (first 32 bytes): 40 fa 86 1f 00 88 ff ff 80 32 38 1f 00 88 ff ff @........28..... 00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de ................ backtrace: [<ffffffff81560199>] kmemleak_alloc+0x49/0xa0 [<ffffffff810f3e5b>] kmem_cache_alloc+0x9b/0xf0 [<ffffffff812b157e>] __ceph_setxattr+0x17e/0x820 [<ffffffff812b1c57>] ceph_set_xattr_handler+0x37/0x40 [<ffffffff8111fb4b>] __vfs_removexattr+0x4b/0x60 [<ffffffff8111fd37>] vfs_removexattr+0x77/0xd0 [<ffffffff8111fdd1>] removexattr+0x41/0x60 [<ffffffff8111fe65>] path_removexattr+0x75/0xa0 [<ffffffff81120aeb>] SyS_lremovexattr+0xb/0x10 [<ffffffff81564b20>] entry_SYSCALL_64_fastpath+0x13/0x94 [<ffffffffffffffff>] 0xffffffffffffffff Cc: stable@vger.kernel.org Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
5130ccea |
|
17-Oct-2016 |
Wei Yongjun <weiyongjun1@huawei.com> |
ceph: fix non static symbol warning Fixes the following sparse warning: fs/ceph/xattr.c:19:28: warning: symbol 'ceph_other_xattr_handler' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
c2050a45 |
|
14-Sep-2016 |
Deepa Dinamani <deepa.kernel@gmail.com> |
fs: Replace current_fs_time() with current_time() current_fs_time() uses struct super_block* as an argument. As per Linus's suggestion, this is changed to take struct inode* as a parameter instead. This is because the function is primarily meant for vfs inode timestamps. Also the function was renamed as per Arnd's suggestion. Change all calls to current_fs_time() to use the new current_time() function instead. current_fs_time() will be deleted. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
779fe0fb |
|
06-Mar-2016 |
Yan, Zheng <zyan@redhat.com> |
ceph: rados pool namespace support This patch adds codes that decode pool namespace information in cap message and request reply. Pool namespace is saved in i_layout, it will be passed to libceph when doing read/write. Signed-off-by: Yan, Zheng <zyan@redhat.com>
|
#
7627151e |
|
03-Feb-2016 |
Yan, Zheng <zyan@redhat.com> |
libceph: define new ceph_file_layout structure Define new ceph_file_layout structure and rename old ceph_file_layout to ceph_file_layout_legacy. This is preparation for adding namespace to ceph_file_layout structure. Signed-off-by: Yan, Zheng <zyan@redhat.com>
|
#
59301226 |
|
27-May-2016 |
Al Viro <viro@zeniv.linux.org.uk> |
switch xattr_handler->set() to passing dentry and inode separately preparation for similar switch in ->setxattr() (see the next commit for rationale). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
04303d8a |
|
20-Apr-2016 |
Yan, Zheng <zyan@redhat.com> |
ceph: use CEPH_MDS_OP_RMXATTR request to remove xattr Setxattr with NULL value and XATTR_REPLACE flag should be equivalent to removexattr. But current MDS does not support deleting vxattrs through MDS_OP_SETXATTR request. The workaround is sending MDS_OP_RMXATTR request if setxattr actually removs xattr. Signed-off-by: Yan, Zheng <zyan@redhat.com>
|
#
5aea3dcd |
|
28-Apr-2016 |
Ilya Dryomov <idryomov@gmail.com> |
libceph: a major OSD client update This is a major sync up, up to ~Jewel. The highlights are: - per-session request trees (vs a global per-client tree) - per-session locking (vs a global per-client rwlock) - homeless OSD session - no ad-hoc global per-client lists - support for pool quotas - foundation for watch/notify v2 support - foundation for map check (pool deletion detection) support The switchover is incomplete: lingering requests can be setup and teared down but aren't ever reestablished. This functionality is restored with the introduction of the new lingering infrastructure (ceph_osd_linger_request, linger_work, etc) in a later commit. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
#
b971e94e |
|
13-Apr-2016 |
Yan, Zheng <zyan@redhat.com> |
ceph: kill __ceph_removexattr() when removing a xattr, generic_removexattr() calls __ceph_setxattr() with NULL value and XATTR_REPLACE flag. __ceph_removexattr() is not used any more. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
2cdeb1e4 |
|
13-Apr-2016 |
Andreas Gruenbacher <agruenba@redhat.com> |
ceph: Switch to generic xattr handlers Add a catch-all xattr handler at the end of ceph_xattr_handlers. Check for valid attribute names there, and remove those checks from __ceph_{get,set,remove}xattr instead. No "system.*" xattrs need to be handled by the catch-all handler anymore. The set xattr handler is called with a NULL value to indicate that the attribute should be removed; __ceph_setxattr already handles that case correctly (ceph_set_acl could already calling __ceph_setxattr with a NULL value). Move the check for snapshots from ceph_{set,remove}xattr into __ceph_{set,remove}xattr. With that, ceph_{get,set,remove}xattr can be replaced with the generic iops. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
a26fecca |
|
13-Apr-2016 |
Andreas Gruenbacher <agruenba@redhat.com> |
ceph: Get rid of d_find_alias in ceph_set_acl Create a variant of ceph_setattr that takes an inode instead of a dentry. Change __ceph_setxattr (and also __ceph_removexattr) to take an inode instead of a dentry. Use those in ceph_set_acl so that we no longer need a dentry there. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
ce23e640 |
|
10-Apr-2016 |
Al Viro <viro@zeniv.linux.org.uk> |
->getxattr(): pass dentry and inode as separate arguments Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
315f2408 |
|
06-Mar-2016 |
Yan, Zheng <zyan@redhat.com> |
ceph: fix security xattr deadlock When security is enabled, security module can call filesystem's getxattr/setxattr callbacks during d_instantiate(). For cephfs, d_instantiate() is usually called by MDS' dispatch thread, while handling MDS reply. If the MDS reply does not include xattrs and corresponding caps, getxattr/setxattr need to send a new request to MDS and waits for the reply. This makes MDS' dispatch sleep, nobody handles later MDS replies. The fix is make sure lookup/atomic_open reply include xattrs and corresponding caps. So getxattr can be handled by cached xattrs. This requires some modification to both MDS and request message. (Client tells MDS what caps it wants; MDS encodes proper caps in the reply) Smack security module may call setxattr during d_instantiate(). Unlike getxattr, we can't force MDS to issue CEPH_CAP_XATTR_EXCL to us. So just make setxattr return error when called by MDS' dispatch thread. Signed-off-by: Yan, Zheng <zyan@redhat.com>
|
#
29dccfa5 |
|
11-Mar-2016 |
Yan, Zheng <zyan@redhat.com> |
ceph: don't request vxattrs from MDS It's uselese because MDS reply does not carry any vxattr. Signed-off-by: Yan, Zheng <zyan@redhat.com>
|
#
8bbd4714 |
|
02-Feb-2016 |
Deepa Dinamani <deepa.kernel@gmail.com> |
ceph: replace CURRENT_TIME by current_fs_time() CURRENT_TIME macro is not appropriate for filesystems as it doesn't use the right granularity for filesystem timestamps. Use current_fs_time() instead. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Yan, Zheng <zyan@redhat.com>
|
#
f66fd9f0 |
|
10-Jun-2015 |
Yan, Zheng <zyan@redhat.com> |
ceph: pre-allocate data structure that tracks caps flushing Signed-off-by: Yan, Zheng <zyan@redhat.com>
|
#
604d1b02 |
|
01-May-2015 |
Yan, Zheng <zyan@redhat.com> |
ceph: take snap_rwsem when accessing snap realm's cached_context When ceph inode's i_head_snapc is NULL, __ceph_mark_dirty_caps() accesses snap realm's cached_context. So we need take read lock of snap_rwsem. Signed-off-by: Yan, Zheng <zyan@redhat.com>
|
#
a149bb9a |
|
20-Mar-2015 |
Sanidhya Kashyap <sanidhya.gatech@gmail.com> |
ceph: kstrdup() memory handling Currently, there is no check for the kstrdup() for r_path2, r_path1 and snapdir_name as various locations as there is a possibility of failure during memory pressure. Therefore, returning ENOMEM where the checks have been missed. Signed-off-by: Sanidhya Kashyap <sanidhya.gatech@gmail.com> Signed-off-by: Yan, Zheng <zyan@redhat.com>
|
#
2b0143b5 |
|
17-Mar-2015 |
David Howells <dhowells@redhat.com> |
VFS: normal filesystems (and lustre): d_inode() annotations that's the bulk of filesystem drivers dealing with inodes of their own Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
0aeff37a |
|
17-Dec-2014 |
Yan, Zheng <zyan@redhat.com> |
ceph: fix setting empty extended attribute make sure 'value' is not null. otherwise __ceph_setxattr will remove the extended attribute. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
|
#
25e6bae3 |
|
16-Sep-2014 |
Yan, Zheng <zyan@redhat.com> |
ceph: use pagelist to present MDS request data Current code uses page array to present MDS request data. Pages in the array are allocated/freed by caller of ceph_mdsc_do_request(). If request is interrupted, the pages can be freed while they are still being used by the request message. The fix is use pagelist to present MDS request data. Pagelist is reference counted. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
|
#
0abb43dc |
|
18-Sep-2014 |
Yan, Zheng <zyan@redhat.com> |
ceph: fix llistxattr on symlink only regular file and directory have vxattrs. Signed-off-by: Yan, Zheng <zyan@redhat.com>
|
#
508b32d8 |
|
16-Sep-2014 |
Yan, Zheng <zyan@redhat.com> |
ceph: request xattrs if xattr_version is zero Following sequence of events can happen. - Client releases an inode, queues cap release message. - A 'lookup' reply brings the same inode back, but the reply doesn't contain xattrs because MDS didn't receive the cap release message and thought client already has up-to-data xattrs. The fix is force sending a getattr request to MDS if xattrs_version is 0. The getattr mask is set to CEPH_STAT_CAP_XATTR, so MDS knows client does not have xattr. Signed-off-by: Yan, Zheng <zyan@redhat.com>
|
#
7e8a2952 |
|
25-Jul-2014 |
Ilya Dryomov <ilya.dryomov@inktank.com> |
ceph: fix sizeof(struct tYpO *) typo struct ceph_xattr -> struct ceph_inode_xattr Reported-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
|
#
1a295bd8 |
|
24-Jul-2014 |
Ilya Dryomov <ilya.dryomov@inktank.com> |
ceph: remove redundant memset(0) xattrs array of pointers is allocated with kcalloc() - no need to memset() it to 0 right after that. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
|
#
cc48c3e8 |
|
24-Mar-2014 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: don't include ceph.{file,dir}.layout vxattr in listxattr() This avoids 'cp -a' modifying layout of new files/directories. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
#
1e5c6649 |
|
23-Mar-2014 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: check buffer size in ceph_vxattrcb_layout() If buffer size is zero, return the size of layout vxattr. If buffer size is not zero, check if it is large enough for layout vxattr. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
#
752c8bdc |
|
05-Feb-2013 |
Sage Weil <sage@inktank.com> |
ceph: do not chain inode updates to parent fsync The fsync(dirfd) only covers namespace operations, not inode updates. We do not need to cover setattr variants or O_TRUNC. Reported-by: Al Viro <viro@xeniv.linux.org.uk> Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
|
#
524186ac |
|
10-Feb-2014 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: fix ceph_removexattr() Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
#
bcdfeb2e |
|
10-Feb-2014 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: remove xattr when null value is given to setxattr() For the setxattr request, introduce a new flag CEPH_XATTR_REMOVE to distinguish null value case from the zero-length value case. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
#
fbc0b970 |
|
10-Feb-2014 |
Yan, Zheng <zheng.z.yan@intel.com> |
ceph: properly handle XATTR_CREATE and XATTR_REPLACE return -EEXIST if XATTR_CREATE is set and xattr alread exists. return -ENODATA if XATTR_REPLACE is set but xattr does not exist. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
#
4db658ea |
|
28-Jan-2014 |
Linus Torvalds <torvalds@linux-foundation.org> |
ceph: Fix up after semantic merge conflict The previous ceph-client merge resulted in ceph not even building, because there was a merge conflict that wasn't visible as an actual data conflict: commit 7221fe4c2ed7 ("ceph: add acl for cephfs") added support for POSIX ACL's into Ceph, but unluckily we also had the VFS tree change a lot of the POSIX ACL helper functions to be much more helpful to filesystems (see for example commits 2aeccbe957d0 "fs: add generic xattr_acl handlers", 5bf3258fd2ac "fs: make posix_acl_chmod more useful" and 37bc15392a23 "fs: make posix_acl_create more useful") The reason this conflict wasn't obvious was many-fold: because it was a semantic conflict rather than a data conflict, it wasn't visible in the git merge as a conflict. And because the VFS tree hadn't been in linux-next, people hadn't become aware of it that way. And because I was at jury duty this morning, I was using my laptop and as a result not doing constant "allmodconfig" builds. Anyway, this fixes the build and generally removes a fair chunk of the Ceph POSIX ACL support code, since the improved helpers seem to match really well for Ceph too. But I don't actually have any way to *test* the end result, and I was really hoping for some ACK's for this. Oh, well. Not compiling certainly doesn't make things easier to test, so I'm committing this without the acks after having waited for four hours... Plus it's what I would have done for the merge had I noticed the semantic conflict.. Reported-by: Dave Jones <davej@redhat.com> Cc: Sage Weil <sage@inktank.com> Cc: Guangliang Zhao <lucienchao@gmail.com> Cc: Li Wang <li.wang@ubuntykylin.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
7221fe4c |
|
11-Nov-2013 |
Guangliang Zhao <lucienchao@gmail.com> |
ceph: add acl for cephfs Signed-off-by: Guangliang Zhao <lucienchao@gmail.com> Reviewed-by: Li Wang <li.wang@ubuntykylin.com> Reviewed-by: Zheng Yan <zheng.z.yan@intel.com>
|
#
a1dc1937 |
|
19-Jun-2013 |
majianpeng <majianpeng@gmail.com> |
ceph: fix sleeping function called from invalid context. [ 1121.231883] BUG: sleeping function called from invalid context at kernel/rwsem.c:20 [ 1121.231935] in_atomic(): 1, irqs_disabled(): 0, pid: 9831, name: mv [ 1121.231971] 1 lock held by mv/9831: [ 1121.231973] #0: (&(&ci->i_ceph_lock)->rlock){+.+...},at:[<ffffffffa02bbd38>] ceph_getxattr+0x58/0x1d0 [ceph] [ 1121.231998] CPU: 3 PID: 9831 Comm: mv Not tainted 3.10.0-rc6+ #215 [ 1121.232000] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080015 11/09/2011 [ 1121.232027] ffff88006d355a80 ffff880092f69ce0 ffffffff8168348c ffff880092f69cf8 [ 1121.232045] ffffffff81070435 ffff88006d355a20 ffff880092f69d20 ffffffff816899ba [ 1121.232052] 0000000300000004 ffff8800b76911d0 ffff88006d355a20 ffff880092f69d68 [ 1121.232056] Call Trace: [ 1121.232062] [<ffffffff8168348c>] dump_stack+0x19/0x1b [ 1121.232067] [<ffffffff81070435>] __might_sleep+0xe5/0x110 [ 1121.232071] [<ffffffff816899ba>] down_read+0x2a/0x98 [ 1121.232080] [<ffffffffa02baf70>] ceph_vxattrcb_layout+0x60/0xf0 [ceph] [ 1121.232088] [<ffffffffa02bbd7f>] ceph_getxattr+0x9f/0x1d0 [ceph] [ 1121.232093] [<ffffffff81188d28>] vfs_getxattr+0xa8/0xd0 [ 1121.232097] [<ffffffff8118900b>] getxattr+0xab/0x1c0 [ 1121.232100] [<ffffffff811704f2>] ? final_putname+0x22/0x50 [ 1121.232104] [<ffffffff81155f80>] ? kmem_cache_free+0xb0/0x260 [ 1121.232107] [<ffffffff811704f2>] ? final_putname+0x22/0x50 [ 1121.232110] [<ffffffff8109e63d>] ? trace_hardirqs_on+0xd/0x10 [ 1121.232114] [<ffffffff816957a7>] ? sysret_check+0x1b/0x56 [ 1121.232120] [<ffffffff81189c9c>] SyS_fgetxattr+0x6c/0xc0 [ 1121.232125] [<ffffffff81695782>] system_call_fastpath+0x16/0x1b [ 1121.232129] BUG: scheduling while atomic: mv/9831/0x10000002 [ 1121.232154] 1 lock held by mv/9831: [ 1121.232156] #0: (&(&ci->i_ceph_lock)->rlock){+.+...}, at: [<ffffffffa02bbd38>] ceph_getxattr+0x58/0x1d0 [ceph] I think move the ci->i_ceph_lock down is safe because we can't free ceph_inode_info at there. CC: stable@vger.kernel.org # 3.8+ Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Reviewed-by: Sage Weil <sage@inktank.com>
|
#
2c3dd4ff |
|
18-Feb-2013 |
Alex Elder <elder@inktank.com> |
ceph: eliminate sparse warnings in fs code Fix the causes for sparse warnings reported in the ceph file system code. Here there are only two (and they're sort of silly but they're easy to fix). This partially resolves: http://tracker.ceph.com/issues/4184 Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
|
#
695b7119 |
|
20-Jan-2013 |
Sage Weil <sage@inktank.com> |
ceph: implement hidden per-field ceph.*.layout.* vxattrs Allow individual fields of the layout to be fetched via getxattr. The ceph.dir.layout.* vxattr with "disappear" if the exists_cb indicates there no dir layout set. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Sam Lang <sam.lang@inktank.com>
|
#
1f08f2b0 |
|
20-Jan-2013 |
Sage Weil <sage@inktank.com> |
ceph: add ceph.dir.layout vxattr This virtual xattr will only appear when there is a dir layout policy set on the directory. It can be set via setxattr and removed via removexattr (implemented by the MDS). Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Sam Lang <sam.lang@inktank.com>
|
#
32ab0bd7 |
|
19-Jan-2013 |
Sage Weil <sage@inktank.com> |
ceph: change ceph.file.layout.* implementation, content Implement a new method to generate the ceph.file.layout vxattr using the new framework. Use 'stripe_unit' instead of 'chunk_size'. Include pool name, either as a string or as an integer. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Sam Lang <sam.lang@inktank.com>
|
#
b65917dd |
|
20-Jan-2013 |
Sage Weil <sage@inktank.com> |
ceph: fix listxattr handling for vxattrs Only include vxattrs in the result if they are not hidden and exist (as determined by the exists_cb callback). Note that the buffer size we return when 0 is passed in always includes vxattrs that *might* exist, forming an upper bound. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Sam Lang <sam.lang@inktank.com>
|
#
0bee82fb |
|
20-Jan-2013 |
Sage Weil <sage@inktank.com> |
ceph: fix getxattr vxattr handling Change the vxattr handling for getxattr so that vxattrs are checked prior to any xattr content, and never after. Enforce vxattr existence via the exists_cb callback. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Sam Lang <sam.lang@inktank.com>
|
#
f36e4472 |
|
20-Jan-2013 |
Sage Weil <sage@inktank.com> |
ceph: add exists_cb to vxattr struct Allow for a callback to dynamically determine if a vxattr exists for the given inode. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Sam Lang <sam.lang@inktank.com>
|
#
d421acb1 |
|
20-Jan-2013 |
Sage Weil <sage@inktank.com> |
ceph: pass ceph.* removexattrs through to MDS If we do not explicitly recognized a vxattr (e.g., as readonly), pass the request through to the MDS and deal with it there. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Sam Lang <sam.lang@inktank.com>
|
#
3adf654d |
|
31-Jan-2013 |
Sage Weil <sage@inktank.com> |
ceph: pass unhandled ceph.* setxattrs through to MDS If we do not specifically understand a setxattr on a ceph.* virtual xattr, send it through to the MDS. This allows us to implement new functionality via the MDS without direct support on the client side. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Sam Lang <sam.lang@inktank.com>
|
#
8860147a |
|
31-Jan-2013 |
Sage Weil <sage@inktank.com> |
ceph: support hidden vxattrs Add ability to flag virtual xattrs as hidden, such that you can getxattr them but they do not appear in listxattr. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Sam Lang <sam.lang@inktank.com>
|
#
39b648d9 |
|
31-Jan-2013 |
Sage Weil <sage@inktank.com> |
ceph: remove 'ceph.layout' virtual xattr This has been deprecated since v3.3, 114fc474. Kill it. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Sam Lang <sam.lang@inktank.com>
|
#
21ec6ffa |
|
20-Jul-2012 |
Alan Cox <alan@linux.intel.com> |
ceph: fix potential double free We re-run the loop but we don't re-set the attrs pointer back to NULL. Signed-off-by: Alan Cox <alan@linux.intel.com> Reviewed-by: Alex Elder <elder@inktank.com>
|
#
3469ac1a |
|
07-May-2012 |
Sage Weil <sage@inktank.com> |
ceph: drop support for preferred_osd pgs This was an ill-conceived feature that has been removed from Ceph. Do this gracefully: - reject attempts to specify a preferred_osd via the ioctl - stop exposing this information via virtual xattrs - always fill in -1 for requests, in case we talk to an older server - don't calculate preferred_osd placements/pgids Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Sage Weil <sage@inktank.com>
|
#
3489b42a |
|
08-Mar-2012 |
Alex Elder <elder@dreamhost.com> |
ceph: fix three bugs, two in ceph_vxattrcb_file_layout() In ceph_vxattrcb_file_layout(), there is a check to determine whether a preferred PG should be formatted into the output buffer. That check assumes that a preferred PG number of 0 indicates "no preference," but that is wrong. No preference is indicated by a negative (specifically, -1) PG number. In addition, if that condition yields true, the preferred value is formatted into a sized buffer, but the size consumed by the earlier snprintf() call is not accounted for, opening up the possibilty of a buffer overrun. Finally, in ceph_vxattrcb_dir_rctime() where the nanoseconds part of the time displayed did not include leading 0's, which led to erroneous (sub-second portion of) time values being shown. This fixes these three issues: http://tracker.newdream.net/issues/2155 http://tracker.newdream.net/issues/2156 http://tracker.newdream.net/issues/2157 Signed-off-by: Alex Elder <elder@dreamhost.com> Reviewed-by: Sage Weil <sage@newdream.net>
|
#
18fa8b3f |
|
23-Jan-2012 |
Alex Elder <elder@dreamhost.com> |
ceph: make ceph_setxattr() and ceph_removexattr() more alike This patch just rearranges a few bits of code to make more portions of ceph_setxattr() and ceph_removexattr() identical. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
3ce6cd12 |
|
23-Jan-2012 |
Alex Elder <elder@dreamhost.com> |
ceph: avoid repeatedly computing the size of constant vxattr names All names defined in the directory and file virtual extended attribute tables are constant, and the size of each is known at compile time. So there's no need to compute their length every time any file's attribute is listed. Record the length of each string and use it when needed to determine the space need to represent them. In addition, compute the aggregate size of strings in each table just once at initialization time. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
aa4066ed |
|
23-Jan-2012 |
Alex Elder <elder@dreamhost.com> |
ceph: encode type in vxattr callback routines The names of the callback functions used for virtual extended attributes are based only on the last component of the attribute name. Because of the way these are defined, this precludes allowing a single (lowest) attribute name for different callbacks, dependent on the type of file being operated on. (For example, it might be nice to support both "ceph.dir.layout" and "ceph.file.layout".) Just change the callback names to avoid this problem. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
881a5fa2 |
|
23-Jan-2012 |
Alex Elder <elder@dreamhost.com> |
ceph: drop "_cb" from name of struct ceph_vxattr_cb A struct ceph_vxattr_cb does not represent a callback at all, but rather a virtual extended attribute itself. Drop the "_cb" suffix from its name to reflect that. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
eb788084 |
|
23-Jan-2012 |
Alex Elder <elder@dreamhost.com> |
ceph: use macros to normalize vxattr table definitions Entries in the ceph virtual extended attribute tables all follow a distinct pattern in their definition. Enforce this pattern through the use of a macro. Also, a null name field signals the end of the table, so make that be the first field in the ceph_vxattr_cb structure. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
22891907 |
|
23-Jan-2012 |
Alex Elder <elder@dreamhost.com> |
ceph: use a symbolic name for "ceph." extended attribute namespace Use symbolic constants to define the top-level prefix for "ceph." extended attribute names. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
06476a69 |
|
23-Jan-2012 |
Alex Elder <elder@dreamhost.com> |
ceph: pass inode rather than table to ceph_match_vxattr() All callers of ceph_match_vxattr() determine what to pass as the first argument by calling ceph_inode_vxattrs(inode). Just do that inside ceph_match_vxattr() itself, changing it to take an inode rather than the vxattr pointer as its first argument. Also ensure the function works correctly for an empty table (i.e., containing only a terminating null entry). Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
b829c195 |
|
23-Jan-2012 |
Alex Elder <elder@dreamhost.com> |
ceph: don't null-terminate xattr values For some reason, ceph_setxattr() allocates an extra byte in which a '\0' is stored past the end of an extended attribute value. This is not needed, and is potentially misleading, so get rid of it. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
114fc474 |
|
11-Jan-2012 |
Alex Elder <elder@dreamhost.com> |
ceph: change "ceph.layout" xattr to be "ceph.file.layout" The virtual extended attribute named "ceph.layout" is meaningful only for regular files. Change its name to be "ceph.file.layout" to more directly reflect that in the ceph xattr namespace. Preserve the old "ceph.layout" name for the time being (until we decide it's safe to get rid of it entirely). Add a missing initializer for "readonly" in the terminating entry. Signed-off-by: Alex Elder <elder@dreamhost.com> Reviewed-by: Sage Weil <sage@newdream.net>
|
#
83eb26af |
|
11-Jan-2012 |
Alex Elder <elder@dreamhost.com> |
ceph: ensure prealloc_blob is in place when removing xattr In __ceph_build_xattrs_blob(), if a ceph inode's extended attributes are marked dirty, all attributes recorded in its rb_tree index are formatted into a "blob" buffer. The target buffer is recorded in ceph_inode->i_xattrs.prealloc_blob, and it is expected to exist and be of sufficient size to hold the attributes. The extended attributes are marked dirty in two cases: when a new attribute is added to the inode; or when one is removed. In the former case work is done to ensure the prealloc_blob buffer is properly set up, but in the latter it is not. Change the logic in ceph_removexattr() so it matches what is done in ceph_setxattr(). Note that this is done in a way that keeps the two blocks of code nearly identical, in anticipation of a subsequent patch that encapsulates some of this logic into one or more helper routines. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
be655596 |
|
30-Nov-2011 |
Sage Weil <sage@newdream.net> |
ceph: use i_ceph_lock instead of i_lock We have been using i_lock to protect all kinds of data structures in the ceph_inode_info struct, including lists of inodes that we need to iterate over while avoiding races with inode destruction. That requires grabbing a reference to the inode with the list lock protected, but igrab() now takes i_lock to check the inode flags. Changing the list lock ordering would be a painful process. However, using a ceph-specific i_ceph_lock in the ceph inode instead of i_lock is a simple mechanical change and avoids the ordering constraints imposed by igrab(). Reported-by: Amon Ott <a.ott@m-privacy.de> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
5f21c96d |
|
26-Jul-2011 |
Sage Weil <sage@newdream.net> |
ceph: protect access to d_parent d_parent is protected by d_lock: use it when looking up a dentry's parent directory inode. Also take a reference and drop it in the caller to avoid a use-after-free. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
70b666c3 |
|
27-May-2011 |
Sage Weil <sage@newdream.net> |
ceph: use ihold when we already have an inode ref We should use ihold whenever we already have a stable inode ref, even when we aren't holding i_lock. This avoids adding new and unnecessary locking dependencies. Signed-off-by: Sage Weil <sage@newdream.net>
|
#
fca65b4a |
|
04-May-2011 |
Sage Weil <sage@newdream.net> |
ceph: do not call __mark_dirty_inode under i_lock The __mark_dirty_inode helper now takes i_lock as of 250df6ed. Fix the one ceph callers that held i_lock (__ceph_mark_dirty_caps) to return the flags value so that the callers can do it outside of i_lock. Signed-off-by: Sage Weil <sage@newdream.net>
|
#
17db143f |
|
13-Jan-2011 |
Sage Weil <sage@newdream.net> |
ceph: fix xattr rbtree search Fix xattr name comparison in rbtree search for strings that share a prefix. The *name argument is null terminated, but the xattr name is not, so we need to use strncmp, but that means adjusting for the case where name is a prefix of xattr->name. The corresponding case in __set_xattr() already handles this properly (although in that case *name is also not null terminated). Reported-by: Sergiy Kibrik <sakib@meta.ua> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
61413c2f |
|
17-Oct-2010 |
Julia Lawall <julia@diku.dk> |
fs/ceph/xattr.c: Use kmemdup Convert a sequence of kmalloc and memcpy to use kmemdup. The semantic patch that performs this transformation is: (http://coccinelle.lip6.fr/) // <smpl> @@ expression a,flag,len; expression arg,e1,e2; statement S; @@ a = - \(kmalloc\|kzalloc\)(len,flag) + kmemdup(arg,len,flag) <... when != a if (a == NULL || ...) S ...> - memcpy(a,arg,len+1); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
3d14c5d2 |
|
06-Apr-2010 |
Yehuda Sadeh <yehuda@hq.newdream.net> |
ceph: factor out libceph from Ceph file system This factors out protocol and low-level storage parts of ceph into a separate libceph module living in net/ceph and include/linux/ceph. This is mostly a matter of moving files around. However, a few key pieces of the interface change as well: - ceph_client becomes ceph_fs_client and ceph_client, where the latter captures the mon and osd clients, and the fs_client gets the mds client and file system specific pieces. - Mount option parsing and debugfs setup is correspondingly broken into two pieces. - The mon client gets a generic handler callback for otherwise unknown messages (mds map, in this case). - The basic supported/required feature bits can be expanded (and are by ceph_fs_client). No functional change, aside from some subtle error handling cases that got cleaned up in the refactoring process. Signed-off-by: Sage Weil <sage@newdream.net>
|
#
4a625be4 |
|
22-Aug-2010 |
Sage Weil <sage@newdream.net> |
ceph: include dirty xattrs state in snapped caps When we snapshot dirty metadata that needs to be written back to the MDS, include dirty xattr metadata. Make the capsnap reference the encoded xattr blob so that it will be written back in the FLUSHSNAP op. Also fix the capsnap creation guard to include dirty auth or file bits, not just tests specific to dirty file data or file writes in progress (this fixes auth metadata writeback). Signed-off-by: Sage Weil <sage@newdream.net>
|
#
cd84db6e |
|
11-Jun-2010 |
Yehuda Sadeh <yehuda@hq.newdream.net> |
ceph: code cleanup Mainly fixing minor issues reported by sparse. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
1a756278 |
|
11-May-2010 |
Sage Weil <sage@newdream.net> |
ceph: use ceph. prefix for virtual xattrs Drop the 'user.' prefix and use just 'ceph.' for fs virtual xattrs. Signed-off-by: Sage Weil <sage@newdream.net>
|
#
bddfa3cc |
|
29-Apr-2010 |
Henry C Chang <henry_c_chang@tcloudcomputing.com> |
ceph: listxattr should compare version by >= If the version hasn't changed, don't rebuild the index. Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
640ef79d |
|
26-Mar-2010 |
Cheng Renquan <crquan@gmail.com> |
ceph: use ceph_sb_to_client instead of ceph_client ceph_sb_to_client and ceph_client are really identical, we need to dump one; while function ceph_client is confusing with "struct ceph_client", ceph_sb_to_client's definition is more clear; so we'd better switch all call to ceph_sb_to_client. -static inline struct ceph_client *ceph_client(struct super_block *sb) -{ - return sb->s_fs_info; -} Signed-off-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
c7708075 |
|
20-Mar-2010 |
Dan Carpenter <error27@gmail.com> |
ceph: cleanup: remove dead code "xattr" is never NULL here. We took care of that in the previous if statement block. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
31459fe4 |
|
17-Mar-2010 |
Yehuda Sadeh <yehuda@hq.newdream.net> |
ceph: use __page_cache_alloc and add_to_page_cache_lru Following Nick Piggin patches in btrfs, pagecache pages should be allocated with __page_cache_alloc, so they obey pagecache memory policies. Also, using add_to_page_cache_lru instead of using a private pagevec where applicable. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
5a0e3ad6 |
|
24-Mar-2010 |
Tejun Heo <tj@kernel.org> |
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
|
#
b6c1d5b8 |
|
07-Dec-2009 |
Sage Weil <sage@newdream.net> |
ceph: simplify ceph_buffer interface We never allocate the ceph_buffer and buffer separtely, so use a single constructor. Disallow put on NULL buffer; make the caller check. Signed-off-by: Sage Weil <sage@newdream.net>
|
#
60d87733 |
|
20-Nov-2009 |
Julia Lawall <julia@diku.dk> |
fs/ceph: introduce missing kfree Error handling code following a kmalloc should free the allocated data. The semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @r exists@ local idexpression x; statement S; expression E; identifier f,f1,l; position p1,p2; expression *ptr != NULL; @@ x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...); ... if (x == NULL) S <... when != x when != if (...) { <+...x...+> } ( x->f1 = E | (x->f1 == NULL || ...) | f(...,x->f1,...) ) ...> ( return \(0\|<+...x...+>\|ptr\); | return@p2 ...; ) @script:python@ p1 << r.p1; p2 << r.p2; @@ print "* file: %s kmalloc %s return %s" % (p1[0].file,p1[0].line,p2[0].line) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Sage Weil <sage@newdream.net>
|
#
63ff78b2 |
|
01-Nov-2009 |
Sage Weil <sage@newdream.net> |
ceph: fix uninitialized err variable Fixes warning fs/ceph/xattr.c: In function '__build_xattrs': fs/ceph/xattr.c:353: warning: 'err' may be used uninitialized in this function Signed-off-by: Sage Weil <sage@newdream.net>
|
#
b195befd |
|
07-Oct-2009 |
Sage Weil <sage@newdream.net> |
ceph: include preferred_osd in file layout virtual xattr Signed-off-by: Sage Weil <sage@newdream.net>
|
#
355da1eb |
|
06-Oct-2009 |
Sage Weil <sage@newdream.net> |
ceph: inode operations Inode cache and inode operations. We also include routines to incorporate metadata structures returned by the MDS into the client cache, and some helpers to deal with file capabilities and metadata leases. The bulk of that work is done by fill_inode() and fill_trace(). Signed-off-by: Sage Weil <sage@newdream.net>
|