364978 |
30-Aug-2020 |
asomers |
MFC r364412:
zfs: fix EIO accessing dataset after resuming interrupted receive
ZFS unmounts a dataset while receiving into it and remounts it afterwards. But if ZFS is resuming an incomplete receive, it screws up and ends up with a dataset that is mounted, but returns EIO for every access. This commit fixes that condition.
While the vulnerable code also exists in OpenZFS, the problem is not reproducible there. Apparently OpenZFS doesn't unmount the destination dataset during receive, like FreeBSD does.
PR: 248606 Reviewed by: mmacy Sponsored by: Axcient Differential Revision: https://reviews.freebsd.org/D26034 |
363954 |
06-Aug-2020 |
markj |
MFC r363447: MFOpenZFS: Fix zpool history unbounded memory usage
PR: 247557 |
359722 |
08-Apr-2020 |
freqlabs |
MFC r359303
MFOpenZFS: ZVOLs should not be allowed to have children
zfs create, receive and rename can bypass this hierarchy rule. Update both userland and kernel module to prevent this issue and use pyzfs unit tests to exercise the ioctls directly.
Note: this commit slightly changes zfs_ioc_create() ABI. This allow to differentiate a generic error (EINVAL) from the specific case where we tried to create a dataset below a ZVOL (ZFS_ERR_WRONG_PARENT).
Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tom Caputi <tcaputi@datto.com> Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Approved by: mav (mentor) openzfs/zfs@d8d418ff0cc90776182534bce10b01e9487b63e4 |
357001 |
22-Jan-2020 |
kevans |
MFC r356876-r356877: add zfs_mount_at
r356876: libzfs: add zfs_mount_at
This will be used in libbe in place of the internal zmount(); libbe only wants to be able to mount a dataset at an arbitrary mountpoint without altering dataset/pool properties. The natural way to do this in a portable way is by creating a zfs_mount_at() interface that's effectively zfs_mount() + a mountpoint parameter. zfs_mount() is now a light wrapper around the new method.
The interface and implementation have already been accepted into ZFS On Linux, and the next commit to switch libbe() over to this new interface will solve the last compatibility issue with ZoL. The next sysutils/openzfs rebase against ZoL should be able to build libbe/bectl with only minor adjustments to build glue.
r356877: libbe: use the new zfs_mount_at()
More background is available in r356876, but this new interface is more portable across ZFS implementations and cleaner for what libbe is attempting to achieve anyways. |
353759 |
19-Oct-2019 |
avg |
MFC r353037: ZFS: add bookmark renaming |
353756 |
19-Oct-2019 |
avg |
MFC r353343: zfs: remove gratuitous divergence from other openzfs flavours |
353339 |
09-Oct-2019 |
avg |
MFC r352591: MFZoL: Retire send space estimation via ZFS_IOC_SEND
Add a small wrapper around libzfs_core's lzc_send_space() to libzfs so that every legacy ZFS_IOC_SEND consumer, along with their userland counterpart estimate_ioctl(), can leverage ZFS_IOC_SEND_SPACE to request send space estimation.
The legacy functionality in zfs_ioc_send() is left untouched for compatibility purposes.
Obtained from: ZoL Obtained from: zfsonlinux/zfs@cf7684bc8d57 Author: loli10K <ezomori.nozomu@gmail.com> |
353338 |
09-Oct-2019 |
avg |
MFC r352580 by sef: Fix a regression introduced in r344601
... and work properly with the -v and -n options.
PR: 240640 |
352722 |
25-Sep-2019 |
avg |
MFC r352590: print summary line for space estimate of zfs send from bookmark |
352598 |
22-Sep-2019 |
avg |
MFC r352447,r352449,r352507: MFZoL: Add -vnP support to 'zfs send' for bookmarks |
352376 |
16-Sep-2019 |
avg |
MFC r351803: ZFS: Always refuse receving non-resume stream when resume state exists |
350402 |
29-Jul-2019 |
bapt |
MFC r350358:
Fix a bug introduced with parallel mounting of zfs
Incorporate a fix from zol: https://github.com/zfsonlinux/zfs/commit/ab5036df1ccbe1b18c1ce6160b5829e8039d94ce
commit log from upstream: Fix race in parallel mount's thread dispatching algorithm
Strategy of parallel mount is as follows.
1) Initial thread dispatching is to select sets of mount points that don't have dependencies on other sets, hence threads can/should run lock-less and shouldn't race with other threads for other sets. Each thread dispatched corresponds to top level directory which may or may not have datasets to be mounted on sub directories.
2) Subsequent recursive thread dispatching for each thread from 1) is to mount datasets for each set of mount points. The mount points within each set have dependencies (i.e. child directories), so child directories are processed only after parent directory completes.
The problem is that the initial thread dispatching in zfs_foreach_mountpoint() can be multi-threaded when it needs to be single-threaded, and this puts threads under race condition. This race appeared as mount/unmount issues on ZoL for ZoL having different timing regarding mount(2) execution due to fork(2)/exec(2) of mount(8). `zfs unmount -a` which expects proper mount order can't unmount if the mounts were reordered by the race condition.
There are currently two known patterns of input list `handles` in `zfs_foreach_mountpoint(..,handles,..)` which cause the race condition.
1) #8833 case where input is `/a /a /a/b` after sorting. The problem is that libzfs_path_contains() can't correctly handle an input list with two same top level directories. There is a race between two POSIX threads A and B, * ThreadA for "/a" for test1 and "/a/b" * ThreadB for "/a" for test0/a and in case of #8833, ThreadA won the race. Two threads were created because "/a" wasn't considered as `"/a" contains "/a"`.
2) #8450 case where input is `/ /var/data /var/data/test` after sorting. The problem is that libzfs_path_contains() can't correctly handle an input list containing "/". There is a race between two POSIX threads A and B, * ThreadA for "/" and "/var/data/test" * ThreadB for "/var/data" and in case of #8450, ThreadA won the race. Two threads were created because "/var/data" wasn't considered as `"/" contains "/var/data"`. In other words, if there is (at least one) "/" in the input list, the initial thread dispatching must be single-threaded since every directory is a child of "/", meaning they all directly or indirectly depend on "/".
In both cases, the first non_descendant_idx() call fails to correctly determine "path1-contains-path2", and as a result the initial thread dispatching creates another thread when it needs to be single-threaded. Fix a conditional in libzfs_path_contains() to consider above two.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
PR: 237517, 237397, 239243 Submitted by: Matthew D. Fuller <fullermd@over-yonder.net> (by email) |
346690 |
25-Apr-2019 |
mav |
MFC r344569, r344618, r344621 (by bapt):
r344569: Implement parallel mounting for ZFS filesystem
It was first implemented on Illumos and then ported to ZoL. This patch is a port to FreeBSD of the ZoL version. This patch also includes a fix for a race condition that was amended
With such patch Delphix has seen a huge decrease in latency of the mount phase (https://github.com/openzfs/openzfs/commit/a3f0e2b569 for details). With that current change Gandi has measured improvments that are on par with those reported by Delphix.
Zol commits incorporated: https://github.com/zfsonlinux/zfs/commit/a10d50f999511d304f910852c7825c70c9c9e303 https://github.com/zfsonlinux/zfs/commit/e63ac16d25fbe991a356489c86d4077567dfea21
Reviewed by: avg, sef Approved by: avg, sef Obtained from: ZoL Relnotes: yes Sponsored by: Gandi.net Differential Revision: https://reviews.freebsd.org/D19098
r344618: Fix regression introduced in r344569
Reported by: cy Tested by: cy Submitted by: Fatih Acar <fatih@gandi.net>
r344621: Fix a regression introduced in r344569
Import a fix from illumos (thanks Toomas Soomas for pointing at it)
See https://www.illumos.org/issues/10205 for more details Illumos commit: https://github.com/illumos/illumos-gate/commit/247b7da039fd88350c50e3d7fef15bdab6bef215
Submitted by: jack@gandi.net Reported by: cy Reviewed by: tsoome, cy, bapt Obtained from: Illumos |
346685 |
25-Apr-2019 |
mav |
MFC r344601 (by sef): Set process title during zfs send.
This adds a '-V' option to 'zfs send', which sets the process title once a second to the progress information.
This code has been in FreeNAS for a long time now; this is just upstreaming it here. It was originially written by delphij. |
342943 |
11-Jan-2019 |
avg |
MFC r342525: MFV r342469: 9630 add lzc_rename and lzc_destroy to libzfs_core
Relnotes: maybe |
339158 |
03-Oct-2018 |
mav |
MFC r337567 (by mmacy): Performance optimization of AVL tree comparator functions
MFV: commit ee36c709c3d5f7040e1bd11f5c75318aa03e789f Author: Gvozden Neskovic <neskovic@gmail.com> Date: Sat Aug 27 20:12:53 2016 +0200
perf: 2.75x faster ddt_entry_compare() First 256bits of ddt_key_t is a block checksum, which are expected to be close to random data. Hence, on average, comparison only needs to look at first few bytes of the keys. To reduce number of conditional jump instructions, the result is computed as: sign(memcmp(k1, k2)).
Sign of an integer 'a' can be obtained as: `(0 < a) - (a < 0)` := {-1, 0, 1} , which is computed efficiently. Synthetic performance evaluation of original and new algorithm over 1G random keys on 2.6GHz Intel(R) Xeon(R) CPU E5-2660 v3:
old 6.85789 s new 2.49089 s
perf: 2.8x faster vdev_queue_offset_compare() and vdev_queue_timestamp_compare() Compute the result directly instead of using conditionals
perf: zfs_range_compare() Speedup between 1.1x - 2.5x, depending on compiler version and optimization level.
perf: spa_error_entry_compare() `bcmp()` is not suitable for comparator use. Use `memcmp()` instead.
perf: 2.8x faster metaslab_compare() and metaslab_rangesize_compare() perf: 2.8x faster zil_bp_compare() perf: 2.8x faster mze_compare() perf: faster dbuf_compare() perf: faster compares in spa_misc perf: 2.8x faster layout_hash_compare() perf: 2.8x faster space_reftree_compare() perf: libzfs: faster avl tree comparators perf: guid_compare() perf: dsl_deadlist_compare() perf: perm_set_compare() perf: 2x faster range_tree_seg_compare() perf: faster unique_compare() perf: faster vdev_cache _compare() perf: faster vdev_uberblock_compare() perf: faster fuid _compare() perf: faster zfs_znode_hold_compare()
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com> Signed-off-by: Richard Elling <richard.elling@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #5033 |
339142 |
03-Oct-2018 |
mav |
MFC r337215: MFV 337214: 9621 Make createtxg and guid properties public
illumos/illumos-gate@e8d4a73c868afb740396041be80ed2b141065e76
Reviewed by: Andy Stormont <astormont@racktopsystems.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Yuri Pankov <yuripv@yuripv.net> Approved by: Robert Mustacchi <rm@joyent.com> Author: Josh Paetzel <josh@tcbug.org> |
339130 |
03-Oct-2018 |
mav |
MFC r337185: MFV r337184: 9457 libzfs_import.c:add_config() has a memory leak
A memory leak occurs on lines 209 and 213 because the config is not freed in the error case. The interface to add_config() seems less than ideal - it would be better if it copied any data necessary from the config and the caller freed it.
illumos/illumos-gate@ddfe901b12348d31c500fb57f9174e88860a4061
Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: sara hartse <sara.hartse@delphix.com> |
339129 |
03-Oct-2018 |
mav |
MFC r337183: MFV r337182: 9330 stack overflow when creating a deeply nested dataset
Datasets that are deeply nested (~100 levels) are impractical. We just put a limit of 50 levels to newly created datasets. Existing datasets should work without a problem.
illumos/illumos-gate@5ac95da7d61660aa299c287a39277cb0372be959
Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Matt Ahrens <matt@delphix.com> Approved by: Garrett D'Amore <garrett@damore.org> Author: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com> |
339119 |
03-Oct-2018 |
mav |
MFC r337163: MFV r337161: 9512 zfs remap poolname@snapname coredumps
Only filesystems and volumes are valid "zfs remap" parameters: when passed a snapshot name zfs_remap_indirects() does not handle the EINVAL returned from libzfs_core, which results in failing an assertion and consequently crashing.
illumos/illumos-gate@0b2e8253986c5c761129b58cfdac46d204903de1
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: John Wren Kennedy <john.kennedy@delphix.com> Reviewed by: Sara Hartse <sara.hartse@delphix.com> Approved by: Matt Ahrens <mahrens@delphix.com> Author: loli10K <ezomori.nozomu@gmail.com> |
339118 |
03-Oct-2018 |
mav |
MFC r337160: Do not blindly include illumos kernel headers instead of user-space. It is not needed now, and I doubt it much helped at all, creating more confusions then good. |
339117 |
03-Oct-2018 |
mav |
MFC r337063: MFV r316926: 7955 libshare needs to initialize only those datasets being modified by the consumer
illumos/illumos-gate@8a981c3356b194b3b5c0ae9276a9cc31cd2f93a3 https://github.com/illumos/illumos-gate/commit/8a981c3356b194b3b5c0ae9276a9cc31cd2f93a3
https://www.illumos.org/issues/7955 Libshare currently initializes all available filesystems when doing any libshare operation. This requires iterating through all the filesystem multiple times, which is a huge performance problem for sharing and unsharing operations.
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Yuri Pankov <yuri.pankov@gmail.com> Approved by: Gordon Ross <gordon.w.ross@gmail.com> Author: Daniel Hoffman <dj.hoffman@delphix.com>
For FreeBSD this is practically a NOP, just a diff reduction. |
339112 |
03-Oct-2018 |
mav |
MFC r337017: MFV r337014: 9421 zdb should detect and print out the number of "leaked" objects 9422 zfs diff and zdb should explicitly mark objects that are on the deleted queue
illumos/illumos-gate@20b5dafb425396adaebd0267d29e1026fc4dc413
Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Approved by: Matt Ahrens <mahrens@delphix.com> Author: Paul Dagnelie <pcd@delphix.com> |
339111 |
03-Oct-2018 |
mav |
MFC r337007: MFV r336991, r337001: 9102 zfs should be able to initialize storage devices
The first access to a disk block can incur a performance penalty on some platforms (e.g. AWS's EBS, VMware VMDKs). Therefore it is recommended that volumes be "thick provisioned", where supported by the platform (VMware). Thick provisioning is time consuming and often is ignored. If the thick provision step is omitted, customers will see suboptimal performance until we have written to all parts of the LUN. ZFS should be able to initialize any unused storage to remove any first-write penalty that exists.
illumos/illumos-gate@094e47e980b0796b94b1b8f51f462a64d246e516
Reviewed by: John Wren Kennedy <john.kennedy@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: George Wilson <george.wilson@delphix.com> |
339106 |
03-Oct-2018 |
mav |
MFC r336951: MFV r336950: 9290 device removal reduces redundancy of mirrors
Mirrors are supposed to provide redundancy in the face of whole-disk failure and silent damage (e.g. some data on disk is not right, but ZFS hasn't detected the whole device as being broken). However, the current device removal implementation bypasses some of the mirror's redundancy.
illumos/illumos-gate@3a4b1be953ee5601bab748afa07c26ed4996cde6
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Prashanth Sreenivasa <pks@delphix.com> Reviewed by: Sara Hartse <sara.hartse@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Tim Chase <tim@chase2k.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Matthew Ahrens <mahrens@delphix.com> |
339103 |
03-Oct-2018 |
mav |
MFC r336945: MFV r336944: 9286 want refreservation=auto
When a ZFS volume is created with zfs create -V (but without -s), the refreservation property is set to a value that is volsize plus the maximum size of metadata. If refreservation is ever set to another value, it is impossible to set it back to the automatically determined value. There are other cases where refreservation may be wrong. These include receiving a volume that was sent without properties and zfs clone.
We need:
zfs set refreservation=auto <volume> zfs clone -o refreservation=auto <volume>
Each one would use the same function used by zfs create -V to determine the proper value for refreservation.
illumos/illumos-gate@1c10ae76c0cb31326c320e7cef1d3f24a1f47125
Reviewed by: Allan Jude <allanjude@freebsd.org> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Andy Stormont <astormont@racktopsystems.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Mike Gerdts <mike.gerdts@joyent.com> |
339034 |
01-Oct-2018 |
sef |
MFC r334844, r336180, r336458
r334844
This originated from ZFS On Linux, as https://github.com/zfsonlinux/zfs/commit/d4a72f23863382bdf6d0ae33196f5b5decbc48fd
During scans (scrubs or resilvers), it sorts the blocks in each transaction group by block offset; the result can be a significant improvement. (On my test system just now, which I put some effort to introduce fragmentation into the pool since I set it up yesterday, a scrub went from 1h2m to 33.5m with the changes.) I've seen similar rations on production systems.
r336180
Fix up some missed and mis-merges from the sequential scan code (r334844). Most of the changes involve moving some code around to reduce conflicts with future merges. One of the missing changes included a notification on scrub cancellation.
r336458
Fix a couple of typos in r334844 noticed by Richard Kojedzinszky
Approved by: mav Sponsored by: iXsystems, Inc |
338974 |
27-Sep-2018 |
mav |
MFC r333307 (by sbruno): Cleanup sundry clang warnings for code that is not upstream in illumos. https://github.com/illumos/illumos-gate/edit/master/usr/src/lib/libzfs/common/libzfs_sendrecv.c
Patch our version of it to quiesce warnings until someone decides to sync up our code:
libzfs_sendrecv.c:2555:30: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat] sprintf(guidname, "%lu", thisguid); ~~~ ^~~~~~~~ %llu libzfs_sendrecv.c:2612:29: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat] sprintf(guidname, "%lu", parent_fromsnap_guid); ~~~ ^~~~~~~~~~~~~~~~~~~~ %llu libzfs_sendrecv.c:2645:29: warning: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat] sprintf(guidname, "%lu", parent_fromsnap_guid); ~~~ ^~~~~~~~~~~~~~~~~~~~ %llu |
333194 |
03-May-2018 |
avg |
MFC r332426: allow ZFS pool to have temporary name for duration of current import
The change adds -t <name> option to zpool create and -t option to zpool import in its form with an old name and a new name. This allows to import (or create) a pool under a name that's different from its real, permanent name without affecting that name. This is useful when working with VM images or images of other physical systems if they happen to have a ZFS pool with the same name as the host system.
Sponsored by: Panzura (porting) |
332550 |
16-Apr-2018 |
mav |
MFC r331707: MFV r331706: 9235 rename zpool_rewind_policy_t to zpool_load_policy_t
illumos/illumos-gate@5dafeea3ebd2dd77affc802bcb90f63faf01589f
We want to be able to pass various settings during import/open of a pool, which are not only related to rewind. Instead of adding a new policy and duplicate a bunch of code, we should just rename rewind_policy to a more generic term like load_policy.
For instance, we'd like to set spa->spa_import_flags from the nvlist, rather from a flags parameter passed to spa_import as in some cases we want those flags not only for the import case, but also for the open case. One such flag could be ZFS_IMPORT_MISSING_LOG (as used in zdb) which would allow zfs to open a pool when logs are missing.
Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Pavel Zakharov <pavel.zakharov@delphix.com> |
332547 |
16-Apr-2018 |
mav |
MFC r331701: MFV r331695, 331700: 9166 zfs storage pool checkpoint
illumos/illumos-gate@8671400134a11c848244896ca51a7db4d0f69da4
The idea of Storage Pool Checkpoint (aka zpool checkpoint) deals with exactly that. It can be thought of as a “pool-wide snapshot” (or a variation of extreme rewind that doesn’t corrupt your data). It remembers the entire state of the pool at the point that it was taken and the user can revert back to it later or discard it. Its generic use case is an administrator that is about to perform a set of destructive actions to ZFS as part of a critical procedure. She takes a checkpoint of the pool before performing the actions, then rewinds back to it if one of them fails or puts the pool into an unexpected state. Otherwise, she discards it. With the assumption that no one else is making modifications to ZFS, she basically wraps all these actions into a “high-level transaction”.
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com> |
332539 |
16-Apr-2018 |
mav |
MFC r329808: MFV r329807: 8940 Sending an intra-pool resumable send stream may result in EXDEV
illumos/illumos-gate@544132fce3fa6583f01318f9559adc46614343a7
"zfs send -t <token>" for an incremental send should be able to resume successfully when sending to the same pool: a subtle issue in zfs_iter_children() doesn't currently allow this.
Because resuming from a token requires "guid" -> "dataset" mapping (guid_to_name()), we have to walk the whole hierarchy to find the right snapshots to send. When resuming an incremental send both source and destination live in the same pool and have the same guid: this is where zfs_iter_children() gets confused and picks up the wrong snapshot, so we end up trying to send an incremental "destination@snap1 -> source@snap2" stream instead of "source@snap1 -> source@snap2": this fails with an "Invalid cross-device link" (EXDEV) error.
Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org> Author: loli10K <ezomori.nozomu@gmail.com> |
332536 |
16-Apr-2018 |
mav |
MFC r329798: MFV r329793, r329795: 9075 Improve ZFS pool import/load process and corrupted pool recovery
illumos/illumos-gate@6f7938128a2c5e23f4b970ea101137eadd1470a1
Some work has been done lately to improve the debugability of the ZFS pool load (and import) process. This includes:
https://www.illumos.org/issues/7638: Refactor spa_load_impl into several functions https://www.illumos.org/issues/8961: SPA load/import should tell us why it failed https://www.illumos.org/issues/7277: zdb should be able to print zfs_dbgmsg's
To iterate on top of that, there's a few changes that were made to make the import process more resilient and crash free. One of the first tasks during the pool load process is to parse a config provided from userland that describes what devices the pool is composed of. A vdev tree is generated from that config, and then all the vdevs are opened.
The Meta Object Set (MOS) of the pool is accessed, and several metadata objects that are necessary to load the pool are read. The exact configuration of the pool is also stored inside the MOS. Since the configuration provided from userland is external and might not accurately describe the vdev tree of the pool at the txg that is being loaded, it cannot be relied upon to safely operate the pool. For that reason, the configuration in the MOS is read early on. In the past, the two configurations were compared together and if there was a mismatch then the load process was aborted and an error was returned.
The latter was a good way to ensure a pool does not get corrupted, however it made the pool load process needlessly fragile in cases where the vdev configuration changed or the userland configuration was outdated. Since the MOS is stored in 3 copies, the configuration provided by userland doesn't have to be perfect in order to read its contents. Hence, a new approach has been adopted: The pool is first opened with the untrusted userland configuration just so that the real configuration can be read from the MOS. The trusted MOS configuration is then used to generate a new vdev tree and the pool is re-opened.
When the pool is opened with an untrusted configuration, writes are disabled to avoid accidentally damaging it. During reads, some sanity checks are performed on block pointers to see if each DVA points to a known vdev; when the configuration is untrusted, instead of panicking the system if those checks fail we simply avoid issuing reads to the invalid DVAs.
This new two-step pool load process now allows rewinding pools accross vdev tree changes such as device replacement, addition, etc. Loading a pool from an external config file in a clustering environment also becomes much safer now since the pool will import even if the config is outdated and didn't, for instance, register a recent device addition.
With this code in place, it became relatively easy to implement a long-sought-after feature: the ability to import a pool with missing top level (i.e. non-redundant) devices. Note that since this almost guarantees some loss Of data, this feature is for now restricted to a read-only import.
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Andrew Stormont <andyjstormont@gmail.com> Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org> Author: Pavel Zakharov <pavel.zakharov@delphix.com> |
332535 |
16-Apr-2018 |
mav |
MFC r329783: 8942 zfs promote .../%recv should be an error
illumos/illumos-gate@add927f8c8d101e16c23eb9cd270be4fd7edf7d5
Reported on the ZFSonLinux https://github.com/zfsonlinux/zfs/issues/4843, fixed by https://github.com/zfsonlinux/zfs/pull/6339:
If we are in the middle of an incremental zfs receive, the child .../%recv will exist. If you concurrently run zfs promote .../%recv, it will "work", but then zfs gets confused. For example, there's no obvious way to destroy the containing filesystem (because it is now a clone of its invisible child).
Attempting to do this promote should be an error. We could fix this by having zfs_ioc_promote() check if zc_name contains a %, similar to zfs_ioc_rename().
Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com> |
332525 |
16-Apr-2018 |
mav |
MFC r329732: MFV r329502: 7614 zfs device evacuation/removal
illumos/illumos-gate@5cabbc6b49070407fb9610cfe73d4c0e0dea3e77
https://www.illumos.org/issues/7614: This project allows top-level vdevs to be removed from the storage pool with “zpool remove”, reducing the total amount of storage in the pool. This operation copies all allocated regions of the device to be removed onto other devices, recording the mapping from old to new location. After the removal is complete, read and free operations to the removed (now “indirect”) vdev must be remapped and performed at the new location on disk. The indirect mapping table is kept in memory whenever the pool is loaded, so there is minimal performance overhead when doing operations on the indirect vdev.
The size of the in-memory mapping table will be reduced when its entries become “obsolete” because they are no longer used by any block pointers in the pool. An entry becomes obsolete when all the blocks that use it are freed. An entry can also become obsolete when all the snapshots that reference it are deleted, and the block pointers that reference it have been “remapped” in all filesystems/zvols (and clones). Whenever an indirect block is written, all the block pointers in it will be “remapped” to their new (concrete) locations if possible. This process can be accelerated by using the “zfs remap” command to proactively rewrite all indirect blocks that reference indirect (removed) vdevs.
Note that when a device is removed, we do not verify the checksum of the data that is copied. This makes the process much faster, but if it were used on redundant vdevs (i.e. mirror or raidz vdevs), it would be possible to copy the wrong data, when we have the correct data on e.g. the other side of the mirror. Therefore, mirror and raidz devices can not be removed.
Reviewed by: Alex Reece <alex@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Richard Laager <rlaager@wiktel.com> Reviewed by: Tim Chase <tim@chase2k.com> Approved by: Garrett D'Amore <garrett@damore.org> Author: Prashanth Sreenivasa <pks@delphix.com> |
332093 |
06-Apr-2018 |
avg |
MFC r330295: ZFS: fix adding vdevs to very large pools
PR: 226096 |
331398 |
23-Mar-2018 |
mav |
MFC r329691: MFV r322231: 8430 dir_is_empty_readdir() doesn't properly handle error from fdopendir()
illumos/illumos-gate@ba6e7e6505150388de6dc6a88741164118a421bf https://github.com/illumos/illumos-gate/commit/ba6e7e6505150388de6dc6a88741164118a421bf
https://www.illumos.org/issues/8430 we should close dirfd if fdopendir() fails.
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Reviewed by: Igor Kozhukhov <igor@dilos.org> Approved by: Robert Mustacchi <rm@joyent.com> Author: Sowrabha Gopal <sowrabha.gopal@delphix.com> |
331395 |
22-Mar-2018 |
mav |
MFC r329681: MFV r318941: 7446 zpool create should support efi system partition
illumos/illumos-gate@7855d95b30fd903e3918bad5a29b777e765db821 https://github.com/illumos/illumos-gate/commit/7855d95b30fd903e3918bad5a29b777e765db821
https://www.illumos.org/issues/7446 Since we support whole-disk configuration for boot pool, we also will need whole disk support with UEFI boot and for this, zpool create should create efi- system partition. I have borrowed the idea from oracle solaris, and introducing zpool create - B switch to provide an way to specify that boot partition should be created. However, there is still an question, how big should the system partition be. For time being, I have set default size 256MB (thats minimum size for FAT32 with 4k blocks). To support custom size, the set on creation "bootsize" property is created and so the custom size can be set as: zpool create B - o bootsize=34MB rpool c0t0d0 After pool is created, the "bootsize" property is read only. When -B switch is not used, the bootsize defaults to 0 and is shown in zpool get output with value ''. Older zfs/zpool implementations are ignoring this property. https://www.illumos.org/rb/r/219/
Reviewed by: Andrew Stormont <andyjstormont@gmail.com> Reviewed by: Yuri Pankov <yuri.pankov@gmail.com> Approved by: Dan McDonald <danmcd@kebe.com> Author: Toomas Soome <tsoome@me.com>
This commit makes no sense for FreeBSD, that is why I blocked the option, but it should be good to stay closer to upstream. |
331394 |
23-Mar-2018 |
mav |
MFC r329668: MFV r316918: 7990 libzfs: snapspec_cb() does not need to call zfs_strdup()
illumos/illumos-gate@d8584ba6fb7a5e46da1725845b99ae5fab5a4baf https://github.com/illumos/illumos-gate/commit/d8584ba6fb7a5e46da1725845b99ae5fab5a4baf
https://www.illumos.org/issues/7990 The snapspec_cb() callback function in libzfs does not need to call zfs_strdup().
Reviewed by: Yuri Pankov <yuri.pankov@gmail.com> Reviewed by: Toomas Soome <tsoome@me.com> Approved by: Matthew Ahrens <mahrens@delphix.com> Author: Marcel Telka <marcel@telka.sk> |
331393 |
22-Mar-2018 |
mav |
MFC r329667: MFV r316902: 7745 print error if lzc_* is called before libzfs_core_init
illumos/illumos-gate@7c13517fff71be473e47531ef4330160c042bedc https://github.com/illumos/illumos-gate/commit/7c13517fff71be473e47531ef4330160c042bedc
https://www.illumos.org/issues/7745 The problem is that consumers of `libZFS_Core` that forget to call `libzfs_core_init()` before calling any other function of the library are having a hard time realizing their mistake. The library's internal file descriptor is declared as global static, which is ok, but it is not initialized explicitly; therefore, it defaults to 0, which is a valid file descriptor. If `libzfs_core_init()`, which explicitly initializes the correct fd, is skipped, the ioctl functions return errors that do not have anything to do with `libZFS_Core`, where the problem is actually located. Even though assertions for that existed within `libZFS_Core` for debug builds, they were never enabled because the `-DDEBUG` flag was missing from the compiler flags. This patch applies the following changes: 1. It adds `-DDEBUG` for debug builds of `libZFS_Core` and `libzfs`, to enable their assertions on debug builds. 2. It corrects an assertion within `libzfs`, where a function had been spelled incorrectly (`zpool_prop_unsupported()`) and nobody knew because the `-DDEBUG` flag was missing, and the preprocessor was taking that part of the code away. 3. The library's internal fd is initialized to `-1` and `VERIFY` assertions have been placed to check that the fd is not equal to `-1` before issuing any ioctl. It is important here to note, that the `VERIFY` assertions exist in both debug and non-debug builds. 4. In `libzfs_core_fini` we make sure to never increment the refcount of our fd below 0, and also reset the fd to `-1` when no one refers to it. The reason for this, is for the rare case that the consumer closes all references but then calls one of the library's functions without using `libzfs_core_init()` first, and in the mean time, a previous call to `open()` decided to reuse our previous fd. This scenario would have passed our assertion in
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Serapheim Dimitropoulos <serapheim@delphix.com> |
331392 |
23-Mar-2018 |
mav |
MFC r329665: MFV r316901: 7730 libzfs`add_config() leaks config nvl when reading spare/l2cache devices
illumos/illumos-gate@105686550ee9cbf5d033166a8a2a5a763667d436 https://github.com/illumos/illumos-gate/commit/105686550ee9cbf5d033166a8a2a5a763667d436
https://www.illumos.org/issues/7730 antares:root:~# mdb /usr/sbin/zpool > ::sysbp _exit > ::run import pool: data id: 2093977168778024605 state: ONLINE action: The pool can be imported using its name or numeric identifier. config:
data ONLINE c6t0d0 ONLINE c6t1d0 ONLINE cache c6t2d0 mdb: stop on entry to _exit mdb: target stopped at: 0xfee556ba: nop mdb: You've got symbols! Loading modules: [ ld.so.1 libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ] > ::findleaks -d BYTES LEAKED VMEM_SEG CALLER 4096 10 fda7b000 MMAP 8192 1 fea8d000 MMAP 8192 1 fe76d000 MMAP 8192 1 fe66e000 MMAP 4096 1 fe570000 MMAP 8192 1 fe470000 MMAP 4096 1 fe372000 MMAP 4096 1 fe273000 MMAP
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Yuri Pankov <yuri.pankov@nexenta.com> |
331391 |
22-Mar-2018 |
mav |
MFC r329664: MFV r316893: 7604 if volblocksize property is the default, it displays as "-" rather than 8K
illumos/illumos-gate@4d86c0eab246bdfddc2dd52410ba808433bd6266 https://github.com/illumos/illumos-gate/commit/4d86c0eab246bdfddc2dd52410ba808433bd6266
https://www.illumos.org/issues/7604 If a zvol has the default setting for the "volblocksize" property, it is 8KB. However, it is displayed as "-" (not present), rather than "8K". The problem was introduced by: commit 25228e830e86924a41243343b1de9daf2d7dd43a Author: Matthew Ahrens <mahrens@delphix.com> Date: Thu Nov 17 14:37:24 2016 -0800 7571 non-present readonly numeric ZFS props do not have default value which changed changed get_numeric_property() to indicate that readonly default properties are not present. However, zfs_prop_readonly() returns TRUE for both readonly and set-once properties (e.g. volblocksize). Amusingly, that commit essentially reverted: 6900484 default volblocksize is no longer being reported correctly from November 2009. However, that change was not correct either; the correct solution is to only do this check for "truly readonly" (i.e. not setonce) properties. $ zfs list -t volume -o name,volblocksize NAME VOLBLOCK domain0/group-100/appdata_container-101/appdata_windows_timeflow-102/ archive - domain0/group-100/appdata_container-101/appdata_windows_timeflow-102/ datafile - domain0/group-100/appdata_container-101/appdata_windows_timeflow-102/ external - rpool/dump 128K rpool/swap 4K rpool/swap1 ===============================================================================
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com> |
331390 |
23-Mar-2018 |
mav |
MFC r329663: MFV r316876: 7542 zfs_unmount failed with EZFS_UNSHARENFSFAILED
illumos/illumos-gate@09c9e6dc9b69d10b771bb87e01040ec320a0bfd3 https://github.com/illumos/illumos-gate/commit/09c9e6dc9b69d10b771bb87e01040ec320a0bfd3
https://www.illumos.org/issues/7542 libshare keeps a cached copy of the sharetab listing in memory, which can become out of date if shares are destroyed or created while leaving a libzfs handle open. This results in a spurious unmounting failure when an NFS share exists but isn't in the stale libshare cache.
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matt Amdur <matt.amdur@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Chris Williamson <chris.williamson@delphix.com> |
331389 |
22-Mar-2018 |
mav |
MFC r329661: MFV r316875: 7336 vfork and O_CLOEXEC causes zfs_mount EBUSY
illumos/illumos-gate@873c4903a52d089cd8234b79d24f5a3fc3bccc82 https://github.com/illumos/illumos-gate/commit/873c4903a52d089cd8234b79d24f5a3fc3bccc82
https://www.illumos.org/issues/7336 We can run into a problem where we call into zfs_mount, which in turn calls is_dir_empty, which opens the directory to try and make sure it's empty. The issue with the current approach is that it holds the directory open while it traverses it with readdir, which, due to subtle interaction with the Java JVM, vfork, and exec can cause a tricky race condition resulting in zfs_mount failures. The approach to resolving the issue in this patch is to drop the usage of readdir altogether, and instead rely on the fact that ZFS stores the number of entries contained in a directory using the st_size field of the stat structure. Thus, if the directory in question is a ZFS directory, we can check to see if it's empty by calling stat() and inspecting the st_size field of structure returned. =============================================================================== The root cause appears to be an interesting race between vfork, exec, and zfs_mount's usage of O_CLOEXEC when calling openat. Here's what is going on: 1. We call zfs_mount, and this in turn calls openat to check if the directory is empty, which results in opening the directory we're trying to mount onto, and increment v_count. 2. As we're in the middle of reading the directory, vfork is called by the JVM and proceeds to exec the jspawnhelper utility. As a result of the vfork, we take an additional hold on the directory, which increments v_count a second time. The semantics of vfork mean the parent process will wait for the child process to exit or exec before the parent can continue; at this point the parent is in the middle of zfs_mount, reading the directory to determine if it's empty or not. 3. The child process exec-ing jspawnhelper gets to the relvm call within exec_args (which is called by exec_common). relvm is the function that releases the parent process, allowing the parent to proceed. The problem is, at this point of calling relvm, the child hasn't yet called close_exec which is responsible for closing the file descriptors inherited from the parent process
Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Robert Mustacchi <rm@joyent.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Prakash Surya <prakash.surya@delphix.com> |
331388 |
22-Mar-2018 |
mav |
MFC r329659: MFV r316873: 7233 dir_is_empty should open directory with CLOEXEC
illumos/illumos-gate@d420209d9c807f782c1d31f5683be74798142198 https://github.com/illumos/illumos-gate/commit/d420209d9c807f782c1d31f5683be74798142198
https://www.illumos.org/issues/7233 This fixes a race where one thread is executing zfs_mount() while another thread forks and execs. If the fork occurs while the directory is open, the child process will inherit (but not necessarily close immediately) the open fd for the directory, preventing the mount.
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Alex Reece <alex@delphix.com> |
331381 |
22-Mar-2018 |
mav |
MFC r329505: MFV r323911: 8502 illumos#7955 broke delegated datasets when libshare is not present
illumos/illumos-gate@1c18e8fbd8db41a9fb39bd3ef7a18ee71ece20da https://github.com/illumos/illumos-gate/commit/1c18e8fbd8db41a9fb39bd3ef7a18ee71ece20da
https://www.illumos.org/issues/8502 The code in lib/libzfs/common/libzfs_mount.c already basically handles the case when libshare is not installed. We just need to not fail in zfs_init_libshare_impl. I tested this in lx and things work as expected. I also tested there trying to set sharenfs and sharesmb on the delegated dataset. Neither is allowed from within a zone. The spew of msgs from a native zone is not ZFS specific. I see the same spew simply running the share command.
Reviewed by: Robert Mustacchi <rm@joyent.com> Reviewed by: Yuri Pankov <yuripv@gmx.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Jerry Jelinek <jerry.jelinek@joyent.com> |
330590 |
07-Mar-2018 |
avg |
MFC r329719: MFV r329718: 8520 7198 lzc_rollback_to should support rolling back to origin |
329493 |
18-Feb-2018 |
mav |
MFC r328252: MFV r328251: 8652 Tautological comparisons with ZPROP_INVAL
illumos/illumos-gate@4ae5f5f06c6c2d1db8167480f7d9e3b5378ba2f2
https://www.illumos.org/issues/8652: Clang and GCC prefer to use unsigned ints to store enums. With Clang, that causes tautological comparison warnings when comparing a zfs_prop_t or zpool_prop_t variable to the macro ZPROP_INVAL. It's likely that error handling code is being silently removed as a result.
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Igor Kozhukhov <igor@dilos.org> Approved by: Gordon Ross <gwr@nexenta.com> Author: Alan Somers <asomers@gmail.com> |
329492 |
18-Feb-2018 |
mav |
MFC r328250: MFV r328249: 8641 "zpool clear" and "zinject" don't work on "spare" or "replacing" vdevs
illumos/illumos-gate@2ba5f978a4f9b02da9db1b8cdd9ea5498eb00ad9
https://www.illumos.org/issues/8641: "zpool clear" and "zinject -d" can both operate on specific vdevs, either leaf or interior. However, due to an oversight, neither works on a "spare" or "replacing" vdev. For example:
sudo zpool create foo raidz1 c1t5000CCA000081D61d0 c1t5000CCA000186235d0 spare c 1t5000CCA000094115d0 sudo zpool replace foo c1t5000CCA000186235d0 c1t5000CCA000094115d0 $ zpool status foo pool: foo state: ONLINE scan: resilvered 81.5K in 0h0m with 0 errors on Fri Sep 8 10:53:03 2017 config:
NAME STATE READ WRITE CKSUM foo ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 c1t5000CCA000081D61d0 ONLINE 0 0 0 spare-1 ONLINE 0 0 0 c1t5000CCA000186235d0 ONLINE 0 0 0 c1t5000CCA000094115d0 ONLINE 0 0 0 spares c1t5000CCA000094115d0 INUSE currently in use $ sudo zinject -d spare-1 -A degrade foo cannot find device 'spare-1' in pool 'foo' $ sudo zpool clear foo spare-1 cannot clear errors for spare-1: no such device in pool
Even though there was nothing to clear, those commands shouldn't have reported an error. by contrast, trying to clear "raidz1-0" works just fine: $ sudo zpool clear foo raidz1-0
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Gordon Ross <gwr@nexenta.com> Author: Alan Somers <asomers@gmail.com> |
329489 |
18-Feb-2018 |
mav |
MFC r328234: MFV r328233: 8898 creating fs with checksum=skein on the boot pools fails ungracefully
illumos/illumos-gate@9fa2266d9a78b8366e1cd2d5f050e8b5e37d558c
https://www.illumos.org/issues/8898: # zfs create -o checksum=skein rpool/test internal error: Result too large Abort (core dumped)
Not a big deal per se, but should be handled correctly.
Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Andy Stormont <astormont@racktopsystems.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Yuri Pankov <yuri.pankov@nexenta.com>
PR: 222199 |
329488 |
18-Feb-2018 |
mav |
MFC r328232: MFV r328231: 8897 zpool online -e fails assertion when run on non-leaf vdevs
illumos/illumos-gate@9a551dd645b478816cb11251b19f5034d885bf01
https://www.illumos.org/issues/8897: # zpool online -e test mirror-1 Assertion failed: nvlist_lookup_string(tgt, "path", &pathname) == 0, file ../common/libzfs_pool.c, line 2558, function zpool_vdev_online Abort (core dumped)
Not a big deal per se, but should be handled gracefully, same way as 'offline' and 'online' without '-e'.
Also reported as: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221408
Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Igor Kozhukhov <igor@dilos.org> Approved by: Dan McDonald <danmcd@joyent.com> Author: Yuri Pankov <yuri.pankov@nexenta.com> |
329484 |
18-Feb-2018 |
mav |
MFC r328224: MFV r328220: 8677 Open-Context Channel Programs
illumos/illumos-gate@a3b2868063897ff0083dea538f55f9873eec981f
https://www.illumos.org/issues/8677 We want to be able to run channel programs outside of synching context. This would greatly improve performance of channel program that just gather information, as we won't have to wait for synching context anymore.
This feature should introduce the following: - A new command line flag in "zfs program" to specify our intention to run in open context. - A new flag/option within the channel program ioctl which selects the context. - Appropriate error handling whenever we try a channel program in open-context that contains zfs.sync* expressions. - Documentation for the new feature in the manual pages.
Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Chris Williamson <chris.williamson@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Serapheim Dimitropoulos <serapheim@delphix.com> |
326298 |
28-Nov-2017 |
asomers |
MFC r322854, r323995, r324568, r324991
r322854: zfsd(8): Close a race condition when onlining a disk paritition
When inserting a partitioned disk, devfs and geom will announce the whole disk before they announce the partition. If the partition containing ZFS extends to one of the disk's extents, then zfsd will see a ZFS label on the whole disk and attempt to online it. ZFS is smart enough to activate the partition instead of the whole disk, but only if GEOM has already created the partition's provider.
cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c Add a zpool_read_all_labels method. It's similar to zpool_read_label, but it will return the number of labels found.
cddl/usr.sbin/zfsd/zfsd_event.cc When processing a DevFS CREATE event, only online a VDEV if we can read all four ZFS labels.
Reviewed by: mav Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D11920
r323995: Close a memory leak when using zpool_read_all_labels
X-MFC-With: 322854 Sponsored by: Spectra Logic Corp
r324568: Optimize zpool_read_all_labels with AIO
Read all labels in parallel instead of sequentially
X-MFC-With: 322854 Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D12495
r324991: Fix zpool_read_all_labels when vfs.aio.enable_unsafe=0
Previously, zpool_read_all_labels was trying to do 256KB reads, which are greater than the default MAXPHYS and therefore must go through the slow, unsafe AIO path. Shrink these reads to 112KB so they can use the safe, fast AIO path instead.
X-MFC-With: 324568 Sponsored by: Spectra Logic Corp |
325914 |
16-Nov-2017 |
avg |
MFC r325035: MFV r325013,r325034: 640 number_to_scaled_string is duplicated in several commands
FreeBSD note: of all libcmdutils functionality ZFS (and other illumos contrib code) currently uses only nicenum() function (which is similar to humanize_number but has some formatting differences). For this reason I decided to not port the whole library. As a result, nicenum.c from libcmdutils is compiled into libzfs and libzpool. This is a bit ugly, but works. If one day we are forced to create libillumos, then the file should be moved to that library. |
325534 |
08-Nov-2017 |
avg |
MFC r324163: MFV r323530,r323533,r323534: 7431 ZFS Channel Programs, and followups
Also MFC-ed are the following fixes: - r324164 Add several new files to the files enabled by ZFS kernel option - r324178 unbreak kernel builds on sparc64 and powerpc - r324194 fix incorrect use of getzfsvfs_impl in r324163 - r324292 really unbreak kernel builds on sparc64 and powerpc64 |
325151 |
30-Oct-2017 |
avg |
MFC r324348: MFV r316934: 7340 receive manual origin should override automatic origin |
325149 |
30-Oct-2017 |
avg |
MFC r324347: MFV r316933: 5142 libzfs support raidz root pool (loader project)
FreeBSD note: we have long supported this feature, this commit only removes a small difference in libzfs. |
325147 |
30-Oct-2017 |
avg |
MFC r324346: MFV r316931: 6268 zfs diff confused by moving a file to another directory |
325139 |
30-Oct-2017 |
avg |
MFC r324345: MFV r316877: 7571 non-present readonly numeric ZFS props do not have default value |
324584 |
13-Oct-2017 |
avg |
MFC r323525: MFV r323523: 8331 zfs_unshare returns wrong error code for smb unshare failure |
324583 |
13-Oct-2017 |
avg |
MFC r323524: MFV r316932: 6280 libzfs: unshare_one() could fail with EZFS_SHARENFSFAILED |
324255 |
04-Oct-2017 |
avg |
MFC r323791: MFV r323790: 8567 Inconsistent return value in zpool_read_label |
324010 |
26-Sep-2017 |
avg |
MFC r323355: MFV r323107: 8414 Implemented zpool scrub pause/resume
illumos/illumos-gate@1702cce751c5cb7ead878d0205a6c90b027e3de8 https://github.com/illumos/illumos-gate/commit/1702cce751c5cb7ead878d0205a6c90b027e3de8
FreeBSD note: rather than merging the zpool.8 update I copied the zpool scrub section from the illumos zpool.1m to FreeBSD zpool.8 almost verbatim. Now that the illumos page uses the mdoc format, it was an easier option. Perhaps the change is not in perfect compliance with the FreeBSD style, but I think that it is acceptible.
https://www.illumos.org/issues/8414 This issue tracks the port of scrub pause from ZoL: https://github.com/zfsonlinux/zfs/pull/6167 Currently, there is no way to pause a scrub. Pausing may be useful when the pool is busy with other I/O to preserve bandwidth.
Description
This patch adds the ability to pause and resume scrubbing. This is achieved by maintaining a persistent on-disk scrub state. While the state is 'paused' we do not scrub any more blocks. We do however perform regular scan housekeeping such as freeing async destroyed and deadlist blocks while paused.
Motivation and Context
Scrub pausing can be an I/O intensive operation and people have been asking for the ability to pause a scrub for a while. This allows one to preserve scrub progress while freeing up bandwidth for other I/O.
Reviewed by: George Melikov <mail@gmelikov.ru> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Brad Lewis <brad.lewis@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Alek Pinchuk <apinchuk@datto.com> |
323757 |
19-Sep-2017 |
avg |
MFC r322230: MFV r322229: 7600 zfs rollback should pass target snapshot to kernel
illumos/illumos-gate@77b171372ed21642e04c873ef1e87fe2365520df https://github.com/illumos/illumos-gate/commit/77b171372ed21642e04c873ef1e87fe2365520df
https://www.illumos.org/issues/7600 At present, the kernel side code seems to blindly rollback to whatever happens to be the latest snapshot at the time when the rollback task is processed. The expected target's name should be passed to the kernel driver and the sync task should validate that the target exists and that it is the latest snapshot indeed.
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org> |
323751 |
19-Sep-2017 |
avg |
MFC r322218: MFV r322217: 8418 zfs_prop_get_table() call in zfs_validate_name() is a no-op
illumos/illumos-gate@e09ba01dcda5e24964b8632718777b39166d86e4 https://github.com/illumos/illumos-gate/commit/e09ba01dcda5e24964b8632718777b39166d86e4
https://www.illumos.org/issues/8418 The following line in zfs_validate_name() is just a no-op and it should be removed: 108 (void) zfs_prop_get_table();
Reviewed by: Vitaliy Gusev <gusev.vitaliy@icloud.com> Approved by: Matthew Ahrens <mahrens@delphix.com> Author: Marcel Telka <marcel@telka.sk> |
322078 |
05-Aug-2017 |
mav |
MFC r321921: Add compat shim part missed at r305197.
This fixes compatibility between old kernel and new ZFS tools. It seems to be tradition to forget it.
PR: 221112 |
321610 |
27-Jul-2017 |
mav |
MFC r320156, r320185, r320186, r320262, r320452, r321111: MFV r318946: 8021 ARC buf data scatter-ization
illumos/illumos-gate@770499e185d15678ccb0be57ebc626ad18d93383 https://github.com/illumos/illumos-gate/commit/770499e185d15678ccb0be57ebc626ad1 8d93383
https://www.illumos.org/issues/8021 The ARC buf data project (known simply as "ABD" since its genesis in the ZoL community) changes the way the ARC allocates `b_pdata` memory from using linea r `void *` buffers to using scatter/gather lists of fixed-size 1KB chunks. This improves ZFS's performance by helping to defragment the address space occupied by the ARC, in particular for cases where compressed ARC is enabled. It could also ease future work to allocate pages directly from `segkpm` for minimal- overhead memory allocations, bypassing the `kmem` subsystem. This is essentially the same change as the one which recently landed in ZFS on Linux, although they made some platform-specific changes while adapting this work to their codebase: 1. Implemented the equivalent of the `segkpm` suggestion for future work mentioned above to bypass issues that they've had with the Linux kernel memory allocator. 2. Changed the internal representation of the ABD's scatter/gather list so it could be used to pass I/O directly into Linux block device drivers. (This feature is not available in the illumos block device interface yet.)
FreeBSD notes: - the actual (default) chunk size is 4KB (despite the text above saying 1KB) - we can try to reimplement ABDs, so that they are not permanently mapped into the KVA unless explicitly requested, especially on platforms with scarce KVA - we can try to use unmapped I/O and avoid intermediate allocation of a linear, virtual memory mapped buffer - we can try to avoid extra data copying by referring to chunks / pages in the original ABD
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Prashanth Sreenivasa <pks@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Chris Williamson <chris.williamson@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Dan Kimmel <dan.kimmel@delphix.com> |
321577 |
26-Jul-2017 |
mav |
MFC r319947: MFV r319945,r319946: 8264 want support for promoting datasets in libzfs_core
illumos/illumos-gate@a4b8c9aa65a0a735aba318024a424a90d7b06c37 https://github.com/illumos/illumos-gate/commit/a4b8c9aa65a0a735aba318024a424a90d7b06c37
https://www.illumos.org/issues/8264 Oddly there is a lzc_clone function, but no lzc_promote function.
Reviewed by: Andriy Gapon <avg@FreeBSD.org> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Dan McDonald <danmcd@kebe.com> Approved by: Dan McDonald <danmcd@kebe.com> Author: Andrew Stormont <astormont@racktopsystems.com> |
321576 |
26-Jul-2017 |
mav |
MFC r319751: MFV r319740: 8168 NULL pointer dereference in zfs_create()
illumos/illumos-gate@690031d326342fa4ea28b5e80f1ad6a16281519d https://github.com/illumos/illumos-gate/commit/690031d326342fa4ea28b5e80f1ad6a16281519d
https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): https://github.com/zfsonlinux/zfs/pull/6096
Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com> |
321555 |
26-Jul-2017 |
mav |
MFC r318831: MFV r316922: 5380 receive of a send -p stream doesn't need to try renaming snapshots
illumos/illumos-gate@471a88e499c660844f4590487ce7c4d5a7090294 https://github.com/illumos/illumos-gate/commit/471a88e499c660844f4590487ce7c4d5a7090294
https://www.illumos.org/issues/5380 A stream created with zfs send -p -I contains properties of all snapshots of a given dataset as opposed to only properties of snapshots in a given range. Not only this is suboptimal but the receive code also does not filter properties by the range. So, properties of earlier snapshots would be updated even though the snapshots themselves are not in the stream (just their properties). Given that modifying the snapshot properties requires a TXG sync and that the snapshots are updated one by one the described behavior may lead to a sever performance penalty.
Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Andriy Gapon <avg@FreeBSD.org> |
321546 |
26-Jul-2017 |
mav |
MFC r318819: MFV r316908: 7541 zpool import/tryimport ioctl returns ENOMEM because provided buffer is too small for config
illumos/illumos-gate@8b65a70b763232c90a91f31eb2010314c02ed338 https://github.com/illumos/illumos-gate/commit/8b65a70b763232c90a91f31eb2010314c02ed338
https://www.illumos.org/issues/7541 When calling zpool import, zpool does a few ioctls to ZFS. zpool allocates a buffer in userland and passes it to the kernel so that ZFS can copy info into it. ZFS will use it to put the nvlist that describes the pool configuration. If the allocated buffer is too small, ZFS will return ENOMEM and the call will have to be redone. This wastes CPU time and slows down the import process. This happens very often for the ZFS_IOC_POOL_TRYIMPORT call.
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Pavel Zakharov <pavel.zakharov@delphix.com> |
321535 |
26-Jul-2017 |
mav |
MFC r317414: MFV 316894
7252 7628 compressed zfs send / receive
illumos/illumos-gate@5602294fda888d923d57a78bafdaf48ae6223dea https://github.com/illumos/illumos-gate/commit/5602294fda888d923d57a78bafdaf48ae6223dea
https://www.illumos.org/issues/7252 This feature includes code to allow a system with compressed ARC enabled to send data in its compressed form straight out of the ARC, and receive data in its compressed form directly into the ARC.
https://www.illumos.org/issues/7628 We should have longer, more readable versions of the ZFS send / recv options.
7628 create long versions of ZFS send / receive options
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Reviewed by: David Quigley <dpquigl@davequigley.com> Reviewed by: Thomas Caputi <tcaputi@datto.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Dan Kimmel <dan.kimmel@delphix.com> |
321534 |
26-Jul-2017 |
mav |
MFC r317267: MFV 316891
7386 zfs get does not work properly with bookmarks
illumos/illumos-gate@edb901aab9c738b5eb15aa55933e82b0f2f9d9a2 https://github.com/illumos/illumos-gate/commit/edb901aab9c738b5eb15aa55933e82b0f2f9d9a2
https://www.illumos.org/issues/7386 The zfs get command does not work with the bookmark parameter while it works properly with both filesystem and snapshot: # zfs get -t all -r creation rpool/test NAME PROPERTY VALUE SOURCE rpool/test creation Fri Sep 16 15:00 2016 - rpool/test@snap creation Fri Sep 16 15:00 2016 - rpool/test#bkmark creation Fri Sep 16 15:00 2016 - # zfs get -t all -r creation rpool/test@snap NAME PROPERTY VALUE SOURCE rpool/test@snap creation Fri Sep 16 15:00 2016 - # zfs get -t all -r creation rpool/test#bkmark cannot open 'rpool/test#bkmark': invalid dataset name # The zfs get command should be modified to work properly with bookmarks too.
Reviewed by: Simon Klinkert <simon.klinkert@gmail.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Approved by: Matthew Ahrens <mahrens@delphix.com> Author: Marcel Telka <marcel@telka.sk> |
321522 |
26-Jul-2017 |
mav |
MFC r309096 (by avg): MFV r308989: 6428 set canmount=off on unmounted filesystem tries to unmount children
This is a cosmetic and bookkeeping change as the actual change is already in FreeBSD. See r297521, r304520, r308985. |
316762 |
13-Apr-2017 |
pfg |
MFC r316695, MFV r316693: 8046 Let calloc() do the multiplication in libzfs_fru_refresh
https://github.com/illumos/illumos-gate/commit/5697e03e6e3e2697f56ae341c7c8ce79680d6a2e
https://www.illumos.org/issues/8046
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Pedro Giffuni <pfg@freebsd.org> |
310068 |
14-Dec-2016 |
avg |
MFC r308985: revert r304520, set canmount=on is not supposed to mount the filesystem |
308914 |
21-Nov-2016 |
avg |
MFC r308089: zfsbootcfg: a simple tool to set next boot (one time) options for zfsboot |
307108 |
12-Oct-2016 |
mav |
MFC r305209: MFV r302660: 6314 buffer overflow in dsl_dataset_name
illumos/illumos-gate@9adfa60d484ce2435f5af77cc99dcd4e692b6660 https://github.com/illumos/illumos-gate/commit/9adfa60d484ce2435f5af77cc99dcd4e6 92b6660
https://www.illumos.org/issues/6314 Callers of dsl_dataset_name pass a buffer of size ZFS_MAXNAMELEN, but dsl_dataset_name copies the datasets' name PLUS the snapshot name to it, resulting in a max of 2 * ZFS_MAXNAMELEN + '@'.
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Matthew Ahrens <mahrens@delphix.com> |
307107 |
12-Oct-2016 |
mav |
MFC r305206: MFV r302658: 6872 zfs libraries should not allow uninitialized variables
illumos/illumos-gate@f83b46baf98d276f5f84fa84c8b461f412ac1f5e https://github.com/illumos/illumos-gate/commit/f83b46baf98d276f5f84fa84c8b461f41 2ac1f5e
https://www.illumos.org/issues/6872 We compile the zfs libraries with -Wno-uninitialized. We should remove this. Change makefiles, fix new warnings, fix pbchk errors.
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Paul Dagnelie <pcd@delphix.com> |
307106 |
12-Oct-2016 |
mav |
MFC r305205: MFV r302657: 4521 zfstest is trying to execute evil "zfs unmount -a"
illumos/illumos-gate@8808ac5dae118369991f158b6ab736cb2691ecde https://github.com/illumos/illumos-gate/commit/8808ac5dae118369991f158b6ab736cb2 691ecde
https://www.illumos.org/issues/4521 zfstest is trying to execute evil "zfs unmount -a", which fails (fortunately, as it would otherwise leave me with my ~ missing): 03:44:11.86 cannot unmount '/export/home/yuri': Device busy cannot unmount '/ export/home': Device busy 03:44:11.86 ERROR: /usr/sbin/zfs unmount -a exited 1 This affects, at least, zfs_mount_009_neg and zfs_mount_all_001_pos, both failing on that step. The pool containing the /export/home hierarchy is included in KEEP variable, but it doesn't seem to affect anything here.
Reviewed by: Andriy Gapon <avg@FreeBSD.org> Reviewed by: Dan McDonald <danmcd@omniti.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Yuri Pankov <yuri.pankov@nexenta.com> |
307105 |
12-Oct-2016 |
mav |
MFC r305203: MFV r302655: 6873 zfs_destroy_snaps_nvl leaks errlist
illumos/illumos-gate@4cde22c29999ffb907ca39d2ebd512812f7e5168 https://github.com/illumos/illumos-gate/commit/4cde22c29999ffb907ca39d2ebd512812 f7e5168
https://www.illumos.org/issues/6873 lzc_destroy_snaps() returns an nvlist in errlist. zfs_destroy_snaps_nvl() should nvlist_free() it before returning.
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Chris Williamson <chris.williamson@delphix.com> |
307104 |
12-Oct-2016 |
mav |
MFC r305202: MFV r302654: 6879 incorrect endianness swap for drr_spill.drr_length in libzfs_sendrecv.c
illumos/illumos-gate@20fea7a47472aceb64d3ed48cc2a3ea268bc4795 https://github.com/illumos/illumos-gate/commit/20fea7a47472aceb64d3ed48cc2a3ea26 8bc4795
https://www.illumos.org/issues/6879 In libzfs_sendrecv, there's a typo: case DRR_SPILL: if (byteswap) { drr->drr_u.drr_write.drr_length = BSWAP_64(drr->drr_u.drr_spill.drr_length); } Instead of drr_write.drr_length, we should be assigning the result of the byteswap to drr_spill.drr_length.
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Dan Kimmel <dan.kimmel@delphix.com> |
307102 |
12-Oct-2016 |
mav |
MFC r305201: MFV r302653: 6111 zfs send should ignore datasets created after the ending snapshot
illumos/illumos-gate@4a20c933b148de8a1c1d3538391c64284e636653 https://github.com/illumos/illumos-gate/commit/4a20c933b148de8a1c1d3538391c64284 e636653
https://www.illumos.org/issues/6111 If you create a zfs child folder, zfs send returns an error when a recursive incremental send is done between two snapshots made prior to the folder creation. The problem can be reproduced with the following steps. root@zfs:/# zfs create pool/test root@zfs:/# zfs snapshot pool/test@snap1 root@zfs:/# zfs snapshot pool/test@snap2 root@zfs:/# zfs create pool/test/child root@zfs:/# zfs send -R -I pool/test@snap1 pool/test@snap2 > /dev/null WARNING: could not send pool/test/child@snap2: does not exist WARNING: could not send pool/test/child@snap2: does not exist root@zfs:/# echo $? 1 root@zfs:/# zfs snapshot -r pool/test@snap3 root@zfs:/# zfs send -R -I pool/test@snap1 pool/test@snap3 > /dev/null root@zfs:/# echo $? 0 root@zfs:/# zfs send -R -I pool/test@snap2 pool/test@snap3 > /dev/null root@zfs:/# echo $? 0 Since pool/test/child was created after snap2, zfs send should not expect snap2 to be in pool/test/child when doing a recursive send. It should examine the compare the creation time of the snapshot and each child folder to decide if the folder will be sent. The next incremental send between snap2 and snap3 would properly create the child folder and snap3 which first appears in the child folder. The problem is identical if '-i' is used instead of '-I'.
Reviewed by: Alex Aizman alex.aizman@nexenta.com Reviewed by: Alek Pinchuk alek.pinchuk@nexenta.com Reviewed by: Roman Strashkin roman.strashkin@nexenta.com Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Approved by: Garrett D'Amore <garrett@damore.org> Author: Alex Deiter <alex.deiter@nexenta.com> |
307100 |
12-Oct-2016 |
mav |
MFC r305194: MFV r302642: 6876 Stack corruption after importing a pool with a too-long name
illumos/illumos-gate@c971037baa5d64dfecf6d87ed602fc3116ebec41 https://github.com/illumos/illumos-gate/commit/c971037baa5d64dfecf6d87ed602fc3116ebec41
https://www.illumos.org/issues/6876 Calling dsl_dataset_name on a dataset with a 256 byte buffer is asking for trouble. We should check every dataset on import, using a 1024 byte buffer and checking each time to see if the dataset's new name is longer than 256 bytes.
Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Paul Dagnelie <pcd@delphix.com> |
307050 |
11-Oct-2016 |
mav |
MFC r305207: MFV r302659: 6931 lib/libzfs: cleanup gcc warnings
illumos/illumos-gate@88f61dee20b358671b1b643e9d1dbf220a1d69be https://github.com/illumos/illumos-gate/commit/88f61dee20b358671b1b643e9d1dbf220a1d69be
https://www.illumos.org/issues/6931 need cleanup: CERRWARN += -_gcc=-Wno-switch CERRWARN += -_gcc=-Wno-parentheses CERRWARN += -_gcc=-Wno-unused-function
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Igor Kozhukhov <ikozhukhov@gmail.com> |
307046 |
11-Oct-2016 |
mav |
MFC r305195: MFV r302643: 6902 speed up listing of snapshots if requesting name only and sorting by name
This was our change from the beginning, so just reduce the upstream diff. |
305460 |
06-Sep-2016 |
avg |
MFC r304520: fix bug introduced in r297521, set canmount=on doesn't mount filesystem |
302408 |
08-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
298472 |
22-Apr-2016 |
avg |
MFV r298471: 6052 decouple lzc_create() from the implementation details
illumos/illumos-gate@26455f9efcf9b1e44937d4d86d1ce37b006f25a9 https://github.com/illumos/illumos-gate/commit/26455f9efcf9b1e44937d4d86d1ce37b006f25a9
https://www.illumos.org/issues/6052 At the moment type parameter of lzc_create() is of dmu_objset_type_t type. That exposes an implementation detail and requires sys/fs/zfs.h to be included in libzfs_core.h creating unnecessary coupling between libzfs_core interface and ZFS internals. I think that dmu_objset_type_t should be replaced with a libzfs_core enumeration of supported dataset types. For ABI reasons the new enumeration could be bit-compatible with dmu_objset_type_t. For example: typedef enum { LZC_DST_ZFS = 2, LZC_DST_ZVOL } lzc_dataset_type_t;
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Andriy Gapon <andriy.gapon@clusterhq.com>
MFC after: 2 weeks Sponsored by: ClusterHQ
|
297763 |
09-Apr-2016 |
mav |
MFV r297760: 6418 zpool should have a label clearing command
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Author: Will Andrews <will@firepipe.net>
Closes #83 Closes #32
openzfs/openzfs@9663688425131744221ea99f9e66b9ed964492ae
FreeBSD already had `zpool labelclear` functionality, so this is mostly just a diff reduction.
MFC after: 1 month
|
297521 |
03-Apr-2016 |
avg |
fix zfs set canmount=off on an unmounted filesystem
Previously this operation tried to unmount and remount children. Also see https://www.illumos.org/issues/6428.
MFC after: 2 weeks X-Needs-Upstreaming: illumos
|
297520 |
03-Apr-2016 |
avg |
zfs receive: -u can be ignored sometimes
When force-receiving a filesystem that was already mounted the re-created filesystem is mounted despite -u flag.
Also see https://www.illumos.org/issues/6412.
PR: 204705 Tested by: Vladimir Krstulja <vlad-fbsd@acheronmedia.com> MFC after: 2 weeks X-Needs-Upstreaming: illumos
|
296567 |
09-Mar-2016 |
mav |
Missed addition to r296563 to fix newer tools to work with older kernel.
|
296541 |
08-Mar-2016 |
mav |
MFV r296540: 4448 zfs diff misprints unicode characters
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Reviewed by: Toomas Soome <tsoome@me.com> Approved by: Matthew Ahrens <mahrens@delphix.com> Author: Joshua M. Clulow <jmc@joyent.com>
illumos/illumos-gate@b211eb9181f99c20acbf4c528f94cb44b4ca8c31
|
296539 |
08-Mar-2016 |
mav |
MFV r296538: 6544 incorrect comment in libzfs.h about offline status
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Gerhard Roethlin <git@the-color-black.net>
illumos/illumos-gate@cb605c4d8ab24b5a900b8b4ca85db65c22d05fad
|
296528 |
08-Mar-2016 |
mav |
MFV r296527: 6659 nvlist_free(NULL) is a no-op
Reviewed by: Toomas Soome <tsoome@me.com> Reviewed by: Marcel Telka <marcel@telka.sk> Approved by: Robert Mustacchi <rm@joyent.com> Author: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
illumos/illumos-gate@aab83bb83be7342f6cfccaed8d5fe0b2f404855d
|
296519 |
08-Mar-2016 |
mav |
MFV r296518: 5027 zfs large block support (add copyright)
Author: Matthew Ahrens <matt@mahrens.org>
illumos/illumos-gate@c3d26abc9ee97b4f60233556aadeb57e0bd30bb9
|
295047 |
29-Jan-2016 |
mav |
MFV 295046: 6358 A faulted pool with only unavailable vdevs triggers assertion failure in libzfs
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Andrew Stormont <andyjstormont@gmail.com> Reviewed by: Serban Maduta <serban.maduta@gmail.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Dan Vatca <dan.vatca@gmail.com>
illumos/illumos-gate@b289d045e084af53efcc025255af8242e41f28fa
|
294817 |
26-Jan-2016 |
mav |
MFV r294816: 4986 receiving replication stream fails if any snapshot exceeds refquota
Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Gordon Ross <gordon.ross@nexenta.com> Author: Dan McDonald <danmcd@omniti.com>
illumos/illumos-gate@5878fad70d76d8711f6608c1f80b0447601261c6
|
289562 |
19-Oct-2015 |
mav |
MFV r289561: 6328 Fix cstyle errors in zfs codebase
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Alex Reece <alex@delphix.com> Reviewed by: Richard Elling <Richard.Elling@RichardElling.com> Reviewed by: Jorgen Lundman <lundman@lundman.net> Approved by: Robert Mustacchi <rm@joyent.com> Author: Paul Dagnelie <pcd@delphix.com>
illumos/illumos-gate@9a686fbc186e8e2a64e9a5094d44c7d6fa0ea167
|
289531 |
18-Oct-2015 |
mav |
MFV r289530: 5847 libzfs_diff should check zfs_prop_get() return
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Albert Lee <trisk@omniti.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Alexander Eremin <a.eremin@nexenta.com>
illumos/illumos-gate@8430278980a48338e04c7dd52b495b7f1551367a
|
289528 |
18-Oct-2015 |
mav |
Reduce diff from upstream.
Should be no functional change.
|
289527 |
18-Oct-2015 |
mav |
MFV r289526: 5561 support root pools on EFI/GPT partitioned disks 5125 update zpool/libzfs to manage bootable whole disk pools (EFI/GPT labeled disks)
Reviewed by: Jean McCormack <jean.mccormack@nexenta.com> Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
illumos/illumos-gate@1a902ef8628b0dffd6df5442354ab59bb8530962
This is NOP changes for FreeBSD.
|
289500 |
18-Oct-2015 |
mav |
MFC r289498: 6298 zfs_create_008_neg and zpool_create_023_neg need to be updated for large block support.
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Joe Stein <joe.stein@delphix.com>
illumos/illumos-gate@e9316f7696401f3e5e263a5939031cb8d5641a88
|
289499 |
18-Oct-2015 |
mav |
MFV r247180: Update vendor/illumos/dist and vendor-sys/illumos/dist to illumos-gate 13967:92bec6d87f59
Illumos ZFS issues: 3557 dumpvp_size is not updated correctly when a dump zvol's size is changed 3558 setting the volsize on a dump device does not return back ENOSPC 3559 setting a volsize larger than the space available sometimes succeeds
|
289497 |
18-Oct-2015 |
mav |
MFV r289493: 5745 zfs set allows only one dataset property to be set at a time
Reviewed by: Christopher Siden <christopher.siden@delphix.com> Reviewed by: George Wilson <george@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com> Reviewed by: Richard PALO <richard@NetBSD.org> Reviewed by: Steven Hartland <killing@multiplay.co.uk> Approved by: Rich Lowe <richlowe@richlowe.net> Author: Chris Williamson <chris.williamson@delphix.com>
illumos/illumos-gate@30925561c223021e91d15899cbe75f80e54d8889
|
289422 |
16-Oct-2015 |
mav |
MFV r289310: 4185 add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com> Reviewed by: Richard Lowe <richlowe@richlowe.net> Approved by: Garrett D'Amore <garrett@damore.org> Author: Matthew Ahrens <mahrens@delphix.com>
illumos/illumos-gate@45818ee124adeaaf947698996b4f4c722afc6d1f
This is only a partial merge of respective ZFS infrastructure changes. At this moment FreeBSD kernel has no those crypto algorithms, so the parts of the code to enable them are commented out. When they are implemented, it will be trivial to plug them in.
|
289362 |
15-Oct-2015 |
mav |
MFV r289312: 2605 want to resume interrupted zfs send
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Richard Elling <Richard.Elling@RichardElling.com> Reviewed by: Xin Li <delphij@freebsd.org> Reviewed by: Arne Jansen <sensille@gmx.net> Approved by: Dan McDonald <danmcd@omniti.com> Author: Matthew Ahrens <mahrens@delphix.com>
illumos/illumos-gate@9c3fd1216fa7fb02cfbc78a2518a686d54b48ab8
For more info, see: - slides http://www.slideshare.net/MatthewAhrens/openzfs-send-and-receive - video https://www.youtube.com/watch?v=iY44jPMvxog - manpage changes (for zfs resume -s and zfs send -t) - upcoming talk at the OpenZFS Developer Summit
The TL;DR is: Use "zfs receive -s" to save the partially received state on failure. On failure, get the receive token with "zfs get receive_resume_token <fs>" Resume the send with "zfs send -t <token_value>"
Relnotes: yes
|
289313 |
14-Oct-2015 |
mav |
MFV r289311: 5764 "zfs send -nv" directs output to stderr
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com> Reviewed by: Basil Crow <basil.crow@delphix.com> Reviewed by: Steven Hartland <killing@multiplay.co.uk> Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Manoj Joseph <manoj.joseph@delphix.com>
illumos/illumos-gate@dc5f28a3c341db7c241bba77ddc109c141072f27
|
288340 |
28-Sep-2015 |
avg |
define aok in libnvpair which is linked to all zfs libraries that need aok
This removes the circular dependency of libnvpair on libzfs / libzpool.
PR: 199811 Obtained from: bapt MFC after: 23 days
|
286705 |
12-Aug-2015 |
mav |
MFV r286704: 5960 zfs recv should prefetch indirect blocks 5925 zfs receive -o origin=
Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Author: Paul Dagnelie <pcd@delphix.com>
While running 'zfs recv' we noticed that every 128th 8K block required a read. We were seeing that restore_write() was calling dmu_tx_hold_write() and the indirect block was not cached. We should prefetch upcoming indirect blocks to avoid having to go to disk and blocking the restore_write().
Allow an incremental send stream to be received as a clone, even if the stream does not mark it as a clone.
|
286587 |
10-Aug-2015 |
mav |
MFV 286586: 5746 more checksumming in zfs send
Reviewed by: Christopher Siden <christopher.siden@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com> Approved by: Albert Lee <trisk@omniti.com> Author: Matthew Ahrens <mahrens@delphix.com>
illumos/illumos-gate@98110f08fa182032082d98be2ddb9391fcd62bf1
|
284308 |
12-Jun-2015 |
avg |
MFV r284042: 1778 Assertion failed: rn->rn_nozpool == B_FALSE, file ../common/libzfs_import.c, line 1077, function zpool_open_func
illumos/illumos-gate@bd0f709169e67f4bd34526e186a7c34f595f0d9b
Author: Andrew Stormont <andyjstormont@gmail.com> MFC after: 13 days
|
277433 |
20-Jan-2015 |
delphij |
MFV r277432:
Plug various memory leaks in libzfs import implementation.
Illumos issue: 5518 Memory leaks in libzfs import implementation
MFC after: 2 weeks
|
277300 |
17-Jan-2015 |
smh |
Mechanically convert cddl sun #ifdef's to illumos
Since the upstream for cddl code is now illumos not sun, mechanically convert all sun #ifdef's to illumos #ifdef's which have been used in all newer code for some time.
Also do a manual pass to correct the use if #ifdef comments as per style(9) as well as few uses of #if defined(__FreeBSD__) vs #ifndef illumos.
MFC after: 1 month Sponsored by: Multiplay
|
277239 |
16-Jan-2015 |
smh |
Eliminate illumos whole disk special case when searching for a ZFS vdev
This special case prevented locating vdevs which start with c[0-9] e.g. gptid/c6cde092-504b-11e4-ba52-c45444453598 hence it was impossible to online a vdev via its path.
Submitted by: Peter Xu <xzpeter@gmail.com> MFC after: 2 weeks Sponsored by: Multiplay
|
276446 |
31-Dec-2014 |
smh |
Use the correct state name for unavailable pools in zpool list
This corrects inconsitencies between zpool list and zpool status which are both described as displaying the pool <state> however zpool list would use this hardcoded FAULTED instead of the correct UNAVAIL.
MFC after: 1 month
|
275812 |
15-Dec-2014 |
delphij |
MFV r275784:
Plug a memory leak in libzfs. In zfs_iter_bookmarks, an nvlist is allocated before calling lzc_get_bookmarks, which allocates the nvlist again (and overwrites the pointer to previously allocated list).
Illumos issue: 5427 memory leak in libzfs when doing rollback
MFC after: 2 weeks
|
275579 |
07-Dec-2014 |
delphij |
MFV r275537:
Illumos issue: 5316 allow smbadm join to use RPC
(Due to our lack of smbsrv this is mostly no-op on FreeBSD)
MFC after: 2 weeks
|
274337 |
10-Nov-2014 |
delphij |
MFV r274273:
ZFS large block support.
Please note that booting from datasets that have recordsize greater than 128KB is not supported (but it's Okay to enable the feature on the pool). This *may* remain unchanged because of memory constraint.
Limited safety belt is provided for mounted root filesystem but use caution is advised.
Illumos issue: 5027 zfs large block support
MFC after: 1 month
|
272502 |
04-Oct-2014 |
delphij |
MFV r272493:
Show individual disk capacity when doing zpool list -v.
Illumos issue: 5147 zpool list -v should show individual disk capacity
MFC after: 1 week
|
271764 |
18-Sep-2014 |
will |
zfs_setprop_error(): Handle errno value E2BIG.
This errno value is emitted by dsl_props_set_check() in sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_prop.c, and is used to mean that the property value is too long. For the record, the maximum length is ZAP_MAXVALUELEN, which is 8*1024 bytes.
Instead of claiming an unknown error (and abort()ing), provide something more specific to the scenario involved. As far as I can tell, E2BIG is not emitted for any other scenario.
MFC after: 1 week Sponsored by: Spectra Logic Affects: All ZFS versions starting 27 Feb 2009 (illumos ccba0801) This change modified the value returned by dsl_props_set_check(), so that it can distinguish between a name that's too long and a value that's too long, but libzfs was not updated accordingly. MFSpectraBSD: r1051499 on 2014/03/28 11:07:59
|
271527 |
13-Sep-2014 |
delphij |
MFV r271511:
Use fnvlist_* to make code more readable.
Illumos issue: 5135 zpool_find_import_cached() can use fnvlist_*
MFC after: 2 weeks
|
269118 |
26-Jul-2014 |
delphij |
MFV r269010:
Import Illumos changes to address the following Illumos issues: 4976 zfs should only avoid writing to a failing non-redundant top-level vdev 4978 ztest fails in get_metaslab_refcount() 4979 extend free space histogram to device and pool 4980 metaslabs should have a fragmentation metric 4981 remove fragmented ops vector from block allocator 4982 space_map object should proactively upgrade when feature is enabled 4984 device selection should use fragmentation metric
MFC after: 2 weeks
|
268469 |
09-Jul-2014 |
delphij |
MFV r268453:
Diff reduction against Illumos.
MFC after: 2 weeks
|
268123 |
01-Jul-2014 |
delphij |
MFV r268119:
4914 zfs on-disk bookmark structure should be named *_phys_t
illumos/illumos-gate@7802d7bf98dec568dadf72286893b1fe5abd8602
MFC after: 2 weeks
|
268116 |
01-Jul-2014 |
delphij |
- Fix handling of "new" style of ioctl in compatiblity mode [1]; - Reorganize code and reduce diff from upstream; - Improve forward compatibility shims for previous kernel;
Reported by: sbruno [1] X-MFC-With: r268075
|
268079 |
01-Jul-2014 |
delphij |
MFV r267566:
4390 i/o errors when deleting filesystem/zvol can lead to space map corruption
MFC after: 2 weeks
|
268075 |
01-Jul-2014 |
delphij |
MFV r267565:
4757 ZFS embedded-data block pointers ("zero block compression") 4913 zfs release should not be subject to space checks
MFC after: 2 weeks
|
265821 |
10-May-2014 |
mav |
Comment out some pointless device open/close around reading device IDs.
FreeBSD ZFS port unlike OpenSolaris does not use device IDs, and does not implement respective devid_*() fuctions. It is pointless to open devices just to close them back immediately.
MFC after: 2 weeks Sponsored by: iXsystems, Inc.
|
264852 |
24-Apr-2014 |
smh |
Silence compiler warning due to missing return in idmap_id_to_numeric_domain_rid
|
264835 |
23-Apr-2014 |
delphij |
MFV r264829:
3897 zfs filesystem and snapshot limits
MFC after: 2 weeks
|
264467 |
14-Apr-2014 |
delphij |
Take into account when zpool history block grows exceeding 128KB in zpool(8) and zdb(8) by growing the buffer on demand with a cap of 1GB (specified in spa_history_create_obj()).
PR: bin/186574 Submitted by: Andrew Childs <lorne cons org nz> (with changes) MFC after: 2 weeks
|
263889 |
28-Mar-2014 |
delphij |
MFV r263887:
3993 zpool(1M) and zfs(1M) should support -p for "list" and "get" 4700 "zpool get" doesn't support -H or -o options
MFC after: 2 weeks
|
262577 |
27-Feb-2014 |
delphij |
MFV r262570:
4626 libzfs memleak in zpool_in_use()
illumos/illumos-gate@fb13f48f1d9593453b94cd1c7277553b56f493c8
MFC after: 2 weeks
|
260183 |
02-Jan-2014 |
delphij |
MFV r260154 + 260182:
4369 implement zfs bookmarks 4368 zfs send filesystems from readonly pools
Illumos/illumos-gate@78f171005391b928aaf1642b3206c534ed644332
MFC after: 2 weeks
|
259850 |
25-Dec-2013 |
delphij |
MFV r258384:
2583 Add -p (parsable) option to zfs list
illumos/illumos-gate@43d68d68c1ce08fb35026bebfb141af422e7082e
MFC after: 2 weeks
|
259813 |
24-Dec-2013 |
delphij |
MFV r258374:
4171 clean up spa_feature_*() interfaces
4172 implement extensible_dataset feature for use by other zpool features
illumos/illumos-gate@2acef22db7808606888f8f92715629ff3ba555b9
MFC after: 2 weeks
|
259168 |
10-Dec-2013 |
mav |
Don't even try to read vdev labels from devices smaller then SPA_MINDEVSIZE (64MB). Even if we would find one somehow, ZFS kernel code rejects such devices. It is funny to look on attempts to read 4 256K vdev labels from 1.44MB floppy, though it is not very practical and quite slow.
|
255750 |
21-Sep-2013 |
delphij |
MFV r254750:
Add support of Illumos dumps on zvol over RAID-Z.
Note that this only adds the features. FreeBSD would still need more work to support dumping on zvols.
Illumos ZFS issues: 2932 support crash dumps to raidz, etc. pools
MFC after: 1 month Approved by: re (ZFS blanket)
|
254755 |
24-Aug-2013 |
delphij |
MFV r254748:
Fix memory leak in libzfs's iter_dependents_cb().
Illumos ZFS issues: 4061 libzfs: memory leak in iter_dependents_cb()
|
254591 |
21-Aug-2013 |
gibbs |
Enhance the ZFS vdev layer to maintain both a logical and a physical minimum allocation size for devices. Use this information to automatically increase ZFS's minimum allocation size for new top-level vdevs to a value that more closely matches the optimum device allocation size.
Use GEOM's stripesize attribute, if set, as the physical sector size of the GEOM.
Calculate the minimum blocksize of each metaslab class. Use the calculated value instead of SPA_MINBLOCKSIZE (512b) when determining the likelyhood of compression yeilding a reduction in physical space usage.
Report devices with sub-optimal block size configuration in "zpool status". Also properly fail attempts to attach devices with a logical block size greater than 8kB, since this will cause corruption to ZFS's label area.
Sponsored by: Spectra Logic Corporaion MFC after: 2 weeks
Background ========== Many modern devices use physical allocation units that are much larger than the minimum logical allocation size accessible by external commands. Two prevalent examples of this are 512e disk drives (512b logical sector, 4K physical sector) and flash devices (512b logical sector, 4K or larger allocation block size, and 128k or larger erase block size). Operations that modify less than the physical sector size result in a costly read-modify-write or garbage collection sequence on these devices.
Simply exporting the true physical sector of the device to ZFS would yield optimal performance, but has two serious drawbacks:
1) Existing pools created with devices that have different logical and physical block sizes, but were configured to use the logical block size (e.g. because the OS version used for pool construction reported the logical block size instead of the physical block size) will suddenly find that the vdev allocation size has increased. This can be easily tolerated for active members of the array, but ZFS would prevent replacement of a vdev with another identical device because it now appears that the smaller allocation size required by the pool is not supported by the new device.
2) The device's physical block size may be too large to be supported by ZFS. The optimal allocation size for the vdev may be quite large. For example, a RAID controller may export a vdev that requires read-modify-write cycles unless accessed using 64k aligned/sized requests. ZFS currently has an 8k minimum block size limit.
Reporting both the logical and physical allocation sizes for vdevs solves these problems. A device may be used so long as the logical block size is compatible with the configuration. By comparing the logical and physical block sizes, new configurations can be optimized and administrators can be notified of any existing pools that are sub-optimal.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h: Add the SPA_ASHIFT constant. ZFS currently has a hard upper limit of 13 (8k) for ashift and this constant is used to both document and enforce this limit.
sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h: Add the VDEV_AUX_ASHIFT_TOO_BIG error code.
Add fields for exporting the configured, logical, and physical ashift to the vdev_stat_t structure.
Add VDEV_STAT_VALID() macro which can be used to verify the presence of required vdev_stat_t fields in nvlist data.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c: Provide a SYSCTL_PROC handler for "max_auto_ashift". Since the limit is only referenced long after boot when a create operation occurs, there's no compelling need for it to be a boot time configurable tunable. This also allows the validation code for the max_auto_ashift value to be contained within the sysctl handler.
Populate the new fields in the vdev_stat_t structure.
Fail vdev opens if the vdev reports an ashift larger than SPA_MAXASHIFT.
Propogate vdev_logical_ashift and vdev_physical_ashift between child and parent vdevs as is done for vdev_ashift.
In vdev_open(), restore code that fails opens for devices where vdev_ashift grows. This can only happen now if the device's logical ashift grows, which means it really isn't safe to use the device.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_missing.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_root.c: Update the vdev_open() API so that both logical (what was just ashift before) and physical ashift are reported.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h: Add two new fields, vdev_physical_ashift and vdev_logical_ashift, to vdev_t.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c: Add vdev_ashift_optimize(). Call it anytime a new top-level vdev is allocated.
cddl/contrib/opensolaris/cmd/zpool/zpool_main.c: Add text for the VDEV_AUX_ASHIFT_TOO_BIG error.
For each sub-optimally configured leaf vdev, report configured and native block sizes.
cddl/contrib/opensolaris/cmd/zpool/zpool_main.c: cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h: cddl/contrib/opensolaris/lib/libzfs/common/libzfs_status.c: Introduce a new zpool status: ZPOOL_STATUS_NON_NATIVE_ASHIFT. This status is reported on healthy pools containing vdevs configured to use a block size smaller than their reported physical block size.
cddl/contrib/opensolaris/lib/libzfs/common/libzfs_status.c: Update find_vdev_problem() and supporting functions to provide the full vdev_stat_t structure to problem checking routines, and to allow decent into replacing vdevs.
Add a vdev_non_native_ashift() validator which is used on the full vdev tree to check for ZPOOL_STATUS_NON_NATIVE_ASHIFT.
cddl/contrib/opensolaris/lib/libzpool/common/kernel.c: cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h: Enhance sysctl userland stubs now that a SYSCTL_PROC handler is used in vdev.c.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h: When the group membership of a metaslab class changes (i.e. when a vdev is added or removed from a pool), walk the group list to determine the smallest block size currently available and record this in the metaslab class.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c: Add the metaslab_class_get_minblocksize() accessor.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio_compress.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio_compress.c: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c: In zio_compress_data(), take the minimum blocksize as an input parameter instead of assuming SPA_MINBLOCKSIZE.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c: In l2arc_compress_buf(), pass SPA_MINBLOCKSIZE as the minimum blocksize of the device. The l2arc code performs has it's own code for deciding if compression is worth while, so this effectively disables zio_compress_data() from second guessing the original decision.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c: In zio_write_bp_init(), use the minimum blocksize of the normal metaslab class when compressing data.
|
254587 |
21-Aug-2013 |
delphij |
MFV r254421:
Illumos ZFS issues: 3996 want a libzfs_core API to rollback to latest snapshot
|
253819 |
30-Jul-2013 |
delphij |
MFV r253781 + r253871:
Illumos ZFS issues: 3894 zfs should not allow snapshot of inconsistent dataset
MFC after: 2 weeks
|
253818 |
30-Jul-2013 |
smh |
MFV r253784:
Fix zfs send -D hang after processing requiring a CTRL+C to interrupt due to pthread_join prior to fd close.
This was introduced by r251646 (MFV r251644)
Illumos ZFS issue: 3909 "zfs send -D" does not work
MFC after: 1 day
|
252219 |
25-Jun-2013 |
delphij |
MFV r252215:
Restore a previous behavior before r251646, where when destructing ZFS snapshot, the ioctl would return ENOENT when it hit any of them in the errlist (the new behavior was only return ENOENT when all returns error).
Illumos ZFS issues: 3829 fix for 3740 changed behavior of zfs destroy/hold/release ioctl
MFC after: 1 week
|
252218 |
25-Jun-2013 |
delphij |
Diff reduction against Illumos, no real change to code itself.
This marks vendor branch revision 252213 as merged, the actual code was committed in r245479.
MFC after: 1 week
|
251646 |
12-Jun-2013 |
delphij |
MFV r251644:
Poor ZFS send / receive performance due to snapshot hold / release processing (by smh@)
Illumos ZFS issues: 3740 Poor ZFS send / receive performance due to snapshot hold / release processing
MFC after: 2 weeks
|
251634 |
11-Jun-2013 |
delphij |
MFV r251623:
zpool create should treat -O mountpoint and -m the same
Illumos ZFS issues: 3745 zpool create should treat -O mountpoint and -m the same
MFC after: 2 weeks
|
251629 |
11-Jun-2013 |
delphij |
MFV r251619:
ZFS needs better comments.
Illumos ZFS issues: 3741 zfs needs better comments
MFC after: 2 weeks
|
249883 |
25-Apr-2013 |
mm |
Respect the enoent_ok flag if reporting error for holding an non-existing snapshot.
Related illumos ZFS issue: 3699 zfs hold or release of a non-existent snapshot does not output error
Reported by: Steven Hartland <smh@FreeBSD.org> MFC after: 3 days
|
249547 |
16-Apr-2013 |
pjd |
Correct error message.
Reported by: Dirk Engling <erdgeist@erdgeist.org>
|
249357 |
11-Apr-2013 |
mm |
Fix libzfs to report error instead of returning zero if trying to hold or release a non-existing snapshot of a existing dataset. In recursive case error is reported if no snapshots with the requested name have been found.
Problem and proposed solution reported to illumos: 3699 zfs hold or release of a non-existent snapshot does not output error
MFC after: 8 days
|
249319 |
09-Apr-2013 |
mm |
ZFS expects a copyout of zfs_cmd_t on an ioctl error. Our sys_ioctl() doesn't copyout in this case.
To solve this issue a new struct zfs_iocparm_t is introduced consisting of: - zfs_ioctl_version (future backwards compatibility purposes) - user space pointer to zfs_cmd_t (copyin and copyout) - size of zfs_cmd_t (verification purposes)
The copyin and copyout of zfs_cmd_t is now done the illumos (vendor) way what makes porting of new changes easier and ensures correct behavior if returning an error.
MFC after: 10 days
|
248571 |
21-Mar-2013 |
mm |
Merge libzfs_core branch: includes MFV 238590, 238592, 247580
MFV 238590, 238592: In the first zfs ioctl restructuring phase, the libzfs_core library was introduced. It is a new thin library that wraps around kernel ioctl's. The idea is to provide a forward-compatible way of dealing with new features. Arguments are passed in nvlists and not random zfs_cmd fields, new-style ioctls are logged to pool history using a new method of history logging.
http://blog.delphix.com/matt/2012/01/17/the-future-of-libzfs/
MFV 247580 [1]: To address issues of several deadlocks and race conditions the locking code around dsl_dataset was rewritten and the interface to synctasks was changed.
User-Visible Changes: "zfs snapshot" can create more arbitrary snapshots at once (atomically) "zfs destroy" destroys multiple snapshots at once "zfs recv" has improved performance
Backward Compatibility: I have extended the compatibility layer to support full backward compatibility by remapping or rewriting the responsible ioctl arguments. Old utilities are fully supported by the new kernel module.
Forward Compatibility: New utilities work with old kernels with the following restrictions: - creating, destroying, holding and releasing of multiple snapshots at once is not supported, this includes recursive (-r) commands
Illumos ZFS issues: 2882 implement libzfs_core 2900 "zfs snapshot" should be able to create multiple, arbitrary snapshots at once 3464 zfs synctask code needs restructuring
References: https://www.illumos.org/issues/2882 https://www.illumos.org/issues/2900 https://www.illumos.org/issues/3464 [1]
MFC after: 1 month Sponsored by: Hybrid Logic Inc. [1]
|
247540 |
01-Mar-2013 |
mm |
Fix the zfs_ioctl compat layer to support zfs_cmd size change introduced in r247265 (ZFS deadman thread). Both new utilities now support the old kernel and new kernel properly detects old utilities.
For future backwards compatibility, the vfs.zfs.version.ioctl read-only sysctl has been introduced. With this sysctl zfs utilities will be able to detect the ioctl interface version of the currently loaded zfs module.
As a side effect, the zfs utilities between r247265 and this revision don't support the old kernel module. If you are using HEAD newer or equal than r247265, install the new kernel module (or whole kernel) first.
MFC after: 10 days
|
246631 |
10-Feb-2013 |
mm |
MFV r246388:
Import vendor bugfixes
Illumos ZFS issues: 3422 zpool create/syseventd race yield non-importable pool 3425 first write to a new zvol can fail with EFBIG
References: https://www.illumos.org/issues/3422 https://www.illumos.org/issues/3425
MFC after: 2 weeks
|
245479 |
15-Jan-2013 |
smh |
Reports pools which have a removed l2cache disk under -x as this is what happens when a cache device is dropped for any reason.
Reviewed by: pjd Approved by: pjd (mentor) MFC after: 2 weeks
|
244194 |
13-Dec-2012 |
smh |
Fixes zfs receive errors caused by snapshot replication being processed in a random order instead of creation order.
Eliminates needless filesystem renames caused by removed parent snapshots which subsequently causes many more errors.
PR: kern/172259 Submitted by: Steven Hartland Reviewed by: pjd (mentor) Approved by: pjd (mentor) MFC after: 2 weeks
|
241655 |
17-Oct-2012 |
mm |
Add missing initialization for do_prefix. Corrects porting error in r238391
Vendor issue and changeset reference: 2883 changing "canmount" property to "on" should not always remount dataset https://www.illumos.org/issues/2883 Changeset 13743:95aba6e49b9f
Reported by: Guido Falsi <mad@madpilot.net>, avg Obtained from: illumos (issue #2883) MFC after: 1 week
|
241021 |
28-Sep-2012 |
kevlo |
Make sure that each va_start has one and only one matching va_end, especially in error cases.
|
240870 |
23-Sep-2012 |
pjd |
It is possible to recursively destroy snapshots even if the snapshot doesn't exist on a dataset we are starting from. For example if we have the following configuration:
tank tank/foo tank/foo@snap tank/bar tank/bar@snap
We can execute:
# zfs destroy -t tank@snap
eventhough tank@snap doesn't exit.
Unfortunately it is not possible to do the same with recursive rename:
# zfs rename -r tank@snap tank@pans cannot open 'tank@snap': dataset does not exist
...until now. This change allows to recursively rename snapshots even if snapshot doesn't exist on the starting dataset.
Sponsored by: rsync.net MFC after: 2 weeks
|
240415 |
12-Sep-2012 |
mm |
Merge recent zfs vendor changes, sync code and adjust userland DEBUG.
Illumos issued covered: 1884 Empty "used" field for zfs *space commands 3006 VERIFY[S,U,P] and ASSERT[S,U,P] frequently check if first argument is zero 3028 zfs {group,user}space -n prints (null) instead of numeric GID/UID 3048 zfs {user,group}space [-s|-S] is broken 3049 zfs {user,group}space -t doesn't really filter the results 3060 zfs {user,group}space -H output isn't tab-delimited 3061 zfs {user,group}space -o doesn't use specified fields order 3064 usr/src/cmd/zpool/zpool_main.c misspells "successful" 3093 zfs {user,group}space's -i is noop 3098 zfs userspace/groupspace fail without saying why when run as non-root
References: https://www.illumos.org/issues/ + [issue_id]
Obtained from: illumos (vendor/illumos, vendor/illumos-sys) MFC after: 2 weeks
|
239774 |
28-Aug-2012 |
mm |
Merge recent vendor changes: 3100 zvol rename fails with EBUSY when dirty 3104 eliminate empty bpobjs 3120 zinject hangs in zfsdev_ioctl() due to uninitialized zc
References: https://www.illumos.org/issues/3100 https://www.illumos.org/issues/3104 https://www.illumos.org/issues/3120
Obtained from: illumos (vendor/illumos, vendor/illumos-sys) MFC after: 2 weeks
|
239620 |
23-Aug-2012 |
mm |
Merge recent vendor changes: 3086 unnecessarily setting DS_FLAG_INCONSISTENT on async destroyed datasets 3090 vdev_reopen() during reguid causes vdev to be treated as corrupt 3102 vdev_uberblock_load() and vdev_validate() may read the wrong label
Referenes: https://www.illumos.org/issues/3086 https://www.illumos.org/issues/3090 https://www.illumos.org/issues/3102
PR: kern/170912, kern/170914 Obtained from: illumos (changeset #13776, #13777) MFC after: 2 weeks
|
238926 |
30-Jul-2012 |
mm |
Partial MFV (illumos-gate 13753:2aba784c276b) 2762 zpool command should have better support for feature flags
References: https://www.illumos.org/issues/2762
MFC after: 2 weeks
|
238422 |
13-Jul-2012 |
mm |
Merge illumos commit 13749:df4cd82e2b60
1796 "ZFS HOLD" should not be used when doing "ZFS SEND" froma read-only pool 2871 support for __ZFS_POOL_RESTRICT used by ZFS test suite 2903 zfs destroy -d does not work 2957 zfs destroy -R/r sometimes fails when removing defer-destroyed snapshot
References: https://www.illumos.org/issues/1796 https://www.illumos.org/issues/2871 https://www.illumos.org/issues/2903 https://www.illumos.org/issues/2957
MFC after: 1 week
|
238391 |
12-Jul-2012 |
mm |
Change behavior introduced in r237119 to vendor solution
References: https://www.illumos.org/issues/2883
PR: 167905 Obtained from: illumos (issue #2883) MFC after: 2 weeks
|
237119 |
15-Jun-2012 |
mm |
Do not remount ZFS dataset if changing canmount property to "on" and dataset is already mounted.
PR: 167905 Submitted by: Bryan Drewery <bryan@shatow.net> MFC after: 1 week
|
236884 |
11-Jun-2012 |
mm |
Introduce "feature flags" for ZFS pools (bump SPA version to 5000). Add first feature "com.delphix:async_destroy" (asynchronous destroy of ZFS datasets). Implement features support in ZFS boot code.
Illumos revisions merged: 13700:2889e2596bd6 13701:1949b688d5fb 2619 asynchronous destruction of ZFS file systems 2747 SPA versioning with zfs feature flags
References: https://www.illumos.org/issues/2619 https://www.illumos.org/issues/2747
Obtained from: illumos (issue #2619, #2747) MFC after: 1 month
|
236705 |
07-Jun-2012 |
mm |
Import Illumos revision 13715:351036203e4b 2803 zfs get guid pretty-prints the output
References: https://www.illumos.org/issues/2803
Obtained from: illumos (issue #2803) MFC after: 3 days
|
236155 |
27-May-2012 |
mm |
Import illumos changeset 13570:3411fd5f1589 1948 zpool list should show more detailed pool information
Display per-vdev information with "zpool list -v". The added expandsize property has currently no value on FreeBSD. This changeset allows adding expansion support to individual vdevs in the future.
References: https://www.illumos.org/issues/1948
Obtained from: illumos (issue #1948) MFC after: 2 weeks
|
235479 |
15-May-2012 |
avg |
zpool_find_import_impl: another /dev/dsk -> /dev fix
This seems to fix zdb -e behavior.
PR: bin/155104 Submitted by: swell.k@gmail.com MFC after: 2 weeks
|
235222 |
10-May-2012 |
mm |
Import illumos changeset 13686:4bc0783f6064 2703 add mechanism to report ZFS send progress
If the zfs send command is used with the -v flag, the amount of bytes transmitted is reported in per second updates.
References: https://www.illumos.org/issues/2703
Obtained from: illumos (issue #2703) MFC after: 2 weeks
|
235216 |
10-May-2012 |
mm |
Add support for force unmounting ZFS filesystems during "zfs rename" with the -f flag.
Reimplementation of the illumos changeset 13677:a0cbef703c12 2635 'zfs rename -f' to perform force unmount
References: https://www.illumos.org/issues/2635
PR: kern/164447 Suggested by: Marcelo Araujo <araujo@FreeBSD.org> Obtained from: illumos (issue #2635) MFC after: 1 week
|
230514 |
24-Jan-2012 |
mm |
Merge illumos revisions 13572, 13573, 13574:
Rev. 13572: disk sync write perf regression when slog is used post oi_148 [1]
Rev. 13573: crash during reguid causes stale config [2] allow and unallow missing from zpool history since removal of pyzfs [5]
Rev. 13574: leaking a vdev when removing an l2cache device [3] memory leak when adding a file-based l2arc device [4] leak in ZFS from metaslab_group_create and zfs_ereport_checksum [6]
References: https://www.illumos.org/issues/1909 [1] https://www.illumos.org/issues/1949 [2] https://www.illumos.org/issues/1951 [3] https://www.illumos.org/issues/1952 [4] https://www.illumos.org/issues/1953 [5] https://www.illumos.org/issues/1954 [6]
Obtained from: illumos (issues #1909, #1949, #1951, #1952, #1953, #1954) MFC after: 2 weeks
|
230438 |
21-Jan-2012 |
pjd |
Dramatically optimize listing snapshots when user requests only snapshot names and wants to sort them by name, ie. when executes:
# zfs list -t snapshot -o name -s name
Because only name is needed we don't have to read all snapshot properties.
Below you can find how long does it take to list 34509 snapshots from a single disk pool before and after this change with cold and warm cache:
before:
# time zfs list -t snapshot -o name -s name > /dev/null cold cache: 525s warm cache: 218s
after:
# time zfs list -t snapshot -o name -s name > /dev/null cold cache: 1.7s warm cache: 1.1s
MFC after: 1 week
|
230404 |
20-Jan-2012 |
mm |
Add one more copyright line accidentially removed in r228103
MFC after: 3 days
|
230402 |
20-Jan-2012 |
mm |
Add accidentially removed copyright lines in r228103
Reported by: pjd MFC after: 3 days
|
228103 |
28-Nov-2011 |
mm |
Merge new ZFS features from illumos:
1644 add ZFS "clones" property https://www.illumos.org/issues/1644
1645 add ZFS "written" and "written@..." properties https://www.illumos.org/issues/1645
1646 "zfs send" should estimate size of stream https://www.illumos.org/issues/1646
1647 "zfs destroy" should determine space reclaimed by destroying multiple snapshots https://www.illumos.org/issues/1647
1693 persistent 'comment' field for a zpool https://www.illumos.org/issues/1693
1708 adjust size of zpool history data https://www.illumos.org/issues/1708
1748 desire support for reguid in zfs https://www.illumos.org/issues/1748
Obtained from: illumos (changesets 13514, 13524, 13525) MFC after: 1 month
|
226706 |
24-Oct-2011 |
pjd |
Update copyright to include myself.
MFC after: 2 weeks
|
226705 |
24-Oct-2011 |
pjd |
Extend r226676 to allow rename without unmount even for file systems with non-legacy mountpoints. It is better to be able to rename such file systems and let them be mounted in old places until next reboot than using live CD, etc. to rename with remount.
This is implemented by adding -u option to 'zfs rename'. If file system's mountpoint property is set to 'legacy' or 'none', there is no need to specify -u.
Update zfs(8) manual page to reflect this addition.
MFC after: 2 weeks
|
226676 |
24-Oct-2011 |
pjd |
Allow to rename file systems without remounting if it is possible. It is possible for file systems with 'mountpoint' preperty set to 'legacy' or 'none' - we don't have to change mount directory for them. Currently such file systems are unmounted on rename and not even mounted back.
This introduces layering violation, as we need to update 'f_mntfromname' field in statfs structure related to mountpoint (for the dataset we are renaming and all its children).
In my opinion it is worth it, as it allow to update FreeBSD in even cleaner way - in ZFS-only configuration root file system is ZFS file system with 'mountpoint' property set to 'legacy'. If root dataset is named system/rootfs, we can snapshot it (system/rootfs@upgrade), clone it (system/oldrootfs), update FreeBSD and if it doesn't boot we can boot back from system/oldrootfs and rename it back to system/rootfs while it is mounted as /. Before it was not possible, because unmounting / was not possible.
MFC after: 2 weeks
|
225828 |
28-Sep-2011 |
mm |
Remove assertion that prevents zfs rename of datasets with mountpoint=none or mountpoint=legacy that have children datasets. This also fixes dataset rename when receiving incremental snapshots as reported on freebsd-fs@
This assertion was made triggerable by opensolaris change #10196.
PR: bin/160400 Reviewed by: pjd MFC after: 1 week
|
224525 |
30-Jul-2011 |
mm |
Fix wrong initialization of "cmd" for calling the jail/unjail ioctl.
Reviewed by: pjd@, delphij@ Approved by: re (kib) MFC after: 3 days
|
224171 |
18-Jul-2011 |
gibbs |
cddl/contrib/opensolaris/cmd/zpool/zpool_main.c: cddl/contrib/opensolaris/cmd/zpool/zpool.8: cddl/contrib/opensolaris/lib/libzfs/common/libzfs_import.c: Add the "zpool labelclear" command. This command can be used to wipe the label data from a drive that is not active in a pool. The optional "-f" argument can be used to treat an exported or foreign vdev as "inactive" thus allowing its label information to be cleared.
|
224170 |
18-Jul-2011 |
gibbs |
Correct reporting of missing leaf vdevs so that the GUID required to perform pool actions is always displayed.
cddl/contrib/opensolaris/cmd/zpool/zpool_main.c: The "zpool status" command reports the "last seen at" device node path when the vdev name is being reported by GUID. Augment this code to assume a GUID is reported when a device goes missing after initial boot in addition to the previous behavior of doing this for devices that aren't seen at boot.
cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c: In zpool_vdev_name(), report recently missing devices by GUID. There is no guarantee they will return at their previous location.
|
224169 |
18-Jul-2011 |
gibbs |
cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h: cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c: o Add zpool_pool_state_to_name() API to libzfs which converts a pool_state_t into a user consumable string. o While here, correct constness of make zpool_state_to_name() and zpool_label_disk().
MFD after: 1 week
|
223623 |
28-Jun-2011 |
mm |
Add a new "REFCOMPRESSRATIO" property.
For snapshots, this is the same as COMPRESSRATIO, but for filesystems/volumes, the COMPRESSRATIO is based on the data "USED" (ie, includes blocks in children, but not blocks shared with the origin).
This is needed to figure out how much space a filesystem would use if it were not compressed (ignoring snapshots).
Illumos-gate revision: 13387
Obtained from: Illumos (Feature #1092) MFC after: 2 weeks
|
220575 |
12-Apr-2011 |
pjd |
Fix 'zfs list <path>' handling. If the path was found, the 'ret' variable was uninitialized.
PR: kern/155940 Submitted by: KOIE Hidetaka <koie@suri.co.jp> MFC after: 1 week
|
219959 |
24-Mar-2011 |
pjd |
Properly print characters larger than 127.
Submitted by: noordsij <noordsij@cs.helsinki.fi> Reviewed by: Eric Schrock <eric.schrock@delphix.com> MFC after: 1 month
|
219089 |
27-Feb-2011 |
pjd |
Finally... Import the latest open-source ZFS version - (SPA) 28.
Few new things available from now on:
- Data deduplication. - Triple parity RAIDZ (RAIDZ3). - zfs diff. - zpool split. - Snapshot holds. - zpool import -F. Allows to rewind corrupted pool to earlier transaction group. - Possibility to import pool in read-only mode.
MFC after: 1 month
|
216293 |
08-Dec-2010 |
mm |
Print message with information about updating the boot code if a new vdev is attached to a root pool (e.g. when creating a mirrored boot pool).
Reviewed by: pav Approved by: delphij (mentor) MFC after: 3 days
|
216291 |
08-Dec-2010 |
mm |
Do not print OpenSolaris hint to use (non-existing) installgrub(1) command if creating a mirror by attaching a new vdev to a root pool.
Reported by: James R. Van Artsdalen (on freebsd-fs@freebsd.org) Approved by: delphij (mentor) MFC after: 3 days
|
213197 |
27-Sep-2010 |
mm |
Enable offlining of log devices.
OpenSolaris revision and Bug IDs:
9701:cc5b64682e64 6803605 should be able to offline log devices 6726045 vdev_deflate_ratio is not set when offlining a log device 6599442 zpool import has faults in the display
Approved by: delphij (mentor) Obtained from: OpenSolaris (Bug ID 6803605, 6726045, 6599442) MFC after: 3 weeks
|
212791 |
17-Sep-2010 |
mm |
Remove duplicate include of <strings.h>
Approved by: delphij (mentor) MFC after: 3 days
|
210398 |
22-Jul-2010 |
mm |
Enable fake resolving of SMB RIDs by using nulldomain and UID_NOBODY - fixes panics when Solaris/OpenSolaris pools that contain files uploaded with the SMB protocol are accessed
Enable seting/unsetting the sharesmb property (dummy action) - allows users who import pools from Solaris/Opensolaris to unset the sharesmb property and get rid of annoying messages
PR: kern/145778, kern/148709 Approved by: pjd, delphij (mentor) MFC after: 7 weeks
|
209962 |
13-Jul-2010 |
mm |
Merge ZFS version 15 and almost all OpenSolaris bugfixes referenced in Solaris 10 updates 141445-09 and 142901-14.
Detailed information: (OpenSolaris revisions and Bug IDs, Solaris 10 patch numbers)
7844:effed23820ae 6755435 zfs_open() and zfs_close() needs to use ZFS_ENTER/ZFS_VERIFY_ZP (141445-01)
7897:e520d8258820 6748436 inconsistent zpool.cache in boot_archive could panic a zfs root filesystem upon boot-up (141445-01)
7965:b795da521357 6740164 zpool attach can create an illegal root pool (141909-02)
8084:b811cc60d650 6769612 zpool_import() will continue to write to cachefile even if altroot is set (N/A)
8121:7fd09d4ebd9c 6757430 want an option for zdb to disable space map loading and leak tracking (141445-01)
8129:e4f45a0bfbb0 6542860 ASSERT: reason != VDEV_LABEL_REMOVE||vdev_inuse(vd, crtxg, reason, 0) (141445-01)
8188:fd00c0a81e80 6761100 want zdb option to select older uberblocks (141445-01)
8190:6eeea43ced42 6774886 zfs_setattr() won't allow ndmp to restore SUNWattr_rw (141445-01)
8225:59a9961c2aeb 6737463 panic while trying to write out config file if root pool import fails (141445-01)
8227:f7d7be9b1f56 6765294 Refactor replay (141445-01)
8228:51e9ca9ee3a5 6572357 libzfs should do more to avoid mnttab lookups (141909-01) 6572376 zfs_iter_filesystems and zfs_iter_snapshots get objset stats twice (141909-01)
8241:5a60f16123ba 6328632 zpool offline is a bit too conservative (141445-01) 6739487 ASSERT: txg <= spa_final_txg due to scrub/export race (141445-01) 6767129 ASSERT: cvd->vdev_isspare, in spa_vdev_detach() (141445-01) 6747698 checksum failures after offline -t / export / import / scrub (141445-01) 6745863 ZFS writes to disk after it has been offlined (141445-01) 6722540 50% slowdown on scrub/resilver with certain vdev configurations (141445-01) 6759999 resilver logic rewrites ditto blocks on both source and destination (141445-01) 6758107 I/O should never suspend during spa_load() (141445-01) 6776548 codereview(1) runs off the page when faced with multi-line comments (N/A) 6761406 AMD errata 91 workaround doesn't work on 64-bit systems (141445-01)
8242:e46e4b2f0a03 6770866 GRUB/ZFS should require physical path or devid, but not both (141445-01)
8269:03a7e9050cfd 6674216 "zfs share" doesn't work, but "zfs set sharenfs=on" does (141445-01) 6621164 $SRC/cmd/zfs/zfs_main.c seems to have a syntax error in the translation note (141445-01) 6635482 i18n problems in libzfs_dataset.c and zfs_main.c (141445-01) 6595194 "zfs get" VALUE column is as wide as NAME (141445-01) 6722991 vdev_disk.c: error checking for ddi_pathname_to_dev_t() must test for NODEV (141445-01) 6396518 ASSERT strings shouldn't be pre-processed (141445-01)
8274:846b39508aff 6713916 scrub/resilver needlessly decompress data (141445-01)
8343:655db2375fed 6739553 libzfs_status msgid table is out of sync (141445-01) 6784104 libzfs unfairly rejects numerical values greater than 2^63 (141445-01) 6784108 zfs_realloc() should not free original memory on failure (141445-01)
8525:e0e0e525d0f8 6788830 set large value to reservation cause core dump (141445-01) 6791064 want sysevents for ZFS scrub (141445-01) 6791066 need to be able to set cachefile on faulted pools (141445-01) 6791071 zpool_do_import() should not enable datasets on faulted pools (141445-01) 6792134 getting multiple properties on a faulted pool leads to confusion (141445-01)
8547:bcc7b46e5ff7 6792884 Vista clients cannot access .zfs (141445-01)
8632:36ef517870a3 6798384 It can take a village to raise a zio (141445-01)
8636:7e4ce9158df3 6551866 deadlock between zfs_write(), zfs_freesp(), and zfs_putapage() (141909-01) 6504953 zfs_getpage() misunderstands VOP_GETPAGE() interface (141909-01) 6702206 ZFS read/writer lock contention throttles sendfile() benchmark (141445-01) 6780491 Zone on a ZFS filesystem has poor fork/exec performance (141445-01) 6747596 assertion failed: DVA_EQUAL(BP_IDENTITY(&zio->io_bp_orig), BP_IDENTITY(zio->io_bp))); (141445-01)
8692:692d4668b40d 6801507 ZFS read aggregation should not mind the gap (141445-01)
8697:e62d2612c14d 6633095 creating a filesystem with many properties set is slow (141445-01)
8768:dfecfdbb27ed 6775697 oracle crashes when overwriting after hitting quota on zfs (141909-01)
8811:f8deccf701cf 6790687 libzfs mnttab caching ignores external changes (141445-01) 6791101 memory leak from libzfs_mnttab_init (141445-01)
8845:91af0d9c0790 6800942 smb_session_create() incorrectly stores IP addresses (N/A) 6582163 Access Control List (ACL) for shares (141445-01) 6804954 smb_search - shortname field should be space padded following the NULL terminator (N/A) 6800184 Panic at smb_oplock_conflict+0x35() (N/A)
8876:59d2e67b4b65 6803822 Reboot after replacement of system disk in a ZFS mirror drops to grub> prompt (141445-01)
8924:5af812f84759 6789318 coredump when issue zdb -uuuu poolname/ (141445-01) 6790345 zdb -dddd -e poolname coredump (141445-01) 6797109 zdb: 'zdb -dddddd pool_name/fs_name inode' coredump if the file with inode was deleted (141445-01) 6797118 zdb: 'zdb -dddddd poolname inum' coredump if I miss the fs name (141445-01) 6803343 shareiscsi=on failed, iscsitgtd failed request to share (141445-01)
9030:243fd360d81f 6815893 hang mounting a dataset after booting into a new boot environment (141445-01)
9056:826e1858a846 6809691 'zpool create -f' no longer overwrites ufs infomation (141445-01)
9179:d8fbd96b79b3 6790064 zfs needs to determine uid and gid earlier in create process (141445-01)
9214:8d350e5d04aa 6604992 forced unmount + being in .zfs/snapshot/<snap1> = not happy (141909-01) 6810367 assertion failed: dvp->v_flag & VROOT, file: ../../common/fs/gfs.c, line: 426 (141909-01)
9229:e3f8b41e5db4 6807765 ztest_dsl_dataset_promote_busy needs to clean up after ENOSPC (141445-01)
9230:e4561e3eb1ef 6821169 offlining a device results in checksum errors (141445-01) 6821170 ZFS should not increment error stats for unavailable devices (141445-01) 6824006 need to increase issue and interrupt taskqs threads in zfs (141445-01)
9234:bffdc4fc05c4 6792139 recovering from a suspended pool needs some work (141445-01) 6794830 reboot command hangs on a failed zfs pool (141445-01)
9246:67c03c93c071 6824062 System panicked in zfs_mount due to NULL pointer dereference when running btts and svvs tests (141909-01)
9276:a8a7fc849933 6816124 System crash running zpool destroy on broken zpool (141445-03)
9355:09928982c591 6818183 zfs snapshot -r is slow due to set_snap_props() doing txg_wait_synced() for each new snapshot (141445-03)
9391:413d0661ef33 6710376 log device can show incorrect status when other parts of pool are degraded (141445-03)
9396:f41cf682d0d3 (part already merged) 6501037 want user/group quotas on ZFS (141445-03) 6827260 assertion failed in arc_read(): hdr == pbuf->b_hdr (141445-03) 6815592 panic: No such hold X on refcount Y from zfs_znode_move (141445-03) 6759986 zfs list shows temporary %clone when doing online zfs recv (141445-03)
9404:319573cd93f8 6774713 zfs ignores canmount=noauto when sharenfs property != off (141445-03)
9412:4aefd8704ce0 6717022 ZFS DMU needs zero-copy support (141445-03)
9425:e7ffacaec3a8 6799895 spa_add_spares() needs to be protected by config lock (141445-03) 6826466 want to post sysevents on hot spare activation (141445-03) 6826468 spa 'allowfaulted' needs some work (141445-03) 6826469 kernel support for storing vdev FRU information (141445-03) 6826470 skip posting checksum errors from DTL regions of leaf vdevs (141445-03) 6826471 I/O errors after device remove probe can confuse FMA (141445-03) 6826472 spares should enjoy some of the benefits of cache devices (141445-03)
9443:2a96d8478e95 6833711 gang leaders shouldn't have to be logical (141445-03)
9463:d0bd231c7518 6764124 want zdb to be able to checksum metadata blocks only (141445-03)
9465:8372081b8019 6830237 zfs panic in zfs_groupmember() (141445-03)
9466:1fdfd1fed9c4 6833162 phantom log device in zpool status (141445-03)
9469:4f68f041ddcd 6824968 add ZFS userquota support to rquotad (141445-03)
9470:6d827468d7b5 6834217 godfather I/O should reexecute (141445-03)
9480:fcff33da767f 6596237 Stop looking and start ganging (141909-02)
9493:9933d599bc93 6623978 lwb->lwb_buf != NULL, file ../../../uts/common/fs/zfs/zil.c, line 787, function zil_lwb_commit (141445-06)
9512:64cafcbcc337 6801810 Commit of aligned streaming rewrites to ZIL device causes unwanted disk reads (N/A)
9515:d3b739d9d043 6586537 async zio taskqs can block out userland commands (142901-09)
9554:787363635b6a 6836768 zfs_userspace() callback has no way to indicate failure (N/A)
9574:1eb6a6ab2c57 6838062 zfs panics when an error is encountered in space_map_load() (141909-02)
9583:b0696cd037cc 6794136 Panic BAD TRAP: type=e when importing degraded zraid pool. (141909-03)
9630:e25a03f552e0 6776104 "zfs import" deadlock between spa_unload() and spa_async_thread() (141445-06)
9653:a70048a304d1 6664765 Unable to remove files when using fat-zap and quota exceeded on ZFS filesystem (141445-06)
9688:127be1845343 6841321 zfs userspace / zfs get userused@ doesn't work on mounted snapshot (N/A) 6843069 zfs get userused@S-1-... doesn't work (N/A)
9873:8ddc892eca6e 6847229 assertion failed: refcount_count(&tx->tx_space_written) + delta <= tx->tx_space_towrite in dmu_tx.c (141445-06)
9904:d260bd3fd47c 6838344 kernel heap corruption detected on zil while stress testing (141445-06)
9951:a4895b3dd543 6844900 zfs_ioc_userspace_upgrade leaks (N/A)
10040:38b25aeeaf7a 6857012 zfs panics on zpool import (141445-06)
10000:241a51d8720c 6848242 zdb -e no longer works as expected (N/A)
10100:4a6965f6bef8 6856634 snv_117 not booting: zfs_parse_bootfs: error2 (141445-07)
10160:a45b03783d44 6861983 zfs should use new name <-> SID interfaces (N/A) 6862984 userquota commands can hang (141445-06)
10299:80845694147f 6696858 zfs receive of incremental replication stream can dereference NULL pointer and crash (N/A)
10302:a9e3d1987706 6696858 zfs receive of incremental replication stream can dereference NULL pointer and crash (fix lint) (N/A)
10575:2a8816c5173b (partial merge) 6882227 spa_async_remove() shouldn't do a full clear (142901-14)
10800:469478b180d9 6880764 fsync on zfs is broken if writes are greater than 32kb on a hard crash and no log attached (142901-09) 6793430 zdb -ivvvv assertion failure: bp->blk_cksum.zc_word[2] == dmu_objset_id(zilog->zl_os) (N/A)
10801:e0bf032e8673 (partial merge) 6822816 assertion failed: zap_remove_int(ds_next_clones_obj) returns ENOENT (142901-09)
10810:b6b161a6ae4a 6892298 buf->b_hdr->b_state != arc_anon, file: ../../common/fs/zfs/arc.c, line: 2849 (142901-09)
10890:499786962772 6807339 spurious checksum errors when replacing a vdev (142901-13)
11249:6c30f7dfc97b 6906110 bad trap panic in zil_replay_log_record (142901-13) 6906946 zfs replay isn't handling uid/gid correctly (142901-13)
11454:6e69bacc1a5a 6898245 suspended zpool should not cause rest of the zfs/zpool commands to hang (142901-10)
11546:42ea6be8961b (partial merge) 6833999 3-way deadlock in dsl_dataset_hold_ref() and dsl_sync_task_group_sync() (142901-09)
Discussed with: pjd Approved by: delphij (mentor) Obtained from: OpenSolaris (multiple Bug IDs) MFC after: 2 months
|
208684 |
31-May-2010 |
pjd |
Allow to use 'jailed' property again.
Reported by: Eugene Mitrofanov <eugene@imedia.ru> MFC after: 3 days
|
208472 |
23-May-2010 |
mm |
Fix zfs receive temporarily changing unchanged stream properties. Fix possible panic with zfs_enable_datasets.
OpenSolaris onnv revision: 8536:33bd5de3260e
Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6748561, 6757075) MFC after: 3 days
|
207670 |
05-May-2010 |
mm |
Introduce hardforce export option (-F) for "zpool export". When exporting with this flag, zpool.cache remains untouched.
OpenSolaris onnv revision: 8211:32722be6ad3b
Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID: 6775357)
|
206199 |
05-Apr-2010 |
delphij |
Refine previous partial merge of OpenSolaris onnv revision 9396:f41cf682d0d3. This fixes a regression that zfs list would crash on zfs having user properties.
PR: kern/145377 Submitted by: mm Approved by: pjd Obtained from: OpenSolaris MFC after: 10 days
|
205198 |
16-Mar-2010 |
delphij |
Merge OpenSolaris revision 8802:010b31dd4c53:
6773366 "zfs list" memory consumption can be further reduced
PR: bin/144720 Submitted by: mm Approved by: pjd Obtained from: OpenSolaris MFC after: 1 month
|
200516 |
14-Dec-2009 |
delphij |
Add an option to specify that the received ZFS should not be automatically mounted (receive -u).
Obtained from: OpenSolaris (onnv revision 8584:327a1b6dd944) Approved by: pjd
|
197867 |
08-Oct-2009 |
trasz |
Properly mark ZFS properties which are not changeable under FreeBSD.
Reviewed by: pjd
|
197859 |
08-Oct-2009 |
trasz |
'aclmode' and 'aclinherit' properties should work as advertised; don't refuse to set them.
|
196950 |
07-Sep-2009 |
pjd |
Fix detection of file system being shared. After this change commands like:
# zfs unshare -a # zfs destroy foo/bar # zfs rename foo/bar foo/baz
should properly remove exported file systems.
MFC after: 3 days
|
196305 |
17-Aug-2009 |
pjd |
Fix receive when dataset has no / in its name.
Submitted by: James R. Van Artsdalen <james-freebsd-current@jrv.org> Approved by: re (kib)
|
186515 |
27-Dec-2008 |
rwatson |
Including mount.h requires including param.h.
MFC after: 3 weeks
|
185039 |
18-Nov-2008 |
pjd |
Fix a warning on amd64 caused by using int for request argument instead of unsigned long:
WARNING pid 12888 (zfs/zpool): ioctl sign-extension ioctl ffffffffcc285aXX
Reported by: kris
|
185029 |
17-Nov-2008 |
pjd |
Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes.
This bring huge amount of changes, I'll enumerate only user-visible changes:
- Delegated Administration
Allows regular users to perform ZFS operations, like file system creation, snapshot creation, etc.
- L2ARC
Level 2 cache for ZFS - allows to use additional disks for cache. Huge performance improvements mostly for random read of mostly static content.
- slog
Allow to use additional disks for ZFS Intent Log to speed up operations like fsync(2).
- vfs.zfs.super_owner
Allows regular users to perform privileged operations on files stored on ZFS file systems owned by him. Very careful with this one.
- chflags(2)
Not all the flags are supported. This still needs work.
- ZFSBoot
Support to boot off of ZFS pool. Not finished, AFAIK.
Submitted by: dfr
- Snapshot properties
- New failure modes
Before if write requested failed, system paniced. Now one can select from one of three failure modes: - panic - panic on write error - wait - wait for disk to reappear - continue - serve read requests if possible, block write requests
- Refquota, refreservation properties
Just quota and reservation properties, but don't count space consumed by children file systems, clones and snapshots.
- Sparse volumes
ZVOLs that don't reserve space in the pool.
- External attributes
Compatible with extattr(2).
- NFSv4-ACLs
Not sure about the status, might not be complete yet.
Submitted by: trasz
- Creation-time properties
- Regression tests for zpool(8) command.
Obtained from: OpenSolaris
|
168926 |
21-Apr-2007 |
pjd |
MFp4:
@118370 Correct typo.
@118371 Integrate changes from vendor.
@118491 Show backtrace on unexpected code paths.
@118494 Integrate changes from vendor.
@118504 Fix sendfile(2). I had two ways of fixing it: 1. Fixing sendfile(2) itself to use VOP_GETPAGES() instead of hacking around with vn_rdwr(UIO_NOCOPY), which was suggested by ups. 2. Modify ZFS behaviour to handle this special case.
Although 1 is more correct, I've choosen 2, because hack from 1 have a side-effect of beeing faster - it reads ahead MAXBSIZE bytes instead of reading page by page. This is not easy to implement with VOP_GETPAGES(), at least not for me in this very moment.
Reported by: Andrey V. Elsukov <bu7cher@yandex.ru>
@118525 Reorganize the code to reduce diff.
@118526 This code path is expected. It is simply when file is opened with O_FSYNC flag.
Reported by: kris Reported by: Michal Suszko <dry@dry.pl>
|
168676 |
12-Apr-2007 |
pjd |
MFp4: Synchronize with vendor (mostly 'zfs rename -r').
|
168498 |
08-Apr-2007 |
pjd |
MFp4: Synchronize with recent OpenSolaris changes.
|
168484 |
08-Apr-2007 |
pjd |
If we cannot open /dev/zfs try to load zfs.ko automatically and reopen.
|
168404 |
06-Apr-2007 |
pjd |
Please welcome ZFS - The last word in file systems.
ZFS file system was ported from OpenSolaris operating system. The code in under CDDL license.
I'd like to thank all SUN developers that created this great piece of software.
Supported by: Wheel LTD (http://www.wheel.pl/) Supported by: The FreeBSD Foundation (http://www.freebsdfoundation.org/) Supported by: Sentex (http://www.sentex.net/)
|