#
355637 |
|
12-Dec-2019 |
mav |
MFC r355182: Fix use-after-free in case of L2ARC prefetch failure.
In case L2ARC read failed, l2arc_read_done() creates _different_ ZIO to read data from the original storage device. Unfortunately pointer to the failed ZIO remains in hdr->b_l1hdr.b_acb->acb_zio_head, and if some other read try to bump the ZIO priority, it will crash.
The problem is reproducible by corrupting L2ARC content and reading some data with prefetch if l2arc_noprefetch tunable is changed to 0. With the default setting the issue is probably not reproducible now.
|
#
349216 |
|
19-Jun-2019 |
avg |
MFC r348772: Restore ARC MFU/MRU pressure
Submitted by: Slawa Olhovchenkov <slw@zxy.spb.ru> Sponsored by: Integros [integros.com]
|
#
346686 |
|
25-Apr-2019 |
mav |
MFC r345200: MFV r336930: 9284 arc_reclaim_thread has 2 jobs
`arc_reclaim_thread()` calls `arc_adjust()` after calling `arc_kmem_reap_now()`; `arc_adjust()` signals `arc_get_data_buf()` to indicate that we may no longer be `arc_is_overflowing()`.
The problem is, `arc_kmem_reap_now()` can take several seconds to complete, has no impact on `arc_is_overflowing()`, but due to how the code is structured, can impact how long the ARC will remain in the `arc_is_overflowing()` state.
The fix is to use seperate threads to:
1. keep `arc_size` under `arc_c`, by calling `arc_adjust()`, which improves `arc_is_overflowing()`
2. keep enough free memory in the system, by calling `arc_kmem_reap_now()` plus `arc_shrink()`, which improves `arc_available_memory()`.
illumos/illumos-gate@de753e34f9c399037936e8bc547d823bba9d4b0d
Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Dan McDonald <danmcd@joyent.com> Reviewed by: Tim Kordas <tim.kordas@joyent.com> Approved by: Garrett D'Amore <garrett@damore.org> Author: Brad Lewis <brad.lewis@delphix.com>
|
#
346684 |
|
25-Apr-2019 |
mav |
MFC r340311: Do not ignore arc_adjust() return value.
This covers scenario when ARC may not shrink as fast as it could: 1. arc_size < arc_c and arc_adjust() does not evict anything, returning zero to arc_reclaim_thread(); 2. arc_available_memory() reports memory pressure, which can not be satisfied by arc_kmem_reap_now(); 3. arc_shrink() reduces arc_c and calls arc_adjust(), return of which is ignored; 4. even if the last arc_adjust() could not satisfy arc_size < arc_c, arc_reclaim_thread() will still go to sleep, since the first one returned zero.
|
#
345121 |
|
14-Mar-2019 |
mav |
MFC r344866: Add respective tunables to few ZFS sysctls.
|
#
339141 |
|
03-Oct-2018 |
mav |
MFC r337213: MFV r337212: 9465 ARC check for 'anon_size > arc_c/2' can stall the system
illumos/illumos-gate@abe1fd01ce5a83718c5a840daeab4abdaec1c104
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: Prashanth Sreenivasa <pks@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Don Brady <don.brady@delphix.com>
|
#
339114 |
|
03-Oct-2018 |
mav |
MFC r337025: MFV r337022: 9403 assertion failed in arc_buf_destroy() when concurrently reading block with checksum error
This assertion (VERIFY) failure was reported when reading a block. Turns out the problem is that if we get an i/o error (ECKSUM in this case), and there are multiple concurrent ARC reads of the same block (from different clones), then the ARC will put multiple buf's on the same ANON hdr, which isn't supposed to happen, and then causes a panic when we try to arc_buf_destroy() the buf.
illumos/illumos-gate@fa98e487a9619b7902f218663be219e787a57dad
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Approved by: Matt Ahrens <mahrens@delphix.com> Author: Matthew Ahrens <mahrens@delphix.com>
|
#
339034 |
|
01-Oct-2018 |
sef |
MFC r334844, r336180, r336458
r334844
This originated from ZFS On Linux, as https://github.com/zfsonlinux/zfs/commit/d4a72f23863382bdf6d0ae33196f5b5decbc48fd
During scans (scrubs or resilvers), it sorts the blocks in each transaction group by block offset; the result can be a significant improvement. (On my test system just now, which I put some effort to introduce fragmentation into the pool since I set it up yesterday, a scrub went from 1h2m to 33.5m with the changes.) I've seen similar rations on production systems.
r336180
Fix up some missed and mis-merges from the sequential scan code (r334844). Most of the changes involve moving some code around to reduce conflicts with future merges. One of the missing changes included a notification on scrub cancellation.
r336458
Fix a couple of typos in r334844 noticed by Richard Kojedzinszky
Approved by: mav Sponsored by: iXsystems, Inc
|
#
338456 |
|
04-Sep-2018 |
markj |
MFC r338416: Re-compute the ARC size before computing the MFU target.
|
#
338346 |
|
28-Aug-2018 |
markj |
MFC r338142: Set arc_kmem_cache_reap_retry_ms to 0 and make it configurable.
|
#
332785 |
|
19-Apr-2018 |
mav |
MFC r332523: 9433 Fix ARC hit rate
When the compressed ARC feature was added in commit d3c2ae1 the method of reference counting in the ARC was modified. As part of this accounting change the arc_buf_add_ref() function was removed entirely.
This would have be fine but the arc_buf_add_ref() function served a second undocumented purpose of updating the ARC access information when taking a hold on a dbuf. Without this logic in place a cached dbuf would not migrate its associated arc_buf_hdr_t to the MFU list. This would negatively impact the ARC hit rate, particularly on systems with a small ARC.
This change reinstates the missing call to arc_access() from dbuf_hold() by implementing a new arc_buf_access() function.
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tim Chase <tim@chase2k.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
|
#
332551 |
|
16-Apr-2018 |
mav |
MFC r331709: MFV r331708: 9321 arc_loan_compressed_buf() can increment arc_loaned_bytes by the wrong value
illumos/illumos-gate@9be12bd737714550277bd02b0c693db560976990
arc_loan_compressed_buf() increments arc_loaned_bytes by psize unconditionally In the case of zfs_compressed_arc_enabled=0, when the buf is returned via arc_return_buf(), if ARC_BUF_COMPRESSED(buf) is false, then arc_loaned_bytes is decremented by lsize, not psize.
Switch to using arc_buf_size(buf), instead of psize, which will return psize or lsize, depending on the result of ARC_BUF_COMPRESSED(buf).
Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Garrett D'Amore <garrett@damore.org> Author: Allan Jude <allanjude@freebsd.org>
|
#
332540 |
|
16-Apr-2018 |
mav |
MFC r331404: MFV r331400: 8484 Implement aggregate sum and use for arc counters
In pursuit of improving performance on multi-core systems, we should implements fanned out counters and use them to improve the performance of some of the arc statistics. These stats are updated extremely frequently, and can consume a significant amount of CPU time.
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Paul Dagnelie <pcd@delphix.com>
|
#
332528 |
|
16-Apr-2018 |
mav |
MFC r329759: 9018 Replace kmem_cache_reap_now() with kmem_cache_reap_soon()
illumos/illumos-gate@36a64e62848b51ac5a9a5216e894ec723cfef14e
To prevent kmem_cache reaping from blocking other system resources, turn kmem_cache_reap_now() (which blocks) into kmem_cache_reap_soon(). Callers to kmem_cache_reap_soon() should use kmem_cache_reap_active(), which exploits #9017's new taskq_empty().
Reviewed by: Bryan Cantrill <bryan@joyent.com> Reviewed by: Dan McDonald <danmcd@joyent.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Yuri Pankov <yuripv@yuripv.net> Author: Tim Kordas <tim.kordas@joyent.com>
FreeBSD does not use taskqueue for kmem caches reaping, so this change is less dramatic then it is on Illumos, just limiting reaping to 1 time per second. It may possibly be improved later, if needed.
|
#
332525 |
|
16-Apr-2018 |
mav |
MFC r329732: MFV r329502: 7614 zfs device evacuation/removal
illumos/illumos-gate@5cabbc6b49070407fb9610cfe73d4c0e0dea3e77
https://www.illumos.org/issues/7614: This project allows top-level vdevs to be removed from the storage pool with “zpool remove”, reducing the total amount of storage in the pool. This operation copies all allocated regions of the device to be removed onto other devices, recording the mapping from old to new location. After the removal is complete, read and free operations to the removed (now “indirect”) vdev must be remapped and performed at the new location on disk. The indirect mapping table is kept in memory whenever the pool is loaded, so there is minimal performance overhead when doing operations on the indirect vdev.
The size of the in-memory mapping table will be reduced when its entries become “obsolete” because they are no longer used by any block pointers in the pool. An entry becomes obsolete when all the blocks that use it are freed. An entry can also become obsolete when all the snapshots that reference it are deleted, and the block pointers that reference it have been “remapped” in all filesystems/zvols (and clones). Whenever an indirect block is written, all the block pointers in it will be “remapped” to their new (concrete) locations if possible. This process can be accelerated by using the “zfs remap” command to proactively rewrite all indirect blocks that reference indirect (removed) vdevs.
Note that when a device is removed, we do not verify the checksum of the data that is copied. This makes the process much faster, but if it were used on redundant vdevs (i.e. mirror or raidz vdevs), it would be possible to copy the wrong data, when we have the correct data on e.g. the other side of the mirror. Therefore, mirror and raidz devices can not be removed.
Reviewed by: Alex Reece <alex@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Richard Laager <rlaager@wiktel.com> Reviewed by: Tim Chase <tim@chase2k.com> Approved by: Garrett D'Amore <garrett@damore.org> Author: Prashanth Sreenivasa <pks@delphix.com>
|
#
331399 |
|
22-Mar-2018 |
mav |
MFC r329694: MFV r324198: 8081 Compiler warnings in zdb
illumos/illumos-gate@3f7978d02b206a6ebc5652c91aa9f42da6fbe00c https://github.com/illumos/illumos-gate/commit/3f7978d02b206a6ebc5652c91aa9f42da6fbe00c
https://www.illumos.org/issues/8081 zdb(8) is full of minor problems that generate compiler warnings. On FreeBSD, which uses -WError, the only way to build it is to disable all compiler warnings. This makes it much harder to detect newly introduced bugs. We should cleanup all the warnings.
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Alan Somers <asomers@gmail.com>
|
#
331383 |
|
22-Mar-2018 |
mav |
MFC r329623: MFV r302649: 7016 arc_available_memory is not 32-bit safe
illumos/illumos-gate@0dd053d7d890618ea1fc697b07de364e69eb4190 https://github.com/illumos/illumos-gate/commit/0dd053d7d890618ea1fc697b07de364e69eb4190
https://www.illumos.org/issues/7016 upstream DLPX-39446 arc_available_memory is not 32-bit safe https://github.com/delphix/delphix-os/commit/ 6b353ea3b8a1610be22e71e657d051743c64190b related to this upstream: DLPX-38547 delphix engine hang https://github.com/delphix/delphix-os/commit/ 3183a567b3e8c62a74a65885ca60c86f3d693783 DLPX-38547 delphix engine hang (fix static global) https://github.com/delphix/delphix-os/commit/ 22ac551d8ef085ad66cc8f65e51ac372b12993b9 DLPX-38882 system hung waiting on free segment https://github.com/delphix/delphix-os/commit/ cdd6beef7548cd3b12f0fc0328eeb3af540079c2
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Gordon Ross <gordon.ross@nexenta.com> Author: Prakash Surya <prakash.surya@delphix.com>
|
#
330061 |
|
27-Feb-2018 |
avg |
MFC r328776: ZFS ARC: restore illumos uses of 'needfree' that were removed in r325851
|
#
329490 |
|
18-Feb-2018 |
mav |
MFC r328246: MFV r328245: 8856 arc_cksum_is_equal() doesn't take into account ABD-logic
illumos/illumos-gate@01a059ee0cdece49f47fd4d70086dd5bc7d0b0ff
https://www.illumos.org/issues/8856: arc_cksum_is_equal() calls zio_push_transform() that requires abd_t* (second arg), but a void* is passed.
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Gordon Ross <gwr@nexenta.com> Author: Roman Strashkin <roman.strashkin@nexenta.com>
|
#
327491 |
|
02-Jan-2018 |
markj |
MFC r326919: Unregister the ARC lowmem event handler earlier in arc_fini().
|
#
326619 |
|
06-Dec-2017 |
bapt |
MFC r325851:
remove the poor emulation of the IllumOS needfree global variable to prevent the ARC reclaim thread running longer than needed.
Update the arc::needfree dtrace probe triggered in arc_lowmem() to also report the value we may want to free.
Submitted by: Nikita Kozlov <nikita.kozlov at blade-group.com> Reviewed by: avg Approved by: avg Sponsored by: blade Differential Revision: https://reviews.freebsd.org/D12163
|
#
323754 |
|
19-Sep-2017 |
avg |
MFC r322222: MFV r322221: 7910 l2arc_write_buffers() may write beyond target_sz
FreeBD note: the essence of this change was committed to FreeBSD in r314274. This commit catches up with differences between what was committed to FreeBSD and what was committed to OpenZFS, mainly more logical variable names.
illumos/illumos-gate@16a7e5ac116c85d965007a5f201104b564e82210 https://github.com/illumos/illumos-gate/commit/16a7e5ac116c85d965007a5f201104b564e82210
https://www.illumos.org/issues/7910 It seems that the change in issue #6950 resurrected the problem that was earlier fixed by the change in issue #5219. Please also see the following FreeBSD bug report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216178
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org>
|
#
323752 |
|
19-Sep-2017 |
avg |
MFC r322239: MFV r322238: 7915 checks in l2arc_evict could use some cleaning up
illumos/illumos-gate@267ae6c3a88d2fc39276af66caafa978b0935b82 https://github.com/illumos/illumos-gate/commit/267ae6c3a88d2fc39276af66caafa978b0935b82
https://www.illumos.org/issues/7915 l2arc_evict() is strictly serialized with respect to l2arc_write_buffers() and l2arc_write_done(). Normally, l2arc_evict() and l2arc_write_buffers() are called from the same thread, so they can not be concurrent. Also, l2arc_write_buffers() uses zio_wait() on the parent zio of all cache zio-s. That ensures that l2arc_write_done() is completed before l2arc_write_buffers() returns. Finally, if a cache device is removed, then l2arc_evict() is called under SCL_ALL in the exclusive mode. That ensures that it can not be concurrent with the normal L2ARC accesses to the device (including writing and evicting buffers). Given the above, some checks and actions in l2arc_evict() do not make sense. For instance, it must never encounter the write head header let alone remove it from the buffer list.
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Approved by: Matthew Ahrens <mahrens@delphix.com> Author: Andriy Gapon <avg@FreeBSD.org>
|
#
323667 |
|
17-Sep-2017 |
bapt |
MFC r323051:
Add sysctls for arc shrinking and growing values
The default value for arc_no_grow_shift may not be optimal when using several GiB ARC. Expose it via sysctl allows users to tune it easily.
Also expose arc_grow_retry via sysctl for the same reason. The default value of 60s might, in case of intensive load, be too long.
Submitted by: Nikita Kozlov <nikita.kozlov@blade-group.com> Reviewed by: mav, manu, bapt Sponsored by: blade Differential Revision: https://reviews.freebsd.org/D12144
|
#
321613 |
|
27-Jul-2017 |
mav |
MFC r320239: MFV r319950: 5220 L2ARC does not support devices that do not provide 512B access
FreeBSD note: the actual change has been in FreeBSD since r297848. This commit accounts for integration of that change with subsequent changes, especially r320156 (MFV of r318946) and r314274.
illumos/illumos-gate@403a8da73c64ff9dfb6230ba045c765a242213fb https://github.com/illumos/illumos-gate/commit/403a8da73c64ff9dfb6230ba045c765a242213fb
https://www.illumos.org/issues/5220 There are disk devices that have logical sector size larger than 512B, for example 4KB. That is, their physical sector size is larger than 512B and they do not provide emulation for 512B sector sizes. For such devices both a data offset and a data size must be properly aligned. L2ARC should arrange that because it uses physical I/O. zio_vdev_io_start() performs a necessary transformation if io_size is not aligned to vdev_ashift, but that is done only for logical I/O. Something similar should be done in L2ARC code. * a temporary write buffer should be allocated if the original buffer is not going to be compressed and its size is not aligned * size of a temporary compression buffer should be ashift aligned * for the reads, if a size of a target buffer is not sufficiently large and it is not aligned then a temporary read buffer should be allocated
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org>
|
#
321610 |
|
27-Jul-2017 |
mav |
MFC r320156, r320185, r320186, r320262, r320452, r321111: MFV r318946: 8021 ARC buf data scatter-ization
illumos/illumos-gate@770499e185d15678ccb0be57ebc626ad18d93383 https://github.com/illumos/illumos-gate/commit/770499e185d15678ccb0be57ebc626ad1 8d93383
https://www.illumos.org/issues/8021 The ARC buf data project (known simply as "ABD" since its genesis in the ZoL community) changes the way the ARC allocates `b_pdata` memory from using linea r `void *` buffers to using scatter/gather lists of fixed-size 1KB chunks. This improves ZFS's performance by helping to defragment the address space occupied by the ARC, in particular for cases where compressed ARC is enabled. It could also ease future work to allocate pages directly from `segkpm` for minimal- overhead memory allocations, bypassing the `kmem` subsystem. This is essentially the same change as the one which recently landed in ZFS on Linux, although they made some platform-specific changes while adapting this work to their codebase: 1. Implemented the equivalent of the `segkpm` suggestion for future work mentioned above to bypass issues that they've had with the Linux kernel memory allocator. 2. Changed the internal representation of the ABD's scatter/gather list so it could be used to pass I/O directly into Linux block device drivers. (This feature is not available in the illumos block device interface yet.)
FreeBSD notes: - the actual (default) chunk size is 4KB (despite the text above saying 1KB) - we can try to reimplement ABDs, so that they are not permanently mapped into the KVA unless explicitly requested, especially on platforms with scarce KVA - we can try to use unmapped I/O and avoid intermediate allocation of a linear, virtual memory mapped buffer - we can try to avoid extra data copying by referring to chunks / pages in the original ABD
Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Prashanth Sreenivasa <pks@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Chris Williamson <chris.williamson@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Dan Kimmel <dan.kimmel@delphix.com>
|
#
321573 |
|
26-Jul-2017 |
mav |
MFC r319748: MFV r319738: 8155 simplify dmu_write_policy handling of pre-compressed buffers
illumos/illumos-gate@adaec86ad212d9fd756bee322934fa54d1258605 https://github.com/illumos/illumos-gate/commit/adaec86ad212d9fd756bee322934fa54d1258605
https://www.illumos.org/issues/8155 When writing pre-compressed buffers, arc_write() requires that the compression algorithm used to compress the buffer matches the compression algorithm requested by the zio_prop_t, which is set by dmu_write_policy(). This makes dmu_write_policy() and its callers a bit more complicated. We can simplify this by making arc_write() trust the caller to supply the type of pre-compressed buffer that it wants to write, and override the compression setting in the zio_prop_t.
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com>
|
#
321563 |
|
26-Jul-2017 |
mav |
MFC r318925: MFV r316929: 6914 kernel virtual memory fragmentation leads to hang
illumos/illumos-gate@af868f46a5b794687741d5424de9e3a2d684a84a https://github.com/illumos/illumos-gate/commit/af868f46a5b794687741d5424de9e3a2d684a84a
https://www.illumos.org/issues/6914
FreeBSD note: only a ZFS part of the change is merged, changes to the VM subsystem are not ported (obviously). Also, now that FreeBSD has vmem(9) we don't have to ifdef-out the code that uses it.
|
#
321562 |
|
26-Jul-2017 |
mav |
MFC r318924 (by avg): arc_init: make code closer to upstream by introducing 'allmem' variable
All the differences in calculations are kept. A comment about arc_max being 1/2 of all memory is fixed to reflect the actual code that uses 5/8 as a factor.
|
#
321553 |
|
26-Jul-2017 |
mav |
MFC r318828: MFV r316917: 7968 multi-threaded spa_sync()
illumos/illumos-gate@94c2d0eb22e9624151ee84a7edbf7178e1bf4087 https://github.com/illumos/illumos-gate/commit/94c2d0eb22e9624151ee84a7edbf7178e1bf4087
https://www.illumos.org/issues/7968 spa_sync() iterates over all the dirty dnodes and processes each of them by calling dnode_sync(). If there are many dirty dnodes (e.g. because we created or removed a lot of files), the single thread of spa_sync() calling dnode_sync() can become a bottleneck. Additionally, if many dnodes are dirtied concurrently in open context (e.g. due to concurrent file creation), the os_lock will experience lock contention via dnode_setdirty(). The solution is to track dirty dnodes on a multilist_t, and for spa_sync() to use separate threads to process each of the sublists in the multilist. On the concurrent file creation microbenchmark, the performance improvement from dnode_setdirty() is up to 7%. Additionally, the wall clock time spent in spa_sync() is reduced to 15%-40% of the single-threaded case. In terms of cost/ reward, once the other bottlenecks are addressed, fixing this bug will provide a medium-large performance gain and require a medium amount of effort to implement.
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Brad Lewis <brad.lewis@delphix.com> Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Approved by: Dan McDonald <danmcd@omniti.com> Author: Matthew Ahrens <mahrens@delphix.com>
|
#
321552 |
|
26-Jul-2017 |
mav |
MFC r318827: MFV r316916: 7970 zfs_arc_num_sublists_per_state should be common to all multilists
illumos/illumos-gate@10fbdecb05f411234920f8d3c92c148d39106d7e https://github.com/illumos/illumos-gate/commit/10fbdecb05f411234920f8d3c92c148d39106d7e
https://www.illumos.org/issues/7970 The global tunable zfs_arc_num_sublists_per_state is used by the ARC and the dbuf cache, and other users are planned. We should change this tunable to be common to all multilists.
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Brad Lewis <brad.lewis@delphix.com> Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Approved by: Dan McDonald <danmcd@omniti.com> Author: Matthew Ahrens <mahrens@delphix.com>
|
#
321535 |
|
26-Jul-2017 |
mav |
MFC r317414: MFV 316894
7252 7628 compressed zfs send / receive
illumos/illumos-gate@5602294fda888d923d57a78bafdaf48ae6223dea https://github.com/illumos/illumos-gate/commit/5602294fda888d923d57a78bafdaf48ae6223dea
https://www.illumos.org/issues/7252 This feature includes code to allow a system with compressed ARC enabled to send data in its compressed form straight out of the ARC, and receive data in its compressed form directly into the ARC.
https://www.illumos.org/issues/7628 We should have longer, more readable versions of the ZFS send / recv options.
7628 create long versions of ZFS send / receive options
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Reviewed by: David Quigley <dpquigl@davequigley.com> Reviewed by: Thomas Caputi <tcaputi@datto.com> Approved by: Dan McDonald <danmcd@omniti.com> Author: Dan Kimmel <dan.kimmel@delphix.com>
|
#
317470 |
|
26-Apr-2017 |
smh |
MFC r315449:
Reduce ARC fragmentation threshold
Sponsored by: Multiplay
|
#
315834 |
|
23-Mar-2017 |
avg |
MFC r314913: MFV r314911: 7867 ARC space accounting leak
|
#
315072 |
|
11-Mar-2017 |
avg |
MFC r314274: l2arc: fix write size calculation broken by Compressed ARC commit
|
#
314873 |
|
07-Mar-2017 |
jpaetzel |
MFC 313879
MVF: 313876
7504 kmem_reap hangs spa_sync and administrative tasks
illumos/illumos-gate@405a5a0f5c3ab36cb76559467d1a62ba648bd809 https://github.com/illumos/illumos-gate/commit/405a5a0f5c3ab36cb76559467d1a62ba648bd80
https://www.illumos.org/issues/7504
We see long spa_sync(). We are waiting to hold dp_config_rwlock for writer. Some other thread holds dp_config_rwlock for reader, then calls arc_get_data_buf(), which finds that arc_is_overflowing()==B_TRUE. So it waits (while holding dp_config_rwlock for reader) for arc_reclaim_thread to signal arc_reclaim_waiters_cv. Before signaling, arc_reclaim_thread does arc_kmem_reap_now(), which takes ~seconds.
Author: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Approved by: Dan McDonald <danmcd@omniti.com>
|
#
314031 |
|
21-Feb-2017 |
avg |
MFC r313687: remove l2_padding_needed statistic from zfs arc
|
#
307297 |
|
14-Oct-2016 |
mav |
MFC r305561: MFV r305560: 7278 tuning zfs_arc_max does not impact arc_c_min
When changing zfs_arc_max (e.g. as zdb does), it may be set to less than the default arc_c_min. arc_c_min should decrease to not be more than arc_c_max, but it doesn't; therefore tuning of arc_c_max is ineffective.
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Author: Matthew Ahrens <mahrens@delphix.com>
openzfs/openzfs@608764beadaf4bb71c5d8fe1818e8392ac66a61b
|
#
307265 |
|
14-Oct-2016 |
mav |
MFC r305323: MFV r302991: 6950 ARC should cache compressed data
illumos/illumos-gate@dcbf3bd6a1f1360fc1afcee9e22c6dcff7844bf2 https://github.com/illumos/illumos-gate/commit/dcbf3bd6a1f1360fc1afcee9e22c6dcff7844bf2
https://www.illumos.org/issues/6950 When reading compressed data from disk, the ARC should keep the compressed block cached and only decompress it when consumers access the block. The uncompressed data should be short-lived allowing the ARC to cache a much larger amount of data. The DMU would also maintain a smaller cache of uncompressed blocks to minimize the impact of decompressing frequently accessed blocks.
Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Don Brady <don.brady@intel.com> Reviewed by: Richard Elling <Richard.Elling@RichardElling.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: George Wilson <george.wilson@delphix.com>
|
#
304138 |
|
15-Aug-2016 |
avg |
MFC r302838: 6513 partially filled holes lose birth time
|