#
e879389f |
|
12-Apr-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Fix bch2_btree_node_fill() for !path We shouldn't be doing the unlock/relock dance when we're not using a path - this fixes an assertion pop when called from btree node scan. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
8cf2036e |
|
12-Apr-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: add safety checks in bch2_btree_node_fill() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
7b4c4ccf |
|
11-Apr-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: fix race in bch2_btree_node_evict() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
79032b07 |
|
23-Mar-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Improved topology repair checks Consolidate bch2_gc_check_topology() and btree_node_interior_verify(), and replace them with an improved version, bch2_btree_node_check_topology(). This checks that children of an interior node correctly span the full range of the parent node with no overlaps. Also, ensure that topology repairs at runtime are always a fatal error; in particular, this adds a check in btree_iter_down() - if we don't find a key while walking down the btree that's indicative of a topology error and should be flagged as such, not a null ptr deref. Some checks in btree_update_interior.c remaining BUG_ONS(), because we already checked the node for topology errors when starting the update, and the assertions indicate that we _just_ corrupted the btree node - i.e. the problem can't be that existing on disk corruption, they indicate an actual algorithmic bug. In the future, we'll be annotating the fsck errors list with which recovery pass corrects them; the open coded "run explicit recovery pass or fatal error" in bch2_btree_node_check_topology() will in the future be done for every fsck_err() call. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
aa6e130e |
|
24-Mar-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Add an assertion for trying to evict btree root Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
91dcad18 |
|
22-Jan-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Pin btree cache in ram for random access in fsck Various phases of fsck involve checking references from one btree to another: this means doing a sequential scan of one btree, and then mostly random access into the second. This is particularly painful for checking extents <-> backpointers; we can prefetch btree node access on the sequential scan, but not on the random access portion, and this is particularly painful on spinning rust, where we'd like to keep the pipeline fairly full of btree node reads so that the elevator can reduce seeking. This patch implements prefetching and pinning of the portion of the btree that we'll be doing random access to. We already calculate how much of the random access btree will fit in memory so it's a fairly straightforward change. This will put more pressure on system memory usage, so we introduce a new option, fsck_memory_usage_percent, which is the percentage of total system ram that fsck is allowed to pin. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
52946d82 |
|
06-Feb-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Kill more -EIO error codes This converts -EIOs related to btree node errors to private error codes, which will help with some ongoing debugging by giving us better error messages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
cb6fc943 |
|
01-Feb-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: kill kvpmalloc() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
5f43b013 |
|
22-Jan-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: btree node prefetching in check_topology btree_and_journal_iter is old code that we want to get rid of, but we're not ready to yet. lack of btree node prefetching is, it turns out, a real performance issue for fsck on spinning rust, so - add it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
ec4edd7b |
|
16-Jan-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Prep work for variable size btree node buffers bcachefs btree nodes are big - typically 256k - and btree roots are pinned in memory. As we're now up to 18 btrees, we now have significant memory overhead in mostly empty btree roots. And in the future we're going to start enforcing that certain btree node boundaries exist, to solve lock contention issues - analagous to XFS's AGIs. Thus, we need to start allocating smaller btree node buffers when we can. This patch changes code that refers to the filesystem constant c->opts.btree_node_size to refer to the btree node buffer size - btree_buf_bytes() - where appropriate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
b819f308 |
|
03-Jan-2024 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: don't clear accessed bit in btree node fill Seeing strange performance issues that might be caused by memory pressure causing prefetched nodes to be evicted before they're used. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a564c9fa |
|
02-Dec-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Include btree_trans in more tracepoints This gives us more context information - e.g. which codepath is invoking btree node reads. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
0117591e |
|
30-Nov-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Don't drop journal pins in exit path There's no need to drop journal pins in our exit paths - the code was trying to have everything cleaned up on any shutdown, but better to just tweak the assertions a bit. This fixes a bug where calling into journal reclaim in the exit path would cass a null ptr deref. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a1d97d84 |
|
19-Oct-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Fix shrinker names Shrinkers are now exported to debugfs, so the names can't have slashes in them. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
88dfe193 |
|
19-Oct-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: bch2_btree_id_str() Since we can run with unknown btree IDs, we can't directly index btree IDs into fixed size arrays. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
ecae0bd5 |
|
02-Nov-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Kemeng Shi has contributed some compation maintenance work in the series 'Fixes and cleanups to compaction' - Joel Fernandes has a patchset ('Optimize mremap during mutual alignment within PMD') which fixes an obscure issue with mremap()'s pagetable handling during a subsequent exec(), based upon an implementation which Linus suggested - More DAMON/DAMOS maintenance and feature work from SeongJae Park i the following patch series: mm/damon: misc fixups for documents, comments and its tracepoint mm/damon: add a tracepoint for damos apply target regions mm/damon: provide pseudo-moving sum based access rate mm/damon: implement DAMOS apply intervals mm/damon/core-test: Fix memory leaks in core-test mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval - In the series 'Do not try to access unaccepted memory' Adrian Hunter provides some fixups for the recently-added 'unaccepted memory' feature. To increase the feature's checking coverage. 'Plug a few gaps where RAM is exposed without checking if it is unaccepted memory' - In the series 'cleanups for lockless slab shrink' Qi Zheng has done some maintenance work which is preparation for the lockless slab shrinking code - Qi Zheng has redone the earlier (and reverted) attempt to make slab shrinking lockless in the series 'use refcount+RCU method to implement lockless slab shrink' - David Hildenbrand contributes some maintenance work for the rmap code in the series 'Anon rmap cleanups' - Kefeng Wang does more folio conversions and some maintenance work in the migration code. Series 'mm: migrate: more folio conversion and unification' - Matthew Wilcox has fixed an issue in the buffer_head code which was causing long stalls under some heavy memory/IO loads. Some cleanups were added on the way. Series 'Add and use bdev_getblk()' - In the series 'Use nth_page() in place of direct struct page manipulation' Zi Yan has fixed a potential issue with the direct manipulation of hugetlb page frames - In the series 'mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO' has improved our handling of gigantic pages in the hugetlb vmmemmep optimizaton code. This provides significant boot time improvements when significant amounts of gigantic pages are in use - Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code rationalization and folio conversions in the hugetlb code - Yin Fengwei has improved mlock()'s handling of large folios in the series 'support large folio for mlock' - In the series 'Expose swapcache stat for memcg v1' Liu Shixin has added statistics for memcg v1 users which are available (and useful) under memcg v2 - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable) prctl so that userspace may direct the kernel to not automatically propagate the denial to child processes. The series is named 'MDWE without inheritance' - Kefeng Wang has provided the series 'mm: convert numa balancing functions to use a folio' which does what it says - In the series 'mm/ksm: add fork-exec support for prctl' Stefan Roesch makes is possible for a process to propagate KSM treatment across exec() - Huang Ying has enhanced memory tiering's calculation of memory distances. This is used to permit the dax/kmem driver to use 'high bandwidth memory' in addition to Optane Data Center Persistent Memory Modules (DCPMM). The series is named 'memory tiering: calculate abstract distance based on ACPI HMAT' - In the series 'Smart scanning mode for KSM' Stefan Roesch has optimized KSM by teaching it to retain and use some historical information from previous scans - Yosry Ahmed has fixed some inconsistencies in memcg statistics in the series 'mm: memcg: fix tracking of pending stats updates values' - In the series 'Implement IOCTL to get and optionally clear info about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits us to atomically read-then-clear page softdirty state. This is mainly used by CRIU - Hugh Dickins contributed the series 'shmem,tmpfs: general maintenance', a bunch of relatively minor maintenance tweaks to this code - Matthew Wilcox has increased the use of the VMA lock over file-backed page faults in the series 'Handle more faults under the VMA lock'. Some rationalizations of the fault path became possible as a result - In the series 'mm/rmap: convert page_move_anon_rmap() to folio_move_anon_rmap()' David Hildenbrand has implemented some cleanups and folio conversions - In the series 'various improvements to the GUP interface' Lorenzo Stoakes has simplified and improved the GUP interface with an eye to providing groundwork for future improvements - Andrey Konovalov has sent along the series 'kasan: assorted fixes and improvements' which does those things - Some page allocator maintenance work from Kemeng Shi in the series 'Two minor cleanups to break_down_buddy_pages' - In thes series 'New selftest for mm' Breno Leitao has developed another MM self test which tickles a race we had between madvise() and page faults - In the series 'Add folio_end_read' Matthew Wilcox provides cleanups and an optimization to the core pagecache code - Nhat Pham has added memcg accounting for hugetlb memory in the series 'hugetlb memcg accounting' - Cleanups and rationalizations to the pagemap code from Lorenzo Stoakes, in the series 'Abstract vma_merge() and split_vma()' - Audra Mitchell has fixed issues in the procfs page_owner code's new timestamping feature which was causing some misbehaviours. In the series 'Fix page_owner's use of free timestamps' - Lorenzo Stoakes has fixed the handling of new mappings of sealed files in the series 'permit write-sealed memfd read-only shared mappings' - Mike Kravetz has optimized the hugetlb vmemmap optimization in the series 'Batch hugetlb vmemmap modification operations' - Some buffer_head folio conversions and cleanups from Matthew Wilcox in the series 'Finish the create_empty_buffers() transition' - As a page allocator performance optimization Huang Ying has added automatic tuning to the allocator's per-cpu-pages feature, in the series 'mm: PCP high auto-tuning' - Roman Gushchin has contributed the patchset 'mm: improve performance of accounted kernel memory allocations' which improves their performance by ~30% as measured by a micro-benchmark - folio conversions from Kefeng Wang in the series 'mm: convert page cpupid functions to folios' - Some kmemleak fixups in Liu Shixin's series 'Some bugfix about kmemleak' - Qi Zheng has improved our handling of memoryless nodes by keeping them off the allocation fallback list. This is done in the series 'handle memoryless nodes more appropriately' - khugepaged conversions from Vishal Moola in the series 'Some khugepaged folio conversions'" [ bcachefs conflicts with the dynamically allocated shrinkers have been resolved as per Stephen Rothwell in https://lore.kernel.org/all/20230913093553.4290421e@canb.auug.org.au/ with help from Qi Zheng. The clone3 test filtering conflict was half-arsed by yours truly ] * tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits) mm/damon/sysfs: update monitoring target regions for online input commit mm/damon/sysfs: remove requested targets when online-commit inputs selftests: add a sanity check for zswap Documentation: maple_tree: fix word spelling error mm/vmalloc: fix the unchecked dereference warning in vread_iter() zswap: export compression failure stats Documentation: ubsan: drop "the" from article title mempolicy: migration attempt to match interleave nodes mempolicy: mmap_lock is not needed while migrating folios mempolicy: alloc_pages_mpol() for NUMA policy without vma mm: add page_rmappable_folio() wrapper mempolicy: remove confusing MPOL_MF_LAZY dead code mempolicy: mpol_shared_policy_init() without pseudo-vma mempolicy trivia: use pgoff_t in shared mempolicy tree mempolicy trivia: slightly more consistent naming mempolicy trivia: delete those ancient pr_debug()s mempolicy: fix migrate_pages(2) syscall return nr_failed kernfs: drop shared NUMA mempolicy hooks hugetlbfs: drop shared NUMA mempolicy pretence mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets() ...
|
#
51c801bc |
|
19-Sep-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Minor bch2_btree_node_get() smatch fixes - it's no longer possible for trans to be NULL - also, move "wait for read to complete" to the slowpath, __bch2_btree_node_get(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
96dea3d5 |
|
12-Sep-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Fix W=12 build errors Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
6c643965 |
|
03-Aug-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: bkey_format helper improvements - add a to_text() method for bkey_format - convert bch2_bkey_format_validate() to modern error message style, where we pass a printbuf for the error string instead of returning a static string Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
067d228b |
|
07-Jul-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Enumerate recovery passes Recovery and fsck have many different passes/jobs to do, which always run in the same order - but not all of them run all the time. Some are for fsck, some for unclean shutdown, some for version upgrades. This adds some new structure: a defined list of recovery passes that we can run in a loop, as well as consolidating the log messages. The main benefit is consolidating the "should run this recovery pass" logic, as well as cleaning up the "this recovery pass has finished" state; instead of having a bunch of ad-hoc state bits in c->flags, we've now got c->curr_recovery_pass. By consolidating the "should run this recovery pass" logic, in the future on disk format upgrades will be able to say "upgrading to this version requires x passes to run", instead of forcing all of fsck to run. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
c8b4534d |
|
07-Jul-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Delete redundant log messages Now that we have distinct error codes for different memory allocation failures, the early init log messages are no longer needed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
faa6cb6c |
|
28-Jun-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Allow for unknown btree IDs We need to allow filesystems with metadata from newer versions to be mountable and usable by older versions. This patch enables us to roll out new btrees without a new major version number; we can now handle btree roots for unknown btree types. The unknown btree roots will be retained, and fsck (including backpointers) will check them, the same as other btree types. We add a dynamic array for the extra, unknown btree roots, in addition to the fixed size btree root array, and add new helpers for looking up btree roots. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
b3591acc |
|
26-Jun-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: unregister_shrinker() now safe on not-registered shrinker Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
25aa8c21 |
|
18-Jun-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: bch2_trans_unlock_noassert() This fixes a spurious assert in the btree node read path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
e1d29c5f |
|
28-May-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Ensure bch2_btree_node_get() calls relock() after unlock() Fix a bug where bch2_btree_node_get() might call bch2_trans_unlock() (in fill) without calling bch2_trans_relock(); this is a bug when it's done in the core btree code. Also, twea bch2_btree_node_mem_alloc() to drop btree locks before doing a blocking memory allocation. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
1fb4fe63 |
|
20-May-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
six locks: Kill six_lock_state union As suggested by Linus, this drops the six_lock_state union in favor of raw bitmasks. On the one hand, bitfields give more type-level structure to the code. However, a significant amount of the code was working with six_lock_state as a u64/atomic64_t, and the conversions from the bitfields to the u64 were deemed a bit too out-there. More significantly, because bitfield order is poorly defined (#ifdef __LITTLE_ENDIAN_BITFIELD can be used, but is gross), incrementing the sequence number would overflow into the rest of the bitfield if the compiler didn't put the sequence number at the high end of the word. The new code is a bit saner when we're on an architecture without real atomic64_t support - all accesses to lock->state now go through atomic64_*() operations. On architectures with real atomic64_t support, we additionally use atomic bit ops for setting/clearing individual bits. Text size: 7467 bytes -> 4649 bytes - compilers still suck at bitfields. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
0d2234a7 |
|
20-May-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
six locks: Kill six_lock_pcpu_(alloc|free) six_lock_pcpu_alloc() is an unsafe interface: it's not safe to allocate or free the percpu reader count on an existing lock that's in use, the only safe time to allocate percpu readers is when the lock is first being initialized. This patch adds a flags parameter to six_lock_init(), and instead of six_lock_pcpu_free() we now expose six_lock_exit(), which does the same thing but is less likely to be misused. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
0b438c5b |
|
21-May-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Clear btree_node_just_written() when node reused or evicted This fixes the following bug: Journal reclaim attempts to flush a node, but races with the node being evicted from the btree node cache; when we lock the node, the data buffers have already been freed. We don't evict a node that's dirty, so calling btree_node_write() is fine - it's a noop - except that the btree_node_just_written bit causes bch2_btree_post_write_cleanup() to run (resorting the node), which then causes a null ptr deref. 00078 Unable to handle kernel NULL pointer dereference at virtual address 000000000000009e 00078 Mem abort info: 00078 ESR = 0x0000000096000005 00078 EC = 0x25: DABT (current EL), IL = 32 bits 00078 SET = 0, FnV = 0 00078 EA = 0, S1PTW = 0 00078 FSC = 0x05: level 1 translation fault 00078 Data abort info: 00078 ISV = 0, ISS = 0x00000005 00078 CM = 0, WnR = 0 00078 user pgtable: 4k pages, 39-bit VAs, pgdp=000000007ed64000 00078 [000000000000009e] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 00078 Internal error: Oops: 0000000096000005 [#1] SMP 00078 Modules linked in: 00078 CPU: 75 PID: 1170 Comm: stress-ng-utime Not tainted 6.3.0-ktest-g5ef5b466e77e #2078 00078 Hardware name: linux,dummy-virt (DT) 00078 pstate: 80001005 (Nzcv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--) 00078 pc : btree_node_sort+0xc4/0x568 00078 lr : bch2_btree_post_write_cleanup+0x6c/0x1c0 00078 sp : ffffff803e30b350 00078 x29: ffffff803e30b350 x28: 0000000000000001 x27: ffffff80076e52a8 00078 x26: 0000000000000002 x25: 0000000000000000 x24: ffffffc00912e000 00078 x23: ffffff80076e52a8 x22: 0000000000000000 x21: ffffff80076e52bc 00078 x20: ffffff80076e5200 x19: 0000000000000000 x18: 0000000000000000 00078 x17: fffffffff8000000 x16: 0000000008000000 x15: 0000000008000000 00078 x14: 0000000000000002 x13: 0000000000000000 x12: 00000000000000a0 00078 x11: ffffff803e30b400 x10: ffffff803e30b408 x9 : 0000000000000001 00078 x8 : 0000000000000000 x7 : ffffff803e480000 x6 : 00000000000000a0 00078 x5 : 0000000000000088 x4 : 0000000000000000 x3 : 0000000000000010 00078 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff80076e52a8 00078 Call trace: 00078 btree_node_sort+0xc4/0x568 00078 bch2_btree_post_write_cleanup+0x6c/0x1c0 00078 bch2_btree_node_write+0x108/0x148 00078 __btree_node_flush+0x104/0x160 00078 bch2_btree_node_flush0+0x1c/0x30 00078 journal_flush_pins.constprop.0+0x184/0x2d0 00078 __bch2_journal_reclaim+0x4d4/0x508 00078 bch2_journal_reclaim+0x1c/0x30 00078 __bch2_journal_preres_get+0x244/0x268 00078 bch2_trans_journal_preres_get_cold+0xa4/0x180 00078 __bch2_trans_commit+0x61c/0x1bb0 00078 bch2_setattr_nonsize+0x254/0x318 00078 bch2_setattr+0x5c/0x78 00078 notify_change+0x2bc/0x408 00078 vfs_utimes+0x11c/0x218 00078 do_utimes+0x84/0x140 00078 __arm64_sys_utimensat+0x68/0xa8 00078 invoke_syscall.constprop.0+0x54/0xf0 00078 do_el0_svc+0x48/0xd8 00078 el0_svc+0x14/0x48 00078 el0t_64_sync_handler+0xb0/0xb8 00078 el0t_64_sync+0x14c/0x150 00078 Code: 8b050265 910020c6 8b060266 910060ac (79402cad) 00078 ---[ end trace 0000000000000000 ]--- Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
65d48e35 |
|
14-Mar-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Private error codes: ENOMEM This adds private error codes for most (but not all) of our ENOMEM uses, which makes it easier to track down assorted allocation failures. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
a345b0f3 |
|
06-Mar-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: bch2_btree_node_to_text() const correctness This is for the Rust interface - Rust cares more about const than C does. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
3329cf1b |
|
02-Mar-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Centralize btree node lock initialization This fixes some confusion in the lockdep code due to initializing btree node/key cache locks with the same lockdep key, but different names. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
1306f87d |
|
02-Mar-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Plumb btree_trans through btree cache code Soon, __bch2_btree_node_write() is going to require a btree_trans: zoned device support is going to require a new allocation for every btree node write. This is a bit of prep work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
94c69faf |
|
04-Feb-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Use six_lock_ip() This uses the new _ip() interface to six locks and hooks it up to btree_path->ip_allocated, when available. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
87ced107 |
|
13-Dec-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Convert EAGAIN errors to private error codes More error code cleanup, for better error messages and debugability. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
e88a75eb |
|
24-Nov-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: New bpos_cmp(), bkey_cmp() replacements This patch introduces - bpos_eq() - bpos_lt() - bpos_le() - bpos_gt() - bpos_ge() and equivalent replacements for bkey_cmp(). Looking at the generated assembly these could probably be improved further, but we already see a significant code size improvement with this patch. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
447e9227 |
|
25-Nov-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Don't set accessed bit on btree node fill Btree nodes shouldn't have their accessed bit set when entering the btree cache by being read in from disk - this fixes linear scans thrashing the cache. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
001783e2 |
|
22-Nov-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Split out __bch2_btree_node_get() Standard splitting out of the slow path from the fast path of a function. We may follow this up in another patch with inlining the fast path into btree_iter.c. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
42af0ad5 |
|
17-Nov-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Fix a race with b->write_type b->write_type needs to be set atomically with setting the btree_node_need_write flag, so move it into b->flags. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
a1019576 |
|
22-Oct-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: More style fixes Fixes for various checkpatch errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
46fee692 |
|
28-Oct-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Improved btree write statistics This replaces sysfs btree_avg_write_size with btree_write_stats, which now breaks out statistics by the source of the btree write. Btree writes that are too small are a source of inefficiency, and excessive btree resort overhead - this will let us see what's causing them. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
3e3e02e6 |
|
19-Oct-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Assorted checkpatch fixes checkpatch.pl gives lots of warnings that we don't want - suggested ignore list: ASSIGN_IN_IF UNSPECIFIED_INT - bcachefs coding style prefers single token type names NEW_TYPEDEFS - typedefs are occasionally good FUNCTION_ARGUMENTS - we prefer to look at functions in .c files (hopefully with docbook documentation), not .h file prototypes MULTISTATEMENT_MACRO_USE_DO_WHILE - we have _many_ x-macros and other macros where we can't do this Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
b5ac23c4 |
|
05-Oct-2022 |
Daniel Hill <daniel@gluo.nz> |
bcachefs: improve behaviour of btree_cache_scan() Appending new nodes to the end of the list means we're more likely to evict old entries when btree_cache_scan() is started. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
c36ff038 |
|
25-Sep-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: bch2_btree_cache_scan() improvement We're still seeing OOM issues caused by the btree node cache shrinker not sufficiently freeing memory: thus, this patch changes the shrinker to not exit if __GFP_FS was not supplied. Instead, tweak btree node memory allocation so that we never invoke memory reclaim while holding the btree node cache lock. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
0d7009d7 |
|
22-Aug-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Delete old deadlock avoidance code This deletes our old lock ordering based deadlock avoidance code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
ca7d8fca |
|
21-Aug-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: New locking functions In the future, with the new deadlock cycle detector, we won't be using bare six_lock_* anymore: lock wait entries will all be embedded in btree_trans, and we will need a btree_trans context whenever locking a btree node. This patch plumbs a btree_trans to the few places that need it, and adds two new locking functions - btree_node_lock_nopath, which may fail returning a transaction restart, and - btree_node_lock_nopath_nofail, to be used in places where we know we cannot deadlock (i.e. because we're holding no other locks). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
674cfc26 |
|
26-Aug-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Add persistent counters for all tracepoints Also, do some reorganizing/renaming, convert atomic counters in bch_fs to persistent counters, and add a few missing counters. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
14599cce |
|
22-Aug-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Switch btree locking code to struct btree_bkey_cached_common This is just some type safety cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
9f96568c |
|
09-Aug-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Tracepoint improvements Our types are exported to the tracepoint code, so it's not necessary to break things out individually when passing them to tracepoints - we can also call other functions from TP_fast_assign(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
549d173c |
|
17-Jul-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: EINTR -> BCH_ERR_transaction_restart Now that we have error codes, with subtypes, we can switch to our own error code for transaction restarts - and even better, a distinct error code for each transaction restart reason: clearer code and better debugging. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
8bfe14e8 |
|
14-Jul-2022 |
Daniel Hill <daniel@gluo.nz> |
bcachefs: lock time stats prep work. We need the caller name and a place to store our results, btree_trans provides this. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
401ec4db |
|
03-Feb-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Printbuf rework This converts bcachefs to the modern printbuf interface/implementation, synced with the version to be submitted upstream. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
1d8a2689 |
|
07-Apr-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve btree_bad_header() In the future printbufs will be mempool-ified, so we shouldn't be using more than one at a time if we don't have to. This also fixes an extra trailing newline. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
7c7e071d |
|
03-Apr-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Don't normalize to pages in btree cache shrinker This behavior dates from the early, early days of bcache, and upon further delving appears to not make any sense. The shrinker only works in terms of 'objects' of unknown size; normalizing to pages only had the effect of changing the batch size, which we could do directly - if we wanted; we probably don't. Normalizing to pages meant our batch size was very small, which seems to have been keeping us from doing as much shrinking as we should be under heavy memory pressure; this patch appears to alleviate some OOMs we've been seeing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
30985537 |
|
04-Mar-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix usage of six lock's percpu mode Six locks have a percpu mode, which we use for interior btree nodes, as well as btree key cache keys for the subvolumes btree. We've been switching locks back and forth between percpu and non percpu mode as needed, but it turns out this is racy - when we're reusing an existing node, other threads could be attempting to lock it while we're switching it between modes. This patch fixes this by never switching 'struct btree' between the two modes, and instead segragating them between two different freed lists. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
5b3f7805 |
|
04-Mar-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Refactor bch2_btree_node_mem_alloc() This is prep work for the next patch, which is going to fix our usage of the percpu mode of six locks by never switching struct btree between the two modes - which means we need separate freed lists. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
05a49d22 |
|
03-Mar-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Make bch2_btree_cache_scan() try harder Previously, when bch2_btree_cache_scan() attempted to reclaim a node but failed (because trylock failed, because it was dirty, etc.), it would count that against the number of nodes it was scanning and attempting to free. This patch changes that behaviour, so that now we only count nodes that we then don't free if they have the accessed bit (which we also clear). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
bf3efff5 |
|
27-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix race leading to btree node write getting stuck Checking btree_node_may_write() isn't atomic with the other btree flags, dirty and need_write in particular. There was a rare race where we'd unblock a node from writing while __btree_node_flush() was setting need_write, and no thread would notice that the node was now both able to write and needed to be written. Fix this by adding btree node flags for will_make_reachable and write_blocked that can be checked in the cmpxchg loop in __bch2_btree_node_write. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
82732ef5 |
|
26-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve btree_node_write_if_need() btree_node_write_if_need() kicks off a btree node write only if need_write is set; this makes the locking easier to reason about by moving the check into the cmpxchg loop in __bch2_btree_node_write(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
de517c95 |
|
26-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use x-macros for btree node flags This is for adding an array of strings for btree node flag names. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
55334d78 |
|
26-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kill BCH_FS_HOLD_BTREE_WRITES This was just dead code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
fa8e94fa |
|
25-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Heap allocate printbufs This patch changes printbufs dynamically allocate and reallocate a buffer as needed. Stack usage has become a bit of a problem, and a major cause of that has been static size string buffers on the stack. The most involved part of this refactoring is that printbufs must now be exited with printbuf_exit(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
8f9ad91a |
|
17-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix failure to allocate btree node in cache The error code when we fail to allocate a node in the btree node cache doesn't make it to bch2_btree_path_traverse_all(). Instead, we need to stash a flag in btree_trans so we know we have to take the cannibalize lock. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
bc82d08b |
|
08-Jan-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Tracepoint improvements This improves the transaction restart tracepoints - adding distinct tracepoints for all the locations and reasons a transaction might have been restarted, and ensures that there's a tracepoint for every transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
669f87a5 |
|
03-Jan-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Switch to __func__for recording where btree_trans was initialized Symbol decoding, via %ps, isn't supported in userspace - this will also be faster when we're using trans->fn in the fast path, as with the new BCH_JSET_ENTRY_log journal messages. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
c7ce813f |
|
27-Dec-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add a tracepoint for the btree cache shrinker This is to help with diagnosing why the btree node can doesn't seem to be shrinking - we've had issues in the past with granularity/batch size, since btree nodes are so big. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
1aeed454 |
|
19-Dec-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Optimize memory accesses in bch2_btree_node_get() This puts a load behind some branches before where it's used, so that it can execute in parallel with other loads. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
54b2db3d |
|
11-Nov-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix infinite loop in bch2_btree_cache_scan() When attempting to free btree nodes, we might not be able to free all the nodes that were requested. But the code was looping until it had freed _all_ the nodes requested, when it should have only been attempting to free nr nodes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
aae4eea6 |
|
13-Sep-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve btree_node_mem_ptr optimization This patch checks b->hash_val before attempting to lock the node in the btree, which makes it more equivalent to the "lookup in hash table" path - and potentially avoids an unnecessary transaction restart if btree_node_mem_ptr(k) no longer points to the node we want. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
67e0dd8f |
|
30-Aug-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: btree_path This splits btree_iter into two components: btree_iter is now the externally visible componont, and it points to a btree_path which is now reference counted. This means we no longer have to clone iterators up front if they might be mutated - btree_path can be shared by multiple iterators, and cloned if an iterator would mutate a shared btree_path. This will help us use iterators more efficiently, as well as slimming down the main long lived state in btree_trans, and significantly cleans up the logic for iterator lifetimes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
cab8e233 |
|
31-Aug-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add an assertion for removing btree nodes from cache Chasing a bug that has something to do with the btree node cache. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
78cf784e |
|
30-Aug-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Further reduce iter->trans usage This is prep work for splitting btree_path out from btree_iter - btree_path will not have a pointer to btree_trans. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
e5af273f |
|
25-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: trans->restarted Start tracking when btree transactions have been restarted - and assert that we're always calling bch2_trans_begin() immediately after transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
8b3e9bd6 |
|
24-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Always check for transaction restarts On transaction restart iterators won't be locked anymore - make sure we're always checking for errors. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
a32b9573 |
|
26-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add an option for btree node mem ptr optimization bch2_btree_node_ptr_v2 has a field for stashing a pointer to the in memory btree node; this is safe because we clear this field when reading in nodes from disk and we never free in memory btree nodes - but, we have bug reports that indicate something might be faulty with this optimization, so let's add an option for it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
6e075b54 |
|
24-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: bch2_btree_iter_relock_intent() This adds a new helper for btree_cache.c that does what we want where the iterator is still being traverse - and also eliminates some unnecessary transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
f8f86c6a |
|
15-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve btree_bad_header() error message We should always print out the full btree node ptr. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
5aab6635 |
|
14-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Tighten up btree_iter locking assertions We weren't correctly verifying that we had interior node intent locks - this patch also fixes bugs uncovered by the new assertions. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
0a700890 |
|
11-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kick off btree node writes from write completions This is a performance improvement by removing the need to wait for the in flight btree write to complete before kicking one off, which is going to be needed to avoid a performance regression with the upcoming patch to update btree ptrs after every btree write. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
19d54324 |
|
10-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Really don't hold btree locks while btree IOs are in flight This is something we've attempted to stick to for quite some time, as it helps guarantee filesystem latency - but there's a few remaining paths that this patch fixes. This is also necessary for an upcoming patch to update btree pointers after every btree write - since the btree write completion path will now be doing btree operations. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
c205321b |
|
08-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Drop all btree locks when submitting btree node reads As a rule we don't want to be holding btree locks while submitting IO - this will improve overall filesystem latency. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
531a0095 |
|
04-Jun-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve btree iterator tracepoints This patch adds some new tracepoints to the btree iterator code, and adds new fields to the existing tracepoints - primarily for the iterator position. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> |
#
e68031fb |
|
29-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Mark newly allocated btree nodes as accessed This was a major oversight - this means under memory pressure we can end up reading in a btree node, then having it evicted before we get to use it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
ceda1b9a |
|
25-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Evict btree nodes we're deleting There was a bug that led to duplicate btree node pointers being inserted at the wrong level. The new topology repair code can fix that, except that the btree cache code gets confused when we read in a btree node from the pointer that was at the wrong level. This patch evicts nodes that we're deleting to, which nicely solves the problem. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
65c0601a |
|
23-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use mmap() instead of vmalloc_exec() in userspace Calling mmap() directly is much better than malloc() then mprotect(), we end up with much less address space fragmentation. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
537c32f5 |
|
23-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Don't BUG_ON() btree topology error This replaces an assertion in the btree merge path with a bch2_inconsistent_error() - fsck will fix it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
6adaac0b |
|
20-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Update bch2_btree_verify() bch2_btree_verify() verifies that the btree node on disk matches what we have in memory. This patch changes it to verify every replica, and also fixes it for interior btree nodes - there's a mem_ptr field which is used as a scratch space and needs to be zeroed out for comparing with what's on disk. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
2177147b |
|
06-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve bset compaction The previous patch that fixed btree nodes being written too aggressively now meant that we weren't sorting btree node bsets optimally - this patch fixes that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
54ca47e1 |
|
28-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kill bch2_btree_node_get_sibling() This patch reworks the btree node merge path to use a second btree iterator to get the sibling node - which means bch2_btree_iter_get_sibling() can be deleted. Also, it uses bch2_btree_iter_traverse_all() if necessary - which means it should be more reliable. We don't currently even try to make it work when trans->nounlock is set - after a BTREE_INSERT_NOUNLOCK transaction commit, hopefully this will be a worthwhile tradeoff. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
e751c01a |
|
24-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Start using bpos.snapshot field This patch starts treating the bpos.snapshot field like part of the key in the btree code: * bpos_successor() and bpos_predecessor() now include the snapshot field * Keys in btrees that will be using snapshots (extents, inodes, dirents and xattrs) now always have their snapshot field set to U32_MAX The btree iterator code gets a new flag, BTREE_ITER_ALL_SNAPSHOTS, that determines whether we're iterating over keys in all snapshots or not - internally, this controlls whether bkey_(successor|predecessor) increment/decrement the snapshot field, or only the higher bits of the key. We add a new member to struct btree_iter, iter->snapshot: when BTREE_ITER_ALL_SNAPSHOTS is not set, iter->pos.snapshot should always equal iter->snapshot, which will be 0 for btrees that don't use snapshots, and alsways U32_MAX for btrees that will use snapshots (until we enable snapshot creation). This patch also introduces a new metadata version number, and compat code for reading from/writing to older versions - this isn't a forced upgrade (yet). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
4cf91b02 |
|
04-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Split out bpos_cmp() and bkey_cmp() With snapshots, we're going to need to differentiate between comparisons that should and shouldn't include the snapshot field. bpos_cmp is now the comparison function that does include the snapshot field, used by core btree code. Upper level filesystem code generally does _not_ want to compare against the snapshot field - that code wants keys to compare as equal even when one of them is in an ancestor snapshot. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
a9d79c6e |
|
23-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use pcpu mode of six locks for interior nodes Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
f020bfcd |
|
04-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use bch2_bpos_to_text() more consistently Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
41f8b09e |
|
20-Feb-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Rename BTREE_ID enums for consistency with other enums Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
c043a330 |
|
27-Dec-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix bch2_btree_cache_scan() It was counting nodes on the freed list that it skips - because we want to leave a few so that btree splits don't touch the allocator - as nodes that it touched, meaning that if it was called with <= 3 nodes to reclaim, and those nodes were on the freed list, it would never do any work. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
18a7b972 |
|
23-Feb-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix for bch2_btree_node_get_noiter() returning -ENOMEM bch2_btree_node_get_noiter() isn't used from the btree iterator code, which retries with the btree node cache cannibalize lock held on -ENOMEM, so we should do it ourself if necessary. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
a0b73c1c |
|
26-Jan-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add (partial) support for fixing btree topology When we walk the btrees during recovery, part of that is checking that btree topology is correct: for every interior btree node, its child nodes should exactly span the range the parent node covers. Previously, we had checks for this, but not repair code. Now that we have the ability to do btree updates during initial GC, this patch adds that repair code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
edfbba58 |
|
11-Jan-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add btree node prefetching to bch2_btree_and_journal_walk() bch2_btree_and_journal_walk() walks the btree overlaying keys from the journal; it was introduced so that we could read in the alloc btree prior to journal replay being done, when journalling of updates to interior btree nodes was introduced. But it didn't have btree node prefetching, which introduced a severe regression with mount times, particularly on spinning rust. This patch implements btree node prefetching for the btree + journal walk, hopefully fixing that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
b929bbef |
|
11-Jan-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add cannibalize lock to btree_cache_to_text() More debugging info is always a good thing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
07a1006a |
|
17-Dec-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Reduce/kill BKEY_PADDED use With various newer key types - stripe keys, inline data extents - the old approach of calculating the maximum size of the value is becoming more and more error prone. Better to switch to bkey_on_stack, which can dynamically allocate if necessary to handle any size bkey. In particular we also want to get rid of BKEY_EXTENT_VAL_U64s_MAX. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
d8ebed7d |
|
19-Nov-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add btree cache stats to sysfs Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
d8b46004 |
|
15-Nov-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Check for errors from register_shrinker() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
e648448c |
|
11-Nov-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix missing memalloc_nofs_restore() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
6a747c46 |
|
09-Nov-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add accounting for dirty btree nodes/keys This lets us improve journal reclaim, so that it now tries to make sure no more than 3/4s of the btree node cache and btree key cache are dirty - ensuring the shrinkers can free memory. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
29364f34 |
|
02-Nov-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Drop sysfs interface to debug parameters It's not used much anymore, the module paramter interface is better. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
a301dc38 |
|
28-Oct-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve tracing for transaction restarts We have a bug where we can get stuck with a process spinning in transaction restarts - need more information. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
645d72aa |
|
26-Oct-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix btree updates when mixing cached and non cached iterators There was a bug where bch2_trans_update() would incorrectly delete a pending update where the new update did not actually overwrite the existing update, because we were incorrectly using BTREE_ITER_TYPE when sorting pending btree updates. This affects the pending patch to use cached iterators for inode updates. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
97c0e195 |
|
15-Oct-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix another lockdep splat vfree() can allocate memory, so we need to call memalloc_nofs_save(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
5d0b7f90 |
|
11-Oct-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix a lockdep splat We can't allocate memory with GFP_FS while holding the btree cache lock, and vfree() can allocate memory. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
4580baec |
|
25-Jul-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Remove some uses of PAGE_SIZE in the btree code For portability to userspace, we should try to avoid working in kernel pages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
760992aa |
|
25-Jul-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Ensure we wake up threads locking node when reusing it Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
47a5649a |
|
15-Jun-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix incorrect gfp check Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
bd2bb273 |
|
12-Jun-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Don't deadlock when btree node reuse changes lock ordering Btree node lock ordering is based on the logical key. However, 'struct btree' may be reused for a different btree node under memory pressure. This patch uses the new six lock callback to check if a btree node is no longer the node we wanted to lock before blocking. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
e38821f3 |
|
09-Jun-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Don't allocate memory under the btree cache lock The btree cache lock is needed for reclaiming from the btree node cache, and memory allocation can potentially spin and sleep (for 100 ms at a time), so.. don't do that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
8c9eef95 |
|
05-Jun-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Check gfp_flags correctly in bch2_btree_cache_scan() bch2_btree_node_mem_alloc() uses memalloc_nofs_save()/GFP_NOFS, but GFP_NOFS does include __GFP_IO - oops. We used to use GFP_NOIO, but as we're a filesystem now GFP_NOFS makes more sense now and is looser. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
f96c0df4 |
|
02-Jun-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix a deadlock in bch2_btree_node_get_sibling() There was a bad interaction with bch2_btree_iter_set_pos_same_leaf(), which can leave a btree node locked that is just outside iter->pos, breaking the lock ordering checks in __bch2_btree_node_lock(). Ideally we should get rid of this corner case, but for now fix it locally with verbose comments. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
692c3f06 |
|
27-May-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: fix memalloc_nofs_restore() usage Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
39fb2983 |
|
07-Jan-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kill bkey_type_successor Previously, BTREE_ID_INODES was special - inodes were indexed by the inode field, which meant the offset field of struct bpos wasn't used, which led to special cases in e.g. the btree iterator code. Now, inodes in the inodes btree are indexed by the offset field. Also: prevously min_key was special for extents btrees, min_key for extents would equal max_key for the previous node. Now, min_key = bkey_successor() of the previous node, same as non extent btrees. This means we can completely get rid of btree_type_sucessor/predecessor. Also make some improvements to the metadata IO validate/compat code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
e62d65f2 |
|
15-Mar-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: trans_commit() path can now insert to interior nodes This will be needed for the upcoming patches to journal updates to interior btree nodes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
7f81d4cf |
|
26-Feb-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: fix setting btree_node_accessed() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
72141e1f |
|
24-Feb-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use btree_ptr_v2.mem_ptr to avoid hash table lookup Nice performance optimization Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
237e8048 |
|
18-Feb-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: introduce b->hash_val This is partly prep work for introducing bch_btree_ptr_v2, but it'll also be a bit of a performance boost by moving the full key out of the hot part of struct btree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
c9bebae6 |
|
29-Nov-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Whiteout changes More prep work for snapshots: extents will soon be using KEY_TYPE_deleted for whiteouts, with 0 size. But we wen't be able to keep these whiteouts with the rest of the extents in the btree node, due to sorting invariants breaking. We can deal with this by immediately moving the new whiteouts to the unwritten whiteouts area - this just means those whiteouts won't be sorted, so we need new code to sort them prior to merging them with the rest of the keys to be written. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
58404bb2 |
|
23-Oct-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fall back to slowpath on exact comparison This is basically equivalent to the original strategy of falling back to checking against the original key when the original key and previous key didn't differ in the required bits - except, now we only fall back when the search key doesn't differ in the required bits, which ends up being a bit faster. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
1bdb67e8 |
|
06-Nov-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: kill BFLOAT_FAILED_PREV The assumption underlying BFLOAT_FAILED_PREV was wrong; the comparison we're doing in bset_search_tree() doesn't have to tell the pivot apart from the previous key, it just has to tell if search is definitely greater than or equal to the pivot. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
fb975d14 |
|
21-Sep-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Drop unnecessary rcu_read_lock() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
e0dfc08b |
|
11-Jun-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: use memalloc_nofs_save() for vmalloc allocation Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
61011ea2 |
|
14-May-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Rip out old hacky transaction restart tracing Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
20bceecb |
|
15-May-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: More work to avoid transaction restarts Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
58fbf808 |
|
15-May-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Delete duplicate code Also rename for consistency Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
60755344 |
|
10-May-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: kill BTREE_ITER_NOUNLOCK Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
b03b81df |
|
10-May-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Don't pass around may_drop_locks Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
c43a6ef9 |
|
05-Jun-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: btree_bkey_cached_common This is prep work for the btree key cache: btree iterators will point to either struct btree, or a new struct bkey_cached. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
ba5c6557 |
|
22-Apr-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add actual tracepoints for transaction restarts Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
0f238367 |
|
27-Mar-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: trans_for_each_iter() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
d0cc3def |
|
13-Jan-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: More allocator startup improvements Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
26609b61 |
|
01-Nov-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Make bkey types globally unique this lets us get rid of a lot of extra switch statements - in a lot of places we dispatch on the btree node type, and then the key type, so this is a nice cleanup across a lot of code. Also improve the on disk format versioning stuff. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
319f9ac3 |
|
08-Nov-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: revamp to_text methods Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
1c7a0adf |
|
12-Jul-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: trace transaction restarts exceptionally crappy "tracing", but it's a start at documenting the places restarts can be triggered Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
1c6fdbd8 |
|
17-Mar-2017 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Initial commit Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write filesystem with every feature you could possibly want. Website: https://bcachefs.org Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> |
#
51c801bc |
|
19-Sep-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Minor bch2_btree_node_get() smatch fixes - it's no longer possible for trans to be NULL - also, move "wait for read to complete" to the slowpath, __bch2_btree_node_get(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
96dea3d5 |
|
12-Sep-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Fix W=12 build errors Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
6c643965 |
|
03-Aug-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: bkey_format helper improvements - add a to_text() method for bkey_format - convert bch2_bkey_format_validate() to modern error message style, where we pass a printbuf for the error string instead of returning a static string Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
067d228b |
|
07-Jul-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Enumerate recovery passes Recovery and fsck have many different passes/jobs to do, which always run in the same order - but not all of them run all the time. Some are for fsck, some for unclean shutdown, some for version upgrades. This adds some new structure: a defined list of recovery passes that we can run in a loop, as well as consolidating the log messages. The main benefit is consolidating the "should run this recovery pass" logic, as well as cleaning up the "this recovery pass has finished" state; instead of having a bunch of ad-hoc state bits in c->flags, we've now got c->curr_recovery_pass. By consolidating the "should run this recovery pass" logic, in the future on disk format upgrades will be able to say "upgrading to this version requires x passes to run", instead of forcing all of fsck to run. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
c8b4534d |
|
07-Jul-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Delete redundant log messages Now that we have distinct error codes for different memory allocation failures, the early init log messages are no longer needed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
faa6cb6c |
|
28-Jun-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Allow for unknown btree IDs We need to allow filesystems with metadata from newer versions to be mountable and usable by older versions. This patch enables us to roll out new btrees without a new major version number; we can now handle btree roots for unknown btree types. The unknown btree roots will be retained, and fsck (including backpointers) will check them, the same as other btree types. We add a dynamic array for the extra, unknown btree roots, in addition to the fixed size btree root array, and add new helpers for looking up btree roots. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
b3591acc |
|
26-Jun-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: unregister_shrinker() now safe on not-registered shrinker Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
25aa8c21 |
|
18-Jun-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: bch2_trans_unlock_noassert() This fixes a spurious assert in the btree node read path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e1d29c5f |
|
28-May-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Ensure bch2_btree_node_get() calls relock() after unlock() Fix a bug where bch2_btree_node_get() might call bch2_trans_unlock() (in fill) without calling bch2_trans_relock(); this is a bug when it's done in the core btree code. Also, twea bch2_btree_node_mem_alloc() to drop btree locks before doing a blocking memory allocation. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1fb4fe63 |
|
20-May-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
six locks: Kill six_lock_state union As suggested by Linus, this drops the six_lock_state union in favor of raw bitmasks. On the one hand, bitfields give more type-level structure to the code. However, a significant amount of the code was working with six_lock_state as a u64/atomic64_t, and the conversions from the bitfields to the u64 were deemed a bit too out-there. More significantly, because bitfield order is poorly defined (#ifdef __LITTLE_ENDIAN_BITFIELD can be used, but is gross), incrementing the sequence number would overflow into the rest of the bitfield if the compiler didn't put the sequence number at the high end of the word. The new code is a bit saner when we're on an architecture without real atomic64_t support - all accesses to lock->state now go through atomic64_*() operations. On architectures with real atomic64_t support, we additionally use atomic bit ops for setting/clearing individual bits. Text size: 7467 bytes -> 4649 bytes - compilers still suck at bitfields. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
0d2234a7 |
|
20-May-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
six locks: Kill six_lock_pcpu_(alloc|free) six_lock_pcpu_alloc() is an unsafe interface: it's not safe to allocate or free the percpu reader count on an existing lock that's in use, the only safe time to allocate percpu readers is when the lock is first being initialized. This patch adds a flags parameter to six_lock_init(), and instead of six_lock_pcpu_free() we now expose six_lock_exit(), which does the same thing but is less likely to be misused. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
0b438c5b |
|
21-May-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Clear btree_node_just_written() when node reused or evicted This fixes the following bug: Journal reclaim attempts to flush a node, but races with the node being evicted from the btree node cache; when we lock the node, the data buffers have already been freed. We don't evict a node that's dirty, so calling btree_node_write() is fine - it's a noop - except that the btree_node_just_written bit causes bch2_btree_post_write_cleanup() to run (resorting the node), which then causes a null ptr deref. 00078 Unable to handle kernel NULL pointer dereference at virtual address 000000000000009e 00078 Mem abort info: 00078 ESR = 0x0000000096000005 00078 EC = 0x25: DABT (current EL), IL = 32 bits 00078 SET = 0, FnV = 0 00078 EA = 0, S1PTW = 0 00078 FSC = 0x05: level 1 translation fault 00078 Data abort info: 00078 ISV = 0, ISS = 0x00000005 00078 CM = 0, WnR = 0 00078 user pgtable: 4k pages, 39-bit VAs, pgdp=000000007ed64000 00078 [000000000000009e] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 00078 Internal error: Oops: 0000000096000005 [#1] SMP 00078 Modules linked in: 00078 CPU: 75 PID: 1170 Comm: stress-ng-utime Not tainted 6.3.0-ktest-g5ef5b466e77e #2078 00078 Hardware name: linux,dummy-virt (DT) 00078 pstate: 80001005 (Nzcv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--) 00078 pc : btree_node_sort+0xc4/0x568 00078 lr : bch2_btree_post_write_cleanup+0x6c/0x1c0 00078 sp : ffffff803e30b350 00078 x29: ffffff803e30b350 x28: 0000000000000001 x27: ffffff80076e52a8 00078 x26: 0000000000000002 x25: 0000000000000000 x24: ffffffc00912e000 00078 x23: ffffff80076e52a8 x22: 0000000000000000 x21: ffffff80076e52bc 00078 x20: ffffff80076e5200 x19: 0000000000000000 x18: 0000000000000000 00078 x17: fffffffff8000000 x16: 0000000008000000 x15: 0000000008000000 00078 x14: 0000000000000002 x13: 0000000000000000 x12: 00000000000000a0 00078 x11: ffffff803e30b400 x10: ffffff803e30b408 x9 : 0000000000000001 00078 x8 : 0000000000000000 x7 : ffffff803e480000 x6 : 00000000000000a0 00078 x5 : 0000000000000088 x4 : 0000000000000000 x3 : 0000000000000010 00078 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff80076e52a8 00078 Call trace: 00078 btree_node_sort+0xc4/0x568 00078 bch2_btree_post_write_cleanup+0x6c/0x1c0 00078 bch2_btree_node_write+0x108/0x148 00078 __btree_node_flush+0x104/0x160 00078 bch2_btree_node_flush0+0x1c/0x30 00078 journal_flush_pins.constprop.0+0x184/0x2d0 00078 __bch2_journal_reclaim+0x4d4/0x508 00078 bch2_journal_reclaim+0x1c/0x30 00078 __bch2_journal_preres_get+0x244/0x268 00078 bch2_trans_journal_preres_get_cold+0xa4/0x180 00078 __bch2_trans_commit+0x61c/0x1bb0 00078 bch2_setattr_nonsize+0x254/0x318 00078 bch2_setattr+0x5c/0x78 00078 notify_change+0x2bc/0x408 00078 vfs_utimes+0x11c/0x218 00078 do_utimes+0x84/0x140 00078 __arm64_sys_utimensat+0x68/0xa8 00078 invoke_syscall.constprop.0+0x54/0xf0 00078 do_el0_svc+0x48/0xd8 00078 el0_svc+0x14/0x48 00078 el0t_64_sync_handler+0xb0/0xb8 00078 el0t_64_sync+0x14c/0x150 00078 Code: 8b050265 910020c6 8b060266 910060ac (79402cad) 00078 ---[ end trace 0000000000000000 ]--- Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
65d48e35 |
|
14-Mar-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Private error codes: ENOMEM This adds private error codes for most (but not all) of our ENOMEM uses, which makes it easier to track down assorted allocation failures. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a345b0f3 |
|
06-Mar-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: bch2_btree_node_to_text() const correctness This is for the Rust interface - Rust cares more about const than C does. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
3329cf1b |
|
02-Mar-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Centralize btree node lock initialization This fixes some confusion in the lockdep code due to initializing btree node/key cache locks with the same lockdep key, but different names. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1306f87d |
|
02-Mar-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Plumb btree_trans through btree cache code Soon, __bch2_btree_node_write() is going to require a btree_trans: zoned device support is going to require a new allocation for every btree node write. This is a bit of prep work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
94c69faf |
|
04-Feb-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Use six_lock_ip() This uses the new _ip() interface to six locks and hooks it up to btree_path->ip_allocated, when available. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
87ced107 |
|
13-Dec-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Convert EAGAIN errors to private error codes More error code cleanup, for better error messages and debugability. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e88a75eb |
|
24-Nov-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: New bpos_cmp(), bkey_cmp() replacements This patch introduces - bpos_eq() - bpos_lt() - bpos_le() - bpos_gt() - bpos_ge() and equivalent replacements for bkey_cmp(). Looking at the generated assembly these could probably be improved further, but we already see a significant code size improvement with this patch. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
447e9227 |
|
25-Nov-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Don't set accessed bit on btree node fill Btree nodes shouldn't have their accessed bit set when entering the btree cache by being read in from disk - this fixes linear scans thrashing the cache. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
001783e2 |
|
22-Nov-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Split out __bch2_btree_node_get() Standard splitting out of the slow path from the fast path of a function. We may follow this up in another patch with inlining the fast path into btree_iter.c. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
42af0ad5 |
|
17-Nov-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Fix a race with b->write_type b->write_type needs to be set atomically with setting the btree_node_need_write flag, so move it into b->flags. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a1019576 |
|
22-Oct-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: More style fixes Fixes for various checkpatch errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
46fee692 |
|
28-Oct-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Improved btree write statistics This replaces sysfs btree_avg_write_size with btree_write_stats, which now breaks out statistics by the source of the btree write. Btree writes that are too small are a source of inefficiency, and excessive btree resort overhead - this will let us see what's causing them. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
3e3e02e6 |
|
19-Oct-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Assorted checkpatch fixes checkpatch.pl gives lots of warnings that we don't want - suggested ignore list: ASSIGN_IN_IF UNSPECIFIED_INT - bcachefs coding style prefers single token type names NEW_TYPEDEFS - typedefs are occasionally good FUNCTION_ARGUMENTS - we prefer to look at functions in .c files (hopefully with docbook documentation), not .h file prototypes MULTISTATEMENT_MACRO_USE_DO_WHILE - we have _many_ x-macros and other macros where we can't do this Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
b5ac23c4 |
|
05-Oct-2022 |
Daniel Hill <daniel@gluo.nz> |
bcachefs: improve behaviour of btree_cache_scan() Appending new nodes to the end of the list means we're more likely to evict old entries when btree_cache_scan() is started. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
c36ff038 |
|
25-Sep-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: bch2_btree_cache_scan() improvement We're still seeing OOM issues caused by the btree node cache shrinker not sufficiently freeing memory: thus, this patch changes the shrinker to not exit if __GFP_FS was not supplied. Instead, tweak btree node memory allocation so that we never invoke memory reclaim while holding the btree node cache lock. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
0d7009d7 |
|
22-Aug-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Delete old deadlock avoidance code This deletes our old lock ordering based deadlock avoidance code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
ca7d8fca |
|
21-Aug-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: New locking functions In the future, with the new deadlock cycle detector, we won't be using bare six_lock_* anymore: lock wait entries will all be embedded in btree_trans, and we will need a btree_trans context whenever locking a btree node. This patch plumbs a btree_trans to the few places that need it, and adds two new locking functions - btree_node_lock_nopath, which may fail returning a transaction restart, and - btree_node_lock_nopath_nofail, to be used in places where we know we cannot deadlock (i.e. because we're holding no other locks). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
674cfc26 |
|
26-Aug-2022 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Add persistent counters for all tracepoints Also, do some reorganizing/renaming, convert atomic counters in bch_fs to persistent counters, and add a few missing counters. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
14599cce |
|
22-Aug-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Switch btree locking code to struct btree_bkey_cached_common This is just some type safety cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
9f96568c |
|
09-Aug-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Tracepoint improvements Our types are exported to the tracepoint code, so it's not necessary to break things out individually when passing them to tracepoints - we can also call other functions from TP_fast_assign(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
549d173c |
|
17-Jul-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: EINTR -> BCH_ERR_transaction_restart Now that we have error codes, with subtypes, we can switch to our own error code for transaction restarts - and even better, a distinct error code for each transaction restart reason: clearer code and better debugging. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
8bfe14e8 |
|
14-Jul-2022 |
Daniel Hill <daniel@gluo.nz> |
bcachefs: lock time stats prep work. We need the caller name and a place to store our results, btree_trans provides this. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
401ec4db |
|
03-Feb-2023 |
Kent Overstreet <kent.overstreet@linux.dev> |
bcachefs: Printbuf rework This converts bcachefs to the modern printbuf interface/implementation, synced with the version to be submitted upstream. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1d8a2689 |
|
07-Apr-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve btree_bad_header() In the future printbufs will be mempool-ified, so we shouldn't be using more than one at a time if we don't have to. This also fixes an extra trailing newline. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
7c7e071d |
|
03-Apr-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Don't normalize to pages in btree cache shrinker This behavior dates from the early, early days of bcache, and upon further delving appears to not make any sense. The shrinker only works in terms of 'objects' of unknown size; normalizing to pages only had the effect of changing the batch size, which we could do directly - if we wanted; we probably don't. Normalizing to pages meant our batch size was very small, which seems to have been keeping us from doing as much shrinking as we should be under heavy memory pressure; this patch appears to alleviate some OOMs we've been seeing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
30985537 |
|
04-Mar-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix usage of six lock's percpu mode Six locks have a percpu mode, which we use for interior btree nodes, as well as btree key cache keys for the subvolumes btree. We've been switching locks back and forth between percpu and non percpu mode as needed, but it turns out this is racy - when we're reusing an existing node, other threads could be attempting to lock it while we're switching it between modes. This patch fixes this by never switching 'struct btree' between the two modes, and instead segragating them between two different freed lists. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
5b3f7805 |
|
04-Mar-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Refactor bch2_btree_node_mem_alloc() This is prep work for the next patch, which is going to fix our usage of the percpu mode of six locks by never switching struct btree between the two modes - which means we need separate freed lists. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
05a49d22 |
|
03-Mar-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Make bch2_btree_cache_scan() try harder Previously, when bch2_btree_cache_scan() attempted to reclaim a node but failed (because trylock failed, because it was dirty, etc.), it would count that against the number of nodes it was scanning and attempting to free. This patch changes that behaviour, so that now we only count nodes that we then don't free if they have the accessed bit (which we also clear). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
bf3efff5 |
|
27-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix race leading to btree node write getting stuck Checking btree_node_may_write() isn't atomic with the other btree flags, dirty and need_write in particular. There was a rare race where we'd unblock a node from writing while __btree_node_flush() was setting need_write, and no thread would notice that the node was now both able to write and needed to be written. Fix this by adding btree node flags for will_make_reachable and write_blocked that can be checked in the cmpxchg loop in __bch2_btree_node_write. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
82732ef5 |
|
26-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve btree_node_write_if_need() btree_node_write_if_need() kicks off a btree node write only if need_write is set; this makes the locking easier to reason about by moving the check into the cmpxchg loop in __bch2_btree_node_write(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
de517c95 |
|
26-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use x-macros for btree node flags This is for adding an array of strings for btree node flag names. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
55334d78 |
|
26-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kill BCH_FS_HOLD_BTREE_WRITES This was just dead code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
fa8e94fa |
|
25-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Heap allocate printbufs This patch changes printbufs dynamically allocate and reallocate a buffer as needed. Stack usage has become a bit of a problem, and a major cause of that has been static size string buffers on the stack. The most involved part of this refactoring is that printbufs must now be exited with printbuf_exit(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
8f9ad91a |
|
17-Feb-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix failure to allocate btree node in cache The error code when we fail to allocate a node in the btree node cache doesn't make it to bch2_btree_path_traverse_all(). Instead, we need to stash a flag in btree_trans so we know we have to take the cannibalize lock. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
bc82d08b |
|
08-Jan-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Tracepoint improvements This improves the transaction restart tracepoints - adding distinct tracepoints for all the locations and reasons a transaction might have been restarted, and ensures that there's a tracepoint for every transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
669f87a5 |
|
03-Jan-2022 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Switch to __func__for recording where btree_trans was initialized Symbol decoding, via %ps, isn't supported in userspace - this will also be faster when we're using trans->fn in the fast path, as with the new BCH_JSET_ENTRY_log journal messages. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
c7ce813f |
|
27-Dec-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add a tracepoint for the btree cache shrinker This is to help with diagnosing why the btree node can doesn't seem to be shrinking - we've had issues in the past with granularity/batch size, since btree nodes are so big. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
1aeed454 |
|
19-Dec-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Optimize memory accesses in bch2_btree_node_get() This puts a load behind some branches before where it's used, so that it can execute in parallel with other loads. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
54b2db3d |
|
11-Nov-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix infinite loop in bch2_btree_cache_scan() When attempting to free btree nodes, we might not be able to free all the nodes that were requested. But the code was looping until it had freed _all_ the nodes requested, when it should have only been attempting to free nr nodes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
aae4eea6 |
|
13-Sep-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve btree_node_mem_ptr optimization This patch checks b->hash_val before attempting to lock the node in the btree, which makes it more equivalent to the "lookup in hash table" path - and potentially avoids an unnecessary transaction restart if btree_node_mem_ptr(k) no longer points to the node we want. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
67e0dd8f |
|
30-Aug-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: btree_path This splits btree_iter into two components: btree_iter is now the externally visible componont, and it points to a btree_path which is now reference counted. This means we no longer have to clone iterators up front if they might be mutated - btree_path can be shared by multiple iterators, and cloned if an iterator would mutate a shared btree_path. This will help us use iterators more efficiently, as well as slimming down the main long lived state in btree_trans, and significantly cleans up the logic for iterator lifetimes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
cab8e233 |
|
31-Aug-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add an assertion for removing btree nodes from cache Chasing a bug that has something to do with the btree node cache. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
78cf784e |
|
30-Aug-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Further reduce iter->trans usage This is prep work for splitting btree_path out from btree_iter - btree_path will not have a pointer to btree_trans. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
e5af273f |
|
25-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: trans->restarted Start tracking when btree transactions have been restarted - and assert that we're always calling bch2_trans_begin() immediately after transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
8b3e9bd6 |
|
24-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Always check for transaction restarts On transaction restart iterators won't be locked anymore - make sure we're always checking for errors. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
a32b9573 |
|
26-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add an option for btree node mem ptr optimization bch2_btree_node_ptr_v2 has a field for stashing a pointer to the in memory btree node; this is safe because we clear this field when reading in nodes from disk and we never free in memory btree nodes - but, we have bug reports that indicate something might be faulty with this optimization, so let's add an option for it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
6e075b54 |
|
24-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: bch2_btree_iter_relock_intent() This adds a new helper for btree_cache.c that does what we want where the iterator is still being traverse - and also eliminates some unnecessary transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
f8f86c6a |
|
15-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve btree_bad_header() error message We should always print out the full btree node ptr. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
5aab6635 |
|
14-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Tighten up btree_iter locking assertions We weren't correctly verifying that we had interior node intent locks - this patch also fixes bugs uncovered by the new assertions. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
0a700890 |
|
11-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kick off btree node writes from write completions This is a performance improvement by removing the need to wait for the in flight btree write to complete before kicking one off, which is going to be needed to avoid a performance regression with the upcoming patch to update btree ptrs after every btree write. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
19d54324 |
|
10-Jul-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Really don't hold btree locks while btree IOs are in flight This is something we've attempted to stick to for quite some time, as it helps guarantee filesystem latency - but there's a few remaining paths that this patch fixes. This is also necessary for an upcoming patch to update btree pointers after every btree write - since the btree write completion path will now be doing btree operations. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
c205321b |
|
08-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Drop all btree locks when submitting btree node reads As a rule we don't want to be holding btree locks while submitting IO - this will improve overall filesystem latency. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
531a0095 |
|
04-Jun-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve btree iterator tracepoints This patch adds some new tracepoints to the btree iterator code, and adds new fields to the existing tracepoints - primarily for the iterator position. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
#
e68031fb |
|
29-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Mark newly allocated btree nodes as accessed This was a major oversight - this means under memory pressure we can end up reading in a btree node, then having it evicted before we get to use it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
ceda1b9a |
|
25-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Evict btree nodes we're deleting There was a bug that led to duplicate btree node pointers being inserted at the wrong level. The new topology repair code can fix that, except that the btree cache code gets confused when we read in a btree node from the pointer that was at the wrong level. This patch evicts nodes that we're deleting to, which nicely solves the problem. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
65c0601a |
|
23-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use mmap() instead of vmalloc_exec() in userspace Calling mmap() directly is much better than malloc() then mprotect(), we end up with much less address space fragmentation. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
537c32f5 |
|
23-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Don't BUG_ON() btree topology error This replaces an assertion in the btree merge path with a bch2_inconsistent_error() - fsck will fix it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
6adaac0b |
|
20-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Update bch2_btree_verify() bch2_btree_verify() verifies that the btree node on disk matches what we have in memory. This patch changes it to verify every replica, and also fixes it for interior btree nodes - there's a mem_ptr field which is used as a scratch space and needs to be zeroed out for comparing with what's on disk. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
2177147b |
|
06-Apr-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve bset compaction The previous patch that fixed btree nodes being written too aggressively now meant that we weren't sorting btree node bsets optimally - this patch fixes that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
54ca47e1 |
|
28-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kill bch2_btree_node_get_sibling() This patch reworks the btree node merge path to use a second btree iterator to get the sibling node - which means bch2_btree_iter_get_sibling() can be deleted. Also, it uses bch2_btree_iter_traverse_all() if necessary - which means it should be more reliable. We don't currently even try to make it work when trans->nounlock is set - after a BTREE_INSERT_NOUNLOCK transaction commit, hopefully this will be a worthwhile tradeoff. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e751c01a |
|
24-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Start using bpos.snapshot field This patch starts treating the bpos.snapshot field like part of the key in the btree code: * bpos_successor() and bpos_predecessor() now include the snapshot field * Keys in btrees that will be using snapshots (extents, inodes, dirents and xattrs) now always have their snapshot field set to U32_MAX The btree iterator code gets a new flag, BTREE_ITER_ALL_SNAPSHOTS, that determines whether we're iterating over keys in all snapshots or not - internally, this controlls whether bkey_(successor|predecessor) increment/decrement the snapshot field, or only the higher bits of the key. We add a new member to struct btree_iter, iter->snapshot: when BTREE_ITER_ALL_SNAPSHOTS is not set, iter->pos.snapshot should always equal iter->snapshot, which will be 0 for btrees that don't use snapshots, and alsways U32_MAX for btrees that will use snapshots (until we enable snapshot creation). This patch also introduces a new metadata version number, and compat code for reading from/writing to older versions - this isn't a forced upgrade (yet). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
4cf91b02 |
|
04-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Split out bpos_cmp() and bkey_cmp() With snapshots, we're going to need to differentiate between comparisons that should and shouldn't include the snapshot field. bpos_cmp is now the comparison function that does include the snapshot field, used by core btree code. Upper level filesystem code generally does _not_ want to compare against the snapshot field - that code wants keys to compare as equal even when one of them is in an ancestor snapshot. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a9d79c6e |
|
23-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use pcpu mode of six locks for interior nodes Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
f020bfcd |
|
04-Mar-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use bch2_bpos_to_text() more consistently Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
41f8b09e |
|
20-Feb-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Rename BTREE_ID enums for consistency with other enums Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
c043a330 |
|
27-Dec-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix bch2_btree_cache_scan() It was counting nodes on the freed list that it skips - because we want to leave a few so that btree splits don't touch the allocator - as nodes that it touched, meaning that if it was called with <= 3 nodes to reclaim, and those nodes were on the freed list, it would never do any work. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
18a7b972 |
|
23-Feb-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix for bch2_btree_node_get_noiter() returning -ENOMEM bch2_btree_node_get_noiter() isn't used from the btree iterator code, which retries with the btree node cache cannibalize lock held on -ENOMEM, so we should do it ourself if necessary. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a0b73c1c |
|
26-Jan-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add (partial) support for fixing btree topology When we walk the btrees during recovery, part of that is checking that btree topology is correct: for every interior btree node, its child nodes should exactly span the range the parent node covers. Previously, we had checks for this, but not repair code. Now that we have the ability to do btree updates during initial GC, this patch adds that repair code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
edfbba58 |
|
11-Jan-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add btree node prefetching to bch2_btree_and_journal_walk() bch2_btree_and_journal_walk() walks the btree overlaying keys from the journal; it was introduced so that we could read in the alloc btree prior to journal replay being done, when journalling of updates to interior btree nodes was introduced. But it didn't have btree node prefetching, which introduced a severe regression with mount times, particularly on spinning rust. This patch implements btree node prefetching for the btree + journal walk, hopefully fixing that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
b929bbef |
|
11-Jan-2021 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add cannibalize lock to btree_cache_to_text() More debugging info is always a good thing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
07a1006a |
|
17-Dec-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Reduce/kill BKEY_PADDED use With various newer key types - stripe keys, inline data extents - the old approach of calculating the maximum size of the value is becoming more and more error prone. Better to switch to bkey_on_stack, which can dynamically allocate if necessary to handle any size bkey. In particular we also want to get rid of BKEY_EXTENT_VAL_U64s_MAX. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
d8ebed7d |
|
19-Nov-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add btree cache stats to sysfs Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
d8b46004 |
|
15-Nov-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Check for errors from register_shrinker() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e648448c |
|
11-Nov-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix missing memalloc_nofs_restore() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
6a747c46 |
|
09-Nov-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add accounting for dirty btree nodes/keys This lets us improve journal reclaim, so that it now tries to make sure no more than 3/4s of the btree node cache and btree key cache are dirty - ensuring the shrinkers can free memory. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
29364f34 |
|
02-Nov-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Drop sysfs interface to debug parameters It's not used much anymore, the module paramter interface is better. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
a301dc38 |
|
28-Oct-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Improve tracing for transaction restarts We have a bug where we can get stuck with a process spinning in transaction restarts - need more information. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
645d72aa |
|
26-Oct-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix btree updates when mixing cached and non cached iterators There was a bug where bch2_trans_update() would incorrectly delete a pending update where the new update did not actually overwrite the existing update, because we were incorrectly using BTREE_ITER_TYPE when sorting pending btree updates. This affects the pending patch to use cached iterators for inode updates. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
97c0e195 |
|
15-Oct-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix another lockdep splat vfree() can allocate memory, so we need to call memalloc_nofs_save(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
5d0b7f90 |
|
11-Oct-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix a lockdep splat We can't allocate memory with GFP_FS while holding the btree cache lock, and vfree() can allocate memory. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
4580baec |
|
25-Jul-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Remove some uses of PAGE_SIZE in the btree code For portability to userspace, we should try to avoid working in kernel pages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
760992aa |
|
25-Jul-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Ensure we wake up threads locking node when reusing it Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
47a5649a |
|
15-Jun-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix incorrect gfp check Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
bd2bb273 |
|
12-Jun-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Don't deadlock when btree node reuse changes lock ordering Btree node lock ordering is based on the logical key. However, 'struct btree' may be reused for a different btree node under memory pressure. This patch uses the new six lock callback to check if a btree node is no longer the node we wanted to lock before blocking. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e38821f3 |
|
09-Jun-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Don't allocate memory under the btree cache lock The btree cache lock is needed for reclaiming from the btree node cache, and memory allocation can potentially spin and sleep (for 100 ms at a time), so.. don't do that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
8c9eef95 |
|
05-Jun-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Check gfp_flags correctly in bch2_btree_cache_scan() bch2_btree_node_mem_alloc() uses memalloc_nofs_save()/GFP_NOFS, but GFP_NOFS does include __GFP_IO - oops. We used to use GFP_NOIO, but as we're a filesystem now GFP_NOFS makes more sense now and is looser. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
f96c0df4 |
|
02-Jun-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fix a deadlock in bch2_btree_node_get_sibling() There was a bad interaction with bch2_btree_iter_set_pos_same_leaf(), which can leave a btree node locked that is just outside iter->pos, breaking the lock ordering checks in __bch2_btree_node_lock(). Ideally we should get rid of this corner case, but for now fix it locally with verbose comments. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
692c3f06 |
|
27-May-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: fix memalloc_nofs_restore() usage Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
39fb2983 |
|
07-Jan-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Kill bkey_type_successor Previously, BTREE_ID_INODES was special - inodes were indexed by the inode field, which meant the offset field of struct bpos wasn't used, which led to special cases in e.g. the btree iterator code. Now, inodes in the inodes btree are indexed by the offset field. Also: prevously min_key was special for extents btrees, min_key for extents would equal max_key for the previous node. Now, min_key = bkey_successor() of the previous node, same as non extent btrees. This means we can completely get rid of btree_type_sucessor/predecessor. Also make some improvements to the metadata IO validate/compat code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e62d65f2 |
|
15-Mar-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: trans_commit() path can now insert to interior nodes This will be needed for the upcoming patches to journal updates to interior btree nodes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
7f81d4cf |
|
26-Feb-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: fix setting btree_node_accessed() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
72141e1f |
|
24-Feb-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Use btree_ptr_v2.mem_ptr to avoid hash table lookup Nice performance optimization Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
237e8048 |
|
18-Feb-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: introduce b->hash_val This is partly prep work for introducing bch_btree_ptr_v2, but it'll also be a bit of a performance boost by moving the full key out of the hot part of struct btree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
c9bebae6 |
|
29-Nov-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Whiteout changes More prep work for snapshots: extents will soon be using KEY_TYPE_deleted for whiteouts, with 0 size. But we wen't be able to keep these whiteouts with the rest of the extents in the btree node, due to sorting invariants breaking. We can deal with this by immediately moving the new whiteouts to the unwritten whiteouts area - this just means those whiteouts won't be sorted, so we need new code to sort them prior to merging them with the rest of the keys to be written. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
58404bb2 |
|
23-Oct-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Fall back to slowpath on exact comparison This is basically equivalent to the original strategy of falling back to checking against the original key when the original key and previous key didn't differ in the required bits - except, now we only fall back when the search key doesn't differ in the required bits, which ends up being a bit faster. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1bdb67e8 |
|
06-Nov-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: kill BFLOAT_FAILED_PREV The assumption underlying BFLOAT_FAILED_PREV was wrong; the comparison we're doing in bset_search_tree() doesn't have to tell the pivot apart from the previous key, it just has to tell if search is definitely greater than or equal to the pivot. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
fb975d14 |
|
21-Sep-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Drop unnecessary rcu_read_lock() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
e0dfc08b |
|
11-Jun-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: use memalloc_nofs_save() for vmalloc allocation Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
61011ea2 |
|
14-May-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Rip out old hacky transaction restart tracing Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
20bceecb |
|
15-May-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: More work to avoid transaction restarts Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
58fbf808 |
|
15-May-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Delete duplicate code Also rename for consistency Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
60755344 |
|
10-May-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: kill BTREE_ITER_NOUNLOCK Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
b03b81df |
|
10-May-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Don't pass around may_drop_locks Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
c43a6ef9 |
|
05-Jun-2020 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: btree_bkey_cached_common This is prep work for the btree key cache: btree iterators will point to either struct btree, or a new struct bkey_cached. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
ba5c6557 |
|
22-Apr-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Add actual tracepoints for transaction restarts Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
0f238367 |
|
27-Mar-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: trans_for_each_iter() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
d0cc3def |
|
13-Jan-2019 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: More allocator startup improvements Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
26609b61 |
|
01-Nov-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Make bkey types globally unique this lets us get rid of a lot of extra switch statements - in a lot of places we dispatch on the btree node type, and then the key type, so this is a nice cleanup across a lot of code. Also improve the on disk format versioning stuff. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
319f9ac3 |
|
08-Nov-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: revamp to_text methods Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1c7a0adf |
|
12-Jul-2018 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: trace transaction restarts exceptionally crappy "tracing", but it's a start at documenting the places restarts can be triggered Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
#
1c6fdbd8 |
|
17-Mar-2017 |
Kent Overstreet <kent.overstreet@gmail.com> |
bcachefs: Initial commit Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write filesystem with every feature you could possibly want. Website: https://bcachefs.org Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|