History log of /linux-master/lib/ref_tracker.c
Revision Date Author Comments
# acd8f0e5 01-Jun-2023 Andrzej Hajda <andrzej.hajda@intel.com>

lib/ref_tracker: remove warnings in case of allocation failure

Library can handle allocation failures. To avoid allocation warnings
__GFP_NOWARN has been added everywhere. Moreover GFP_ATOMIC has been
replaced with GFP_NOWAIT in case of stack allocation on tracker free
call.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 227c6c83 01-Jun-2023 Andrzej Hajda <andrzej.hajda@intel.com>

lib/ref_tracker: add printing to memory buffer

Similar to stack_(depot|trace)_snprint the patch
adds helper to printing stats to memory buffer.
It will be helpful in case of debugfs.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# b6d7c0eb 01-Jun-2023 Andrzej Hajda <andrzej.hajda@intel.com>

lib/ref_tracker: improve printing stats

In case the library is tracking busy subsystem, simply
printing stack for every active reference will spam log
with long, hard to read, redundant stack traces. To improve
readabilty following changes have been made:
- reports are printed per stack_handle - log is more compact,
- added display name for ref_tracker_dir - it will differentiate
multiple subsystems,
- stack trace is printed indented, in the same printk call,
- info about dropped references is printed as well.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 7a113ff6 01-Jun-2023 Andrzej Hajda <andrzej.hajda@intel.com>

lib/ref_tracker: add unlocked leak print helper

To have reliable detection of leaks, caller must be able to check under
the same lock both: tracked counter and the leaks. dir.lock is natural
candidate for such lock and unlocked print helper can be called with this
lock taken.
As a bonus we can reuse this helper in ref_tracker_dir_exit.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# c2d1e3df 05-Feb-2022 Eric Dumazet <edumazet@google.com>

ref_tracker: remove filter_irq_stacks() call

After commit e94006608949 ("lib/stackdepot: always do filter_irq_stacks()
in stack_depot_save()") it became unnecessary to filter the stack
before calling stack_depot_save().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 8fd5522f 04-Feb-2022 Eric Dumazet <edumazet@google.com>

ref_tracker: add a count of untracked references

We are still chasing a netdev refcount imbalance, and we suspect
we have one rogue dev_put() that is consuming a reference taken
from a dev_hold_track()

To detect this case, allow ref_tracker_alloc() and ref_tracker_free()
to be called with a NULL @trackerp parameter, and use a dedicated
refcount_t just for them.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# e3ececfe 04-Feb-2022 Eric Dumazet <edumazet@google.com>

ref_tracker: implement use-after-free detection

Whenever ref_tracker_dir_init() is called, mark the struct ref_tracker_dir
as dead.

Test the dead status from ref_tracker_alloc() and ref_tracker_free()

This should detect buggy dev_put()/dev_hold() happening too late
in netdevice dismantle process.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# c12837d1 12-Jan-2022 Eric Dumazet <edumazet@google.com>

ref_tracker: use __GFP_NOFAIL more carefully

syzbot was able to trigger this warning from new_slab()
/*
* All existing users of the __GFP_NOFAIL are blockable, so warn
* of any new users that actually require GFP_NOWAIT
*/
if (WARN_ON_ONCE(!can_direct_reclaim))
goto fail;

Indeed, we should use __GFP_NOFAIL if direct reclaim is possible.

Hopefully in the future we will be able to use SLAB_NOFAILSLAB
option so that syzbot can benefit from full ref_tracker
even in the presence of memory fault injections.

WARNING: CPU: 0 PID: 13 at mm/page_alloc.c:5081 __alloc_pages_slowpath.constprop.0+0x1b7b/0x20d0 mm/page_alloc.c:5081 mm/page_alloc.c:5081
Modules linked in:
CPU: 0 PID: 13 Comm: ksoftirqd/0 Not tainted 5.16.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__alloc_pages_slowpath.constprop.0+0x1b7b/0x20d0 mm/page_alloc.c:5081 mm/page_alloc.c:5081
Code: 90 08 00 00 48 81 c7 d8 04 00 00 48 89 f8 48 c1 e8 03 42 80 3c 30 00 0f 84 f0 ea ff ff e8 3d 82 09 00 e9 e6 ea ff ff 4d 89 fd <0f> 0b 48 b8 00 00 00 00 00 fc ff df 48 8b 54 24 30 48 c1 ea 03 80
RSP: 0018:ffffc90000d272b8 EFLAGS: 00010246

RAX: 0000000000000000 RBX: ffff88813fffc300 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88813fffc348
RBP: ffff88813fffc300 R08: 00000000000013dc R09: 00000000000013c8
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffffc90000d274e8 R14: dffffc0000000000 R15: ffffc90000d274e8
FS: 0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffefe6000f8 CR3: 000000001d21e000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__alloc_pages+0x412/0x500 mm/page_alloc.c:5382 mm/page_alloc.c:5382
alloc_pages+0x1a7/0x300 mm/mempolicy.c:2191 mm/mempolicy.c:2191
alloc_slab_page mm/slub.c:1793 [inline]
allocate_slab mm/slub.c:1938 [inline]
alloc_slab_page mm/slub.c:1793 [inline] mm/slub.c:1993
allocate_slab mm/slub.c:1938 [inline] mm/slub.c:1993
new_slab+0x349/0x4a0 mm/slub.c:1993 mm/slub.c:1993
___slab_alloc+0x918/0xfe0 mm/slub.c:3022 mm/slub.c:3022
__slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3109 mm/slub.c:3109
slab_alloc_node mm/slub.c:3200 [inline]
slab_alloc mm/slub.c:3242 [inline]
slab_alloc_node mm/slub.c:3200 [inline] mm/slub.c:3259
slab_alloc mm/slub.c:3242 [inline] mm/slub.c:3259
kmem_cache_alloc_trace+0x289/0x2c0 mm/slub.c:3259 mm/slub.c:3259
kmalloc include/linux/slab.h:590 [inline]
kzalloc include/linux/slab.h:724 [inline]
kmalloc include/linux/slab.h:590 [inline] lib/ref_tracker.c:74
kzalloc include/linux/slab.h:724 [inline] lib/ref_tracker.c:74
ref_tracker_alloc+0xe1/0x430 lib/ref_tracker.c:74 lib/ref_tracker.c:74
netdev_tracker_alloc include/linux/netdevice.h:3855 [inline]
dev_hold_track include/linux/netdevice.h:3872 [inline]
netdev_tracker_alloc include/linux/netdevice.h:3855 [inline] net/core/dst.c:52
dev_hold_track include/linux/netdevice.h:3872 [inline] net/core/dst.c:52
dst_init+0xe0/0x520 net/core/dst.c:52 net/core/dst.c:52
dst_alloc+0x16b/0x1f0 net/core/dst.c:96 net/core/dst.c:96
rt_dst_alloc+0x73/0x450 net/ipv4/route.c:1614 net/ipv4/route.c:1614
ip_route_input_mc net/ipv4/route.c:1720 [inline]
ip_route_input_mc net/ipv4/route.c:1720 [inline] net/ipv4/route.c:2465
ip_route_input_rcu.part.0+0x4fe/0xcc0 net/ipv4/route.c:2465 net/ipv4/route.c:2465
ip_route_input_rcu net/ipv4/route.c:2420 [inline]
ip_route_input_rcu net/ipv4/route.c:2420 [inline] net/ipv4/route.c:2416
ip_route_input_noref+0x1b8/0x2a0 net/ipv4/route.c:2416 net/ipv4/route.c:2416
ip_rcv_finish_core.constprop.0+0x288/0x1e90 net/ipv4/ip_input.c:354 net/ipv4/ip_input.c:354
ip_rcv_finish+0x135/0x2f0 net/ipv4/ip_input.c:427 net/ipv4/ip_input.c:427
NF_HOOK include/linux/netfilter.h:307 [inline]
NF_HOOK include/linux/netfilter.h:301 [inline]
NF_HOOK include/linux/netfilter.h:307 [inline] net/ipv4/ip_input.c:540
NF_HOOK include/linux/netfilter.h:301 [inline] net/ipv4/ip_input.c:540
ip_rcv+0xaa/0xd0 net/ipv4/ip_input.c:540 net/ipv4/ip_input.c:540
__netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5350 net/core/dev.c:5350
__netif_receive_skb+0x24/0x1b0 net/core/dev.c:5464 net/core/dev.c:5464
process_backlog+0x2a5/0x6c0 net/core/dev.c:5796 net/core/dev.c:5796
__napi_poll+0xaf/0x440 net/core/dev.c:6364 net/core/dev.c:6364
napi_poll net/core/dev.c:6431 [inline]
napi_poll net/core/dev.c:6431 [inline] net/core/dev.c:6518
net_rx_action+0x801/0xb40 net/core/dev.c:6518 net/core/dev.c:6518
__do_softirq+0x29b/0x9c2 kernel/softirq.c:558 kernel/softirq.c:558
run_ksoftirqd kernel/softirq.c:921 [inline]
run_ksoftirqd kernel/softirq.c:921 [inline] kernel/softirq.c:913
run_ksoftirqd+0x2d/0x60 kernel/softirq.c:913 kernel/softirq.c:913
smpboot_thread_fn+0x645/0x9c0 kernel/smpboot.c:164 kernel/smpboot.c:164
kthread+0x405/0x4f0 kernel/kthread.c:327 kernel/kthread.c:327
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 arch/x86/entry/entry_64.S:295

Fixes: 4e66934eaadc ("lib: add reference counting tracking infrastructure")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4e66934e 04-Dec-2021 Eric Dumazet <edumazet@google.com>

lib: add reference counting tracking infrastructure

It can be hard to track where references are taken and released.

In networking, we have annoying issues at device or netns dismantles,
and we had various proposals to ease root causing them.

This patch adds new infrastructure pairing refcount increases
and decreases. This will self document code, because programmers
will have to associate increments/decrements.

This is controled by CONFIG_REF_TRACKER which can be selected
by users of this feature.

This adds both cpu and memory costs, and thus should probably be
used with care.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>