History log of /linux-master/mm/kmemleak.c
Revision Date Author Comments
# 39042079 01-Dec-2023 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: avoid RCU stalls when freeing metadata for per-CPU pointers

On systems with large number of CPUs, the following soft lockup splat
might sometimes happen:

[ 2656.001617] watchdog: BUG: soft lockup - CPU#364 stuck for 21s! [ksoftirqd/364:2206]
:
[ 2656.141194] RIP: 0010:_raw_spin_unlock_irqrestore+0x3d/0x70
:
2656.241214] Call Trace:
[ 2656.243971] <IRQ>
[ 2656.246237] ? show_trace_log_lvl+0x1c4/0x2df
[ 2656.251152] ? show_trace_log_lvl+0x1c4/0x2df
[ 2656.256066] ? kmemleak_free_percpu+0x11f/0x1f0
[ 2656.261173] ? watchdog_timer_fn+0x379/0x470
[ 2656.265984] ? __pfx_watchdog_timer_fn+0x10/0x10
[ 2656.271179] ? __hrtimer_run_queues+0x5f3/0xd00
[ 2656.276283] ? __pfx___hrtimer_run_queues+0x10/0x10
[ 2656.281783] ? ktime_get_update_offsets_now+0x95/0x2c0
[ 2656.287573] ? ktime_get_update_offsets_now+0xdd/0x2c0
[ 2656.293380] ? hrtimer_interrupt+0x2e9/0x780
[ 2656.298221] ? __sysvec_apic_timer_interrupt+0x184/0x640
[ 2656.304211] ? sysvec_apic_timer_interrupt+0x8e/0xc0
[ 2656.309807] </IRQ>
[ 2656.312169] <TASK>
[ 2656.326110] kmemleak_free_percpu+0x11f/0x1f0
[ 2656.331015] free_percpu.part.0+0x1b/0xe70
[ 2656.335635] free_vfsmnt+0xb9/0x100
[ 2656.339567] rcu_do_batch+0x3c8/0xe30
[ 2656.363693] rcu_core+0x3de/0x5a0
[ 2656.367433] __do_softirq+0x2d0/0x9a8
[ 2656.381119] run_ksoftirqd+0x36/0x60
[ 2656.385145] smpboot_thread_fn+0x556/0x910
[ 2656.394971] kthread+0x2a4/0x350
[ 2656.402826] ret_from_fork+0x29/0x50
[ 2656.406861] </TASK>

The issue is caused by kmemleak registering each per_cpu_ptr()
corresponding to the __percpu pointer. This is unnecessary since such
individual per-CPU pointers are not tracked anyway. Create a new
object_percpu_tree_root rbtree that stores a single __percpu pointer
together with an OBJECT_PERCPU flag for the kmemleak metadata. Scanning
needs to be done for all per_cpu_ptr() pointers with a cond_resched()
between each CPU iteration to avoid RCU stalls.

[catalin.marinas@arm.com: update comment]
Link: https://lkml.kernel.org/r/20231206114414.2085824-1-catalin.marinas@arm.com
Link: https://lore.kernel.org/r/20231127194153.289626-1-longman@redhat.comLink: https://lkml.kernel.org/r/20231201190829.825856-1-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Waiman Long <longman@redhat.com>
Closes: https://lore.kernel.org/r/20231127194153.289626-1-longman@redhat.com
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 52c5d2bc 16-Nov-2023 Jim Cromie <jim.cromie@gmail.com>

kmemleak: add checksum to backtrace report

Change /sys/kernel/debug/kmemleak report format slightly, adding
"(extra info)" to the backtrace header:

from: " backtrace:"
to: " backtrace (crc <cksum>):"

The <cksum> allows a user to see recurring backtraces without
detailed/careful reading of multiline stacks. So after cycling
kmemleak-test a few times, I know some leaks are repeating.

bash-5.2# grep backtrace /sys/kernel/debug/kmemleak | wc
62 186 1792
bash-5.2# grep backtrace /sys/kernel/debug/kmemleak | sort -u | wc
37 111 1067

syzkaller parses kmemleak for "unreferenced object" only, so is
unaffected by this change. Other github repos are moribund.

Link: https://lkml.kernel.org/r/20231116224318.124209-3-jim.cromie@gmail.com
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 88f9ee2b 16-Nov-2023 Jim Cromie <jim.cromie@gmail.com>

kmemleak: drop (age <increasing>) from leak record

Patch series "tweak kmemleak report format".

These 2 patches make minor changes to the report:

1st strips "age <increasing>" from output. This makes the output
idempotent; unchanging until a new leak is reported.

2nd adds the backtrace.checksum to the "backtrace:" line. This lets a
user see repeats without actually reading the whole backtrace. So now
the backtrace line looks like this:

backtrace (crc 603070071):

I surveyed for un-wanted effects upon users:

Syzkaller parses kmemleak in executor/common_linux.h:
static void check_leaks(char** frames, int nframes)

It just counts occurrences of "unreferenced object", specifically it
does not look for "age", nor would it choke on "crc" being added.

github has 3 repos with "kmemleak" mentioned, all are moribund.
gitlab has 0 hits on "kmemleak".


This patch (of 2):

Displaying age is pretty, but counter-productive; it changes with
current-time, so it surrenders idempotency of the output, which breaks
simple hash-based cataloging of the records by the user.

The trouble: sequential reads, wo new leaks, get new results:

:#> sum /sys/kernel/debug/kmemleak
53439 74 /sys/kernel/debug/kmemleak
:#> sum /sys/kernel/debug/kmemleak
59066 74 /sys/kernel/debug/kmemleak

and age is why (nothing else changes):

:#> grep -v age /sys/kernel/debug/kmemleak | sum
58894 67
:#> grep -v age /sys/kernel/debug/kmemleak | sum
58894 67

Since jiffies is already printed in the "comm" line, age adds nothing.

Notably, syzkaller reads kmemleak only for "unreferenced object", and
won't care about this reform of age-ism. A few moribund github repos
mention it, but don't compile.

Link: https://lkml.kernel.org/r/20231116224318.124209-1-jim.cromie@gmail.com
Link: https://lkml.kernel.org/r/20231116224318.124209-2-jim.cromie@gmail.com
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# d63385a7 15-Nov-2023 Liu Shixin <liushixin2@huawei.com>

mm/kmemleak: move set_track_prepare() outside raw_spinlocks

set_track_prepare() will call __alloc_pages() which attempts to acquire
zone->lock(spinlocks), so move it outside object->lock(raw_spinlocks)
because it's not right to acquire spinlocks while holding raw_spinlocks in
RT mode.

Link: https://lkml.kernel.org/r/20231115082138.2649870-3-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Patrick Wang <patrick.wang.shcn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 4eff7d62 15-Nov-2023 Liu Shixin <liushixin2@huawei.com>

Revert "mm/kmemleak: move the initialisation of object to __link_object"

Patch series "Fix invalid wait context of set_track_prepare()".

Geert reported an invalid wait context[1] which is resulted by moving
set_track_prepare() inside kmemleak_lock. This is not allowed because in
RT mode, the spinlocks can be preempted but raw_spinlocks can not, so it
is not allowd to acquire spinlocks while holding raw_spinlocks. The
second patch fix same problem in kmemleak_update_trace().


This patch (of 2):

Move the initialisation of object back to__alloc_object() because
set_track_prepare() attempt to acquire zone->lock(spinlocks) while
__link_object is holding kmemleak_lock(raw_spinlocks). This is not right
for RT mode.

This reverts commit 245245c2fffd00 ("mm/kmemleak: move the initialisation
of object to __link_object").

Link: https://lkml.kernel.org/r/20231115082138.2649870-1-liushixin2@huawei.com
Link: https://lkml.kernel.org/r/20231115082138.2649870-2-liushixin2@huawei.com
Fixes: 245245c2fffd ("mm/kmemleak: move the initialisation of object to __link_object")
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Closes: https://lore.kernel.org/linux-mm/CAMuHMdWj0UzwNaxUvcocTfh481qRJpOWwXxsJCTJfu1oCqvgdA@mail.gmail.com/ [1]
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Patrick Wang <patrick.wang.shcn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 245245c2 22-Oct-2023 Liu Shixin <liushixin2@huawei.com>

mm/kmemleak: move the initialisation of object to __link_object

In patch (mm: kmemleak: split __create_object into two functions), the
initialisation of object has been splited in two places. Catalin said it
feels a bit weird and error prone. So leave __alloc_object() to just do
the actual allocation and let __link_object() do the full initialisation.

Link: https://lkml.kernel.org/r/20231023025125.90972-1-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 5e4fc577 18-Oct-2023 Liu Shixin <liushixin2@huawei.com>

mm/kmemleak: fix partially freeing unknown object warning

delete_object_part() can be called by multiple callers in the same time.
If an object is found and removed by a caller, and then another caller try
to find it too, it failed and return directly. It still be recorded by
kmemleak even if it has already been freed to buddy. With DEBUG on,
kmemleak will report the following warning,

kmemleak: Partially freeing unknown object at 0xa1af86000 (size 4096)
CPU: 0 PID: 742 Comm: test_huge Not tainted 6.6.0-rc3kmemleak+ #54
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x37/0x50
kmemleak_free_part_phys+0x50/0x60
hugetlb_vmemmap_optimize+0x172/0x290
? __pfx_vmemmap_remap_pte+0x10/0x10
__prep_new_hugetlb_folio+0xe/0x30
prep_new_hugetlb_folio.isra.0+0xe/0x40
alloc_fresh_hugetlb_folio+0xc3/0xd0
alloc_surplus_hugetlb_folio.constprop.0+0x6e/0xd0
hugetlb_acct_memory.part.0+0xe6/0x2a0
hugetlb_reserve_pages+0x110/0x2c0
hugetlbfs_file_mmap+0x11d/0x1b0
mmap_region+0x248/0x9a0
? hugetlb_get_unmapped_area+0x15c/0x2d0
do_mmap+0x38b/0x580
vm_mmap_pgoff+0xe6/0x190
ksys_mmap_pgoff+0x18a/0x1f0
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Expand __create_object() and move __alloc_object() to the beginning. Then
use kmemleak_lock to protect __find_and_remove_object() and
__link_object() as a whole, which can guarantee all objects are processed
sequentialally.

Link: https://lkml.kernel.org/r/20231018102952.3339837-8-liushixin2@huawei.com
Fixes: 53238a60dd4a ("kmemleak: Allow partial freeing of memory blocks")
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Patrick Wang <patrick.wang.shcn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 858a195b 18-Oct-2023 Liu Shixin <liushixin2@huawei.com>

mm: kmemleak: add __find_and_remove_object()

Add new __find_and_remove_object() without kmemleak_lock protect, it is in
preparation for the next patch.

Link: https://lkml.kernel.org/r/20231018102952.3339837-7-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Patrick Wang <patrick.wang.shcn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 2e1d4738 18-Oct-2023 Liu Shixin <liushixin2@huawei.com>

mm: kmemleak: use mem_pool_free() to free object

The kmemleak object is allocated by mem_pool_alloc(), which could be from
slab or mem_pool[], so it's not suitable using __kmem_cache_free() to free
the object, use __mem_pool_free() instead.

Link: https://lkml.kernel.org/r/20231018102952.3339837-6-liushixin2@huawei.com
Fixes: 0647398a8c7b ("mm: kmemleak: simple memory allocation pool for kmemleak objects")
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Patrick Wang <patrick.wang.shcn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 0edd7b58 18-Oct-2023 Liu Shixin <liushixin2@huawei.com>

mm: kmemleak: split __create_object into two functions

__create_object() consists of two part, the first part allocate a kmemleak
object and initialize it, the second part insert it into object tree.
This function need kmemleak_lock but actually only the second part need
lock.

Split it into two functions, the first function __alloc_object only
allocate a kmemleak object, and the second function __link_object() will
initialize the object and insert it into object tree, use the
kmemleak_lock to protect __link_object() only.

[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/20231018102952.3339837-5-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Patrick Wang <patrick.wang.shcn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 62047e0f 18-Oct-2023 Liu Shixin <liushixin2@huawei.com>

mm/kmemleak: fix print format of pointer in pr_debug()

With 0x%p, the pointer will be hashed and print (____ptrval____) instead.
And with 0x%pa, the pointer can be successfully printed but with duplicate
prefixes, which looks like:

kmemleak: kmemleak_free(0x(____ptrval____))
kmemleak: kmemleak_free_percpu(0x(____ptrval____))
kmemleak: kmemleak_free_part_phys(0x0x0000000a1af86000)

Use 0x%px instead of 0x%p or 0x%pa to print the pointer. Then the print
will be like:

kmemleak: kmemleak_free(0xffff9111c145b020)
kmemleak: kmemleak_free_percpu(0x00000000000333b0)
kmemleak: kmemleak_free_part_phys(0x0000000a1af80000)

Link: https://lkml.kernel.org/r/20231018102952.3339837-4-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Patrick Wang <patrick.wang.shcn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# e68d343d 24-Aug-2023 Waiman Long <longman@redhat.com>

mm/kmemleak: move up cond_resched() call in page scanning loop

Commit bde5f6bc68db ("kmemleak: add scheduling point to kmemleak_scan()")
added a cond_resched() call to the struct page scanning loop to prevent
soft lockup from happening. However, soft lockup can still happen in that
loop in some corner cases when the pages that satisfy the "!(pfn & 63)"
check are skipped for some reasons.

Fix this corner case by moving up the cond_resched() check so that it will
be called every 64 pages unconditionally.

Link: https://lkml.kernel.org/r/20230825164947.1317981-1-longman@redhat.com
Fixes: bde5f6bc68db ("kmemleak: add scheduling point to kmemleak_scan()")
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# d160ef71 15-Aug-2023 Xiaolei Wang <xiaolei.wang@windriver.com>

Rename kmemleak_initialized to kmemleak_late_initialized

The old name is confusing because it implies the completion of earlier
kmemleak_init(), the new name update to kmemleak_late_initial represents
the completion of kmemleak_late_init().

No functional changes.

Link: https://lkml.kernel.org/r/20230815144128.3623103-3-xiaolei.wang@windriver.com
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 835bc157 15-Aug-2023 Xiaolei Wang <xiaolei.wang@windriver.com>

mm/kmemleak: use object_cache instead of kmemleak_initialized to check in set_track_prepare()

Patch series "mm/kmemleak: use object_cache instead of
kmemleak_initialized", v3.

Use object_cache instead of kmemleak_initialized to check in
set_track_prepare(), so that memory leaks after kmemleak_init() can be
recorded and Rename kmemleak_initialized to kmemleak_late_initialized

unreferenced object 0xc674ca80 (size 64):
comm "swapper/0", pid 1, jiffies 4294938337 (age 204.880s)
hex dump (first 32 bytes):
80 55 75 c6 80 54 75 c6 00 55 75 c6 80 52 75 c6 .Uu..Tu..Uu..Ru.
00 53 75 c6 00 00 00 00 00 00 00 00 00 00 00 00 .Su..........


This patch (of 2):

kmemleak_initialized is set in kmemleak_late_init(), which also means that
there is no call trace which object's memory leak is before
kmemleak_late_init(), so use object_cache instead of kmemleak_initialized
to check in set_track_prepare() to avoid no call trace records when there
is a memory leak in the code between kmemleak_init() and
kmemleak_late_init().

unreferenced object 0xc674ca80 (size 64):
comm "swapper/0", pid 1, jiffies 4294938337 (age 204.880s)
hex dump (first 32 bytes):
80 55 75 c6 80 54 75 c6 00 55 75 c6 80 52 75 c6 .Uu..Tu..Uu..Ru.
00 53 75 c6 00 00 00 00 00 00 00 00 00 00 00 00 .Su..........

Link: https://lkml.kernel.org/r/20230815144128.3623103-1-xiaolei.wang@windriver.com
Link: https://lkml.kernel.org/r/20230815144128.3623103-2-xiaolei.wang@windriver.com
Fixes: 56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace")
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 1c0310ad 10-Feb-2023 Andrey Konovalov <andreyknvl@gmail.com>

lib/stackdepot, mm: rename stack_depot_want_early_init

Rename stack_depot_want_early_init to stack_depot_request_early_init.

The old name is confusing, as it hints at returning some kind of intention
of stack depot. The new name reflects that this function requests an
action from stack depot instead.

No functional changes.

[akpm@linux-foundation.org: update mm/kmemleak.c]
Link: https://lkml.kernel.org/r/359f31bf67429a06e630b4395816a967214ef753.1676063693.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 782e4179 18-Jan-2023 Waiman Long <longman@redhat.com>

mm/kmemleak: fix UAF bug in kmemleak_scan()

Commit 6edda04ccc7c ("mm/kmemleak: prevent soft lockup in first object
iteration loop of kmemleak_scan()") fixes soft lockup problem in
kmemleak_scan() by periodically doing a cond_resched(). It does take a
reference of the current object before doing it. Unfortunately, if the
object has been deleted from the object_list, the next object pointed to
by its next pointer may no longer be valid after coming back from
cond_resched(). This can result in use-after-free and other nasty
problem.

Fix this problem by adding a del_state flag into kmemleak_object structure
to synchronize the object deletion process between kmemleak_cond_resched()
and __remove_object() to make sure that the object remained in the
object_list in the duration of the cond_resched() call.

Link: https://lkml.kernel.org/r/20230119040111.350923-3-longman@redhat.com
Fixes: 6edda04ccc7c ("mm/kmemleak: prevent soft lockup in first object iteration loop of kmemleak_scan()")
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 6061e740 18-Jan-2023 Waiman Long <longman@redhat.com>

mm/kmemleak: simplify kmemleak_cond_resched() usage

Patch series "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF", v2.

It was found that a KASAN use-after-free error was reported in the
kmemleak_scan() function. After further examination, it is believe that
even though a reference is taken from the current object, it does not
prevent the object pointed to by the next pointer from going away after a
cond_resched().

To fix that, additional flags are added to make sure that the current
object won't be removed from the object_list during the duration of the
cond_resched() to ensure the validity of the next pointer.

While making the change, I also simplify the current usage of
kmemleak_cond_resched() to make it easier to understand.


This patch (of 2):

The presence of a pinned argument and the 64k loop count make
kmemleak_cond_resched() a bit more complex to read. The pinned argument
is used only by first kmemleak_scan() loop.

Simplify the usage of kmemleak_cond_resched() by removing the pinned
argument and always do a get_object()/put_object() sequence. In addition,
the 64k loop is removed by using need_resched() to decide if
kmemleak_cond_resched() should be called.

Link: https://lkml.kernel.org/r/20230119040111.350923-1-longman@redhat.com
Link: https://lkml.kernel.org/r/20230119040111.350923-2-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 993f57e0 18-Jan-2023 Zhaoyang Huang <zhaoyang.huang@unisoc.com>

mm: use stack_depot_early_init for kmemleak

Mirsad report the below error which is caused by stack_depot_init()
failure in kvcalloc. Solve this by having stackdepot use
stack_depot_early_init().

On 1/4/23 17:08, Mirsad Goran Todorovac wrote:
I hate to bring bad news again, but there seems to be a problem with the output of /sys/kernel/debug/kmemleak:

[root@pc-mtodorov ~]# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff951c118568b0 (size 16):
comm "kworker/u12:2", pid 56, jiffies 4294893952 (age 4356.548s)
hex dump (first 16 bytes):
6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00 memstick0.......
backtrace:
[root@pc-mtodorov ~]#

Apparently, backtrace of called functions on the stack is no longer
printed with the list of memory leaks. This appeared on Lenovo desktop
10TX000VCR, with AlmaLinux 8.7 and BIOS version M22KT49A (11/10/2022) and
6.2-rc1 and 6.2-rc2 builds. This worked on 6.1 with the same
CONFIG_KMEMLEAK=y and MGLRU enabled on a vanilla mainstream kernel from
Mr. Torvalds' tree. I don't know if this is deliberate feature for some
reason or a bug. Please find attached the config, lshw and kmemleak
output.

[vbabka@suse.cz: remove stack_depot_init() call]
Link: https://lore.kernel.org/all/5272a819-ef74-65ff-be61-4d2d567337de@alu.unizg.hr/
Link: https://lkml.kernel.org/r/1674091345-14799-2-git-send-email-zhaoyang.huang@unisoc.com
Fixes: 56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace")
Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: ke.wang <ke.wang@unisoc.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 3a6f33d8 08-Nov-2022 Clément Léger <clement.leger@bootlin.com>

mm/kmemleak: use %pK to display kernel pointers in backtrace

Currently, %p is used to display kernel pointers in backtrace which result
in a hashed value that is not usable to correlate the address for debug.
Use %pK which will respect the kptr_restrict configuration value and thus
allow to extract meaningful information from the backtrace.

Link: https://lkml.kernel.org/r/20221108094322.73492-1-clement.leger@bootlin.com
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 56a61617 27-Oct-2022 Zhaoyang Huang <zhaoyang.huang@unisoc.com>

mm: use stack_depot for recording kmemleak's backtrace

Using stack_depot to record kmemleak's backtrace which has been
implemented on slub for reducing redundant information.

[akpm@linux-foundation.org: fix build - remove now-unused __save_stack_trace()]
[zhaoyang.huang@unisoc.com: v3]
Link: https://lkml.kernel.org/r/1667101354-4669-1-git-send-email-zhaoyang.huang@unisoc.com
[akpm@linux-foundation.org: fix v3 layout oddities]
[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/1666864224-27541-1-git-send-email-zhaoyang.huang@unisoc.com
Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: ke.wang <ke.wang@unisoc.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhaoyang Huang <huangzhaoyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 25e9fa22 14-Nov-2022 Yixuan Cao <caoyixuan2019@email.szu.edu.cn>

mm/kmemleak.c: fix a comment

I noticed a typo in a code comment and I fixed it.

Link: https://lkml.kernel.org/r/20221114171426.91745-1-caoyixuan2019@email.szu.edu.cn
Signed-off-by: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 984a6083 20-Oct-2022 Waiman Long <longman@redhat.com>

mm/kmemleak: prevent soft lockup in kmemleak_scan()'s object iteration loops

Commit 6edda04ccc7c ("mm/kmemleak: prevent soft lockup in first object
iteration loop of kmemleak_scan()") adds cond_resched() in the first
object iteration loop of kmemleak_scan(). However, it turns that the 2nd
objection iteration loop can still cause soft lockup to happen in some
cases. So add a cond_resched() call in the 2nd and 3rd loops as well to
prevent that and for completeness.

Link: https://lkml.kernel.org/r/20221020175619.366317-1-longman@redhat.com
Fixes: 6edda04ccc7c ("mm/kmemleak: prevent soft lockup in first object iteration loop of kmemleak_scan()")
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# b955aa70 31-Aug-2022 Liu Shixin <liushixin2@huawei.com>

mm/kmemleak: make create_object return void

No caller cares about the return value of create_object(), so make it
return void.

Link: https://lkml.kernel.org/r/20220901023007.3471887-1-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 6edda04c 14-Jun-2022 Waiman Long <longman@redhat.com>

mm/kmemleak: prevent soft lockup in first object iteration loop of kmemleak_scan()

The first RCU-based object iteration loop has to modify the object count.
So we cannot skip taking the object lock.

One way to avoid soft lockup is to insert occasional cond_resched() call
into the loop. This cannot be done while holding the RCU read lock which
is to protect objects from being freed. However, taking a reference to
the object will prevent it from being freed. We can then do a
cond_resched() call after every 64k objects safely.

Link: https://lkml.kernel.org/r/20220614220359.59282-4-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 64977918 14-Jun-2022 Waiman Long <longman@redhat.com>

mm/kmemleak: skip unlikely objects in kmemleak_scan() without taking lock

There are 3 RCU-based object iteration loops in kmemleak_scan(). Because
of the need to take RCU read lock, we can't insert cond_resched() into the
loop like other parts of the function. As there can be millions of
objects to be scanned, it takes a while to iterate all of them. The
kmemleak functionality is usually enabled in a debug kernel which is much
slower than a non-debug kernel. With sufficient number of kmemleak
objects, the time to iterate them all may exceed 22s causing soft lockup.

watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kmemleak:625]

In this particular bug report, the soft lockup happen in the 2nd iteration
loop.

In the 2nd and 3rd loops, most of the objects are checked and then skipped
under the object lock. Only a selected fews are modified. Those objects
certainly need lock protection. However, the lock/unlock operation is
slow especially with interrupt disabling and enabling included.

We can actually do some basic check like color_white() without taking the
lock and skip the object accordingly. Of course, this kind of check is
racy and may miss objects that are being modified concurrently. The cost
of missed objects, however, is just that they will be discovered in the
next scan instead. The advantage of doing so is that iteration can be
done much faster especially with LOCKDEP enabled in a debug kernel.

With a debug kernel running on a 2-socket 96-thread x86-64 system
(HZ=1000), the 2nd and 3rd iteration loops speedup with this patch on the
first kmemleak_scan() call after bootup is shown in the table below.

Before patch After patch
Loop # # of objects Elapsed time # of objects Elapsed time
------ ------------ ------------ ------------ ------------
2 2,599,850 2.392s 2,596,364 0.266s
3 2,600,176 2.171s 2,597,061 0.260s

This patch reduces loop iteration times by about 88%. This will greatly
reduce the chance of a soft lockup happening in the 2nd or 3rd iteration
loops.

Even though the first loop runs a little bit faster, it can still be
problematic if many kmemleak objects are there. As the object count has
to be modified in every object, we cannot avoid taking the object lock.
So other way to prevent soft lockup will be needed.

Link: https://lkml.kernel.org/r/20220614220359.59282-3-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 00c15506 14-Jun-2022 Waiman Long <longman@redhat.com>

mm/kmemleak: use _irq lock/unlock variants in kmemleak_scan/_clear()

Patch series "mm/kmemleak: Avoid soft lockup in kmemleak_scan()", v2.

There are 3 RCU-based object iteration loops in kmemleak_scan(). Because
of the need to take RCU read lock, we can't insert cond_resched() into the
loop like other parts of the function. As there can be millions of
objects to be scanned, it takes a while to iterate all of them. The
kmemleak functionality is usually enabled in a debug kernel which is much
slower than a non-debug kernel. With sufficient number of kmemleak
objects, the time to iterate them all may exceed 22s causing soft lockup.

watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kmemleak:625]

This patch series make changes to the 3 object iteration loops in
kmemleak_scan() to prevent them from causing soft lockup.


This patch (of 3):

kmemleak_scan() is called only from the kmemleak scan thread or from write
to the kmemleak debugfs file. Both are in task context and so we can
directly use the simpler _irq() lock/unlock calls instead of the more
complex _irqsave/_irqrestore variants.

Similarly, kmemleak_clear() is called only from write to the kmemleak
debugfs file. The same change can be applied.

Link: https://lkml.kernel.org/r/20220614220359.59282-1-longman@redhat.com
Link: https://lkml.kernel.org/r/20220614220359.59282-2-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 84c32629 10-Jun-2022 Patrick Wang <patrick.wang.shcn@gmail.com>

mm: kmemleak: check physical address when scan

Check the physical address of objects for its boundary when scan instead
of in kmemleak_*_phys().

Link: https://lkml.kernel.org/r/20220611035551.1823303-5-patrick.wang.shcn@gmail.com
Fixes: 23c2d497de21 ("mm: kmemleak: take a full lowmem check in kmemleak_*_phys()")
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Yee Lee <yee.lee@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 0c24e061 10-Jun-2022 Patrick Wang <patrick.wang.shcn@gmail.com>

mm: kmemleak: add rbtree and store physical address for objects allocated with PA

Add object_phys_tree_root to store the objects allocated with physical
address. Distinguish it from object_tree_root by OBJECT_PHYS flag or
function argument. The physical address is stored directly in those
objects.

Link: https://lkml.kernel.org/r/20220611035551.1823303-4-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Yee Lee <yee.lee@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 8e0c4ab3 10-Jun-2022 Patrick Wang <patrick.wang.shcn@gmail.com>

mm: kmemleak: add OBJECT_PHYS flag for objects allocated with physical address

Add OBJECT_PHYS flag for object. This flag is used to identify the
objects allocated with physical address. The create_object_phys()
function is added as well to set that flag and is used by
kmemleak_alloc_phys().

Link: https://lkml.kernel.org/r/20220611035551.1823303-3-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Yee Lee <yee.lee@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# c200d900 10-Jun-2022 Patrick Wang <patrick.wang.shcn@gmail.com>

mm: kmemleak: remove kmemleak_not_leak_phys() and the min_count argument to kmemleak_alloc_phys()

Patch series "mm: kmemleak: store objects allocated with physical address
separately and check when scan", v4.

The kmemleak_*_phys() interface uses "min_low_pfn" and "max_low_pfn" to
check address. But on some architectures, kmemleak_*_phys() is called
before those two variables initialized. The following steps will be
taken:

1) Add OBJECT_PHYS flag and rbtree for the objects allocated
with physical address
2) Store physical address in objects if allocated with OBJECT_PHYS
3) Check the boundary when scan instead of in kmemleak_*_phys()

This patch set will solve:
https://lore.kernel.org/r/20220527032504.30341-1-yee.lee@mediatek.com
https://lore.kernel.org/r/9dd08bb5-f39e-53d8-f88d-bec598a08c93@gmail.com

v3: https://lore.kernel.org/r/20220609124950.1694394-1-patrick.wang.shcn@gmail.com
v2: https://lore.kernel.org/r/20220603035415.1243913-1-patrick.wang.shcn@gmail.com
v1: https://lore.kernel.org/r/20220531150823.1004101-1-patrick.wang.shcn@gmail.com


This patch (of 4):

Remove the unused kmemleak_not_leak_phys() function. And remove the
min_count argument to kmemleak_alloc_phys() function, assume it's 0.

Link: https://lkml.kernel.org/r/20220611035551.1823303-1-patrick.wang.shcn@gmail.com
Link: https://lkml.kernel.org/r/20220611035551.1823303-2-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Yee Lee <yee.lee@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# 23c2d497 14-Apr-2022 Patrick Wang <patrick.wang.shcn@gmail.com>

mm: kmemleak: take a full lowmem check in kmemleak_*_phys()

The kmemleak_*_phys() apis do not check the address for lowmem's min
boundary, while the caller may pass an address below lowmem, which will
trigger an oops:

# echo scan > /sys/kernel/debug/kmemleak
Unable to handle kernel paging request at virtual address ff5fffffffe00000
Oops [#1]
Modules linked in:
CPU: 2 PID: 134 Comm: bash Not tainted 5.18.0-rc1-next-20220407 #33
Hardware name: riscv-virtio,qemu (DT)
epc : scan_block+0x74/0x15c
ra : scan_block+0x72/0x15c
epc : ffffffff801e5806 ra : ffffffff801e5804 sp : ff200000104abc30
gp : ffffffff815cd4e8 tp : ff60000004cfa340 t0 : 0000000000000200
t1 : 00aaaaaac23954cc t2 : 00000000000003ff s0 : ff200000104abc90
s1 : ffffffff81b0ff28 a0 : 0000000000000000 a1 : ff5fffffffe01000
a2 : ffffffff81b0ff28 a3 : 0000000000000002 a4 : 0000000000000001
a5 : 0000000000000000 a6 : ff200000104abd7c a7 : 0000000000000005
s2 : ff5fffffffe00ff9 s3 : ffffffff815cd998 s4 : ffffffff815d0e90
s5 : ffffffff81b0ff28 s6 : 0000000000000020 s7 : ffffffff815d0eb0
s8 : ffffffffffffffff s9 : ff5fffffffe00000 s10: ff5fffffffe01000
s11: 0000000000000022 t3 : 00ffffffaa17db4c t4 : 000000000000000f
t5 : 0000000000000001 t6 : 0000000000000000
status: 0000000000000100 badaddr: ff5fffffffe00000 cause: 000000000000000d
scan_gray_list+0x12e/0x1a6
kmemleak_scan+0x2aa/0x57e
kmemleak_write+0x32a/0x40c
full_proxy_write+0x56/0x82
vfs_write+0xa6/0x2a6
ksys_write+0x6c/0xe2
sys_write+0x22/0x2a
ret_from_syscall+0x0/0x2

The callers may not quite know the actual address they pass(e.g. from
devicetree). So the kmemleak_*_phys() apis should guarantee the address
they finally use is in lowmem range, so check the address for lowmem's
min boundary.

Link: https://lkml.kernel.org/r/20220413122925.33856-1-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# bfc8089f 01-Apr-2022 Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>

mm/kmemleak: reset tag when compare object pointer

When we use HW-tag based kasan and enable vmalloc support, we hit the
following bug. It is due to comparison between tagged object and
non-tagged pointer.

We need to reset the kasan tag when we need to compare tagged object and
non-tagged pointer.

kmemleak: [name:kmemleak&]Scan area larger than object 0xffffffe77076f440
CPU: 4 PID: 1 Comm: init Tainted: G S W 5.15.25-android13-0-g5cacf919c2bc #1
Hardware name: MT6983(ENG) (DT)
Call trace:
add_scan_area+0xc4/0x244
kmemleak_scan_area+0x40/0x9c
layout_and_allocate+0x1e8/0x288
load_module+0x2c8/0xf00
__se_sys_finit_module+0x190/0x1d0
__arm64_sys_finit_module+0x20/0x30
invoke_syscall+0x60/0x170
el0_svc_common+0xc8/0x114
do_el0_svc+0x28/0xa0
el0_svc+0x60/0xf8
el0t_64_sync_handler+0x88/0xec
el0t_64_sync+0x1b4/0x1b8
kmemleak: [name:kmemleak&]Object 0xf5ffffe77076b000 (size 32768):
kmemleak: [name:kmemleak&] comm "init", pid 1, jiffies 4294894197
kmemleak: [name:kmemleak&] min_count = 0
kmemleak: [name:kmemleak&] count = 0
kmemleak: [name:kmemleak&] flags = 0x1
kmemleak: [name:kmemleak&] checksum = 0
kmemleak: [name:kmemleak&] backtrace:
module_alloc+0x9c/0x120
move_module+0x34/0x19c
layout_and_allocate+0x1c4/0x288
load_module+0x2c8/0xf00
__se_sys_finit_module+0x190/0x1d0
__arm64_sys_finit_module+0x20/0x30
invoke_syscall+0x60/0x170
el0_svc_common+0xc8/0x114
do_el0_svc+0x28/0xa0
el0_svc+0x60/0xf8
el0t_64_sync_handler+0x88/0xec
el0t_64_sync+0x1b4/0x1b8

Link: https://lkml.kernel.org/r/20220318034051.30687-1-Kuan-Ying.Lee@mediatek.com
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Cc: Nicholas Tang <nicholas.tang@mediatek.com>
Cc: Yee Lee <yee.lee@mediatek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# c10a0f87 03-Feb-2022 Lang Yu <lang.yu@amd.com>

mm/kmemleak: avoid scanning potential huge holes

When using devm_request_free_mem_region() and devm_memremap_pages() to
add ZONE_DEVICE memory, if requested free mem region's end pfn were
huge(e.g., 0x400000000), the node_end_pfn() will be also huge (see
move_pfn_range_to_zone()). Thus it creates a huge hole between
node_start_pfn() and node_end_pfn().

We found on some AMD APUs, amdkfd requested such a free mem region and
created a huge hole. In such a case, following code snippet was just
doing busy test_bit() looping on the huge hole.

for (pfn = start_pfn; pfn < end_pfn; pfn++) {
struct page *page = pfn_to_online_page(pfn);
if (!page)
continue;
...
}

So we got a soft lockup:

watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [bash:1221]
CPU: 6 PID: 1221 Comm: bash Not tainted 5.15.0-custom #1
RIP: 0010:pfn_to_online_page+0x5/0xd0
Call Trace:
? kmemleak_scan+0x16a/0x440
kmemleak_write+0x306/0x3a0
? common_file_perm+0x72/0x170
full_proxy_write+0x5c/0x90
vfs_write+0xb9/0x260
ksys_write+0x67/0xe0
__x64_sys_write+0x1a/0x20
do_syscall_64+0x3b/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae

I did some tests with the patch.

(1) amdgpu module unloaded

before the patch:

real 0m0.976s
user 0m0.000s
sys 0m0.968s

after the patch:

real 0m0.981s
user 0m0.000s
sys 0m0.973s

(2) amdgpu module loaded

before the patch:

real 0m35.365s
user 0m0.000s
sys 0m35.354s

after the patch:

real 0m1.049s
user 0m0.000s
sys 0m1.042s

Link: https://lkml.kernel.org/r/20211108140029.721144-1-lang.yu@amd.com
Signed-off-by: Lang Yu <lang.yu@amd.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# ad1a3e15 14-Jan-2022 Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>

kmemleak: fix kmemleak false positive report with HW tag-based kasan enable

With HW tag-based kasan enable, We will get the warning when we free
object whose address starts with 0xFF.

It is because kmemleak rbtree stores tagged object and this freeing
object's tag does not match with rbtree object.

In the example below, kmemleak rbtree stores the tagged object in the
kmalloc(), and kfree() gets the pointer with 0xFF tag.

Call sequence:
ptr = kmalloc(size, GFP_KERNEL);
page = virt_to_page(ptr);
offset = offset_in_page(ptr);
kfree(page_address(page) + offset);
ptr = kmalloc(size, GFP_KERNEL);

A sequence like that may cause the warning as following:

1) Freeing unknown object:

In kfree(), we will get free unknown object warning in
kmemleak_free(). Because object(0xFx) in kmemleak rbtree and
pointer(0xFF) in kfree() have different tag.

2) Overlap existing:

When we allocate that object with the same hw-tag again, we will
find the overlap in the kmemleak rbtree and kmemleak thread will be
killed.

kmemleak: Freeing unknown object at 0xffff000003f88000
CPU: 5 PID: 177 Comm: cat Not tainted 5.16.0-rc1-dirty #21
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x1ac
show_stack+0x1c/0x30
dump_stack_lvl+0x68/0x84
dump_stack+0x1c/0x38
kmemleak_free+0x6c/0x70
slab_free_freelist_hook+0x104/0x200
kmem_cache_free+0xa8/0x3d4
test_version_show+0x270/0x3a0
module_attr_show+0x28/0x40
sysfs_kf_seq_show+0xb0/0x130
kernfs_seq_show+0x30/0x40
seq_read_iter+0x1bc/0x4b0
seq_read_iter+0x1bc/0x4b0
kernfs_fop_read_iter+0x144/0x1c0
generic_file_splice_read+0xd0/0x184
do_splice_to+0x90/0xe0
splice_direct_to_actor+0xb8/0x250
do_splice_direct+0x88/0xd4
do_sendfile+0x2b0/0x344
__arm64_sys_sendfile64+0x164/0x16c
invoke_syscall+0x48/0x114
el0_svc_common.constprop.0+0x44/0xec
do_el0_svc+0x74/0x90
el0_svc+0x20/0x80
el0t_64_sync_handler+0x1a8/0x1b0
el0t_64_sync+0x1ac/0x1b0
...
kmemleak: Cannot insert 0xf2ff000003f88000 into the object search tree (overlaps existing)
CPU: 5 PID: 178 Comm: cat Not tainted 5.16.0-rc1-dirty #21
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x1ac
show_stack+0x1c/0x30
dump_stack_lvl+0x68/0x84
dump_stack+0x1c/0x38
create_object.isra.0+0x2d8/0x2fc
kmemleak_alloc+0x34/0x40
kmem_cache_alloc+0x23c/0x2f0
test_version_show+0x1fc/0x3a0
module_attr_show+0x28/0x40
sysfs_kf_seq_show+0xb0/0x130
kernfs_seq_show+0x30/0x40
seq_read_iter+0x1bc/0x4b0
kernfs_fop_read_iter+0x144/0x1c0
generic_file_splice_read+0xd0/0x184
do_splice_to+0x90/0xe0
splice_direct_to_actor+0xb8/0x250
do_splice_direct+0x88/0xd4
do_sendfile+0x2b0/0x344
__arm64_sys_sendfile64+0x164/0x16c
invoke_syscall+0x48/0x114
el0_svc_common.constprop.0+0x44/0xec
do_el0_svc+0x74/0x90
el0_svc+0x20/0x80
el0t_64_sync_handler+0x1a8/0x1b0
el0t_64_sync+0x1ac/0x1b0
kmemleak: Kernel memory leak detector disabled
kmemleak: Object 0xf2ff000003f88000 (size 128):
kmemleak: comm "cat", pid 177, jiffies 4294921177
kmemleak: min_count = 1
kmemleak: count = 0
kmemleak: flags = 0x1
kmemleak: checksum = 0
kmemleak: backtrace:
kmem_cache_alloc+0x23c/0x2f0
test_version_show+0x1fc/0x3a0
module_attr_show+0x28/0x40
sysfs_kf_seq_show+0xb0/0x130
kernfs_seq_show+0x30/0x40
seq_read_iter+0x1bc/0x4b0
kernfs_fop_read_iter+0x144/0x1c0
generic_file_splice_read+0xd0/0x184
do_splice_to+0x90/0xe0
splice_direct_to_actor+0xb8/0x250
do_splice_direct+0x88/0xd4
do_sendfile+0x2b0/0x344
__arm64_sys_sendfile64+0x164/0x16c
invoke_syscall+0x48/0x114
el0_svc_common.constprop.0+0x44/0xec
do_el0_svc+0x74/0x90
kmemleak: Automatic memory scanning thread ended

[akpm@linux-foundation.org: whitespace tweak]

Link: https://lkml.kernel.org/r/20211118054426.4123-1-Kuan-Ying.Lee@mediatek.com
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Doug Berger <opendmb@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 79d37050 08-Sep-2021 Naohiro Aota <naohiro.aota@wdc.com>

mm/kmemleak: allow __GFP_NOLOCKDEP passed to kmemleak's gfp

In a memory pressure situation, I'm seeing the lockdep WARNING below.
Actually, this is similar to a known false positive which is already
addressed by commit 6dcde60efd94 ("xfs: more lockdep whackamole with
kmem_alloc*").

This warning still persists because it's not from kmalloc() itself but
from an allocation for kmemleak object. While kmalloc() itself suppress
the warning with __GFP_NOLOCKDEP, gfp_kmemleak_mask() is dropping the
flag for the kmemleak's allocation.

Allow __GFP_NOLOCKDEP to be passed to kmemleak's allocation, so that the
warning for it is also suppressed.

======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc7-BTRFS-ZNS+ #37 Not tainted
------------------------------------------------------
kswapd0/288 is trying to acquire lock:
ffff88825ab45df0 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_ilock+0x8a/0x250

but task is already holding lock:
ffffffff848cc1e0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (fs_reclaim){+.+.}-{0:0}:
fs_reclaim_acquire+0x112/0x160
kmem_cache_alloc+0x48/0x400
create_object.isra.0+0x42/0xb10
kmemleak_alloc+0x48/0x80
__kmalloc+0x228/0x440
kmem_alloc+0xd3/0x2b0
kmem_alloc_large+0x5a/0x1c0
xfs_attr_copy_value+0x112/0x190
xfs_attr_shortform_getvalue+0x1fc/0x300
xfs_attr_get_ilocked+0x125/0x170
xfs_attr_get+0x329/0x450
xfs_get_acl+0x18d/0x430
get_acl.part.0+0xb6/0x1e0
posix_acl_xattr_get+0x13a/0x230
vfs_getxattr+0x21d/0x270
getxattr+0x126/0x310
__x64_sys_fgetxattr+0x1a6/0x2a0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #0 (&xfs_nondir_ilock_class){++++}-{3:3}:
__lock_acquire+0x2c0f/0x5a00
lock_acquire+0x1a1/0x4b0
down_read_nested+0x50/0x90
xfs_ilock+0x8a/0x250
xfs_can_free_eofblocks+0x34f/0x570
xfs_inactive+0x411/0x520
xfs_fs_destroy_inode+0x2c8/0x710
destroy_inode+0xc5/0x1a0
evict+0x444/0x620
dispose_list+0xfe/0x1c0
prune_icache_sb+0xdc/0x160
super_cache_scan+0x31e/0x510
do_shrink_slab+0x337/0x8e0
shrink_slab+0x362/0x5c0
shrink_node+0x7a7/0x1a40
balance_pgdat+0x64e/0xfe0
kswapd+0x590/0xa80
kthread+0x38c/0x460
ret_from_fork+0x22/0x30

other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(&xfs_nondir_ilock_class);
lock(fs_reclaim);
lock(&xfs_nondir_ilock_class);

*** DEADLOCK ***
3 locks held by kswapd0/288:
#0: ffffffff848cc1e0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30
#1: ffffffff848a08d8 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x269/0x5c0
#2: ffff8881a7a820e8 (&type->s_umount_key#60){++++}-{3:3}, at: super_cache_scan+0x5a/0x510

Link: https://lkml.kernel.org/r/20210907055659.3182992-1-naohiro.aota@wdc.com
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: "Darrick J . Wong" <djwong@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# ea0eafea 07-Sep-2021 Changbin Du <changbin.du@intel.com>

mm: in_irq() cleanup

Replace the obsolete and ambiguos macro in_irq() with new macro
in_hardirq().

Link: https://lkml.kernel.org/r/20210813145245.86070-1-changbin.du@gmail.com
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> [kmemleak]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 6c7a00b8 13-Aug-2021 Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>

kasan, kmemleak: reset tags when scanning block

Patch series "kasan, slub: reset tag when printing address", v3.

With hardware tag-based kasan enabled, we reset the tag when we access
metadata to avoid from false alarm.

This patch (of 2):

Kmemleak needs to scan kernel memory to check memory leak. With hardware
tag-based kasan enabled, when it scans on the invalid slab and
dereference, the issue will occur as below.

Hardware tag-based KASAN doesn't use compiler instrumentation, we can not
use kasan_disable_current() to ignore tag check.

Based on the below report, there are 11 0xf7 granules, which amounts to
176 bytes, and the object is allocated from the kmalloc-256 cache. So
when kmemleak accesses the last 256-176 bytes, it causes faults, as those
are marked with KASAN_KMALLOC_REDZONE == KASAN_TAG_INVALID == 0xfe.

Thus, we reset tags before accessing metadata to avoid from false positives.

BUG: KASAN: out-of-bounds in scan_block+0x58/0x170
Read at addr f7ff0000c0074eb0 by task kmemleak/138
Pointer tag: [f7], memory tag: [fe]

CPU: 7 PID: 138 Comm: kmemleak Not tainted 5.14.0-rc2-00001-g8cae8cd89f05-dirty #134
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x1b0
show_stack+0x1c/0x30
dump_stack_lvl+0x68/0x84
print_address_description+0x7c/0x2b4
kasan_report+0x138/0x38c
__do_kernel_fault+0x190/0x1c4
do_tag_check_fault+0x78/0x90
do_mem_abort+0x44/0xb4
el1_abort+0x40/0x60
el1h_64_sync_handler+0xb4/0xd0
el1h_64_sync+0x78/0x7c
scan_block+0x58/0x170
scan_gray_list+0xdc/0x1a0
kmemleak_scan+0x2ac/0x560
kmemleak_scan_thread+0xb0/0xe0
kthread+0x154/0x160
ret_from_fork+0x10/0x18

Allocated by task 0:
kasan_save_stack+0x2c/0x60
__kasan_kmalloc+0xec/0x104
__kmalloc+0x224/0x3c4
__register_sysctl_paths+0x200/0x290
register_sysctl_table+0x2c/0x40
sysctl_init+0x20/0x34
proc_sys_init+0x3c/0x48
proc_root_init+0x80/0x9c
start_kernel+0x648/0x6a4
__primary_switched+0xc0/0xc8

Freed by task 0:
kasan_save_stack+0x2c/0x60
kasan_set_track+0x2c/0x40
kasan_set_free_info+0x44/0x54
____kasan_slab_free.constprop.0+0x150/0x1b0
__kasan_slab_free+0x14/0x20
slab_free_freelist_hook+0xa4/0x1fc
kfree+0x1e8/0x30c
put_fs_context+0x124/0x220
vfs_kern_mount.part.0+0x60/0xd4
kern_mount+0x24/0x4c
bdev_cache_init+0x70/0x9c
vfs_caches_init+0xdc/0xf4
start_kernel+0x638/0x6a4
__primary_switched+0xc0/0xc8

The buggy address belongs to the object at ffff0000c0074e00
which belongs to the cache kmalloc-256 of size 256
The buggy address is located 176 bytes inside of
256-byte region [ffff0000c0074e00, ffff0000c0074f00)
The buggy address belongs to the page:
page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x100074
head:(____ptrval____) order:2 compound_mapcount:0 compound_pincount:0
flags: 0xbfffc0000010200(slab|head|node=0|zone=2|lastcpupid=0xffff|kasantag=0x0)
raw: 0bfffc0000010200 0000000000000000 dead000000000122 f5ff0000c0002300
raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff0000c0074c00: f0 f0 f0 f0 f0 f0 f0 f0 f0 fe fe fe fe fe fe fe
ffff0000c0074d00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
>ffff0000c0074e00: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 fe fe fe fe fe
^
ffff0000c0074f00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
ffff0000c0075000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Disabling lock debugging due to kernel taint
kmemleak: 181 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

Link: https://lkml.kernel.org/r/20210804090957.12393-1-Kuan-Ying.Lee@mediatek.com
Link: https://lkml.kernel.org/r/20210804090957.12393-2-Kuan-Ying.Lee@mediatek.com
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Marco Elver <elver@google.com>
Cc: Nicholas Tang <nicholas.tang@mediatek.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Chinwen Chang <chinwen.chang@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 54dd200c 28-Jun-2021 Yanfei Xu <yanfei.xu@windriver.com>

mm/kmemleak: fix possible wrong memory scanning period

This commit contains 3 modifications:

1. Convert the type of jiffies_scan_wait to "unsigned long".

2. Use READ/WRITE_ONCE() for accessing "jiffies_scan_wait".

3. Fix the possible wrong memory scanning period. If you set a large
memory scanning period like blow, then the "secs" variable will be
non-zero, however the value of "jiffies_scan_wait" will be zero.

echo "scan=0x10000000" > /sys/kernel/debug/kmemleak

It is because the type of the msecs_to_jiffies()'s parameter is "unsigned
int", and the "secs * 1000" is larger than its max value. This in turn
leads a unexpected jiffies_scan_wait, maybe zero. We corret it by
replacing kstrtoul() with kstrtouint(), and check the msecs to prevent it
larger than UINT_MAX.

Link: https://lkml.kernel.org/r/20210613174022.23044-1-yanfei.xu@windriver.com
Signed-off-by: Yanfei Xu <yanfei.xu@windriver.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 0b5121ef 29-Apr-2021 Bhaskar Chowdhury <unixbhaskar@gmail.com>

mm/kmemleak.c: fix a typo

s/interruptable/interruptible/

Link: https://lkml.kernel.org/r/20210319214140.23304-1-unixbhaskar@gmail.com
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 95511580 24-Mar-2021 Marco Elver <elver@google.com>

kfence: make compatible with kmemleak

Because memblock allocations are registered with kmemleak, the KFENCE
pool was seen by kmemleak as one large object. Later allocations
through kfence_alloc() that were registered with kmemleak via
slab_post_alloc_hook() would then overlap and trigger a warning.
Therefore, once the pool is initialized, we can remove (free) it from
kmemleak again, since it should be treated as allocator-internal and be
seen as "free memory".

The second problem is that kmemleak is passed the rounded size, and not
the originally requested size, which is also the size of KFENCE objects.
To avoid kmemleak scanning past the end of an object and trigger a
KFENCE out-of-bounds error, fix the size if it is a KFENCE object.

For simplicity, to avoid a call to kfence_ksize() in
slab_post_alloc_hook() (and avoid new IS_ENABLED(CONFIG_DEBUG_KMEMLEAK)
guard), just call kfence_ksize() in mm/kmemleak.c:create_object().

Link: https://lkml.kernel.org/r/20210317084740.3099921-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reported-by: Luis Henriques <lhenriques@suse.de>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Luis Henriques <lhenriques@suse.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# c4b28963 13-Oct-2020 Davidlohr Bueso <dave@stgolabs.net>

mm/kmemleak: rely on rcu for task stack scanning

kmemleak_scan() currently relies on the big tasklist_lock hammer to
stabilize iterating through the tasklist. Instead, this patch proposes
simply using rcu along with the rcu-safe for_each_process_thread flavor
(without changing scan semantics), which doesn't make use of
next_thread/p->thread_group and thus cannot race with exit. Furthermore,
any races with fork() and not seeing the new child should be benign as
it's not running yet and can also be detected by the next scan.

Avoiding the tasklist_lock could prove beneficial for performance
considering the scan operation is done periodically. I have seen
improvements of 30%-ish when doing similar replacements on very
pathological microbenchmarks (ie stressing get/setpriority(2)).

However my main motivation is that it's one less user of the global
lock, something that Linus has long time wanted to see gone eventually
(if ever) even if the traditional fairness issues has been dealt with
now with qrwlocks. Of course this is a very long ways ahead. This
patch also kills another user of the deprecated tsk->thread_group.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Qian Cai <cai@lca.pw>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lkml.kernel.org/r/20200820203902.11308-1-dave@stgolabs.net
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 69d0b54d 14-Aug-2020 Qian Cai <cai@lca.pw>

mm/kmemleak: silence KCSAN splats in checksum

Even if KCSAN is disabled for kmemleak, update_checksum() could still call
crc32() (which is outside of kmemleak.c) to dereference object->pointer.
Thus, the value of object->pointer could be accessed concurrently as
noticed by KCSAN,

BUG: KCSAN: data-race in crc32_le_base / do_raw_spin_lock

write to 0xffffb0ea683a7d50 of 4 bytes by task 23575 on cpu 12:
do_raw_spin_lock+0x114/0x200
debug_spin_lock_after at kernel/locking/spinlock_debug.c:91
(inlined by) do_raw_spin_lock at kernel/locking/spinlock_debug.c:115
_raw_spin_lock+0x40/0x50
__handle_mm_fault+0xa9e/0xd00
handle_mm_fault+0xfc/0x2f0
do_page_fault+0x263/0x6f9
page_fault+0x34/0x40

read to 0xffffb0ea683a7d50 of 4 bytes by task 839 on cpu 60:
crc32_le_base+0x67/0x350
crc32_le_base+0x67/0x350:
crc32_body at lib/crc32.c:106
(inlined by) crc32_le_generic at lib/crc32.c:179
(inlined by) crc32_le at lib/crc32.c:197
kmemleak_scan+0x528/0xd90
update_checksum at mm/kmemleak.c:1172
(inlined by) kmemleak_scan at mm/kmemleak.c:1497
kmemleak_scan_thread+0xcc/0xfa
kthread+0x1e0/0x200
ret_from_fork+0x27/0x50

If a shattered value was returned due to a data race, it will be corrected
in the next scan. Thus, let KCSAN ignore all reads in the region to
silence KCSAN in case the write side is non-atomic.

Suggested-by: Marco Elver <elver@google.com>
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Marco Elver <elver@google.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: http://lkml.kernel.org/r/20200317182754.2180-1-cai@lca.pw
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# b0d14fc4 01-Apr-2020 Nathan Chancellor <nathan@kernel.org>

mm/kmemleak.c: use address-of operator on section symbols

Clang warns:

mm/kmemleak.c:1955:28: warning: array comparison always evaluates to a constant [-Wtautological-compare]
if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata)
^
mm/kmemleak.c:1955:60: warning: array comparison always evaluates to a constant [-Wtautological-compare]
if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata)

These are not true arrays, they are linker defined symbols, which are just
addresses. Using the address of operator silences the warning and does
not change the resulting assembly with either clang/ld.lld or gcc/ld
(tested with diff + objdump -Dr).

Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/895
Link: http://lkml.kernel.org/r/20200220051551.44000-1-natechancellor@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 8c96f1bc 30-Jan-2020 He Zhe <zhe.he@windriver.com>

mm/kmemleak: turn kmemleak_lock and object->lock to raw_spinlock_t

kmemleak_lock as a rwlock on RT can possibly be acquired in atomic
context which does work.

Since the kmemleak operation is performed in atomic context make it a
raw_spinlock_t so it can also be acquired on RT. This is used for
debugging and is not enabled by default in a production like environment
(where performance/latency matters) so it makes sense to make it a
raw_spinlock_t instead trying to get rid of the atomic context. Turn
also the kmemleak_object->lock into raw_spinlock_t which is acquired
(nested) while the kmemleak_lock is held.

The time spent in "echo scan > kmemleak" slightly improved on 64core box
with this patch applied after boot.

[bigeasy@linutronix.de: redo the description, update comments. Merge the individual bits: He Zhe did the kmemleak_lock, Liu Haitao the ->lock and Yongxin Liu forwarded Liu's patch.]
Link: http://lkml.kernel.org/r/20191219170834.4tah3prf2gdothz4@linutronix.de
Link: https://lkml.kernel.org/r/20181218150744.GB20197@arrakis.emea.arm.com
Link: https://lkml.kernel.org/r/1542877459-144382-1-git-send-email-zhe.he@windriver.com
Link: https://lkml.kernel.org/r/20190927082230.34152-1-yongxin.liu@windriver.com
Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Liu Haitao <haitao.liu@windriver.com>
Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 2abd839a 04-Oct-2019 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Do not corrupt the object_list during clean-up

In case of an error (e.g. memory pool too small), kmemleak disables
itself and cleans up the already allocated metadata objects. However, if
this happens early before the RCU callback mechanism is available,
put_object() skips call_rcu() and frees the object directly. This is not
safe with the RCU list traversal in __kmemleak_do_cleanup().

Change the list traversal in __kmemleak_do_cleanup() to
list_for_each_entry_safe() and remove the rcu_read_{lock,unlock} since
the kmemleak is already disabled at this point. In addition, avoid an
unnecessary metadata object rb-tree look-up since it already has the
struct kmemleak_object pointer.

Fixes: c5665868183f ("mm: kmemleak: use the memory pool for early allocations")
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reported-by: Marc Dionne <marc.c.dionne@gmail.com>
Reported-by: Ted Ts'o <tytso@mit.edu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 0e965a6b 23-Sep-2019 Qian Cai <cai@lca.pw>

mm/kmemleak.c: record the current memory pool size

The only way to obtain the current memory pool size for a running kernel
is to check the kernel config file which is inconvenient. Record it in
the kernel messages.

[akpm@linux-foundation.org: s/memory pool size/memory pool/available/, per Catalin]
Link: http://lkml.kernel.org/r/1565809631-28933-1-git-send-email-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# c5665868 23-Sep-2019 Catalin Marinas <catalin.marinas@arm.com>

mm: kmemleak: use the memory pool for early allocations

Currently kmemleak uses a static early_log buffer to trace all memory
allocation/freeing before the slab allocator is initialised. Such early
log is replayed during kmemleak_init() to properly initialise the kmemleak
metadata for objects allocated up that point. With a memory pool that
does not rely on the slab allocator, it is possible to skip this early log
entirely.

In order to remove the early logging, consider kmemleak_enabled == 1 by
default while the kmem_cache availability is checked directly on the
object_cache and scan_area_cache variables. The RCU callback is only
invoked after object_cache has been initialised as we wouldn't have any
concurrent list traversal before this.

In order to reduce the number of callbacks before kmemleak is fully
initialised, move the kmemleak_init() call to mm_init().

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: remove WARN_ON(), per Catalin]
Link: http://lkml.kernel.org/r/20190812160642.52134-4-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 0647398a 23-Sep-2019 Catalin Marinas <catalin.marinas@arm.com>

mm: kmemleak: simple memory allocation pool for kmemleak objects

Add a memory pool for struct kmemleak_object in case the normal
kmem_cache_alloc() fails under the gfp constraints passed by the caller.
The mem_pool[] array size is currently fixed at 16000.

We are not using the existing mempool kernel API since this requires
the slab allocator to be available (for pool->elements allocation). A
subsequent kmemleak patch will replace the static early log buffer with
the pool allocation introduced here and this functionality is required
to be available before the slab was initialised.

Link: http://lkml.kernel.org/r/20190812160642.52134-3-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# dba82d94 23-Sep-2019 Catalin Marinas <catalin.marinas@arm.com>

mm: kmemleak: make the tool tolerant to struct scan_area allocation failures

Patch series "mm: kmemleak: Use a memory pool for kmemleak object
allocations", v3.

Following the discussions on v2 of this patch(set) [1], this series takes
slightly different approach:

- it implements its own simple memory pool that does not rely on the
slab allocator

- drops the early log buffer logic entirely since it can now allocate
metadata from the memory pool directly before kmemleak is fully
initialised

- CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE

- moves the kmemleak_init() call earlier (mm_init())

- to avoid a separate memory pool for struct scan_area, it makes the
tool robust when such allocations fail as scan areas are rather an
optimisation

[1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com

This patch (of 3):

Object scan areas are an optimisation aimed to decrease the false
positives and slightly improve the scanning time of large objects known to
only have a few specific pointers. If a struct scan_area fails to
allocate, kmemleak can still function normally by scanning the full
object.

Introduce an OBJECT_FULL_SCAN flag and mark objects as such when scan_area
allocation fails.

Link: http://lkml.kernel.org/r/20190812160642.52134-2-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# fcf3a5b6 13-Aug-2019 Catalin Marinas <catalin.marinas@arm.com>

mm: kmemleak: disable early logging in case of error

If an error occurs during kmemleak_init() (e.g. kmem cache cannot be
created), kmemleak is disabled but kmemleak_early_log remains enabled.
Subsequently, when the .init.text section is freed, the log_early()
function no longer exists. To avoid a page fault in such scenario,
ensure that kmemleak_disable() also disables early logging.

Link: http://lkml.kernel.org/r/20190731152302.42073-1-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# df9576de 02-Aug-2019 Yang Shi <yang.shi@linux.alibaba.com>

Revert "kmemleak: allow to coexist with fault injection"

When running ltp's oom test with kmemleak enabled, the below warning was
triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
passed in:

WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50
Modules linked in: loop dax_pmem dax_pmem_core ip_tables x_tables xfs virtio_net net_failover virtio_blk failover ata_generic virtio_pci virtio_ring virtio libata
CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
...
kmemleak_alloc+0x4e/0xb0
kmem_cache_alloc+0x2a7/0x3e0
mempool_alloc_slab+0x2d/0x40
mempool_alloc+0x118/0x2b0
bio_alloc_bioset+0x19d/0x350
get_swap_bio+0x80/0x230
__swap_writepage+0x5ff/0xb20

The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, however kmemleak
has __GFP_NOFAIL set all the time due to d9570ee3bd1d4f2 ("kmemleak:
allow to coexist with fault injection"). But, it doesn't make any sense
to have __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM specified at the same
time.

According to the discussion on the mailing list, the commit should be
reverted for short term solution. Catalin Marinas would follow up with
a better solution for longer term.

The failure rate of kmemleak metadata allocation may increase in some
circumstances, but this should be expected side effect.

Link: http://lkml.kernel.org/r/1563299431-111710-1-git-send-email-yang.shi@linux.alibaba.com
Fixes: d9570ee3bd1d4f2 ("kmemleak: allow to coexist with fault injection")
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 4e4dfce2 11-Jul-2019 André Almeida <andrealmeid@collabora.com>

mm/kmemleak.c: change error at _write when kmemleak is disabled

According to POSIX, EBUSY means that the "device or resource is busy", and
this can lead to people thinking that the file
`/sys/kernel/debug/kmemleak/` is somehow locked or being used by other
process. Change this error code to a more appropriate one.

Link: http://lkml.kernel.org/r/20190612155231.19448-1-andrealmeid@collabora.com
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 6ef90569 11-Jul-2019 Dmitry Vyukov <dvyukov@google.com>

mm/kmemleak.c: fix check for softirq context

in_softirq() is a wrong predicate to check if we are in a softirq
context. It also returns true if we have BH disabled, so objects are
falsely stamped with "softirq" comm. The correct predicate is
in_serving_softirq().

If user does cat from /sys/kernel/debug/kmemleak previously they would
see this, which is clearly wrong, this is system call context (see the
comm):

unreferenced object 0xffff88805bd661c0 (size 64):
comm "softirq", pid 0, jiffies 4294942959 (age 12.400s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 ff ff ff ff 00 00 00 00 ................
00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................
backtrace:
[<0000000007dcb30c>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
[<0000000007dcb30c>] slab_post_alloc_hook mm/slab.h:439 [inline]
[<0000000007dcb30c>] slab_alloc mm/slab.c:3326 [inline]
[<0000000007dcb30c>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
[<00000000969722b7>] kmalloc include/linux/slab.h:547 [inline]
[<00000000969722b7>] kzalloc include/linux/slab.h:742 [inline]
[<00000000969722b7>] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline]
[<00000000969722b7>] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085
[<00000000a4134b5f>] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475
[<00000000d20248ad>] do_ip_setsockopt.isra.0+0x19fe/0x1c00 net/ipv4/ip_sockglue.c:957
[<000000003d367be7>] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246
[<000000003c7c76af>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
[<000000000c1aeb23>] sock_common_setsockopt+0x3e/0x50 net/core/sock.c:3130
[<000000000157b92b>] __sys_setsockopt+0x9e/0x120 net/socket.c:2078
[<00000000a9f3d058>] __do_sys_setsockopt net/socket.c:2089 [inline]
[<00000000a9f3d058>] __se_sys_setsockopt net/socket.c:2086 [inline]
[<00000000a9f3d058>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
[<000000001b8da885>] do_syscall_64+0x7c/0x1a0 arch/x86/entry/common.c:301
[<00000000ba770c62>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

now they will see this:

unreferenced object 0xffff88805413c800 (size 64):
comm "syz-executor.4", pid 8960, jiffies 4294994003 (age 14.350s)
hex dump (first 32 bytes):
00 7a 8a 57 80 88 ff ff e0 00 00 01 00 00 00 00 .z.W............
00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................
backtrace:
[<00000000c5d3be64>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
[<00000000c5d3be64>] slab_post_alloc_hook mm/slab.h:439 [inline]
[<00000000c5d3be64>] slab_alloc mm/slab.c:3326 [inline]
[<00000000c5d3be64>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
[<0000000023865be2>] kmalloc include/linux/slab.h:547 [inline]
[<0000000023865be2>] kzalloc include/linux/slab.h:742 [inline]
[<0000000023865be2>] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline]
[<0000000023865be2>] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085
[<000000003029a9d4>] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475
[<00000000ccd0a87c>] do_ip_setsockopt.isra.0+0x19fe/0x1c00 net/ipv4/ip_sockglue.c:957
[<00000000a85a3785>] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246
[<00000000ec13c18d>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
[<0000000052d748e3>] sock_common_setsockopt+0x3e/0x50 net/core/sock.c:3130
[<00000000512f1014>] __sys_setsockopt+0x9e/0x120 net/socket.c:2078
[<00000000181758bc>] __do_sys_setsockopt net/socket.c:2089 [inline]
[<00000000181758bc>] __se_sys_setsockopt net/socket.c:2086 [inline]
[<00000000181758bc>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
[<00000000d4b73623>] do_syscall_64+0x7c/0x1a0 arch/x86/entry/common.c:301
[<00000000c1098bec>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Link: http://lkml.kernel.org/r/20190517171507.96046-1-dvyukov@gmail.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 45051539 29-May-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not write to the free
software foundation inc 59 temple place suite 330 boston ma 02111
1307 usa

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 136 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190530000436.384967451@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 282401df 22-Jan-2019 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: kmemleak: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-mm@kvack.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 07984aad 25-Apr-2019 Thomas Gleixner <tglx@linutronix.de>

mm/kmemleak: Simplify stacktrace handling

Replace the indirection through struct stack_trace by using the storage
array based interfaces.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: linux-mm@kvack.org
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: kasan-dev@googlegroups.com
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: iommu@lists.linux-foundation.org
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: linux-btrfs@vger.kernel.org
Cc: dm-devel@redhat.com
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: intel-gfx@lists.freedesktop.org
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: dri-devel@lists.freedesktop.org
Cc: David Airlie <airlied@linux.ie>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: linux-arch@vger.kernel.org
Link: https://lkml.kernel.org/r/20190425094801.863716911@linutronix.de


# dce5b0bd 18-Apr-2019 Arnd Bergmann <arnd@arndb.de>

mm/kmemleak.c: fix unused-function warning

The only references outside of the #ifdef have been removed, so now we
get a warning in non-SMP configurations:

mm/kmemleak.c:1404:13: error: unused function 'scan_large_block' [-Werror,-Wunused-function]

Add a new #ifdef around it.

Link: http://lkml.kernel.org/r/20190416123148.3502045-1-arnd@arndb.de
Fixes: 298a32b13208 ("kmemleak: powerpc: skip scanning holes in the .bss section")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 298a32b1 05-Apr-2019 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: powerpc: skip scanning holes in the .bss section

Commit 2d4f567103ff ("KVM: PPC: Introduce kvm_tmp framework") adds
kvm_tmp[] into the .bss section and then free the rest of unused spaces
back to the page allocator.

kernel_init
kvm_guest_init
kvm_free_tmp
free_reserved_area
free_unref_page
free_unref_page_prepare

With DEBUG_PAGEALLOC=y, it will unmap those pages from kernel. As the
result, kmemleak scan will trigger a panic when it scans the .bss
section with unmapped pages.

This patch creates dedicated kmemleak objects for the .data, .bss and
potentially .data..ro_after_init sections to allow partial freeing via
the kmemleak_free_part() in the powerpc kvm_free_tmp() function.

Link: http://lkml.kernel.org/r/20190321171917.62049-1-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Qian Cai <cai@lca.pw>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Tested-by: Qian Cai <cai@lca.pw>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# a2f77575 20-Feb-2019 Andrey Konovalov <andreyknvl@google.com>

kmemleak: account for tagged pointers when calculating pointer range

kmemleak keeps two global variables, min_addr and max_addr, which store
the range of valid (encountered by kmemleak) pointer values, which it
later uses to speed up pointer lookup when scanning blocks.

With tagged pointers this range will get bigger than it needs to be. This
patch makes kmemleak untag pointers before saving them to min_addr and
max_addr and when performing a lookup.

Link: http://lkml.kernel.org/r/16e887d442986ab87fe87a755815ad92fa431a5f.1550066133.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Qian Cai <cai@lca.pw>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgeniy Stepanov <eugenis@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# d53ce042 28-Dec-2018 Sri Krishna chowdary <schowdary@nvidia.com>

kmemleak: add config to select auto scan

Kmemleak scan can be cpu intensive and can stall user tasks at times. To
prevent this, add config DEBUG_KMEMLEAK_AUTO_SCAN to enable/disable auto
scan on boot up. Also protect first_run with DEBUG_KMEMLEAK_AUTO_SCAN as
this is meant for only first automatic scan.

Link: http://lkml.kernel.org/r/1540231723-7087-1-git-send-email-prpatel@nvidia.com
Signed-off-by: Sri Krishna chowdary <schowdary@nvidia.com>
Signed-off-by: Sachin Nikam <snikam@nvidia.com>
Signed-off-by: Prateek <prpatel@nvidia.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 9f1eb38e 28-Dec-2018 Oscar Salvador <osalvador@suse.de>

mm, kmemleak: little optimization while scanning

kmemleak_scan() goes through all online nodes and tries to scan all used
pages.

We can do better and use pfn_to_online_page(), so in case we have
CONFIG_MEMORY_HOTPLUG, offlined pages will be skiped automatically. For
boxes where CONFIG_MEMORY_HOTPLUG is not present, pfn_to_online_page()
will fallback to pfn_valid().

Another little optimization is to check if the page belongs to the node we
are currently checking, so in case we have nodes interleaved we will not
check the same pfn multiple times.

I ran some tests:

Add some memory to node1 and node2 making it interleaved:

(qemu) object_add memory-backend-ram,id=ram0,size=1G
(qemu) device_add pc-dimm,id=dimm0,memdev=ram0,node=1
(qemu) object_add memory-backend-ram,id=ram1,size=1G
(qemu) device_add pc-dimm,id=dimm1,memdev=ram1,node=2
(qemu) object_add memory-backend-ram,id=ram2,size=1G
(qemu) device_add pc-dimm,id=dimm2,memdev=ram2,node=1

Then, we offline that memory:
# for i in {32..39} ; do echo "offline" > /sys/devices/system/node/node1/memory$i/state;done
# for i in {48..55} ; do echo "offline" > /sys/devices/system/node/node1/memory$i/state;don
# for i in {40..47} ; do echo "offline" > /sys/devices/system/node/node2/memory$i/state;done

And we run kmemleak_scan:

# echo "scan" > /sys/kernel/debug/kmemleak

before the patch:

kmemleak: time spend: 41596 us

after the patch:

kmemleak: time spend: 34899 us

[akpm@linux-foundation.org: remove stray newline, per Oscar]
Link: http://lkml.kernel.org/r/20181206131918.25099-1-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 57c8a661 30-Oct-2018 Mike Rapoport <rppt@linux.vnet.ibm.com>

mm: remove include/linux/bootmem.h

Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.

The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include <linux/memblock.h>

@@
@@
- #include <linux/bootmem.h>
+ #include <linux/memblock.h>

[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 154221c3 26-Oct-2018 Vincent Whitchurch <vincent.whitchurch@axis.com>

kmemleak: add module param to print warnings to dmesg

Currently, kmemleak only prints the number of suspected leaks to dmesg but
requires the user to read a debugfs file to get the actual stack traces of
the objects' allocation points. Add a module option to print the full
object information to dmesg too. It can be enabled with
kmemleak.verbose=1 on the kernel command line, or "echo 1 >
/sys/module/kmemleak/parameters/verbose":

This allows easier integration of kmemleak into test systems: We have
automated test infrastructure to test our Linux systems. With this
option, running our tests with kmemleak is as simple as enabling kmemleak
and passing this command line option; the test infrastructure knows how to
save kernel logs, which will now include kmemleak reports. Without this
option, the test infrastructure needs to be specifically taught to read
out the kmemleak debugfs file. Removing this need for special handling
makes kmemleak more similar to other kernel debug options (slab debugging,
debug objects, etc).

Link: http://lkml.kernel.org/r/20180903144046.21023-1-vincent.whitchurch@axis.com
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# b353756b 04-Sep-2018 Vincent Whitchurch <vincent.whitchurch@axis.com>

kmemleak: always register debugfs file

If kmemleak built in to the kernel, but is disabled by default, the
debugfs file is never registered. Because of this, it is not possible
to find out if the kernel is built with kmemleak support by checking for
the presence of this file. To allow this, always register the file.

After this patch, if the file doesn't exist, kmemleak is not available
in the kernel. If writing "scan" or any other value than "clear" to
this file results in EBUSY, then kmemleak is available but is disabled
by default and can be activated via the kernel command line.

Catalin: "that's also consistent with a late disabling of kmemleak when
the debugfs entry sticks around."

Link: http://lkml.kernel.org/r/20180824131220.19176-1-vincent.whitchurch@axis.com
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# e8b098fc 05-Apr-2018 Mike Rapoport <rppt@linux.vnet.ibm.com>

mm: kernel-doc: add missing parameter descriptions

Link: http://lkml.kernel.org/r/1519585191-10180-4-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 8bd30c10 05-Apr-2018 Dou Liyang <douly.fnst@cn.fujitsu.com>

mm/kmemleak.c: make kmemleak_boot_config() __init

The early_param() is only called during kernel initialization, So Linux
marks the functions of it with __init macro to save memory.

But it forgot to mark the kmemleak_boot_config(). So, Make it __init as
well.

Link: http://lkml.kernel.org/r/20180117034720.26897-1-douly.fnst@cn.fujitsu.com
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 914b6dff 28-Mar-2018 Vinayak Menon <vinmenon@codeaurora.org>

mm/kmemleak.c: wait for scan completion before disabling free

A crash is observed when kmemleak_scan accesses the object->pointer,
likely due to the following race.

TASK A TASK B TASK C
kmemleak_write
(with "scan" and
NOT "scan=on")
kmemleak_scan()
create_object
kmem_cache_alloc fails
kmemleak_disable
kmemleak_do_cleanup
kmemleak_free_enabled = 0
kfree
kmemleak_free bails out
(kmemleak_free_enabled is 0)
slub frees object->pointer
update_checksum
crash - object->pointer
freed (DEBUG_PAGEALLOC)

kmemleak_do_cleanup waits for the scan thread to complete, but not for
direct call to kmemleak_scan via kmemleak_write. So add a wait for
kmemleak_scan completion before disabling kmemleak_free, and while at it
fix the comment on stop_scan_thread.

[vinmenon@codeaurora.org: fix stop_scan_thread comment]
Link: http://lkml.kernel.org/r/1522219972-22809-1-git-send-email-vinmenon@codeaurora.org
Link: http://lkml.kernel.org/r/1522063429-18992-1-git-send-email-vinmenon@codeaurora.org
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 4a01768e 31-Jan-2018 Yang Shi <yang.s@alibaba-inc.com>

mm: kmemleak: remove unused hardirq.h

Preempt counter APIs have been split out, currently, hardirq.h just
includes irq_enter/exit APIs which are not used by kmemleak at all.

So, remove the unused hardirq.h.

Link: http://lkml.kernel.org/r/1510959741-31109-1-git-send-email-yang.s@alibaba-inc.com
Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# d9570ee3 12-Jan-2018 Dmitry Vyukov <dvyukov@google.com>

kmemleak: allow to coexist with fault injection

kmemleak does one slab allocation per user allocation. So if slab fault
injection is enabled to any degree, kmemleak instantly fails to allocate
and turns itself off. However, it's useful to use kmemleak with fault
injection to find leaks on error paths. On the other hand, checking
kmemleak itself is not so useful because (1) it's a debugging tool and
(2) it has a very regular allocation pattern (basically a single
allocation site, so it either works or not).

Turn off fault injection for kmemleak allocations.

Link: http://lkml.kernel.org/r/20180109192243.19316-1-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 13ab183d 14-Dec-2017 Andrew Morton <akpm@linux-foundation.org>

mm/kmemleak.c: make cond_resched() rate-limiting more efficient

Commit bde5f6bc68db ("kmemleak: add scheduling point to
kmemleak_scan()") tries to rate-limit the frequency of cond_resched()
calls, but does it in a way which might incur an expensive division
operation in the inner loop. Simplify this.

Fixes: bde5f6bc68db5 ("kmemleak: add scheduling point to kmemleak_scan()")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# bde5f6bc 29-Nov-2017 Yisheng Xie <xieyisheng1@huawei.com>

kmemleak: add scheduling point to kmemleak_scan()

kmemleak_scan() will scan struct page for each node and it can be really
large and resulting in a soft lockup. We have seen a soft lockup when
do scan while compile kernel:

watchdog: BUG: soft lockup - CPU#53 stuck for 22s! [bash:10287]
[...]
Call Trace:
kmemleak_scan+0x21a/0x4c0
kmemleak_write+0x312/0x350
full_proxy_write+0x5a/0xa0
__vfs_write+0x33/0x150
vfs_write+0xad/0x1a0
SyS_write+0x52/0xc0
do_syscall_64+0x61/0x1a0
entry_SYSCALL64_slow_path+0x25/0x25

Fix this by adding cond_resched every MAX_SCAN_SIZE.

Link: http://lkml.kernel.org/r/1511439788-20099-1-git-send-email-xieyisheng1@huawei.com
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 49502766 15-Nov-2017 Levin, Alexander (Sasha Levin) <alexander.levin@verizon.com>

kmemcheck: remove annotations

Patch series "kmemcheck: kill kmemcheck", v2.

As discussed at LSF/MM, kill kmemcheck.

KASan is a replacement that is able to work without the limitation of
kmemcheck (single CPU, slow). KASan is already upstream.

We are also not aware of any users of kmemcheck (or users who don't
consider KASan as a suitable replacement).

The only objection was that since KASAN wasn't supported by all GCC
versions provided by distros at that time we should hold off for 2
years, and try again.

Now that 2 years have passed, and all distros provide gcc that supports
KASAN, kill kmemcheck again for the very same reasons.

This patch (of 4):

Remove kmemcheck annotations, and calls to kmemcheck from the kernel.

[alexander.levin@verizon.com: correctly remove kmemcheck call from dma_map_sg_attrs]
Link: http://lkml.kernel.org/r/20171012192151.26531-1-alexander.levin@verizon.com
Link: http://lkml.kernel.org/r/20171007030159.22241-2-alexander.levin@verizon.com
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Hansen <devtimhansen@gmail.com>
Cc: Vegard Nossum <vegardno@ifi.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 7d6c4dfa 15-Nov-2017 Konstantin Khlebnikov <koct9i@gmail.com>

kmemleak: change /sys/kernel/debug/kmemleak permissions from 0444 to 0644

Kmemleak can be tweaked at runtime by writing commands into debugfs
file. Root can use it anyway, but without the write-bit this interface
isn't obvious.

Link: http://lkml.kernel.org/r/150728996582.744328.11541332857988399411.stgit@buzz
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 94f4a161 06-Jul-2017 Catalin Marinas <catalin.marinas@arm.com>

mm: kmemleak: treat vm_struct as alternative reference to vmalloc'ed objects

Kmemleak requires that vmalloc'ed objects have a minimum reference count
of 2: one in the corresponding vm_struct object and the other owned by
the vmalloc() caller. There are cases, however, where the original
vmalloc() returned pointer is lost and, instead, a pointer to vm_struct
is stored (see free_thread_stack()). Kmemleak currently reports such
objects as leaks.

This patch adds support for treating any surplus references to an object
as additional references to a specified object. It introduces the
kmemleak_vmalloc() API function which takes a vm_struct pointer and sets
its surplus reference passing to the actual vmalloc() returned pointer.
The __vmalloc_node_range() calling site has been modified accordingly.

Link: http://lkml.kernel.org/r/1495726937-23557-4-git-send-email-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: "Luis R. Rodriguez" <mcgrof@kernel.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 04f70d13 06-Jul-2017 Catalin Marinas <catalin.marinas@arm.com>

mm: kmemleak: factor object reference updating out of scan_block()

scan_block() updates the number of references (pointers) to objects,
adding them to the gray_list when object->min_count is reached. The
patch factors out this functionality into a separate update_refs()
function.

Link: http://lkml.kernel.org/r/1495726937-23557-3-git-send-email-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# f66abf09 06-Jul-2017 Catalin Marinas <catalin.marinas@arm.com>

mm: kmemleak: slightly reduce the size of some structures on 64-bit architectures

Change the kmemleak_object.flags type to unsigned int and moves the
early_log.min_count (int) near early_log.op_type (int) to slightly
reduce the size of these structures on 64-bit architectures.

Link: http://lkml.kernel.org/r/1495726937-23557-2-git-send-email-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 906f2a51 31-Mar-2017 Kees Cook <keescook@chromium.org>

mm: fix section name for .data..ro_after_init

A section name for .data..ro_after_init was added by both:

commit d07a980c1b8d ("s390: add proper __ro_after_init support")

and

commit d7c19b066dcf ("mm: kmemleak: scan .data.ro_after_init")

The latter adds incorrect wrapping around the existing s390 section, and
came later. I'd prefer the s390 naming, so this moves the s390-specific
name up to the asm-generic/sections.h and renames the section as used by
kmemleak (and in the future, kernel/extable.c).

Link: http://lkml.kernel.org/r/20170327192213.GA129375@beast
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390 parts]
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Eddie Kovsky <ewk@edkovsky.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 68db0cf1 08-Feb-2017 Ingo Molnar <mingo@kernel.org>

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h>

We are going to split <linux/sched/task_stack.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/task_stack.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 29930025 08-Feb-2017 Ingo Molnar <mingo@kernel.org>

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h>

We are going to split <linux/sched/task.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/task.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 3f07c014 08-Feb-2017 Ingo Molnar <mingo@kernel.org>

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h>

We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 22901c6c 12-Dec-2016 Andreas Platschek <andreas.platschek@opentech.at>

kmemleak: fix reference to Documentation

Documentation/kmemleak.txt was moved to Documentation/dev-tools/kmemleak.rst,
this fixes the reference to the new location.

Link: http://lkml.kernel.org/r/1476544946-18804-1-git-send-email-andreas.platschek@opentech.at
Signed-off-by: Andreas Platschek <andreas.platschek@opentech.at>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# d7c19b06 10-Nov-2016 Jakub Kicinski <kuba@kernel.org>

mm: kmemleak: scan .data.ro_after_init

Limit the number of kmemleak false positives by including
.data.ro_after_init in memory scanning. To achieve this we need to add
symbols for start and end of the section to the linker scripts.

The problem was been uncovered by commit 56989f6d8568 ("genetlink: mark
families as __ro_after_init").

Link: http://lkml.kernel.org/r/1478274173-15218-1-git-send-email-jakub.kicinski@netronome.com
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 37df49f4 27-Oct-2016 Catalin Marinas <catalin.marinas@arm.com>

mm: kmemleak: ensure that the task stack is not freed during scanning

Commit 68f24b08ee89 ("sched/core: Free the stack early if
CONFIG_THREAD_INFO_IN_TASK") may cause the task->stack to be freed
during kmemleak_scan() execution, leading to either a NULL pointer fault
(if task->stack is NULL) or kmemleak accessing already freed memory.

This patch uses the new try_get_task_stack() API to ensure that the task
stack is not freed during kmemleak stack scanning.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=173901.

Fixes: 68f24b08ee89 ("sched/core: Free the stack early if CONFIG_THREAD_INFO_IN_TASK")
Link: http://lkml.kernel.org/r/1476266223-14325-1-git-send-email-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: CAI Qian <caiqian@redhat.com>
Tested-by: CAI Qian <caiqian@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: CAI Qian <caiqian@redhat.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 9099daed 11-Oct-2016 Catalin Marinas <catalin.marinas@arm.com>

mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping

Some of the kmemleak_*() callbacks in memblock, bootmem, CMA convert a
physical address to a virtual one using __va(). However, such physical
addresses may sometimes be located in highmem and using __va() is
incorrect, leading to inconsistent object tracking in kmemleak.

The following functions have been added to the kmemleak API and they take
a physical address as the object pointer. They only perform the
corresponding action if the address has a lowmem mapping:

kmemleak_alloc_phys
kmemleak_free_part_phys
kmemleak_not_leak_phys
kmemleak_ignore_phys

The affected calling places have been updated to use the new kmemleak
API.

Link: http://lkml.kernel.org/r/1471531432-16503-1-git-send-email-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Vignesh R <vigneshr@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 98c42d94 28-Jul-2016 Vegard Nossum <vegard.nossum@oracle.com>

kmemleak: don't hang if user disables scanning early

If the user tries to disable automatic scanning early in the boot
process using e.g.:

echo scan=off > /sys/kernel/debug/kmemleak

then this command will hang until SECS_FIRST_SCAN (= 60) seconds have
elapsed, even though the system is fully initialised.

We can fix this using interruptible sleep and checking if we're supposed
to stop whenever we wake up (like the rest of the code does).

Link: http://lkml.kernel.org/r/1468835005-2873-1-git-send-email-vegard.nossum@oracle.com
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 5c335fe0 24-Jun-2016 Dmitry Vyukov <dvyukov@google.com>

mm: prevent KASAN false positives in kmemleak

When kmemleak dumps contents of leaked objects it reads whole objects
regardless of user-requested size. This upsets KASAN. Disable KASAN
checks around object dump.

Link: http://lkml.kernel.org/r/1466617631-68387-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 756a025f 17-Mar-2016 Joe Perches <joe@perches.com>

mm: coalesce split strings

Kernel style prefers a single string over split strings when the string is
'user-visible'.

Miscellanea:

- Add a missing newline
- Realign arguments

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Tejun Heo <tj@kernel.org> [percpu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 598d8091 17-Mar-2016 Joe Perches <joe@perches.com>

mm: convert pr_warning to pr_warn

There are a mixture of pr_warning and pr_warn uses in mm. Use pr_warn
consistently.

Miscellanea:

- Coalesce formats
- Realign arguments

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Tejun Heo <tj@kernel.org> [percpu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 20b5c303 14-Jan-2016 Vladimir Davydov <vdavydov.dev@gmail.com>

Revert "gfp: add __GFP_NOACCOUNT"

This reverts commit 8f4fc071b192 ("gfp: add __GFP_NOACCOUNT").

Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
fragile and difficult to maintain, because there seem to be many more
allocations that should not be accounted than those that should be.
Besides, false accounting an allocation might result in much worse
consequences than not accounting at all, namely increased memory
consumption due to pinned dead kmem caches.

So it was decided to switch to the white-list policy. This patch
reverts bits introducing the black-list policy. The white-list policy
will be introduced later in the series.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 9fbed254 05-Nov-2015 Alexey Klimov <alexey.klimov@linaro.org>

mm/kmemleak.c: remove unneeded initialization of object to NULL

Few lines below object is reinitialized by lookup_object() so we don't
need to init it by NULL in the beginning of find_and_get_object().

Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 6fc37c49 09-Sep-2015 Andy Shevchenko <andriy.shevchenko@linux.intel.com>

kmemleak: use seq_hex_dump() to dump buffers

Instead of custom approach let's use recently introduced seq_hex_dump()
helper.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Joe Perches <joe@perches.com>
Cc: Tadeusz Struk <tadeusz.struk@intel.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Tuchscherer <ingo.tuchscherer@de.ibm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 21cd3a60 08-Sep-2015 Wang Kai <morgan.wang@huawei.com>

kmemleak: record accurate early log buffer count and report when exceeded

In log_early function, crt_early_log should also count once when
'crt_early_log >= ARRAY_SIZE(early_log)'. Otherwise the reported count
from kmemleak_init is one less than 'actual number'.

Then, in kmemleak_init, if early_log buffer size equal actual number,
kmemleak will init sucessful, so change warning condition to
'crt_early_log > ARRAY_SIZE(early_log)'.

Signed-off-by: Wang Kai <morgan.wang@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 8a8c35fa 24-Jun-2015 Larry Finger <Larry.Finger@lwfinger.net>

mm: kmemleak_alloc_percpu() should follow the gfp from per_alloc()

Beginning at commit d52d3997f843 ("ipv6: Create percpu rt6_info"), the
following INFO splat is logged:

===============================
[ INFO: suspicious RCU usage. ]
4.1.0-rc7-next-20150612 #1 Not tainted
-------------------------------
kernel/sched/core.c:7318 Illegal context switch in RCU-bh read-side critical section!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 0
3 locks held by systemd/1:
#0: (rtnl_mutex){+.+.+.}, at: [<ffffffff815f0c8f>] rtnetlink_rcv+0x1f/0x40
#1: (rcu_read_lock_bh){......}, at: [<ffffffff816a34e2>] ipv6_add_addr+0x62/0x540
#2: (addrconf_hash_lock){+...+.}, at: [<ffffffff816a3604>] ipv6_add_addr+0x184/0x540
stack backtrace:
CPU: 0 PID: 1 Comm: systemd Not tainted 4.1.0-rc7-next-20150612 #1
Hardware name: TOSHIBA TECRA A50-A/TECRA A50-A, BIOS Version 4.20 04/17/2014
Call Trace:
dump_stack+0x4c/0x6e
lockdep_rcu_suspicious+0xe7/0x120
___might_sleep+0x1d5/0x1f0
__might_sleep+0x4d/0x90
kmem_cache_alloc+0x47/0x250
create_object+0x39/0x2e0
kmemleak_alloc_percpu+0x61/0xe0
pcpu_alloc+0x370/0x630

Additional backtrace lines are truncated. In addition, the above splat
is followed by several "BUG: sleeping function called from invalid
context at mm/slub.c:1268" outputs. As suggested by Martin KaFai Lau,
these are the clue to the fix. Routine kmemleak_alloc_percpu() always
uses GFP_KERNEL for its allocations, whereas it should follow the gfp
from its callers.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: <stable@vger.kernel.org> [3.18+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 93ada579 24-Jun-2015 Catalin Marinas <catalin.marinas@arm.com>

mm: kmemleak: optimise kmemleak_lock acquiring during kmemleak_scan

The kmemleak memory scanning uses finer grained object->lock spinlocks
primarily to avoid races with the memory block freeing. However, the
pointer lookup in the rb tree requires the kmemleak_lock to be held.
This is currently done in the find_and_get_object() function for each
pointer-like location read during scanning. While this allows a low
latency on kmemleak_*() callbacks on other CPUs, the memory scanning is
slower.

This patch moves the kmemleak_lock outside the scan_block() loop,
acquiring/releasing it only once per scanned memory block. The
allow_resched logic is moved outside scan_block() and a new
scan_large_block() function is implemented which splits large blocks in
MAX_SCAN_SIZE chunks with cond_resched() calls in-between. A redundant
(object->flags & OBJECT_NO_SCAN) check is also removed from
scan_object().

With this patch, the kmemleak scanning performance is significantly
improved: at least 50% with lock debugging disabled and over an order of
magnitude with lock proving enabled (on an arm64 system).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 9d5a4c73 24-Jun-2015 Catalin Marinas <catalin.marinas@arm.com>

mm: kmemleak: avoid deadlock on the kmemleak object insertion error path

While very unlikely (usually kmemleak or sl*b bug), the create_object()
function in mm/kmemleak.c may fail to insert a newly allocated object into
the rb tree. When this happens, kmemleak disables itself and prints
additional information about the object already found in the rb tree.
Such printing is done with the parent->lock acquired, however the
kmemleak_lock is already held. This is a potential race with the scanning
thread which acquires object->lock and kmemleak_lock in a

This patch removes the locking around the 'parent' object information
printing. Such object cannot be freed or removed from object_tree_root
and object_list since kmemleak_lock is already held. There is a very
small risk that some of the object data is being modified on another CPU
but the only downside is inconsistent information printing.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 5f369f37 24-Jun-2015 Catalin Marinas <catalin.marinas@arm.com>

mm: kmemleak: do not acquire scan_mutex in kmemleak_do_cleanup()

The kmemleak_do_cleanup() work thread already waits for the kmemleak_scan
thread to finish via kthread_stop(). Waiting in kthread_stop() while
scan_mutex is held may lead to deadlock if kmemleak_scan_thread() also
waits to acquire for scan_mutex.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# e781a9ab 24-Jun-2015 Catalin Marinas <catalin.marinas@arm.com>

mm: kmemleak: fix delete_object_*() race when called on the same memory block

Calling delete_object_*() on the same pointer is not a standard use case
(unless there is a bug in the code calling kmemleak_free()). However,
during kmemleak disabling (error or user triggered via /sys), there is a
potential race between kmemleak_free() calls on a CPU and
__kmemleak_do_cleanup() on a different CPU.

The current delete_object_*() implementation first performs a look-up
holding kmemleak_lock, increments the object->use_count and then
re-acquires kmemleak_lock to remove the object from object_tree_root and
object_list.

This patch simplifies the delete_object_*() mechanism to both look up
and remove an object from the object_tree_root and object_list
atomically (guarded by kmemleak_lock). This allows safe concurrent
calls to delete_object_*() on the same pointer without additional
locking for synchronising the kmemleak_free_enabled flag.

A side effect is a slight improvement in the delete_object_*() performance
by avoiding acquiring kmemleak_lock twice and incrementing/decrementing
object->use_count.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# c5f3b1a5 24-Jun-2015 Catalin Marinas <catalin.marinas@arm.com>

mm: kmemleak: allow safe memory scanning during kmemleak disabling

The kmemleak scanning thread can run for minutes. Callbacks like
kmemleak_free() are allowed during this time, the race being taken care
of by the object->lock spinlock. Such lock also prevents a memory block
from being freed or unmapped while it is being scanned by blocking the
kmemleak_free() -> ... -> __delete_object() function until the lock is
released in scan_object().

When a kmemleak error occurs (e.g. it fails to allocate its metadata),
kmemleak_enabled is set and __delete_object() is no longer called on
freed objects. If kmemleak_scan is running at the same time,
kmemleak_free() no longer waits for the object scanning to complete,
allowing the corresponding memory block to be freed or unmapped (in the
case of vfree()). This leads to kmemleak_scan potentially triggering a
page fault.

This patch separates the kmemleak_free() enabling/disabling from the
overall kmemleak_enabled nob so that we can defer the disabling of the
object freeing tracking until the scanning thread completed. The
kmemleak_free_part() is deliberately ignored by this patch since this is
only called during boot before the scanning thread started.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Vignesh Radhakrishnan <vigneshr@codeaurora.org>
Tested-by: Vignesh Radhakrishnan <vigneshr@codeaurora.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 8f4fc071 14-May-2015 Vladimir Davydov <vdavydov.dev@gmail.com>

gfp: add __GFP_NOACCOUNT

Not all kmem allocations should be accounted to memcg. The following
patch gives an example when accounting of a certain type of allocations to
memcg can effectively result in a memory leak. This patch adds the
__GFP_NOACCOUNT flag which if passed to kmalloc and friends will force the
allocation to go through the root cgroup. It will be used by the next
patch.

Note, since in case of kmemleak enabled each kmalloc implies yet another
allocation from the kmemleak_object cache, we add __GFP_NOACCOUNT to
gfp_kmemleak_mask.

Alternatively, we could introduce a per kmem cache flag disabling
accounting for all allocations of a particular kind, but (a) we would not
be able to bypass accounting for kmalloc then and (b) a kmem cache with
this flag set could not be merged with a kmem cache without this flag,
which would increase the number of global caches and therefore
fragmentation even if the memory cgroup controller is not used.

Despite its generic name, currently __GFP_NOACCOUNT disables accounting
only for kmem allocations while user page allocations are always charged.
To catch abusing of this flag, a warning is issued on an attempt of
passing it to mem_cgroup_try_charge.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org> [4.0.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# e79ed2f1 13-Feb-2015 Andrey Ryabinin <ryabinin.a.a@gmail.com>

kmemleak: disable kasan instrumentation for kmemleak

kmalloc internally round up allocation size, and kmemleak uses rounded up
size as object's size. This makes kasan to complain while kmemleak scans
memory or calculates of object's checksum. The simplest solution here is
to disable kasan.

Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrey Konovalov <adech.fo@gmail.com>
Cc: Yuri Gribov <tetra2005@gmail.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# ffe2c748 06-Jun-2014 Catalin Marinas <catalin.marinas@arm.com>

mm: introduce kmemleak_update_trace()

The memory allocation stack trace is not always useful for debugging a
memory leak (e.g. radix_tree_preload). This function, when called,
updates the stack trace for an already allocated object.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# aae0ad7a 06-Jun-2014 Jianpeng Ma <majianpeng@gmail.com>

mm/kmemleak.c: use %u to print ->checksum

Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# bfc8c901 04-Jun-2014 Vladimir Davydov <vdavydov.dev@gmail.com>

mem-hotplug: implement get/put_online_mems

kmem_cache_{create,destroy,shrink} need to get a stable value of
cpu/node online mask, because they init/destroy/access per-cpu/node
kmem_cache parts, which can be allocated or destroyed on cpu/mem
hotplug. To protect against cpu hotplug, these functions use
{get,put}_online_cpus. However, they do nothing to synchronize with
memory hotplug - taking the slab_mutex does not eliminate the
possibility of race as described in patch 2.

What we need there is something like get_online_cpus, but for memory.
We already have lock_memory_hotplug, which serves for the purpose, but
it's a bit of a hammer right now, because it's backed by a mutex. As a
result, it imposes some limitations to locking order, which are not
desirable, and can't be used just like get_online_cpus. That's why in
patch 1 I substitute it with get/put_online_mems, which work exactly
like get/put_online_cpus except they block not cpu, but memory hotplug.

[ v1 can be found at https://lkml.org/lkml/2014/4/6/68. I NAK'ed it by
myself, because it used an rw semaphore for get/put_online_mems,
making them dead lock prune. ]

This patch (of 2):

{un}lock_memory_hotplug, which is used to synchronize against memory
hotplug, is currently backed by a mutex, which makes it a bit of a
hammer - threads that only want to get a stable value of online nodes
mask won't be able to proceed concurrently. Also, it imposes some
strong locking ordering rules on it, which narrows down the set of its
usage scenarios.

This patch introduces get/put_online_mems, which are the same as
get/put_online_cpus, but for memory hotplug, i.e. executing a code
inside a get/put_online_mems section will guarantee a stable value of
online nodes, present pages, etc.

lock_memory_hotplug()/unlock_memory_hotplug() are removed altogether.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 3551a928 09-May-2014 Catalin Marinas <catalin.marinas@arm.com>

mm: postpone the disabling of kmemleak early logging

Commit 8910ae896c8c ("kmemleak: change some global variables to int"),
in addition to the atomic -> int conversion, moved the disabling of
kmemleak_early_log to the beginning of the kmemleak_init() function,
before the full kmemleak tracing is actually enabled. In this small
window, kmem_cache_create() is called by kmemleak which triggers
additional memory allocation that are not traced. This patch restores
the original logic with kmemleak_early_log disabling when kmemleak is
fully functional.

Fixes: 8910ae896c8c (kmemleak: change some global variables to int)

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 8910ae89 03-Apr-2014 Li Zefan <lizefan@huawei.com>

kmemleak: change some global variables to int

They don't have to be atomic_t, because they are simple boolean toggles.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 5f3bf19a 03-Apr-2014 Li Zefan <lizefan@huawei.com>

kmemleak: remove redundant code

Remove kmemleak_padding() and kmemleak_release().

Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# c89da70c 03-Apr-2014 Li Zefan <lizefan@huawei.com>

kmemleak: allow freeing internal objects after kmemleak was disabled

Currently if kmemleak is disabled, the kmemleak objects can never be
freed, no matter if it's disabled by a user or due to fatal errors.

Those objects can be a big waste of memory.

OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
1200264 1197433 99% 0.30K 46164 26 369312K kmemleak_object

With this patch, after kmemleak was disabled you can reclaim memory
with:

# echo clear > /sys/kernel/debug/kmemleak

Also inform users about this with a printk.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# dc9b3f42 03-Apr-2014 Li Zefan <lizefan@huawei.com>

kmemleak: free internal objects only if there're no leaks to be reported

Currently if you stop kmemleak thread before disabling kmemleak,
kmemleak objects will be freed and so you won't be able to check
previously reported leaks.

With this patch, kmemleak objects won't be freed if there're leaks that
can be reported.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 7f88f88f 12-Nov-2013 Catalin Marinas <catalin.marinas@arm.com>

mm: kmemleak: avoid false negatives on vmalloc'ed objects

Commit 248ac0e1943a ("mm/vmalloc: remove guard page from between vmap
blocks") had the side effect of making vmap_area.va_end member point to
the next vmap_area.va_start. This was creating an artificial reference
to vmalloc'ed objects and kmemleak was rarely reporting vmalloc() leaks.

This patch marks the vmap_area containing pointers explicitly and
reduces the min ref_count to 2 as vm_struct still contains a reference
to the vmalloc'ed object. The kmemleak add_scan_area() function has
been improved to allow a SIZE_MAX argument covering the rest of the
object (for simpler calling sites).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 3dbb95f7 11-Sep-2013 Jingoo Han <jg1.han@samsung.com>

mm: replace strict_strtoul() with kstrtoul()

The use of strict_strtoul() is not preferred, because strict_strtoul() is
obsolete. Thus, kstrtoul() should be used.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# b67bfe0d 27-Feb-2013 Sasha Levin <sasha.levin@oracle.com>

hlist: drop the node parameter from iterators

I'm not sure why, but the hlist for each entry iterators were conceived

list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 108bcc96 22-Feb-2013 Cody P Schafer <cody@linux.vnet.ibm.com>

mm: add & use zone_end_pfn() and zone_spans_pfn()

Add 2 helpers (zone_end_pfn() and zone_spans_pfn()) to reduce code
duplication.

This also switches to using them in compaction (where an additional
variable needed to be renamed), page_alloc, vmstat, memory_hotplug, and
kmemleak.

Note that in compaction.c I avoid calling zone_end_pfn() repeatedly
because I expect at some point the sycronization issues with start_pfn &
spanned_pages will need fixing, either by actually using the seqlock or
clever memory barrier usage.

Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Cc: David Hansen <dave@linux.vnet.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# dc053733 18-Dec-2012 Abhijit Pawar <abhi.c.pawar@gmail.com>

mm/kmemleak.c: remove obsolete simple_strtoul

Replace the obsolete simple_strtoul() with kstrtoul().

Signed-off-by: Abhijit Pawar <abhi.c.pawar@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 85d3a316 08-Oct-2012 Michel Lespinasse <walken@google.com>

kmemleak: use rbtree instead of prio tree

kmemleak uses a tree where each node represents an allocated memory object
in order to quickly find out what object a given address is part of.
However, the objects don't overlap, so rbtrees are a better choice than
prio tree for this use. They are both faster and have lower memory
overhead.

Tested by booting a kernel with kmemleak enabled, loading the
kmemleak_test module, and looking for the expected messages.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 58fac095 16-Aug-2012 Michael Wang <wangyun@linux.vnet.ibm.com>

kmemleak: Replace list_for_each_continue_rcu with new interface

This patch replaces list_for_each_continue_rcu() with
list_for_each_entry_continue_rcu() to save a few lines
of code and allow removing list_for_each_continue_rcu().

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>


# b370d29e 20-Jan-2012 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Disable early logging when kmemleak is off by default

Commit b6693005 (kmemleak: When the early log buffer is exceeded, report
the actual number) deferred the disabling of the early logging to
kmemleak_init(). However, when CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y, the
early logging was no longer disabled causing __init kmemleak functions
to be called even after the kernel freed the init memory. This patch
disables the early logging during kmemleak_init() if kmemleak is left
disabled.

Reported-by: Dirk Gouders <gouders@et.bocholt.fh-gelsenkirchen.de>
Tested-by: Dirk Gouders <gouders@et.bocholt.fh-gelsenkirchen.de>
Tested-by: Josh Boyer <jwboyer@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# b469d432 10-Jan-2012 Tiejun Chen <tiejun.chen@windriver.com>

kmemleak: Only scan non-zero-size areas

Kmemleak should only track valid scan areas with a non-zero size.
Otherwise, such area may reside just at the end of an object and
kmemleak would report "Adding scan area to unknown object".

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 029aeff5 15-Nov-2011 Laura Abbott <lauraa@codeaurora.org>

kmemleak: Add support for memory hotplug

Ensure that memory hotplug can co-exist with kmemleak
by taking the hotplug lock before scanning the memory
banks.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# f528f0b8 26-Sep-2011 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Handle percpu memory allocation

This patch adds kmemleak callbacks from the percpu allocator, reducing a
number of false positives caused by kmemleak not scanning such memory
blocks. The percpu chunks are never reported as leaks because of current
kmemleak limitations with the __percpu pointer not pointing directly to
the actual chunks.

Reported-by: Huajun Li <huajun.li.lee@gmail.com>
Acked-by: Christoph Lameter <cl@gentwo.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 74341703 29-Sep-2011 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Report previously found leaks even after an error

If an error fatal to kmemleak (like memory allocation failure) happens,
kmemleak disables itself but it also removes the access to any
previously found memory leaks. This patch allows read-only access to the
kmemleak debugfs interface but disables any other action.

Reported-by: Nick Bowler <nbowler@elliptictech.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# b6693005 28-Sep-2011 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: When the early log buffer is exceeded, report the actual number

Just telling that the early log buffer has been exceeded doesn't mean
much. This patch moves the error printing to the kmemleak_init()
function and displays the actual calls to the kmemleak API during early
logging.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 5f79020c 27-Sep-2011 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Show where early_log issues come from

Based on initial patch by Steven Rostedt.

Early kmemleak warnings did not show where the actual kmemleak API had
been called from but rather just a backtrace to the kmemleak_init()
function. By having all early kmemleak logs record the stack_trace, we
can have kmemleak_init() write exactly where the problem occurred. This
patch adds the setting of the kmemleak_warning variable every time a
kmemleak warning is issued. The kmemleak_init() function checks this
variable during early log replaying and prints the log trace if there
was any warning.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>


# b95f1b31 16-Oct-2011 Paul Gortmaker <paul.gortmaker@windriver.com>

mm: Map most files to use export.h instead of module.h

The files changed within are only using the EXPORT_SYMBOL
macro variants. They are not using core modular infrastructure
and hence don't need module.h but only the export.h header.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>


# 60063497 26-Jul-2011 Arun Sharma <asharma@fb.com>

atomic: use <linux/atomic.h>

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 52c3ce4e 27-Apr-2011 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Do not return a pointer to an object that kmemleak did not get

The kmemleak_seq_next() function tries to get an object (and increment
its use count) before returning it. If it could not get the last object
during list traversal (because it may have been freed), the function
should return NULL rather than a pointer to such object that it did not
get.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Acked-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Cc: <stable@kernel.org>


# 25985edc 30-Mar-2011 Lucas De Marchi <lucas.demarchi@profusion.mobi>

Fix common misspellings

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>


# 6ae4bd1f 27-Jan-2011 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Allow kmemleak metadata allocations to fail

This patch adds __GFP_NORETRY and __GFP_NOMEMALLOC flags to the kmemleak
metadata allocations so that it has a smaller effect on the users of the
kernel slab allocator. Since kmemleak allocations can now fail more
often, this patch also reduces the verbosity by passing __GFP_NOWARN and
not dumping the stack trace when a kmemleak allocation fails.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Ted Ts'o <tytso@mit.edu>


# 145b64b9 22-Jul-2010 Holger Hans Peter Freyther <zecke@selfish.org>

kmemleak: Fix typo in the comment

Fix typo in comment.

Signed-off-by: Holger Hans Peter Freyther <zecke@selfish.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# a2b6bf63 19-Jul-2010 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Add DocBook style comments to kmemleak.c

The description and parameters of the kmemleak API weren't obvious. This
patch adds comments clarifying the API usage.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>


# ab0155a2 19-Jul-2010 Jason Baron <jbaron@redhat.com>

kmemleak: Introduce a default off mode for kmemleak

Introduce a new DEBUG_KMEMLEAK_DEFAULT_OFF config parameter that allows
kmemleak to be disabled by default, but enabled on the command line
via: kmemleak=on. Although a reboot is required to turn it on, its still
useful to not require a re-compile.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>


# a7686a45 19-Jul-2010 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Show more information for objects found by alias

There may be situations when an object is freed using a pointer inside
the memory block. Kmemleak should show more information to help with
debugging.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>


# 5a0e3ad6 24-Mar-2010 Tejun Heo <tj@kernel.org>

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>


# 21ae2956 07-Oct-2009 Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

tree-wide: fix typos "aquire" -> "acquire", "cumsumed" -> "consumed"

This patch was generated by

git grep -E -i -l '[Aa]quire' | xargs -r perl -p -i -e 's/([Aa])quire/$1cquire/'

and the cumsumed was found by checking the diff for aquire.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>


# 04609ccc 28-Oct-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Reduce the false positives by checking for modified objects

If an object was modified since it was previously suspected as leak, do
not report it. The modification check is done by calculating the
checksum (CRC32) of such object.

Several false positives are caused by objects being removed from linked
lists (e.g. allocation pools) and temporarily breaking the reference
chain since kmemleak runs concurrently with such list mutation
primitives.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# fefdd336 28-Oct-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Show the age of an unreferenced object

The jiffies shown for unreferenced objects isn't always meaningful to
people debugging kernel memory leaks. This patch adds the age as well to
the displayed information.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 0587da40 28-Oct-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Release the object lock before calling put_object()

The put_object() function may free the object if the use_count
dropped to 0. There shouldn't be further accesses to such object unless
it is known that the use_count is non-zero.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# c017b4be 28-Oct-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Simplify the kmemleak_scan_area() function prototype

This function was taking non-necessary arguments which can be determined
by kmemleak. The patch also modifies the calling sites.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>


# 0d5d1aad 09-Oct-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Check for NULL pointer returned by create_object()

This patch adds NULL pointer checking in the early_alloc() function.

Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# c1bcd6b3 09-Oct-2009 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

kmemleak: Use GFP_ATOMIC for early_alloc().

We can't use GFP_KERNEL inside rcu_read_lock().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# addd72c1 11-Sep-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Improve the "Early log buffer exceeded" error message

Based on a suggestion from Jaswinder, clarify what the user would need
to do to avoid this error message from kmemleak.

Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 7eb0d5e5 08-Sep-2009 Luis R. Rodriguez <lrodriguez@Atheros.com>

kmemleak: fix sparse warning for static declarations

This fixes these sparse warnings:

mm/kmemleak.c:1179:6: warning: symbol 'start_scan_thread' was not declared. Should it be static?
mm/kmemleak.c:1194:6: warning: symbol 'stop_scan_thread' was not declared. Should it be static?

Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 0580a181 08-Sep-2009 Luis R. Rodriguez <lrodriguez@Atheros.com>

kmemleak: fix sparse warning over overshadowed flags

A secondary irq_save is not required as a locking before it was
already disabling irqs.

This fixes this sparse warning:
mm/kmemleak.c:512:31: warning: symbol 'flags' shadows an earlier one
mm/kmemleak.c:448:23: originally declared here

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# a1084c87 04-Sep-2009 Luis R. Rodriguez <lrodriguez@Atheros.com>

kmemleak: move common painting code together

When painting grey or black we do the same thing, bring
this together into a helper and identify coloring grey or
black explicitly with defines. This makes this a little
easier to read.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 30b37101 04-Sep-2009 Luis R. Rodriguez <lrodriguez@Atheros.com>

kmemleak: add clear command support

In an ideal world your kmemleak output will be small, when its
not (usually during initial bootup) you can use the clear command
to ingore previously reported and unreferenced kmemleak objects. We
do this by painting all currently reported unreferenced objects grey.
We paint them grey instead of black to allow future scans on the same
objects as such objects could still potentially reference newly
allocated objects in the future.

To test a critical section on demand with a clean
/sys/kernel/debug/kmemleak you can do:

echo clear > /sys/kernel/debug/kmemleak
test your kernel or modules
echo scan > /sys/kernel/debug/kmemleak

Then as usual to get your report with:

cat /sys/kernel/debug/kmemleak

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 4a558dd6 08-Sep-2009 Luis R. Rodriguez <lrodriguez@Atheros.com>

kmemleak: use bool for true/false questions

Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 179a8100 07-Sep-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Do no create the clean-up thread during kmemleak_disable()

The kmemleak_disable() function could be called from various contexts
including IRQ. It creates a clean-up thread but the kthread_create()
function has restrictions on which contexts it can be called from,
mainly because of the kthread_create_lock. The patch changes the
kmemleak clean-up thread to a workqueue.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Eric Paris <eparis@redhat.com>


# 43ed5d6e 01-Sep-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Scan all thread stacks

This patch changes the for_each_process() loop with the
do_each_thread()/while_each_thread() pair.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 8e019366 27-Aug-2009 Pekka Enberg <penberg@cs.helsinki.fi>

kmemleak: Don't scan uninitialized memory when kmemcheck is enabled

Ingo Molnar reported the following kmemcheck warning when running both
kmemleak and kmemcheck enabled:

PM: Adding info for No Bus:vcsa7
WARNING: kmemcheck: Caught 32-bit read from uninitialized memory
(f6f6e1a4)
d873f9f600000000c42ae4c1005c87f70000000070665f666978656400000000
i i i i u u u u i i i i i i i i i i i i i i i i i i i i i u u u
^

Pid: 3091, comm: kmemleak Not tainted (2.6.31-rc7-tip #1303) P4DC6
EIP: 0060:[<c110301f>] EFLAGS: 00010006 CPU: 0
EIP is at scan_block+0x3f/0xe0
EAX: f40bd700 EBX: f40bd780 ECX: f16b46c0 EDX: 00000001
ESI: f6f6e1a4 EDI: 00000000 EBP: f10f3f4c ESP: c2605fcc
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
CR0: 8005003b CR2: e89a4844 CR3: 30ff1000 CR4: 000006f0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff4ff0 DR7: 00000400
[<c110313c>] scan_object+0x7c/0xf0
[<c1103389>] kmemleak_scan+0x1d9/0x400
[<c1103a3c>] kmemleak_scan_thread+0x4c/0xb0
[<c10819d4>] kthread+0x74/0x80
[<c10257db>] kernel_thread_helper+0x7/0x3c
[<ffffffff>] 0xffffffff
kmemleak: 515 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)
kmemleak: 42 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

The problem here is that kmemleak will scan partially initialized
objects that makes kmemcheck complain. Fix that up by skipping
uninitialized memory regions when kmemcheck is enabled.

Reported-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>


# 0494e082 27-Aug-2009 Sergey Senozhatsky <sergey.senozhatsky@mail.by>

kmemleak: Printing of the objects hex dump

Introducing printing of the objects hex dump to the seq file.
The number of lines to be printed is limited to HEX_MAX_LINES
to prevent seq file spamming. The actual number of printed
bytes is less than or equal to (HEX_MAX_LINES * HEX_ROW_SIZE).

(slight adjustments by Catalin Marinas)

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@mail.by>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# fd678967 27-Aug-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Save the stack trace for early allocations

Before slab is initialised, kmemleak save the allocations in an early
log buffer. They are later recorded as normal memory allocations. This
patch adds the stack trace saving to the early log buffer, otherwise the
information shown for such objects only refers to the kmemleak_init()
function.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# a6186d89 27-Aug-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Mark the early log buffer as __initdata

This buffer isn't needed after kmemleak was initialised so it can be
freed together with the .init.data section. This patch also marks
functions conditionally accessing the early log variables with __ref.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 189d84ed 27-Aug-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Dump object information on request

By writing dump=<addr> to the kmemleak file, kmemleak will look up an
object with that address and dump the information it has about it to
syslog. This is useful in debugging memory leaks.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# af98603d 27-Aug-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Allow rescheduling during an object scanning

If the object size is bigger than a predefined value (4K in this case),
release the object lock during scanning and call cond_resched().
Re-acquire the lock after rescheduling and test whether the object is
still valid.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# f5886c7f 29-Jul-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Protect the seq start/next/stop sequence by rcu_read_lock()

Objects passed to kmemleak_seq_next() have an incremented reference
count (hence not freed) but they may point via object_list.next to
other freed objects. To avoid this, the whole start/next/stop sequence
must be protected by rcu_read_lock().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 53238a60 07-Jul-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Allow partial freeing of memory blocks

Functions like free_bootmem() are allowed to free only part of a memory
block. This patch adds support for this via the kmemleak_free_part()
callback which removes the original object and creates one or two
additional objects as a result of the memory block split.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>


# 2587362e 07-Jul-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Scan objects allocated during a scanning episode

Many of the false positives in kmemleak happen on busy systems where
objects are allocated during a kmemleak scanning episode. These objects
aren't scanned by default until the next memory scan. When such object
is added, for example, at the head of a list, it is possible that all
the other objects in the list become unreferenced until the next scan.

This patch adds checking for newly allocated objects at the end of the
scan and repeats the scanning on these objects. If Linux allocates
new objects at a higher rate than their scanning, it stops after a
predefined number of passes.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# b87324d0 07-Jul-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Do not acquire scan_mutex in kmemleak_open()

Initially, the scan_mutex was acquired in kmemleak_open() and released
in kmemleak_release() (corresponding to /sys/kernel/debug/kmemleak
operations). This was causing some lockdep reports when the file was
closed from a different task than the one opening it. This patch moves
the scan_mutex acquiring in kmemleak_write() or kmemleak_seq_start()
with releasing in kmemleak_seq_stop().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 288c857d 07-Jul-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Remove the reported leaks number limitation

Since the leaks are no longer printed to the syslog, there is no point
in keeping this limitation. All the suspected leaks are shown on
/sys/kernel/debug/kmemleak file.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 4b8a9674 07-Jul-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Add more cond_resched() calls in the scanning thread

Following recent fix to no longer reschedule in the scan_block()
function, the system may become unresponsive with !PREEMPT. This patch
re-adds the cond_resched() call to scan_block() but conditioned by the
allow_resched parameter.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>


# bf2a76b3 07-Jul-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Renice the scanning thread to +10

This is a long-running thread but not high-priority. So it makes sense
to renice it to +10.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 57d81f6f 01-Jul-2009 Ingo Molnar <mingo@elte.hu>

kmemleak: Fix scheduling-while-atomic bug

One of the kmemleak changes caused the following
scheduling-while-holding-the-tasklist-lock regression on x86:

BUG: sleeping function called from invalid context at mm/kmemleak.c:795
in_atomic(): 1, irqs_disabled(): 0, pid: 1737, name: kmemleak
2 locks held by kmemleak/1737:
#0: (scan_mutex){......}, at: [<c10c4376>] kmemleak_scan_thread+0x45/0x86
#1: (tasklist_lock){......}, at: [<c10c3bb4>] kmemleak_scan+0x1a9/0x39c
Pid: 1737, comm: kmemleak Not tainted 2.6.31-rc1-tip #59266
Call Trace:
[<c105ac0f>] ? __debug_show_held_locks+0x1e/0x20
[<c102e490>] __might_sleep+0x10a/0x111
[<c10c38d5>] scan_yield+0x17/0x3b
[<c10c3970>] scan_block+0x39/0xd4
[<c10c3bc6>] kmemleak_scan+0x1bb/0x39c
[<c10c4331>] ? kmemleak_scan_thread+0x0/0x86
[<c10c437b>] kmemleak_scan_thread+0x4a/0x86
[<c104d73e>] kthread+0x6e/0x73
[<c104d6d0>] ? kthread+0x0/0x73
[<c100959f>] kernel_thread_helper+0x7/0x10
kmemleak: 834 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

The bit causing it is highly dubious:

static void scan_yield(void)
{
might_sleep();

if (time_is_before_eq_jiffies(next_scan_yield)) {
schedule();
next_scan_yield = jiffies + jiffies_scan_yield;
}
}

It called deep inside the codepath and in a conditional way,
and that is what crapped up when one of the new scan_block()
uses grew a tasklist_lock dependency.

This minimal patch removes that yielding stuff and adds the
proper cond_resched().

The background scanning thread could probably also be reniced
to +10.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# b6e68722 29-Jun-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Do not warn if an unknown object is freed

vmap'ed memory blocks are not tracked by kmemleak (yet) but they may be
released with vfree() which is tracked. The corresponding kmemleak
warning is only enabled in debug mode. Future patch will add support for
ioremap and vmap.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 17bb9e0d 29-Jun-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Do not report new leaked objects if the scanning was stopped

If the scanning was stopped with a signal, it is possible that some
objects are left with a white colour (potential leaks) and reported. Add
a check to avoid reporting such objects.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# acf4968e 26-Jun-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Slightly change the policy on newly allocated objects

Newly allocated objects are more likely to be reported as false
positives. Kmemleak ignores the reporting of objects younger than 5
seconds. However, this age was calculated after the memory scanning
completed which usually takes longer than 5 seconds. This patch
make the minimum object age calculation in relation to the start of the
memory scanning.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 4698c1f2 26-Jun-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Do not trigger a scan when reading the debug/kmemleak file

Since there is a kernel thread for automatically scanning the memory, it
makes sense for the debug/kmemleak file to only show its findings. This
patch also adds support for "echo scan > debug/kmemleak" to trigger an
intermediate memory scan and eliminates the kmemleak_mutex (scan_mutex
covers all the cases now).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# bab4a34a 26-Jun-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Simplify the reports logged by the scanning thread

Because of false positives, the memory scanning thread may print too
much information. This patch changes the scanning thread to only print
the number of newly suspected leaks. Further information can be read
from the /sys/kernel/debug/kmemleak file.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# e0a2a160 26-Jun-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Enable task stacks scanning by default

This is to reduce the number of false positives reported.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# a9d9058a 25-Jun-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Allow the early log buffer to be configurable.

(feature suggested by Sergey Senozhatsky)

Kmemleak needs to track all the memory allocations but some of these
happen before kmemleak is initialised. These are stored in an internal
buffer which may be exceeded in some kernel configurations. This patch
adds a configuration option with a default value of 400 and also removes
the stack dump when the early log buffer is exceeded.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@mail.by>


# ae281064 23-Jun-2009 Joe Perches <joe@perches.com>

kmemleak: use pr_fmt

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 2030117d 17-Jun-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Fix some typos in comments

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


# 000814f4 17-Jun-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Rename kmemleak_panic to kmemleak_stop

This is to avoid the confusion created by the "panic" word.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>


# 216c04b0 17-Jun-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Only use GFP_KERNEL|GFP_ATOMIC for the internal allocations

Kmemleak allocates memory for pointer tracking and it tries to avoid
using GFP_ATOMIC if the caller doesn't require it. However other gfp
flags may be passed by the caller which aren't required by kmemleak.
This patch filters the gfp flags so that only GFP_KERNEL | GFP_ATOMIC
are used.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>


# 3c7b4e6b 11-Jun-2009 Catalin Marinas <catalin.marinas@arm.com>

kmemleak: Add the base support

This patch adds the base support for the kernel memory leak
detector. It traces the memory allocation/freeing in a way similar to
the Boehm's conservative garbage collector, the difference being that
the unreferenced objects are not freed but only shown in
/sys/kernel/debug/kmemleak. Enabling this feature introduces an
overhead to memory allocations.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>