Cross Reference: /linux-master/lib/idr.c

History log of /linux-master/lib/idr.c
Revision	Date	Author	Comments
# af73483f	21-Dec-2023	Matthew Wilcox (Oracle) <willy@infradead.org>	ida: Fix crash in ida_free when the bitmap is empty The IDA usually detects double-frees, but that detection failed to consider the case when there are no nearby IDs allocated and so we have a NULL bitmap rather than simply having a clear bit. Add some tests to the test-suite to be sure we don't inadvertently reintroduce this problem. Unfortunately they're quite noisy so include a message to disregard the warnings. Reported-by: Zhenghan Wang <wzhmmmmm@gmail.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 2a15de80	26-Aug-2023	Ariel Marcovitch <arielmarcovitch@gmail.com>	idr: fix param name in idr_alloc_cyclic() doc The relevant parameter is 'start' and not 'nextid' Fixes: 460488c58ca8 ("idr: Remove idr_alloc_ext") Signed-off-by: Ariel Marcovitch <arielmarcovitch@gmail.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
# fc82bbf4	10-Jul-2022	Linus Torvalds <torvalds@linux-foundation.org>	ida: don't use BUG_ON() for debugging This is another old BUG_ON() that just shouldn't exist (see also commit a382f8fee42c: "signal handling: don't use BUG_ON() for debugging"). In fact, as Matthew Wilcox points out, this condition shouldn't really even result in a warning, since a negative id allocation result is just a normal allocation failure: "I wonder if we should even warn here -- sure, the caller is trying to free something that wasn't allocated, but we don't warn for kfree(NULL)" and goes on to point out how that current error check is only causing people to unnecessarily do their own index range checking before freeing it. This was noted by Itay Iellin, because the bluetooth HCI socket cookie code does not do that range checking, and ends up just freeing the error case too, triggering the BUG_ON(). The HCI code requires CAP_NET_RAW, and seems to just result in an ugly splat, but there really is no reason to BUG_ON() here, and we have generally striven for allocation models where it's always ok to just do free(alloc()); even if the allocation were to fail for some random reason (usually obviously that "random" reason being some resource limit). Fixes: 88eca0207cf1 ("ida: simplified functions for id allocation") Reported-by: Itay Iellin <ieitayie@gmail.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 3b674261	15-Oct-2020	Stephen Boyd <swboyd@chromium.org>	lib/idr.c: document calling context for IDA APIs mustn't use locks The documentation for these functions indicates that callers don't need to hold a lock while calling them, but that documentation is only in one place under "IDA Usage". Let's state the same information on each IDA function so that it's clear what the calling context requires. Furthermore, let's document ida_simple_get() with the same information so that callers know how this API works. Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tri Vo <trong@android.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Link: https://lkml.kernel.org/r/20200910055246.2297797-1-swboyd@chromium.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# a219b856	02-Apr-2020	Matthew Wilcox (Oracle) <willy@infradead.org>	ida: Free allocated bitmap in error path If a bitmap needs to be allocated, and then by the time the thread is scheduled to be run again all the indices which would satisfy the allocation have been allocated then we would leak the allocation. Almost impossible to hit in practice, but a trivial fix. Found by Coverity. Fixes: f32f004cddf8 ("ida: Convert to XArray") Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
# 5a74ac4c	01-Nov-2019	Matthew Wilcox (Oracle) <willy@infradead.org>	idr: Fix idr_get_next_ul race with idr_remove Commit 5c089fd0c734 ("idr: Fix idr_get_next race with idr_remove") neglected to fix idr_get_next_ul(). As far as I can tell, nobody's actually using this interface under the RCU read lock, but fix it now before anybody decides to use it. Fixes: 5c089fd0c734 ("idr: Fix idr_get_next race with idr_remove") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
# 5c089fd0	14-May-2019	Matthew Wilcox (Oracle) <willy@infradead.org>	idr: Fix idr_get_next race with idr_remove If the entry is deleted from the IDR between the call to radix_tree_iter_find() and rcu_dereference_raw(), idr_get_next() will return NULL, which will end the iteration prematurely. We should instead continue to the next entry in the IDR. This only happens if the iteration is protected by the RCU lock. Most IDR users use a spinlock or semaphore to exclude simultaneous modifications. It was noticed once the PID allocator was converted to use the IDR, as it uses the RCU lock, but there may be other users elsewhere in the kernel. We can't use the normal pattern of calling radix_tree_deref_retry() (which catches both a retry entry in a leaf node and a node entry in the root) as the IDR supports storing entries which are unaligned, which will trigger an infinite loop if they are encountered. Instead, we have to explicitly check whether the entry is a retry entry. Fixes: 0a835c4f090a ("Reimplement IDR and IDA using the radix tree") Reported-by: Brendan Gregg <bgregg@netflix.com> Tested-by: Brendan Gregg <bgregg@netflix.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
# 457c8996	19-May-2019	Thomas Gleixner <tglx@linutronix.de>	treewide: Add SPDX license identifier for missed files Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
# 1cf56f9d	09-Apr-2018	Matthew Wilcox <willy@infradead.org>	radix tree: Remove radix_tree_update_node_t The only user of this functionality was the workingset code, and it's now been converted to the XArray. Remove __radix_tree_delete_node() entirely as it was also only used by the workingset code. Signed-off-by: Matthew Wilcox <willy@infradead.org>
# f32f004c	04-Jul-2018	Matthew Wilcox <willy@infradead.org>	ida: Convert to XArray Use the XA_TRACK_FREE ability to track which entries have a free bit, similarly to how it uses the radix tree's IDR_FREE tag. This eliminates the per-cpu ida_bitmap preload, and fixes the memory consumption regression I introduced when making the IDR able to store any pointer. Signed-off-by: Matthew Wilcox <willy@infradead.org>
# f8d5d0cc	07-Nov-2017	Matthew Wilcox <willy@infradead.org>	xarray: Add definition of struct xarray This is a direct replacement for struct radix_tree_root. Some of the struct members have changed name; convert those, and use a #define so that radix_tree users continue to work without change. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com>
# 3159f943	03-Nov-2017	Matthew Wilcox <willy@infradead.org>	xarray: Replace exceptional entries Introduce xarray value entries and tagged pointers to replace radix tree exceptional entries. This is a slight change in encoding to allow the use of an extra bit (we can now store BITS_PER_LONG - 1 bits in a value entry). It is also a change in emphasis; exceptional entries are intimidating and different. As the comment explains, you can choose to store values or pointers in the xarray and they are both first-class citizens. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Josef Bacik <jbacik@fb.com>
# 66ee620f	25-Jun-2018	Matthew Wilcox <willy@infradead.org>	idr: Permit any valid kernel pointer to be stored An upcoming change to the encoding of internal entries will set the bottom two bits to 0b10. Unfortunately, m68k only aligns some data structures to 2 bytes, so the IDR will interpret them as internal entries and things will go badly wrong. Change the radix tree so that it stops either when the node indicates that it's the bottom of the tree (shift == 0) or when the entry is not an internal entry. This means we cannot insert an arbitrary kernel pointer as a multiorder entry, but the IDR does not permit multiorder entries. Annoyingly, this means the IDR can no longer take advantage of the radix tree's ability to store a single entry at offset 0 without allocating memory. A pointer which is 2-byte aligned cannot be stored directly in the root as it would be indistinguishable from a node, so we must allocate a node in order to store a 2-byte pointer at index 0. The idr_replace() function does not take a GFP flags argument, so cannot allocate memory. If a user inserts a 4-byte aligned pointer at index 0 and then replaces it with a 2-byte aligned pointer, we must be able to store it. Arbitrary pointer values are still not permitted; pointers of the form 2 + (i * 4) for values of i between 0 and 1023 are reserved for the implementation. These are not valid kernel pointers as they would point into the zero page. This change does cause a runtime memory consumption regression for the IDA. I will recover that later. Signed-off-by: Matthew Wilcox <willy@infradead.org> Tested-by: Guenter Roeck <linux@roeck-us.net>
# 1df89519	18-Jun-2018	Matthew Wilcox <willy@infradead.org>	ida: Change ida_get_new_above to return the id This calling convention makes more sense for the implementation as well as the callers. It even shaves 32 bytes off the compiled code size. Signed-off-by: Matthew Wilcox <willy@infradead.org>
# b03f8e43	18-Jun-2018	Matthew Wilcox <willy@infradead.org>	ida: Remove old API Delete ida_pre_get(), ida_get_new(), ida_get_new_above() and ida_remove() from the public API. Some of these functions still exist as internal helpers, but they should not be called by consumers. Signed-off-by: Matthew Wilcox <willy@infradead.org>
# 5ade60dd	20-Mar-2018	Matthew Wilcox <willy@infradead.org>	ida: Add new API Add ida_alloc(), ida_alloc_min(), ida_alloc_max(), ida_alloc_range() and ida_free(). The ida_alloc_max() and ida_alloc_range() functions differ from ida_simple_get() in that they take an inclusive 'max' parameter instead of an exclusive 'end' parameter. Callers are about evenly split whether they'd like inclusive or exclusive parameters and 'max' is easier to document than 'end'. Change the IDA allocation to first attempt to allocate a bit using existing memory, and only allocate memory afterwards. Also change the behaviour of 'min' > INT_MAX from being a BUG() to returning -ENOSPC. Leave compatibility wrappers in place for ida_simple_get() and ida_simple_remove() to avoid changing all callers. Signed-off-by: Matthew Wilcox <willy@infradead.org>
# 50d97d50	21-Jun-2018	Matthew Wilcox <willy@infradead.org>	ida: Lock the IDA in ida_destroy The user has no need to handle locking between ida_simple_get() and ida_simple_remove(). They shouldn't be forced to think about whether ida_destroy() might be called at the same time as any of their other IDA manipulation calls. Improve the documnetation while I'm in here. Signed-off-by: Matthew Wilcox <willy@infradead.org>
# b94078e6	07-Jun-2018	Matthew Wilcox <willy@infradead.org>	lib/idr.c: remove simple_ida_lock Improve the scalability of the IDA by using the per-IDA xa_lock rather than the global simple_ida_lock. IDAs are not typically used in performance-sensitive locations, but since we have this lock anyway, we can use it. It is also a step towards converting the IDA from the radix tree to the XArray. [akpm@linux-foundation.org: idr.c needs xarray.h] Link: http://lkml.kernel.org/r/20180331125332.GF13332@bombadil.infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 4b0ad076	26-Feb-2018	Matthew Wilcox <willy@infradead.org>	idr: Fix handling of IDs above INT_MAX Khalid reported that the kernel selftests are currently failing: selftests: test_bpf.sh ======================================== test_bpf: [FAIL] not ok 1..8 selftests: test_bpf.sh [FAIL] He bisected it to 6ce711f2750031d12cec91384ac5cfa0a485b60a ("idr: Make 1-based IDRs more efficient"). The root cause is doing a signed comparison in idr_alloc_u32() instead of an unsigned comparison. I went looking for any similar problems and found a couple (which would each result in the failure to warn in two situations that aren't supposed to happen). I knocked up a few test-cases to prove that I was right and added them to the test-suite. Reported-by: Khalid Aziz <khalid.aziz@oracle.com> Tested-by: Khalid Aziz <khalid.aziz@oracle.com> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
# b1a8a7a7	21-Feb-2018	Rasmus Villemoes <linux@rasmusvillemoes.dk>	ida: do zeroing in ida_pre_get() As far as I can tell, the only place the per-cpu ida_bitmap is populated is in ida_pre_get. The pre-allocated element is stolen in two places in ida_get_new_above, in both cases immediately followed by a memset(0). Since ida_get_new_above is called with locks held, do the zeroing in ida_pre_get, or rather let kmalloc() do it. Also, apparently gcc generates ~44 bytes of code to do a memset(, 0, 128): $ scripts/bloat-o-meter vmlinux.{0,1} add/remove: 0/0 grow/shrink: 2/1 up/down: 5/-88 (-83) Function old new delta ida_pre_get 115 119 +4 vermagic 27 28 +1 ida_get_new_above 715 627 -88 Link: http://lkml.kernel.org/r/20180108225634.15340-1-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Matthew Wilcox <mawilcox@microsoft.com> Cc: Eric Biggers <ebiggers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 6ce711f2	30-Nov-2017	Matthew Wilcox <willy@infradead.org>	idr: Make 1-based IDRs more efficient About 20% of the IDR users in the kernel want the allocated IDs to start at 1. The implementation currently searches all the way down the left hand side of the tree, finds no free ID other than ID 0, walks all the way back up, and then all the way down again. This patch 'rebases' the ID so we fill the entire radix tree, rather than leave a gap at 0. Chris Wilson says: "I did the quick hack of allocating index 0 of the idr and that eradicated idr_get_free() from being at the top of the profiles for the many-object stress tests. This improvement will be much appreciated." Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
# 72fd6c7b	28-Nov-2017	Matthew Wilcox <willy@infradead.org>	idr: Warn if old iterators see large IDs Now that the IDR can be used to store large IDs, it is possible somebody might only partially convert their old code and use the iterators which can only handle IDs up to INT_MAX. It's probably unwise to show them a truncated ID, so settle for spewing warnings to dmesg, and terminating the iteration. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
# 7a457577	28-Nov-2017	Matthew Wilcox <willy@infradead.org>	idr: Rename idr_for_each_entry_ext Most places in the kernel that we need to distinguish functions by the type of their arguments, we use '_ul' as a suffix for the unsigned long variant, not '_ext'. Also add kernel-doc. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
# 460488c5	28-Nov-2017	Matthew Wilcox <willy@infradead.org>	idr: Remove idr_alloc_ext It has no more users, so remove it. Move idr_alloc() back into idr.c, move the guts of idr_alloc_cmn() into idr_alloc_u32(), remove the wrappers around idr_get_free_cmn() and rename it to idr_get_free(). While there is now no interface to allocate IDs larger than a u32, the IDR internals remain ready to handle a larger ID should a need arise. These changes make it possible to provide the guarantee that, if the nextid pointer points into the object, the object's ID will be initialised before a concurrent lookup can find the object. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
# e096f6a7	28-Nov-2017	Matthew Wilcox <willy@infradead.org>	idr: Add idr_alloc_u32 helper All current users of idr_alloc_ext() actually want to allocate a u32 and idr_alloc_u32() fits their needs better. Like idr_get_next(), it uses a 'nextid' argument which serves as both a pointer to the start ID and the assigned ID (instead of a separate minimum and pointer-to-assigned-ID argument). It uses a 'max' argument rather than 'end' because the semantics that idr_alloc has for 'end' don't work well for unsigned types. Since idr_alloc_u32() returns an errno instead of the allocated ID, mark it as __must_check to help callers use it correctly. Include copious kernel-doc. Chris Mi <chrism@mellanox.com> has promised to contribute test-cases for idr_alloc_u32. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
# 234a4624	28-Nov-2017	Matthew Wilcox <willy@infradead.org>	idr: Delete idr_replace_ext function Changing idr_replace's 'id' argument to 'unsigned long' works for all callers. Callers which passed a negative ID now get -ENOENT instead of -EINVAL. No callers relied on this error value. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
# c7df8ad2	15-Nov-2017	Mel Gorman <mgorman@techsingularity.net>	mm, truncate: do not check mapping for every page being truncated During truncation, the mapping has already been checked for shmem and dax so it's known that workingset_update_node is required. This patch avoids the checks on mapping for each page being truncated. In all other cases, a lookup helper is used to determine if workingset_update_node() needs to be called. The one danger is that the API is slightly harder to use as calling workingset_update_node directly without checking for dax or shmem mappings could lead to surprises. However, the API rarely needs to be used and hopefully the comment is enough to give people the hint. sparsetruncate (tiny) 4.14.0-rc4 4.14.0-rc4 oneirq-v1r1 pickhelper-v1r1 Min Time 141.00 ( 0.00%) 140.00 ( 0.71%) 1st-qrtle Time 142.00 ( 0.00%) 141.00 ( 0.70%) 2nd-qrtle Time 142.00 ( 0.00%) 142.00 ( 0.00%) 3rd-qrtle Time 143.00 ( 0.00%) 143.00 ( 0.00%) Max-90% Time 144.00 ( 0.00%) 144.00 ( 0.00%) Max-95% Time 147.00 ( 0.00%) 145.00 ( 1.36%) Max-99% Time 195.00 ( 0.00%) 191.00 ( 2.05%) Max Time 230.00 ( 0.00%) 205.00 ( 10.87%) Amean Time 144.37 ( 0.00%) 143.82 ( 0.38%) Stddev Time 10.44 ( 0.00%) 9.00 ( 13.74%) Coeff Time 7.23 ( 0.00%) 6.26 ( 13.41%) Best99%Amean Time 143.72 ( 0.00%) 143.34 ( 0.26%) Best95%Amean Time 142.37 ( 0.00%) 142.00 ( 0.26%) Best90%Amean Time 142.19 ( 0.00%) 141.85 ( 0.24%) Best75%Amean Time 141.92 ( 0.00%) 141.58 ( 0.24%) Best50%Amean Time 141.69 ( 0.00%) 141.31 ( 0.27%) Best25%Amean Time 141.38 ( 0.00%) 140.97 ( 0.29%) As you'd expect, the gain is marginal but it can be detected. The differences in bonnie are all within the noise which is not surprising given the impact on the microbenchmark. radix_tree_update_node_t is a callback for some radix operations that optionally passes in a private field. The only user of the callback is workingset_update_node and as it no longer requires a mapping, the private field is removed. Link: http://lkml.kernel.org/r/20171018075952.10627-3-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# a70e43a5	03-Oct-2017	Eric Biggers <ebiggers@google.com>	lib/idr.c: fix comment for idr_replace() idr_replace() returns the old value on success, not 0. Link: http://lkml.kernel.org/r/20170918162642.37511-1-ebiggers3@gmail.com Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# a47f68d6	13-Sep-2017	Eric Biggers <ebiggers@google.com>	idr: remove WARN_ON_ONCE() when trying to replace negative ID IDR only supports non-negative IDs. There used to be a 'WARN_ON_ONCE(id < 0)' in idr_replace(), but it was intentionally removed by commit 2e1c9b286765 ("idr: remove WARN_ON_ONCE() on negative IDs"). Then it was added back by commit 0a835c4f090a ("Reimplement IDR and IDA using the radix tree"). However it seems that adding it back was a mistake, given that some users such as drm_gem_handle_delete() (DRM_IOCTL_GEM_CLOSE) pass in a value from userspace to idr_replace(), allowing the WARN_ON_ONCE to be triggered. drm_gem_handle_delete() actually just wants idr_replace() to return an error code if the ID is not allocated, including in the case where the ID is invalid (negative). So once again remove the bogus WARN_ON_ONCE(). This bug was found by syzkaller, which encountered the following warning: WARNING: CPU: 3 PID: 3008 at lib/idr.c:157 idr_replace+0x1d8/0x240 lib/idr.c:157 Kernel panic - not syncing: panic_on_warn set ... CPU: 3 PID: 3008 Comm: syzkaller218828 Not tainted 4.13.0-rc4-next-20170811 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190 do_trap_no_signal arch/x86/kernel/traps.c:224 [inline] do_trap+0x260/0x390 arch/x86/kernel/traps.c:273 do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:310 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323 invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:930 RIP: 0010:idr_replace+0x1d8/0x240 lib/idr.c:157 RSP: 0018:ffff8800394bf9f8 EFLAGS: 00010297 RAX: ffff88003c6c60c0 RBX: 1ffff10007297f43 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800394bfa78 RBP: ffff8800394bfae0 R08: ffffffff82856487 R09: 0000000000000000 R10: ffff8800394bf9a8 R11: ffff88006c8bae28 R12: ffffffffffffffff R13: ffff8800394bfab8 R14: dffffc0000000000 R15: ffff8800394bfbc8 drm_gem_handle_delete+0x33/0xa0 drivers/gpu/drm/drm_gem.c:297 drm_gem_close_ioctl+0xa1/0xe0 drivers/gpu/drm/drm_gem.c:671 drm_ioctl_kernel+0x1e7/0x2e0 drivers/gpu/drm/drm_ioctl.c:729 drm_ioctl+0x72e/0xa50 drivers/gpu/drm/drm_ioctl.c:825 vfs_ioctl fs/ioctl.c:45 [inline] do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685 SYSC_ioctl fs/ioctl.c:700 [inline] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691 entry_SYSCALL_64_fastpath+0x1f/0xbe Here is a C reproducer: #include <fcntl.h> #include <stddef.h> #include <stdint.h> #include <sys/ioctl.h> #include <drm/drm.h> int main(void) { int cardfd = open("/dev/dri/card0", O_RDONLY); ioctl(cardfd, DRM_IOCTL_GEM_CLOSE, &(struct drm_gem_close) { .handle = -1 } ); } Link: http://lkml.kernel.org/r/20170906235306.20534-1-ebiggers3@gmail.com Fixes: 0a835c4f090a ("Reimplement IDR and IDA using the radix tree") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: <stable@vger.kernel.org> [v4.11+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 388f79fd	30-Aug-2017	Chris Mi <chrism@mellanox.com>	idr: Add new APIs to support unsigned long The following new APIs are added: int idr_alloc_ext(struct idr idr, void ptr, unsigned long index, unsigned long start, unsigned long end, gfp_t gfp); void idr_remove_ext(struct idr idr, unsigned long id); void idr_find_ext(const struct idr idr, unsigned long id); void idr_replace_ext(struct idr idr, void ptr, unsigned long id); void idr_get_next_ext(struct idr idr, unsigned long *nextid); Signed-off-by: Chris Mi <chrism@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
# 7e73eb0b	13-Feb-2017	Matthew Wilcox <willy@infradead.org>	idr: Add missing __rcu annotations Where we use the radix tree iteration macros, we need to annotate 'slot' with __rcu. Make sure we don't forget any new places in the future with the same CFLAGS check used for radix-tree.c. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
# d37cacc5	17-Dec-2016	Matthew Wilcox <willy@infradead.org>	ida: Use exceptional entries for small IDAs We can use the root entry as a bitmap and save allocating a 128 byte bitmap for an IDA that contains only a few entries (30 on a 32-bit machine, 62 on a 64-bit machine). This costs about 300 bytes of kernel text on x86-64, so as long as 3 IDAs fall into this category, this is a net win for memory consumption. Thanks to Rasmus Villemoes for his work documenting the problem and collecting statistics on IDAs. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
# 7ad3d4d8	16-Dec-2016	Matthew Wilcox <willy@infradead.org>	ida: Move ida_bitmap to a percpu variable When we preload the IDA, we allocate an IDA bitmap. Instead of storing that preallocated bitmap in the IDA, we store it in a percpu variable. Generally there are more IDAs in the system than CPUs, so this cuts down on the number of preallocated bitmaps that are unused, and about half of the IDA users did not call ida_destroy() so they were leaking IDA bitmaps. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
# 0a835c4f	20-Dec-2016	Matthew Wilcox <willy@infradead.org>	Reimplement IDR and IDA using the radix tree The IDR is very similar to the radix tree. It has some functionality that the radix tree did not have (alloc next free, cyclic allocation, a callback-based for_each, destroy tree), which is readily implementable on top of the radix tree. A few small changes were needed in order to use a tag to represent nodes with free space below them. More extensive changes were needed to support storing NULL as a valid entry in an IDR. Plain radix trees still interpret NULL as a not-present entry. The IDA is reimplemented as a client of the newly enhanced radix tree. As in the current implementation, it uses a bitmap at the last level of the tree. Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
# a2ef9471	12-Dec-2016	Daniel Vetter <daniel.vetter@ffwll.ch>	lib/ida: document locking requirements a bit better I wanted to wrap a bunch of ida_simple_get calls into their own locking, until I dug around and read the original commit message. Stuff like this should imo be added to the kernel doc, let's do that. Link: http://lkml.kernel.org/r/20161027072216.20411-1-daniel.vetter@ffwll.ch Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# d0164adc	06-Nov-2015	Mel Gorman <mgorman@techsingularity.net>	mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for background reclaim. __GFP_WAIT is redefined as a caller that is willing to enter direct reclaim and wake kswapd for background reclaim. This patch then converts a number of sites o __GFP_ATOMIC is used by callers that are high priority and have memory pools for those requests. GFP_ATOMIC uses this flag. o Callers that have a limited mempool to guarantee forward progress clear __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall into this category where kswapd will still be woken but atomic reserves are not used as there is a one-entry mempool to guarantee progress. o Callers that are checking if they are non-blocking should use the helper gfpflags_allow_blocking() where possible. This is because checking for __GFP_WAIT as was done historically now can trigger false positives. Some exceptions like dm-crypt.c exist where the code intent is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to flag manipulations. o Callers that built their own GFP flags instead of starting with GFP_KERNEL and friends now also need to specify __GFP_KSWAPD_RECLAIM. The first key hazard to watch out for is callers that removed __GFP_WAIT and was depending on access to atomic reserves for inconspicuous reasons. In some cases it may be appropriate for them to use __GFP_HIGH. The second key hazard is callers that assembled their own combination of GFP flags instead of starting with something like GFP_KERNEL. They may now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless if it's missed in most cases as other activity will wake kswapd. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 87d1d169	12-Feb-2015	Rasmus Villemoes <linux@rasmusvillemoes.dk>	lib/idr.c: remove redundant include idr.c doesn't seem to use anything from hardirq.h (or anything included from that). Removing it produces identical objdump -d output, and gives 44 fewer lines in the .idr.o.cmd dependency file. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# da3dae54	08-Sep-2014	Masanari Iida <standby24x7@gmail.com>	Documentation: Docbook: Fix generated DocBook/kernel-api.xml This patch fix spelling typo found in DocBook/kernel-api.xml. It is because the file is generated from the source comments, I have to fix the comments in source codes. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
# 93b7aca3	08-Aug-2014	Andrey Ryabinin <ryabinin.a.a@gmail.com>	lib/idr.c: fix out-of-bounds pointer dereference I'm working on address sanitizer project for kernel. Recently we started experiments with stack instrumentation, to detect out-of-bounds read/write bugs on stack. Just after booting I've hit out-of-bounds read on stack in idr_for_each (and in __idr_remove_all as well): struct idr_layer *paa = &pa[0]; while (id >= 0 && id <= max) { ... while (n < fls(id)) { n += IDR_BITS; p = --paa; <--- here we are reading pa[-1] value. } } Despite the fact that after this dereference we are exiting out of loop and never use p, such behaviour is undefined and should be avoided. Fix this by moving pointer derference to the beggining of the loop, right before we will use it. Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Alexey Preobrazhensky <preobr@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 15f3ec3f	06-Jun-2014	Lai Jiangshan <laijs@cn.fujitsu.com>	idr: reduce the unneeded check in free_layer() If "idr->hint == p" is true, it also implies "idr->hint" is true(not NULL). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# aefb7682	06-Jun-2014	Lai Jiangshan <laijs@cn.fujitsu.com>	idr: don't need to shink the free list when idr_remove() After idr subsystem is changed to RCU-awared, the free layer will not go to the free list. The free list will not be filled up when idr_remove(). So we don't need to shink it too. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# b93804b2	06-Jun-2014	Lai Jiangshan <laijs@cn.fujitsu.com>	idr: fix idr_replace()'s returned error code When the smaller id is not found, idr_replace() returns -ENOENT. But when the id is bigger enough, idr_replace() returns -EINVAL, actually there is no difference between these two kinds of ids. These are all unallocated id, the return values of the idr_replace() for these ids should be the same: -ENOENT. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# aef0f62e	06-Jun-2014	Lai Jiangshan <laijs@cn.fujitsu.com>	idr: fix NULL pointer dereference when ida_remove(unallocated_id) If the ida has at least one existing id, and when an unallocated ID which meets a certain condition is passed to the ida_remove(), the system will crash because it hits NULL pointer dereference. The condition is that the unallocated ID shares the same lowest idr layer with the existing ID, but the idr slot would be different if the unallocated ID were to be allocated. In this case the matching idr slot for the unallocated_id is NULL, causing @bitmap to be NULL which the function dereferences without checking crashing the kernel. See the test code: static void test3(void) { int id; DEFINE_IDA(test_ida); printk(KERN_INFO "Start test3\n"); if (ida_pre_get(&test_ida, GFP_KERNEL) < 0) return; if (ida_get_new(&test_ida, &id) < 0) return; ida_remove(&test_ida, 4000); /* bug: null deference here */ printk(KERN_INFO "End of test3\n"); } It happens only when the caller tries to free an unallocated ID which is the caller's fault. It is not a bug. But it is better to add the proper check and complain rather than crashing the kernel. [tj@kernel.org: updated patch description] Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 8f9f665a	06-Jun-2014	Lai Jiangshan <laijs@cn.fujitsu.com>	idr: fix unexpected ID-removal when idr_remove(unallocated_id) If unallocated_id = (ANY * idr_max(idp->layers) + existing_id) is passed to idr_remove(). The existing_id will be removed unexpectedly. The following test shows this unexpected id-removal: static void test4(void) { int id; DEFINE_IDR(test_idr); printk(KERN_INFO "Start test4\n"); id = idr_alloc(&test_idr, (void )1, 42, 43, GFP_KERNEL); BUG_ON(id != 42); idr_remove(&test_idr, 42 + IDR_SIZE); TEST_BUG_ON(idr_find(&test_idr, 42) != (void )1); idr_destroy(&test_idr); printk(KERN_INFO "End of test4\n"); } ida_remove() shares the similar problem. It happens only when the caller tries to free an unallocated ID which is the caller's fault. It is not a bug. But it is better to add the proper check and complain rather than removing an existing_id silently. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 3afb69cb	06-Jun-2014	Lai Jiangshan <laijs@cn.fujitsu.com>	idr: fix overflow bug during maximum ID calculation at maximum height idr_replace() open-codes the logic to calculate the maximum valid ID given the height of the idr tree; unfortunately, the open-coded logic doesn't account for the fact that the top layer may have unused slots and over-shifts the limit to zero when the tree is at its maximum height. The following test code shows it fails to replace the value for id=((1<<27)+42): static void test5(void) { int id; DEFINE_IDR(test_idr); #define TEST5_START ((1<<27)+42) /* use the highest layer / printk(KERN_INFO "Start test5\n"); id = idr_alloc(&test_idr, (void )1, TEST5_START, 0, GFP_KERNEL); BUG_ON(id != TEST5_START); TEST_BUG_ON(idr_replace(&test_idr, (void )2, TEST5_START) != (void )1); idr_destroy(&test_idr); printk(KERN_INFO "End of test5\n"); } Fix the bug by using idr_max() which correctly takes into account the maximum allowed shift. sub_alloc() shares the same problem and may incorrectly fail with -EAGAIN; however, this bug doesn't affect correct operation because idr_get_empty_slot(), which already uses idr_max(), retries with the increased @id in such cases. [tj@kernel.org: Updated patch description.] Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 3f59b067	07-Apr-2014	Monam Agarwal <monamagarwal123@gmail.com>	lib/idr.c: use RCU_INIT_POINTER(x, NULL) Replace rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL) The rcu_assign_pointer() ensures that the initialization of a structure is carried out before storing a pointer to that structure. And in the case of the NULL pointer, there is no structure to initialize. So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL) Signed-off-by: Monam Agarwal <monamagarwal123@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 90ae3ae5	07-Apr-2014	Stephen Hemminger <stephen@networkplumber.org>	idr: remove dead code Remove no longer used deprecated code, and make local functions static. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jean Delvare <jdelvare@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Cc: Jeff Layton <jlayton@redhat.com> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: George Spelvin <linux@horizon.com> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 05f7a7d6	08-Aug-2011	Andreas Gruenbacher <agruen@linbit.com>	idr: Add new function idr_is_empty() Signed-off-by: Andreas Gruenbacher <agruen@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
# dd04b452	03-Jul-2013	Jean Delvare <jdelvare@suse.de>	idr: print a stack dump after ida_remove warning We print a dump stack after idr_remove warning. This is useful to find the faulty piece of code. Let's do the same for ida_remove, as it would be equally useful there. [akpm@linux-foundation.org: convert the open-coded printk+dump_stack into WARN()] Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Tejun Heo <tj@kernel.org> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 3e6628c4	29-Apr-2013	Jeff Layton <jlayton@kernel.org>	idr: introduce idr_alloc_cyclic() As Tejun points out, there are several users of the IDR facility that attempt to use it in a cyclic fashion. These users are likely to see -ENOSPC errors after the counter wraps one or more times however. This patchset adds a new idr_alloc_cyclic routine and converts several of these users to it. Many of these users are in obscure parts of the kernel, and I don't have a good way to test some of them. The change is pretty straightforward though, so hopefully it won't be an issue. There is one other cyclic user of idr_alloc that I didn't touch in ipc/util.c. That one is doing some strange stuff that I didn't quite understand, but it looks like it should probably be converted later somehow. This patch: Thus spake Tejun Heo: Ooh, BTW, the cyclic allocation is broken. It's prone to -ENOSPC after the first wraparound. There are several cyclic users in the kernel and I think it probably would be best to implement cyclic support in idr. This patch does that by adding new idr_alloc_cyclic function that such users in the kernel can use. With this, there's no need for a caller to keep track of the last value used as that's now tracked internally. This should prevent the ENOSPC problems that can hit when the "last allocated" counter exceeds INT_MAX. Later patches will convert existing cyclic users to the new interface. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Eric Paris <eparis@parisplace.org> Cc: Jack Morgenstein <jackm@dev.mellanox.co.il> Cc: John McCutchan <john@johnmccutchan.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Robert Love <rlove@rlove.org> Cc: Roland Dreier <roland@purestorage.com> Cc: Sridhar Samudrala <sri@us.ibm.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Tom Tucker <tom@opengridcomputing.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 59bfbcf0	13-Mar-2013	Tejun Heo <tj@kernel.org>	idr: idr_alloc() shouldn't trigger lowmem warning when preloaded GFP_NOIO is often used for idr_alloc() inside preloaded section as the allocation mask doesn't really matter. If the idr tree needs to be expanded, idr_alloc() first tries to allocate using the specified allocation mask and if it fails falls back to the preloaded buffer. This order prevent non-preloading idr_alloc() users from taking advantage of preloading ones by using preload buffer without filling it shifting the burden of allocation to the preload users. Unfortunately, this allowed/expected-to-fail kmem_cache allocation ends up generating spurious slab lowmem warning before succeeding the request from the preload buffer. This patch makes idr_layer_alloc() add __GFP_NOWARN to the first kmem_cache attempt and try kmem_cache again w/o __GFP_NOWARN after allocation from preload_buffer fails so that lowmem warning is generated if not suppressed by the original @gfp_mask. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: David Teigland <teigland@redhat.com> Tested-by: David Teigland <teigland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# c8615d37	13-Mar-2013	Tejun Heo <tj@kernel.org>	idr: deprecate idr_pre_get() and idr_get_new[_above]() Now that all in-kernel users are converted to ues the new alloc interface, mark the old interface deprecated. We should be able to remove these in a few releases. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 5857f70c	04-Mar-2013	Randy Dunlap <rdunlap@infradead.org>	idr: fix new kernel-doc warnings Fix new kernel-doc warnings in idr: Warning(include/linux/idr.h:113): No description found for parameter 'idr' Warning(include/linux/idr.h:113): Excess function parameter 'idp' description in 'idr_find' Warning(lib/idr.c:232): Excess function parameter 'id' description in 'sub_alloc' Warning(lib/idr.c:232): Excess function parameter 'id' description in 'sub_alloc' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 2e1c9b28	08-Mar-2013	Tejun Heo <tj@kernel.org>	idr: remove WARN_ON_ONCE() on negative IDs idr_find(), idr_remove() and idr_replace() used to silently ignore the sign bit and perform lookup with the rest of the bits. The weird behavior has been changed such that negative IDs are treated as invalid. As the behavior change was subtle, WARN_ON_ONCE() was added in the hope of determining who's calling idr functions with negative IDs so that they can be examined for problems. Up until now, all two reported cases are ID number coming directly from userland and getting fed into idr_find() and the warnings seem to cause more problems than being helpful. Drop the WARN_ON_ONCE()s. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: <markus@trippelsdorf.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 7175c61c	27-Feb-2013	Tejun Heo <tj@kernel.org>	idr: explain WARN_ON_ONCE() on negative IDs out-of-range ID Until recently, when an negative ID is specified, idr functions used to ignore the sign bit and proceeded with the operation with the rest of bits, which is bizarre and error-prone. The behavior recently got changed so that negative IDs are treated as invalid but we're triggering WARN_ON_ONCE() on negative IDs just in case somebody was depending on the sign bit being ignored, so that those can be detected and fixed easily. We only need this for a while. Explain why WARN_ON_ONCE()s are there and that they can be removed later. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 0ffc2a9c	27-Feb-2013	Tejun Heo <tj@kernel.org>	idr: implement lookup hint While idr lookup isn't a particularly heavy operation, it still is too substantial to use in hot paths without worrying about the performance implications. With recent changes, each idr_layer covers 256 slots which should be enough to cover most use cases with single idr_layer making lookup hint very attractive. This patch adds idr->hint which points to the idr_layer which allocated an ID most recently and the fast path lookup becomes if (look up target's prefix matches that of the hinted layer) return hint->ary[ID's offset in the leaf layer]; which can be inlined. idr->hint is set to the leaf node on idr_fill_slot() and cleared from free_layer(). [andriy.shevchenko@linux.intel.com: always do slow path when hint is uninitialized] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 54616283	27-Feb-2013	Tejun Heo <tj@kernel.org>	idr: add idr_layer->prefix Add a field which carries the prefix of ID the idr_layer covers. This will be used to implement lookup hint. This patch doesn't make use of the new field and doesn't introduce any behavior difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 1d9b2e1e	27-Feb-2013	Tejun Heo <tj@kernel.org>	idr: remove length restriction from idr_layer->bitmap Currently, idr->bitmap is declared as an unsigned long which restricts the number of bits an idr_layer can contain. All bitops can handle arbitrary positive integer bit number and there's no reason for this restriction. Declare idr_layer->bitmap using DECLARE_BITMAP() instead of a single unsigned long. * idr_layer->bitmap is now an array. '&' dropped from params to bitops. * Replaced "== IDR_FULL" tests with bitmap_full() and removed IDR_FULL. * Replaced find_next_bit() on ~bitmap with find_next_zero_bit(). * Replaced "bitmap = 0" with bitmap_clear(). This patch doesn't (or at least shouldn't) introduce any behavior changes. [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# e8c8d1bc	27-Feb-2013	Tejun Heo <tj@kernel.org>	idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c MAX_IDR_MASK is another weirdness in the idr interface. As idr covers whole positive integer range, it's defined as 0x7fffffff or INT_MAX. Its usage in idr_find(), idr_replace() and idr_remove() is bizarre. They basically mask off the sign bit and operate on the rest, so if the caller, by accident, passes in a negative number, the sign bit will be masked off and the remaining part will be used as if that was the input, which is worse than crashing. The constant is visible in idr.h and there are several users in the kernel. * drivers/i2c/i2c-core.c:i2c_add_numbered_adapter() Basically used to test if adap->nr is a negative number which isn't -1 and returns -EINVAL if so. idr_alloc() already has negative @start checking (w/ WARN_ON_ONCE), so this can go away. * drivers/infiniband/core/cm.c:cm_alloc_id() drivers/infiniband/hw/mlx4/cm.c:id_map_alloc() Used to wrap cyclic @start. Can be replaced with max(next, 0). Note that this type of cyclic allocation using idr is buggy. These are prone to spurious -ENOSPC failure after the first wraparound. * fs/super.c:get_anon_bdev() The ID allocated from ida is masked off before being tested whether it's inside valid range. ida allocated ID can never be a negative number and the masking is unnecessary. Update idr_() functions to fail with -EINVAL when negative @id is specified and update other MAX_IDR_MASK users as described above. This leaves MAX_IDR_MASK without any user, remove it and relocate other MAX_IDR_ constants to lib/idr.c. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jean Delvare <khali@linux-fr.org> Cc: Roland Dreier <roland@kernel.org> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: "Marciniszyn, Mike" <mike.marciniszyn@intel.com> Cc: Jack Morgenstein <jackm@dev.mellanox.co.il> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Wolfram Sang <wolfram@the-dreams.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 326cf0f0	27-Feb-2013	Tejun Heo <tj@kernel.org>	idr: fix top layer handling Most functions in idr fail to deal with the high bits when the idr tree grows to the maximum height. * idr_get_empty_slot() stops growing idr tree once the depth reaches MAX_IDR_LEVEL - 1, which is one depth shallower than necessary to cover the whole range. The function doesn't even notice that it didn't grow the tree enough and ends up allocating the wrong ID given sufficiently high @starting_id. For example, on 64 bit, if the starting id is 0x7fffff01, idr_get_empty_slot() will grow the tree 5 layer deep, which only covers the 30 bits and then proceed to allocate as if the bit 30 wasn't specified. It ends up allocating 0x3fffff01 without the bit 30 but still returns 0x7fffff01. * __idr_remove_all() will not remove anything if the tree is fully grown. * idr_find() can't find anything if the tree is fully grown. * idr_for_each() and idr_get_next() can't iterate anything if the tree is fully grown. Fix it by introducing idr_max() which returns the maximum possible ID given the depth of tree and replacing the id limit checks in all affected places. As the idr_layer pointer array pa[] needs to be 1 larger than the maximum depth, enlarge pa[] arrays by one. While this plugs the discovered issues, the whole code base is horrible and in desparate need of rewrite. It's fragile like hell, Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# d5c7409f	27-Feb-2013	Tejun Heo <tj@kernel.org>	idr: implement idr_preload[_end]() and idr_alloc() The current idr interface is very cumbersome. * For all allocations, two function calls - idr_pre_get() and idr_get_new() - should be made. idr_pre_get() doesn't guarantee that the following idr_get_new() will not fail from memory shortage. If idr_get_new() returns -EAGAIN, the caller is expected to retry pre_get and allocation. * idr_get_new() can't enforce upper limit. Upper limit can only be enforced by allocating and then freeing if above limit. idr_layer buffer is unnecessarily per-idr. Each idr ends up keeping around MAX_IDR_FREE idr_layers. The memory consumed per idr is under two pages but it makes it difficult to make idr_layer larger. This patch implements the following new set of allocation functions. * idr_preload[_end]() - Similar to radix preload but doesn't fail. The first idr_alloc() inside preload section can be treated as if it were called with @gfp_mask used for idr_preload(). * idr_alloc() - Allocate an ID w/ lower and upper limits. Takes @gfp_flags and can be used w/o preloading. When used inside preloaded section, the allocation mask of preloading can be assumed. If idr_alloc() can be called from a context which allows sufficiently relaxed @gfp_mask, it can be used by itself. If, for example, idr_alloc() is called inside spinlock protected region, preloading can be used like the following. idr_preload(GFP_KERNEL); spin_lock(lock); id = idr_alloc(idr, ptr, start, end, GFP_NOWAIT); spin_unlock(lock); idr_preload_end(); if (id < 0) error; which is much simpler and less error-prone than idr_pre_get and idr_get_new*() loop. The new interface uses per-pcu idr_layer buffer and thus the number of idr's in the system doesn't affect the amount of memory used for preloading. idr_layer_alloc() is introduced to handle idr_layer allocations for both old and new ID allocation paths. This is a bit hairy now but the new interface is expected to replace the old and the internal implementation eventually will become simpler. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 3594eb28	27-Feb-2013	Tejun Heo <tj@kernel.org>	idr: refactor idr_get_new_above() Move slot filling to idr_fill_slot() from idr_get_new_above_int() and make idr_get_new_above() directly call it. idr_get_new_above_int() is no longer needed and removed. This will be used to implement a new ID allocation interface. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 12d1b439	27-Feb-2013	Tejun Heo <tj@kernel.org>	idr: remove _idr_rc_to_errno() hack idr uses -1, IDR_NEED_TO_GROW and IDR_NOMORE_SPACE to communicate exception conditions internally. The return value is later translated to errno values using _idr_rc_to_errno(). This is confusing. Drop the custom ones and consistently use -EAGAIN for "tree needs to grow", -ENOMEM for "need more memory" and -ENOSPC for "ran out of ID space". Due to the weird memory preloading mechanism, [ra]_get_new*() return -EAGAIN on memory shortage, so we need to substitute -ENOMEM w/ -EAGAIN on those interface functions. They'll eventually be cleaned up and the translations will go away. This patch doesn't introduce any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 49038ef4	27-Feb-2013	Tejun Heo <tj@kernel.org>	idr: relocate idr_for_each_entry() and reorganize id[r\|a]_get_new() * Move idr_for_each_entry() definition next to other idr related definitions. * Make id[r\|a]_get_new() inline wrappers of id[r\|a]_get_new_above(). This changes the implementation of idr_get_new() but the new implementation is trivial. This patch doesn't introduce any functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# fe6e24ec	27-Feb-2013	Tejun Heo <tj@kernel.org>	idr: deprecate idr_remove_all() There was only one legitimate use of idr_remove_all() and a lot more of incorrect uses (or lack of it). Now that idr_destroy() implies idr_remove_all() and all the in-kernel users updated not to use it, there's no reason to keep it around. Mark it deprecated so that we can later unexport it. idr_remove_all() is made an inline function calling __idr_remove_all() to avoid triggering deprecated warning on EXPORT_SYMBOL(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 9bb26bc1	27-Feb-2013	Tejun Heo <tj@kernel.org>	idr: make idr_destroy() imply idr_remove_all() idr is silly in quite a few ways, one of which is how it's supposed to be destroyed - idr_destroy() doesn't release IDs and doesn't even whine if the idr isn't empty. If the caller forgets idr_remove_all(), it simply leaks memory. Even ida gets this wrong and leaks memory on destruction. There is absoltely no reason not to call idr_remove_all() from idr_destroy(). Nobody is abusing idr_destroy() for shrinking free layer buffer and continues to use idr after idr_destroy(), so it's safe to do remove_all from destroy. In the whole kernel, there is only one place where idr_remove_all() is legitimiately used without following idr_destroy() while there are quite a few places where the caller forgets either idr_remove_all() or idr_destroy() leaking memory. This patch makes idr_destroy() call idr_destroy_all() and updates the function description accordingly. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 6cdae741	27-Feb-2013	Tejun Heo <tj@kernel.org>	idr: fix a subtle bug in idr_get_next() The iteration logic of idr_get_next() is borrowed mostly verbatim from idr_for_each(). It walks down the tree looking for the slot matching the current ID. If the matching slot is not found, the ID is incremented by the distance of single slot at the given level and repeats. The implementation assumes that during the whole iteration id is aligned to the layer boundaries of the level closest to the leaf, which is true for all iterations starting from zero or an existing element and thus is fine for idr_for_each(). However, idr_get_next() may be given any point and if the starting id hits in the middle of a non-existent layer, increment to the next layer will end up skipping the same offset into it. For example, an IDR with IDs filled between [64, 127] would look like the following. [ 0 64 ... ] /----/ \| \| \| NULL [ 64 ... 127 ] If idr_get_next() is called with 63 as the starting point, it will try to follow down the pointer from 0. As it is NULL, it will then try to proceed to the next slot in the same level by adding the slot distance at that level which is 64 - making the next try 127. It goes around the loop and finds and returns 127 skipping [64, 126]. Note that this bug also triggers in idr_for_each_entry() loop which deletes during iteration as deletions can make layers go away leaving the iteration with unaligned ID into missing layers. Fix it by ensuring proceeding to the next slot doesn't carry over the unaligned offset - ie. use round_up(id + 1, slot_distance) instead of id += slot_distance. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: David Teigland <teigland@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 125c4c70	04-Oct-2012	Fengguang Wu <fengguang.wu@intel.com>	idr: rename MAX_LEVEL to MAX_IDR_LEVEL To avoid name conflicts: drivers/video/riva/fbdev.c:281:9: sparse: preprocessor token MAX_LEVEL redefined While at it, also make the other names more consistent and add parentheses. [akpm@linux-foundation.org: repair fallout] [sfr@canb.auug.org.au: IB/mlx4: fix for MAX_ID_MASK to MAX_IDR_MASK name change] Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: walter harms <wharms@bfs.de> Cc: Glauber Costa <glommer@parallels.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Roland Dreier <roland@purestorage.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 9f7de827	21-Mar-2012	Hugh Dickins <hughd@google.com>	idr: make idr_get_next() good for rcu_read_lock() Make one small adjustment to idr_get_next(): take the height from the top layer (stable under RCU) instead of from the root (unprotected by RCU), as idr_find() does: so that it can be used with RCU locking. Copied comment on RCU locking from idr_find(). Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 8bc3bcc9	16-Nov-2011	Paul Gortmaker <paul.gortmaker@windriver.com>	lib: reduce the use of module.h wherever possible For files only using THIS_MODULE and/or EXPORT_SYMBOL, map them onto including export.h -- or if the file isn't even using those, then just delete the include. Fix up any implicit include dependencies that were being masked by module.h along the way. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
# 46cbc1d3	02-Nov-2011	Tejun Heo <tj@kernel.org>	ida: make ida_simple_get/put() IRQ safe It's often convenient to be able to release resource from IRQ context. Make ida_simple_*() use irqsave/restore spin ops so that they are IRQ safe. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# e3816c54	31-Oct-2011	Wang Sheng-Hui <shhuiw@gmail.com>	lib/idr.c: fix comment for ida_get_new_above() Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# f5c3dd71	03-Aug-2011	Paul Bolle <pebolle@tiscali.nl>	Fix kernel-doc comment typo '@id' Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
# 88eca020	03-Aug-2011	Rusty Russell <rusty@rustcorp.com.au>	ida: simplified functions for id allocation The current hyper-optimized functions are overkill if you simply want to allocate an id for a device. Create versions which use an internal lock. In followup patches, numerous drivers are converted to use this interface. Thanks to Tejun for feedback. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Jonathan Cameron <jic23@cam.ac.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 56083ab1	26-Oct-2010	Randy Dunlap <randy.dunlap@oracle.com>	docbook: add idr/ida to kernel-api docbook Add idr/ida to kernel-api docbook. Fix typos and kernel-doc notation. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Naohiro Aota <naota@elisp.net> Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 066a9be6	26-Oct-2010	Naohiro Aota <naota@elisp.net>	idr: fix idr_pre_get() locking description Despite the idr_pre_get() kernel-doc, there are some cases where you can call idr_pre_get() from within locked regions. Add a description for such cases. See also: http://lkml.org/lkml/2010/9/16/462 [akpm@linux-foundation.org: cleanups, grammatical fixes] Signed-off-by: Naohiro Aota <naota@elisp.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 1458ce16	27-Aug-2010	Naohiro Aota <naota@elisp.net>	idr: describe how nextidp works in idr_get_next(). It was unclear in original kernel-doc how nextidp worked in idr_get_next(). Let's describe it. Signed-off-by: Naohiro Aota <naota@elisp.net> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
# ea24ea850	30-Aug-2010	Naohiro Aota <naota@elisp.net>	idr: fix kernel-doc warnings. Fix the following kernel-doc warnings. % perl scripts/kernel-doc lib/idr.c > /dev/null Warning(lib/idr.c:300): No description found for parameter 'starting_id' Warning(lib/idr.c:300): Excess function parameter 'start_id' description in 'idr_get_new_above' Warning(lib/idr.c:485): No description found for parameter 'idp' Warning(lib/idr.c:596): No description found for parameter 'nextidp' Warning(lib/idr.c:596): Excess function parameter 'id' description in 'idr_get_next' Warning(lib/idr.c:774): No description found for parameter 'starting_id' Warning(lib/idr.c:774): Excess function parameter 'staring_id' description in 'ida_get_new_above' Warning(lib/idr.c:918): No description found for parameter 'ida' Signed-off-by: Naohiro Aota <naota@elisp.net> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
# 94bfa3b6	07-Jun-2010	Paul E. McKenney <paulmck@kernel.org>	idr: fix RCU lockdep splat in idr_get_next() Convert to rcu_dereference_raw() given that many callers may have many different locking models. Located-by: Miles Lane <miles.lane@gmail.com> Tested-by: Miles Lane <miles.lane@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# 2dcb22b3	26-May-2010	Imre Deak <imre.deak@nokia.com>	idr: fix backtrack logic in idr_remove_all Currently idr_remove_all will fail with a use after free error if idr::layers is bigger than 2, which on 32 bit systems corresponds to items more than 1024. This is due to stepping back too many levels during backtracking. For simplicity let's assume that IDR_BITS=1 -> we have 2 nodes at each level below the root node and each leaf node stores two IDs. (In reality for 32 bit systems IDR_BITS=5, with 32 nodes at each sub-root level and 32 IDs in each leaf node). The sequence of freeing the nodes at the moment is as follows: layer 1 -> a(7) 2 -> b(3) c(5) 3 -> d(1) e(2) f(4) g(6) Until step 4 things go fine, but then node c is freed, whereas node g should be freed first. Since node c contains the pointer to node g we'll have a use after free error at step 6. How many levels we step back after visiting the leaf nodes is currently determined by the msb of the id we are currently visiting: Step 1. node d with IDs 0,1 is freed, current ID is advanced to 2. msb of the current ID bit 1. This means we need to step back 1 level to node b and take the next sibling, node e. 2-3. node e with IDs 2,3 is freed, current ID is 4, msb is bit 2. This means we need to step back 2 levels to node a, freeing node b on the way. 4-5. node f with IDs 4,5 is freed, current ID is 6, msb is still bit 2. This means we again need to step back 2 levels to node a and free c on the way. 6. We should visit node g, but its pointer is not available as node c was freed. The fix changes how we determine the number of levels to step back. Instead of deducting this merely from the msb of the current ID, we should really check if advancing the ID causes an overflow to a bit position corresponding to a given layer. In the above example overflow from bit 0 to bit 1 should mean stepping back 1 level. Overflow from bit 1 to bit 2 should mean stepping back 2 levels and so on. The fix was tested with IDs up to 1 << 20, which corresponds to 4 layers on 32 bit systems. Signed-off-by: Imre Deak <imre.deak@nokia.com> Reviewed-by: Tejun Heo <tj@kernel.org> Cc: Eric Paris <eparis@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: <stable@kernel.org> [2.6.34.1] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 4d1ee80f	29-Jan-2010	Ben Hutchings <bhutchings@solarflare.com>	idr: export idr_get_next() idr_get_next() was accidentally not exported when added. It is about to be used by mtdcore, which may be built as a module. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
# 96be753a	22-Feb-2010	Paul E. McKenney <paulmck@kernel.org>	idr: Apply lockdep-based diagnostics to rcu_dereference() uses Because idr can be used with any of a number of locks or with any flavor of RCU, just disable the lockdep-based diagnostics. If idr needs diagnostics, the check expression will need to be passed into the relevant idr primitives as an additional argument. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1266887105-1528-11-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
# d2e7276b	22-Feb-2010	Tejun Heo <tj@kernel.org>	idr: fix a critical misallocation bug, take#2 This is retry of reverted 859ddf09743a8cc680af33f7259ccd0fd36bfe9d ("idr: fix a critical misallocation bug") which contained two bugs. * pa[idp->layers] should be cleared even if it's not used by sub_alloc() because it's used by mark idr_mark_full(). * The original condition check also assigned pa[l] to p which the new code didn't do thus leaving p pointing at the wrong layer. Both problems have been fixed and the idr code has received good amount testing using userland testing setup where simple bitmap allocator is run parallel to verify the result of idr allocation. The bug this patch fixes is caused by sub_alloc() optimization path bypassing out-of-room condition check and restarting allocation loop with starting value higher than maximum allowed value. For detailed description, please read commit message of 859ddf09. Signed-off-by: Tejun Heo <tj@kernel.org> Based-on-patch-from: Eric Paris <eparis@redhat.com> Reported-by: Eric Paris <eparis@redhat.com> Tested-by: Stefan Lippers-Hollmann <s.l-h@gmx.de> Tested-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 6f14a668	04-Feb-2010	Tejun Heo <tj@kernel.org>	idr: revert misallocation bug fix Commit 859ddf09743a8cc680af33f7259ccd0fd36bfe9d tried to fix misallocation bug but broke full bit marking by not clearing pa[idp->layers] and also is causing X failures due to lookup failure in drm code. The cause of the latter hasn't been found yet. Revert the fix for now. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 859ddf09	02-Feb-2010	Tejun Heo <tj@kernel.org>	idr: fix a critical misallocation bug Eric Paris located a bug in idr. With IDR_BITS of 6, it grows to three layers when id 4096 is first allocated. When that happens, idr wraps incorrectly and searches the idr array ignoring the high bits. The following test code from Eric demonstrates the bug nicely. #include <linux/idr.h> #include <linux/kernel.h> #include <linux/module.h> static DEFINE_IDR(test_idr); int init_module(void) { int ret, forty95, forty96; void addr; / add 2 entries both with 4095 as the start address / again1: if (!idr_pre_get(&test_idr, GFP_KERNEL)) return -ENOMEM; ret = idr_get_new_above(&test_idr, (void )4095, 4095, &forty95); if (ret) { if (ret == -EAGAIN) goto again1; return ret; } if (forty95 != 4095) printk(KERN_ERR "hmmm, forty95=%d\n", forty95); again2: if (!idr_pre_get(&test_idr, GFP_KERNEL)) return -ENOMEM; ret = idr_get_new_above(&test_idr, (void )4096, 4095, &forty96); if (ret) { if (ret == -EAGAIN) goto again2; return ret; } if (forty96 != 4096) printk(KERN_ERR "hmmm, forty96=%d\n", forty96); / try to find the 2 entries, noticing that 4096 broke / addr = idr_find(&test_idr, forty95); if ((int)addr != forty95) printk(KERN_ERR "hmmm, after find forty95=%d addr=%d\n", forty95, (int)addr); addr = idr_find(&test_idr, forty96); if ((int)addr != forty96) printk(KERN_ERR "hmmm, after find forty96=%d addr=%d\n", forty96, (int)addr); / really weird, the entry which should be at 4096 is actually at 0!! */ addr = idr_find(&test_idr, 0); if ((int)addr) printk(KERN_ERR "found an entry at id=0 for addr=%d\n", (int)addr); idr_remove(&test_idr, forty95); idr_remove(&test_idr, forty96); return 0; } void cleanup_module(void) { } MODULE_AUTHOR("Eric Paris <eparis@redhat.com>"); MODULE_DESCRIPTION("Simple idr test"); MODULE_LICENSE("GPL"); This happens because when sub_alloc() back tracks it doesn't always do it step-by-step while the over-the-limit detection assumes step-by-step backtracking. The logic in sub_alloc() looks like the following. restart: clear pa[top level + 1] for end cond detection l = top level while (true) { search for empty slot at this level if (not found) { push id to the next possible value l++ A: if (pa[l] is clear) failed, return asking caller to grow the tree if (going up 1 level gives more slots to search) continue the while loop above with the incremented l else C: goto restart } adjust id accordingly to the found slot if (l == 0) return found id; create lower level if not there yet record pa[l] and l-- } Test A is the fail exit condition but this assumes that failure is propagated upwared one level at a time but the B optimization path breaks the assumption and restarts the whole thing with a start value which is above the possible limit with the current layers. sub_alloc() assumes the start id value is inside the limit when called and test A is the only exit condition check, so it ends up searching for empty slot while ignoring high set bit. So, for 4095->4096 test, level0 search fails but pa[1] contains a valid pointer. However, going up 1 level wouldn't give any more empty slot so it takes C and when the whole thing restarts nobody notices the high bit set beyond the top level. This patch fixes the bug by changing the fail exit condition check to full id limit check. Based-on-patch-from: Eric Paris <eparis@redhat.com> Reported-by: Eric Paris <eparis@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 94e2bd68	16-Oct-2009	Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>	tree-wide: fix some typos and punctuation in comments fix some typos and punctuation in comments Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
# 38460b48	02-Apr-2009	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>	cgroup: CSS ID support Patch for Per-CSS(Cgroup Subsys State) ID and private hierarchy code. This patch attaches unique ID to each css and provides following. - css_lookup(subsys, id) returns pointer to struct cgroup_subysys_state of id. - css_get_next(subsys, id, rootid, depth, foundid) returns the next css under "root" by scanning When cgroup_subsys->use_id is set, an id for css is maintained. The cgroup framework only parepares - css_id of root css for subsys - id is automatically attached at creation of css. - id is not freed automatically. Because the cgroup framework don't know lifetime of cgroup_subsys_state. free_css_id() function is provided. This must be called by subsys. There are several reasons to develop this. - Saving space .... For example, memcg's swap_cgroup is array of pointers to cgroup. But it is not necessary to be very fast. By replacing pointers(8bytes per ent) to ID (2byes per ent), we can reduce much amount of memory usage. - Scanning without lock. CSS_ID provides "scan id under this ROOT" function. By this, scanning css under root can be written without locks. ex) do { rcu_read_lock(); next = cgroup_get_next(subsys, id, root, &found); /* check sanity of next here */ css_tryget(); rcu_read_unlock(); id = found + 1 } while(...) Characteristics: - Each css has unique ID under subsys. - Lifetime of ID is controlled by subsys. - css ID contains "ID" and "Depth in hierarchy" and stack of hierarchy - Allowed ID is 1-65535, ID 0 is UNUSED ID. Design Choices: - scan-by-ID v.s. scan-by-tree-walk. As /proc's pid scan does, scan-by-ID is robust when scanning is done by following kind of routine. scan -> rest a while(release a lock) -> conitunue from interrupted memcg's hierarchical reclaim does this. - When subsys->use_id is set, # of css in the system is limited to 65535. [bharata@linux.vnet.ibm.com: remove rcu_read_lock() from css_get_next()] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 1b23336a	10-Mar-2009	Paul E. McKenney <paulmck@kernel.org>	idr: make idr_remove_all() do removal -before- free_layer() Fix a problem in the IDR system, where an idr_remove_all() hands a data element to call_rcu() (via free_layer()) before making that data element inaccessible to new readers. This is very bad, and results in readers still having a reference to this data element at the end of the grace period. Tests on large machines that concurrently map and unmap user-space memory within the same multithreaded process result in crashes within about five minutes. Applying this patch increases the kernel's longevity to the three-to-eight-hour range. There appear to be other similar problems in idr_get_empty_slot() and sub_remove(), but I fixed the easy one in idr_remove_all() first. It is therefore no surprise that failures still occur. Located-by: Milton Miller II <miltonm@austin.ibm.com> Tested-by: Milton Miller II <miltonm@austin.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 5b019e99	15-Jan-2009	Andrew Morton <akpm@linux-foundation.org>	lib/idr.c: use kmem_cache_zalloc() for the idr_layer cache David points out that the idr_remove_all() function returns unused slabs to the kmem cache, but needs to zero them first or else they will be uninitialized upon next use. This causes crashes which have been observed in the firewire subsystem. He fixed this by zeroing the object before freeing it in idr_remove_all(). But we agree that simply removing the constructor and zeroing the object at allocation time is simpler than relying upon slab constructor machinery and might even be faster. This problem was introduced by "idr: make idr_remove rcu-safe" (commit cf481c20c476ad2c0febdace9ce23f5a4db19582), which was first released in 2.6.27. There are no known codesites which trigger this bug in 2.6.27 or 2.6.28. The post-2.6.28 firewire changes are the only known triggerer. There might of course be not-yet-discovered triggerers in 2.6.27 and 2.6.28, and there might be out-of-tree triggerers which are added to those kernel versions. I'll let the -stable guys decide whether they want to backport this fix. Reported-by: David Moore <dcm@acm.org> Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Paul E. McKenney <paulmck@us.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Kristian Hgsberg <krh@redhat.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# b098161b	15-Jan-2009	Li Zefan <lizf@cn.fujitsu.com>	idr: fix wrong kernel-doc idr_get_new_above() and ida_get_new_above() return an id in the range of @staring_id ... 0x7fffffff, not 0 ... 0x7fffffff. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 711a49a0	10-Dec-2008	Manfred Spraul <manfred@colorfullife.com>	lib/idr.c: Fix bug introduced by RCU fix The last patch to lib/idr.c caused a bug if idr_get_new_above() was called on an empty idr. Usually, nodes stay on the same layer. New layers are added to the top of the tree. The exception is idr_get_new_above() on an empty tree: In this case, the new root node is first added on layer 0, then moved upwards. p->layer was not updated. As usual: You shall never rely on the source code comments, they will only mislead you. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 6ff2d39b	01-Dec-2008	Manfred Spraul <manfred@colorfullife.com>	lib/idr.c: fix rcu related race with idr_find 2nd part of the fixes needed for http://bugzilla.kernel.org/show_bug.cgi?id=11796. When the idr tree is either grown or shrunk, then the update to the number of layers and the top pointer were not atomic. This race caused crashes. The attached patch fixes that by replicating the layers counter in each layer, thus idr_find doesn't need idp->layers anymore. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Clement Calmels <cboulte@gmail.com> Cc: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Pierre Peiffer <peifferp@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 51cc5068	25-Jul-2008	Alexey Dobriyan <adobriyan@gmail.com>	SL*B: drop kmem cache argument from constructor Kmem cache passed to constructor is only needed for constructors that are themselves multiplexeres. Nobody uses this "feature", nor does anybody uses passed kmem cache in non-trivial way, so pass only pointer to object. Non-trivial places are: arch/powerpc/mm/init_64.c arch/powerpc/mm/hugetlbpage.c This is flag day, yes. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Matt Mackall <mpm@selenic.com> [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c] [akpm@linux-foundation.org: fix mm/slab.c] [akpm@linux-foundation.org: fix ubifs] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# cf481c20	25-Jul-2008	Nadia Derbey <Nadia.Derbey@bull.net>	idr: make idr_remove rcu-safe Introduce the free_layer() routine: it is the one that actually frees memory after a grace period has elapsed. Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Reviewed-by: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Jim Houston <jim.houston@comcast.net> Cc: Pierre Peiffer <peifferp@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# f9c46d6e	25-Jul-2008	Nadia Derbey <Nadia.Derbey@bull.net>	idr: make idr_find rcu-safe Make idr_find rcu-safe: it can now be called inside an rcu_read critical section. Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Reviewed-by: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Jim Houston <jim.houston@comcast.net> Cc: Pierre Peiffer <peifferp@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 3219b3b7	25-Jul-2008	Nadia Derbey <Nadia.Derbey@bull.net>	idr: make idr_get_new* rcu-safe Make the idr_get_new* routines rcu-safe. Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Reviewed-by: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Jim Houston <jim.houston@comcast.net> Cc: Pierre Peiffer <peifferp@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 944ca05c	25-Jul-2008	Nadia Derbey <Nadia.Derbey@bull.net>	idr: error checking factorization Do some code factorization in the return code analysis. Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Jim Houston <jim.houston@comcast.net> Cc: Pierre Peiffer <peifferp@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# f098ad65	25-Jul-2008	Nadia Derbey <Nadia.Derbey@bull.net>	idr: fix a printk call Fix the incomplete printk call. Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Reviewed-by: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Jim Houston <jim.houston@comcast.net> Cc: Pierre Peiffer <peifferp@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 4ae53789	25-Jul-2008	Nadia Derbey <Nadia.Derbey@bull.net>	idr: rename some of the idr APIs internal routines This is a trivial patch that renames: . alloc_layer to get_from_free_list since it idr_pre_get that actually allocates memory. . free_layer to move_to_free_list since memory is not actually freed there. This makes things more clear for the next patches. Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Reviewed-by: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Jim Houston <jim.houston@comcast.net> Cc: Pierre Peiffer <peifferp@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# af8e2a4c	01-May-2008	Nadia Derbey <Nadia.Derbey@bull.net>	idr: fix idr_remove() The return inside the loop makes us free only a single layer. Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Jim Houston <jim.houston@comcast.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 199f0ca5	29-Apr-2008	Akinobu Mita <akinobu.mita@gmail.com>	idr: create idr_layer_cache at boot time Avoid a possible kmem_cache_create() failure by creating idr_layer_cache unconditionary at boot time rather than creating it on-demand when idr_init() is called the first time. This change also enables us to eliminate the check every time idr_init() is called. [akpm@linux-foundation.org: rename init_id_cache() to idr_init_cache()] [akpm@linux-foundation.org: fix alpha build] Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 4ba9b9d0	17-Oct-2007	Christoph Lameter <clameter@sgi.com>	Slab API: remove useless ctor parameter and reorder parameters Slab constructors currently have a flags parameter that is never used. And the order of the arguments is opposite to other slab functions. The object pointer is placed before the kmem_cache pointer. Convert ctor(void object, struct kmem_cache s, unsigned long flags) to ctor(struct kmem_cache s, void object) throughout the kernel [akpm@linux-foundation.org: coupla fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 5ba25331	14-Oct-2007	Al Viro <viro@ftp.linux.org.uk>	more low-hanging fruits - kernel, fs, lib signedness Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 6ace06dc	31-Jul-2007	Oleg Nesterov <oleg@tv-sign.ru>	idr_remove_all: kill unused variable "error" is always equal to 0. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 20c2df83	19-Jul-2007	Paul Mundt <lethal@linux-sh.org>	mm: Remove slab destructors from kmem_cache_create(). Slab destructors were no longer supported after Christoph's c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
# 23936cc0	16-Jul-2007	Kristian Hoegsberg <krh@redhat.com>	lib: add idr_remove_all Remove all ids from the given idr tree. idr_destroy() only frees up unused, cached idp_layers, but this function will remove all id mappings and leave all idp_layers unused. A typical clean-up sequence for objects stored in an idr tree, will use idr_for_each() to free all objects, if necessay, then idr_remove_all() to remove all ids, and idr_destroy() to free up the cached idr_layers. Signed-off-by: Kristian Hoegsberg <krh@redhat.com> Cc: Tejun Heo <htejun@gmail.com> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 96d7fa42	16-Jul-2007	Kristian Hoegsberg <krh@redhat.com>	lib: add idr_for_each() This patch adds an iterator function for the idr data structure. Compared to just iterating through the idr with an integer and idr_find, this iterator is (almost, but not quite) linear in the number of elements, as opposed to the number of integers in the range covered by the idr. This makes a difference for sparse idrs, but more importantly, it's a nicer way to iterate through the elements. The drm subsystem is moving to idr for tracking contexts and drawables, and with this change, we can use the idr exclusively for tracking these resources. [akpm@linux-foundation.org: fix comment] Signed-off-by: Kristian Hoegsberg <krh@redhat.com> Cc: Tejun Heo <htejun@gmail.com> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# 72dba584	13-Jun-2007	Tejun Heo <htejun@gmail.com>	ida: implement idr based id allocator Implement idr based id allocator. ida is used the same way idr is used but lacks id -> ptr translation and thus consumes much less memory. struct ida_bitmap is attached as leaf nodes to idr tree which is managed by the idr code. Each ida_bitmap is 128bytes long and contains slightly less than a thousand slots. ida is more aggressive with releasing extra resources acquired using ida_pre_get(). After every successful id allocation, ida frees one reserved idr_layer if possible. Reserved ida_bitmap is not freed automatically but only one ida_bitmap is reserved and it's almost always used right away. Under most circumstances, ida won't hold on to memory for too long which isn't actively used. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
# e33ac8bd	13-Jun-2007	Tejun Heo <htejun@gmail.com>	idr: separate out idr_mark_full() Separate out idr_mark_full() from sub_alloc() and make marking the allocated slot full the responsibility of idr_get_new_above_int(). Allocation part of idr_get_new_above_int() is renamed to idr_get_empty_slot(). New idr_get_new_above_int() allocates a slot using the function, install the user pointer and marks it full using idr_mark_full(). This change doesn't introduce any behavior change. This will be used by ida. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
# 7aae6dd8	13-Jun-2007	Tejun Heo <htejun@gmail.com>	idr: fix obscure bug in allocation path In sub_alloc(), when bitmap search fails, it goes up one level to continue search. This is done by updating the id cursor and searching the upper level again. If the cursor was at the end of the upper level, we need to go further than that. This wasn't implemented and when that happens the part of the cursor which indexes into the upper level wraps and sub_alloc() ends up searching the wrong bitmap. It allocates id which doesn't match the actual slot. This patch fixes this by restarting from the top if the search needs to go higher than one level. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
# 72fd4a35	10-Feb-2007	Robert P. J. Day <rpjday@mindspring.com>	[PATCH] Numerous fixes to kernel-doc info in source files. A variety of (mostly) innocuous fixes to the embedded kernel-doc content in source files, including: * make multi-line initial descriptions single line * denote some function names, constants and structs as such * change erroneous opening '/' to '/' in a few places reword some text for clarity Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
# e18b890b	06-Dec-2006	Christoph Lameter <clameter@sgi.com>	[PATCH] slab: remove kmem_cache_t Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name ".c" -o -name ".h"\|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
# c259cc28	14-Jul-2006	Roland Dreier <rdreier@cisco.com>	[PATCH] Convert idr's internal locking to _irqsave variant Currently, the code in lib/idr.c uses a bare spin_lock(&idp->lock) to do internal locking. This is a nasty trap for code that might call idr functions from different contexts; for example, it seems perfectly reasonable to call idr_get_new() from process context and idr_remove() from interrupt context -- but with the current locking this would lead to a potential deadlock. The simplest fix for this is to just convert the idr locking to use spin_lock_irqsave(). In particular, this fixes a very complicated locking issue detected by lockdep, involving the ib_ipoib driver's priv->lock and dev->_xmit_lock, which get involved with the ib_sa module's query_idr.lock. Cc: Arjan van de Ven <arjan@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Zach Brown <zach.brown@oracle.com>, Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
# 5806f07c	26-Jun-2006	Jeff Mahoney <jeffm@suse.com>	[PATCH] lib: add idr_replace This patch adds idr_replace() to replace an existing pointer in a single operation. Device-mapper will use this to update the pointer it stored against a given id. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
# 1eec0056	25-Jun-2006	Sonny Rao <sonny@burdell.org>	[PATCH] fix race in idr code I ran into a bug where the kernel died in the idr code: cpu 0x1d: Vector: 300 (Data Access) at [c000000b7096f710] pc: c0000000001f8984: .idr_get_new_above_int+0x140/0x330 lr: c0000000001f89b4: .idr_get_new_above_int+0x170/0x330 sp: c000000b7096f990 msr: 800000000000b032 dar: 0 dsisr: 40010000 current = 0xc000000b70d43830 paca = 0xc000000000556900 pid = 2022, comm = hwup 1d:mon> t [c000000b7096f990] c0000000000d2ad8 .expand_files+0x2e8/0x364 (unreliable) [c000000b7096faa0] c0000000001f8bf8 .idr_get_new_above+0x18/0x68 [c000000b7096fb20] c00000000002a054 .init_new_context+0x5c/0xf0 [c000000b7096fbc0] c000000000049dc8 .copy_process+0x91c/0x1404 [c000000b7096fcd0] c00000000004a988 .do_fork+0xd8/0x224 [c000000b7096fdc0] c00000000000ebdc .sys_clone+0x5c/0x74 [c000000b7096fe30] c000000000008950 .ppc_clone+0x8/0xc
# e15ae2dd	30-Oct-2005	Jesper Juhl <jesper.juhl@gmail.com>	[PATCH] Whitespace and CodingStyle cleanup for lib/idr.c Cleanup trailing whitespace, blank lines, CodingStyle issues etc, for lib/idr.c Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
# fd4f2df2	21-Oct-2005	Al Viro <viro@zeniv.linux.org.uk>	[PATCH] gfp_t: lib/* Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
# 8d3b3591	23-Oct-2005	Andrew Morton <akpm@osdl.org>	[PATCH] inotify/idr leak fix Fix a bug which was reported and diagnosed by Stefan Jones <stefan.jones@churchillrandoms.co.uk> IDR trees include a cache of idr_layer objects. There's no way to destroy this cache, so when we discard an overall idr tree we end up leaking some memory. Add and use idr_destroy() for this. v9fs and infiniband also need to use idr_destroy() to avoid leaks. Or, we make the cache global, like radix_tree_preload(). Which is probably better. Later. Cc: Eric Van Hensbergen <ericvh@ericvh.myip.org> Cc: Roland Dreier <rolandd@cisco.com> Cc: Robert Love <rml@novell.com> Cc: John McCutchan <ttb@tentacle.dhs.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
# 7c657f2f	26-Aug-2005	John McCutchan <ttb@tentacle.dhs.org>	[PATCH] Document idr_get_new_above() semantics, update inotify There is an off by one problem with idr_get_new_above. The comment and function name suggest that it will return an id > starting_id, but it actually returned an id >= starting_id, and kernel callers other than inotify treated it as such. The patch below fixes the comment, and fixes inotifys usage. The function name still doesn't match the behaviour, but it never did. Signed-off-by: John McCutchan <ttb@tentacle.dhs.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
# 589777ea	21-Jun-2005	Zaur Kambarov <kambarov@berkeley.edu>	[PATCH] coverity: idr_get_new_above_int() overrun fix This patch fixes overrun of array pa: 92 struct idr_layer *pa[MAX_LEVEL]; in 98 l = idp->layers; 99 pa[l--] = NULL; by passing idp->layers, set in 202 idp->layers = layers; to function sub_alloc in 203 v = sub_alloc(idp, ptr, &id); Signed-off-by: Zaur Kambarov <zkambarov@coverity.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
# 1da177e4	16-Apr-2005	Linus Torvalds <torvalds@ppc970.osdl.org>	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!