History log of /freebsd-11-stable/include/search.h
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 308090 29-Oct-2016 ed

MFC r307227 and r307343:

Improve typing of POSIX search tree functions.

Back in 2015 when I reimplemented these functions to use an AVL tree, I
was annoyed by the weakness of the typing of these functions. Both tree
nodes and keys are represented by 'void *', meaning that things like the
documentation for these functions are an absolute train wreck.

To make things worse, users of these functions need to cast the return
value of tfind()/tsearch() from 'void *' to 'type_of_key **' in order to
access the key. Technically speaking such casts violate aliasing rules.
I've observed actual breakages as a result of this by enabling features
like LTO.

I've filed a bug report at the Austin Group. Looking at the way the bug
got resolved, they made a pretty good step in the right direction. A new
type 'posix_tnode' has been added to correspond to tree nodes. It is
still defined as 'void' for source-level compatibility, but in the very
far future it could be replaced by a proper structure type containing a
key pointer.


# 302408 07-Jul-2016 gjb

Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle.
Prune svn:mergeinfo from the new branch, as nothing has been merged
here.

Additional commits post-branch will follow.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


/freebsd-11-stable/MAINTAINERS
/freebsd-11-stable/cddl
/freebsd-11-stable/cddl/contrib/opensolaris
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/print
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/zfs
/freebsd-11-stable/cddl/contrib/opensolaris/lib/libzfs
/freebsd-11-stable/contrib/amd
/freebsd-11-stable/contrib/apr
/freebsd-11-stable/contrib/apr-util
/freebsd-11-stable/contrib/atf
/freebsd-11-stable/contrib/binutils
/freebsd-11-stable/contrib/bmake
/freebsd-11-stable/contrib/byacc
/freebsd-11-stable/contrib/bzip2
/freebsd-11-stable/contrib/com_err
/freebsd-11-stable/contrib/compiler-rt
/freebsd-11-stable/contrib/dialog
/freebsd-11-stable/contrib/dma
/freebsd-11-stable/contrib/dtc
/freebsd-11-stable/contrib/ee
/freebsd-11-stable/contrib/elftoolchain
/freebsd-11-stable/contrib/elftoolchain/ar
/freebsd-11-stable/contrib/elftoolchain/brandelf
/freebsd-11-stable/contrib/elftoolchain/elfdump
/freebsd-11-stable/contrib/expat
/freebsd-11-stable/contrib/file
/freebsd-11-stable/contrib/gcc
/freebsd-11-stable/contrib/gcclibs/libgomp
/freebsd-11-stable/contrib/gdb
/freebsd-11-stable/contrib/gdtoa
/freebsd-11-stable/contrib/groff
/freebsd-11-stable/contrib/ipfilter
/freebsd-11-stable/contrib/ldns
/freebsd-11-stable/contrib/ldns-host
/freebsd-11-stable/contrib/less
/freebsd-11-stable/contrib/libarchive
/freebsd-11-stable/contrib/libarchive/cpio
/freebsd-11-stable/contrib/libarchive/libarchive
/freebsd-11-stable/contrib/libarchive/libarchive_fe
/freebsd-11-stable/contrib/libarchive/tar
/freebsd-11-stable/contrib/libc++
/freebsd-11-stable/contrib/libc-vis
/freebsd-11-stable/contrib/libcxxrt
/freebsd-11-stable/contrib/libexecinfo
/freebsd-11-stable/contrib/libpcap
/freebsd-11-stable/contrib/libstdc++
/freebsd-11-stable/contrib/libucl
/freebsd-11-stable/contrib/libxo
/freebsd-11-stable/contrib/llvm
/freebsd-11-stable/contrib/llvm/projects/libunwind
/freebsd-11-stable/contrib/llvm/tools/clang
/freebsd-11-stable/contrib/llvm/tools/lldb
/freebsd-11-stable/contrib/llvm/tools/llvm-dwarfdump
/freebsd-11-stable/contrib/llvm/tools/llvm-lto
/freebsd-11-stable/contrib/mdocml
/freebsd-11-stable/contrib/mtree
/freebsd-11-stable/contrib/ncurses
/freebsd-11-stable/contrib/netcat
/freebsd-11-stable/contrib/ntp
/freebsd-11-stable/contrib/nvi
/freebsd-11-stable/contrib/one-true-awk
/freebsd-11-stable/contrib/openbsm
/freebsd-11-stable/contrib/openpam
/freebsd-11-stable/contrib/openresolv
/freebsd-11-stable/contrib/pf
/freebsd-11-stable/contrib/sendmail
/freebsd-11-stable/contrib/serf
/freebsd-11-stable/contrib/sqlite3
/freebsd-11-stable/contrib/subversion
/freebsd-11-stable/contrib/tcpdump
/freebsd-11-stable/contrib/tcsh
/freebsd-11-stable/contrib/tnftp
/freebsd-11-stable/contrib/top
/freebsd-11-stable/contrib/top/install-sh
/freebsd-11-stable/contrib/tzcode/stdtime
/freebsd-11-stable/contrib/tzcode/zic
/freebsd-11-stable/contrib/tzdata
/freebsd-11-stable/contrib/unbound
/freebsd-11-stable/contrib/vis
/freebsd-11-stable/contrib/wpa
/freebsd-11-stable/contrib/xz
/freebsd-11-stable/crypto/heimdal
/freebsd-11-stable/crypto/openssh
/freebsd-11-stable/crypto/openssl
/freebsd-11-stable/gnu/lib
/freebsd-11-stable/gnu/usr.bin/binutils
/freebsd-11-stable/gnu/usr.bin/cc/cc_tools
/freebsd-11-stable/gnu/usr.bin/gdb
/freebsd-11-stable/lib/libc/locale/ascii.c
/freebsd-11-stable/sys/cddl/contrib/opensolaris
/freebsd-11-stable/sys/contrib/dev/acpica
/freebsd-11-stable/sys/contrib/ipfilter
/freebsd-11-stable/sys/contrib/libfdt
/freebsd-11-stable/sys/contrib/octeon-sdk
/freebsd-11-stable/sys/contrib/x86emu
/freebsd-11-stable/sys/contrib/xz-embedded
/freebsd-11-stable/usr.sbin/bhyve/atkbdc.h
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.c
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.h
/freebsd-11-stable/usr.sbin/bhyve/console.c
/freebsd-11-stable/usr.sbin/bhyve/console.h
/freebsd-11-stable/usr.sbin/bhyve/pci_fbuf.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.h
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.c
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.h
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.c
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.h
/freebsd-11-stable/usr.sbin/bhyve/rfb.c
/freebsd-11-stable/usr.sbin/bhyve/rfb.h
/freebsd-11-stable/usr.sbin/bhyve/sockstream.c
/freebsd-11-stable/usr.sbin/bhyve/sockstream.h
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.c
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.h
/freebsd-11-stable/usr.sbin/bhyve/usb_mouse.c
/freebsd-11-stable/usr.sbin/bhyve/vga.c
/freebsd-11-stable/usr.sbin/bhyve/vga.h
# 292767 27-Dec-2015 ed

Replace implementation of hsearch() by one that scales.

Traditionally the hcreate() function creates a hash table that uses
chaining, using a fixed user-provided size. The problem with this
approach is that this often either wastes memory (table too big) or
yields bad performance (table too small). For applications it may not
always be easy to estimate the right hash table size. A fixed number
only increases performance compared to a linked list by a constant
factor.

This problem can be solved easily by dynamically resizing the hash
table. If the size of the hash table is at least doubled, this has no
negative on the running time complexity. If a dynamically sized hash
table is used, we can also switch to using open addressing instead of
chaining, which has the advantage of just using a single allocation for
the entire table, instead of allocating many small objects.

Finally, a problem with the existing implementation is that its
deterministic algorithm for hashing makes it possible to come up with
fixed patterns to trigger an excessive number of collisions. We can
easily solve this by using FNV-1a as a hashing algorithm in combination
with a randomly generated offset basis.

Measurements have shown that this implementation is about 20-25% faster
than the existing implementation (even if the existing implementation is
given an excessive number of buckets). Though it allocates more memory
through malloc() than the old implementation (between 4-8 pointers per
used entry instead of 3), process memory use is similar to the old
implementation as if the estimated size was underestimated by a factor
10. This is due to the fact that malloc() needs to perform less
bookkeeping.

Reviewed by: jilles, pfg
Obtained from: https://github.com/NuxiNL/cloudlibc
Differential Revision: https://reviews.freebsd.org/D4644


# 292613 22-Dec-2015 ed

Let tsearch()/tdelete() use an AVL tree.

The existing implementations of POSIX tsearch() and tdelete() don't
attempt to perform any balancing at all. Testing reveals that inserting
100k nodes into a tree sequentially takes approximately one minute on my
system.

Though most other BSDs also don't use any balanced tree internally, C
libraries like glibc and musl do provide better implementations. glibc
uses a red-black tree and musl uses an AVL tree.

Red-black trees have the advantage over AVL trees that they only require
O(1) rotations after insertion and deletion, but have the disadvantage
that the tree has a maximum depth of 2*log2(n) instead of 1.44*log2(n).
My take is that it's better to focus on having a lower maximum depth,
for the reason that in the case of tsearch() the invocation of the
comparator likely dominates the running time.

This change replaces the tsearch() and tdelete() functions by versions
that create an AVL tree. Compared to musl's implementation, this version
is different in two different ways:

- We don't keep track of heights; just balances. This is sufficient.
This has the advantage that it reduces the number of nodes that are
being accessed. Storing heights requires us to also access all of the
siblings along the path.

- Don't use any recursion at all. We know that the tree cannot 2^64
elements in size, so the height of the tree can never be larger than
96. Use a 128-bit bitmask to keep track of the path that is computed.
This allows us to iterate over the same path twice, meaning we can
apply rotations from top to bottom.

Inserting 100k nodes into a tree now only takes 0.015 seconds. Insertion
seems to be twice as fast as glibc, whereas deletion has about the same
performance. Unlike glibc, it uses a fixed amount of memory.

I also experimented with both recursive and iterative bottom-up
implementations of the same algorithm. This iterative top-down version
performs similar to the recursive bottom-up version in terms of speed
and code size.

For some reason, the iterative bottom-up algorithm was actually 30%
faster for deletion, but has a quadratic memory complexity to keep track
of all the parent pointers.

Reviewed by: jilles
Obtained from: https://github.com/NuxiNL/cloudlibc
Differential Revision: https://reviews.freebsd.org/D4412


# 268943 21-Jul-2014 pfg

Add re-entrant versions of the hash functions based on the GNU api.

While testing this I found a conformance issue in hdestroy()
that will be fixed in a subsequent commit.

Obtained from: NetBSD (hcreate.c, CVS Rev. 1.7)


# 268846 18-Jul-2014 pfg

Revert r268826:
The current ordering of this header is a feature as it
is more consistent with POSIX.
Also adding gratuitous newlines is not elegant.

Pointed out by: bde


# 268826 18-Jul-2014 pfg

Minor sorting to match the NetBSD header

MFC after: 3 days
Obtained from: NetBSD


# 105250 16-Oct-2002 robert

- Remove the lsearch() and lfind() functions and their manpage from
the compatibility library libcompat.
- Add new implementations of lsearch() and lfind() which conform to
IEEE Std 1003.1-2001 to libc. Add a new manual page for them and
add them to the makefile.
- Add function prototypes for lsearch() and lfind() to the search.h
header.


# 105245 16-Oct-2002 robert

- Remove the old insque() and remque() functions and their manual
page from the compatibility library.
- Add new implementations of insque() and remque() which conform to
IEEE Std 1003.1-2001 to libc. Add a new manual page for them and
connect them to the build.
- Add the prototypes of insque() and remque() to the search.h
header.


# 104399 03-Oct-2002 mike

Fix various style(9) bugs:
o Source ID's in wrong location.
o Space used, instead of tab, after typedef.
o Unaligned function prototype for twalk().

Other changes:
o Add missing const qualifier in tfind().
o Add comment about missing functions.


# 103012 06-Sep-2002 tjr

Style: One space between "restrict" qualifier and "*".


# 102227 21-Aug-2002 mike

o Merge <machine/ansi.h> and <machine/types.h> into a new header
called <machine/_types.h>.
o <machine/ansi.h> will continue to live so it can define MD clock
macros, which are only MD because of gratuitous differences between
architectures.
o Change all headers to make use of this. This mainly involves
changing:
#ifdef _BSD_FOO_T_
typedef _BSD_FOO_T_ foo_t;
#undef _BSD_FOO_T_
#endif
to:
#ifndef _FOO_T_DECLARED
typedef __foo_t foo_t;
#define _FOO_T_DECLARED
#endif

Concept by: bde
Reviewed by: jake, obrien


# 101882 14-Aug-2002 robert

- Add the 'restrict' qualifier to match the IEEE Std 1003.1-2001
prototype of the tdelete(3) function.
- Remove duplicated space.
- Use an ANSI-C function definition for tdelete(3).
- Update the manual page.


# 93032 23-Mar-2002 imp

Breath deep and take __P out of the system include files.

# This appears to not break X11, but I'm having problems compiling the
# glide part of the server with or without this patch, so I can't tell
# for sure.


# 62780 07-Jul-2000 alfred

fix spelling errors.

Pointed out by: bde


# 62684 06-Jul-2000 alfred

cleanup the tsearch import.

remove (comment out) functions defined or depricated elsewhere:
bsearch, lfind, lsearch, insque, remque

change hcreate to take a size_t rather than uint (essentially the same)

since hcreate/hdestroy are now in <search.h>, remove private search.h
in lib/libc/db/hash/

add $FreeBSD tags to hsearch.c


# 62321 01-Jul-2000 alfred

bring in binary search tree code.

Obtained from: NetBSD