1<HTML>
2<HEAD>
3<!-- This HTML file has been created by texi2html 1.52b
4     from gperf.texi on 19 March 2013 -->
5
6<META HTTP-EQUIV="content-type" CONTENT="text/html; charset=UTF-8">
7<TITLE>Perfect Hash Function Generator - 3  Static search structures and GNU gperf</TITLE>
8</HEAD>
9<BODY>
10Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_3.html">previous</A>, <A HREF="gperf_5.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
11<P><HR><P>
12
13
14<H1><A NAME="SEC4" HREF="gperf_toc.html#TOC4">3  Static search structures and GNU <CODE>gperf</CODE></A></H1>
15<P>
16<A NAME="IDX2"></A>
17
18</P>
19<P>
20A <EM>static search structure</EM> is an Abstract Data Type with certain
21fundamental operations, e.g., <EM>initialize</EM>, <EM>insert</EM>,
22and <EM>retrieve</EM>.  Conceptually, all insertions occur before any
23retrievals.  In practice, <CODE>gperf</CODE> generates a <EM>static</EM> array
24containing search set keywords and any associated attributes specified
25by the user.  Thus, there is essentially no execution-time cost for the
26insertions.  It is a useful data structure for representing <EM>static
27search sets</EM>.  Static search sets occur frequently in software system
28applications.  Typical static search sets include compiler reserved
29words, assembler instruction opcodes, and built-in shell interpreter
30commands.  Search set members, called <EM>keywords</EM>, are inserted into
31the structure only once, usually during program initialization, and are
32not generally modified at run-time.
33
34</P>
35<P>
36Numerous static search structure implementations exist, e.g.,
37arrays, linked lists, binary search trees, digital search tries, and
38hash tables.  Different approaches offer trade-offs between space
39utilization and search time efficiency.  For example, an <VAR>n</VAR> element
40sorted array is space efficient, though the average-case time
41complexity for retrieval operations using binary search is
42proportional to log <VAR>n</VAR>.  Conversely, hash table implementations
43often locate a table entry in constant time, but typically impose
44additional memory overhead and exhibit poor worst case performance.
45
46</P>
47<P>
48<A NAME="IDX3"></A>
49<EM>Minimal perfect hash functions</EM> provide an optimal solution for a
50particular class of static search sets.  A minimal perfect hash
51function is defined by two properties:
52
53</P>
54
55<UL>
56<LI>
57
58It allows keyword recognition in a static search set using at most
59<EM>one</EM> probe into the hash table.  This represents the ���perfect���
60property.
61<LI>
62
63The actual memory allocated to store the keywords is precisely large
64enough for the keyword set, and <EM>no larger</EM>.  This is the
65���minimal��� property.
66</UL>
67
68<P>
69For most applications it is far easier to generate <EM>perfect</EM> hash
70functions than <EM>minimal perfect</EM> hash functions.  Moreover,
71non-minimal perfect hash functions frequently execute faster than
72minimal ones in practice.  This phenomena occurs since searching a
73sparse keyword table increases the probability of locating a ���null���
74entry, thereby reducing string comparisons.  <CODE>gperf</CODE>'s default
75behavior generates <EM>near-minimal</EM> perfect hash functions for
76keyword sets.  However, <CODE>gperf</CODE> provides many options that permit
77user control over the degree of minimality and perfection.
78
79</P>
80<P>
81Static search sets often exhibit relative stability over time.  For
82example, Ada's 63 reserved words have remained constant for nearly a
83decade.  It is therefore frequently worthwhile to expend concerted
84effort building an optimal search structure <EM>once</EM>, if it
85subsequently receives heavy use multiple times.  <CODE>gperf</CODE> removes
86the drudgery associated with constructing time- and space-efficient
87search structures by hand.  It has proven a useful and practical tool
88for serious programming projects.  Output from <CODE>gperf</CODE> is currently
89used in several production and research compilers, including GNU C, GNU
90C++, GNU Java, GNU Pascal, and GNU Modula 3.  The latter two compilers are
91not yet part of the official GNU distribution.  Each compiler utilizes
92<CODE>gperf</CODE> to automatically generate static search structures that
93efficiently identify their respective reserved keywords.
94
95</P>
96<P><HR><P>
97Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_3.html">previous</A>, <A HREF="gperf_5.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
98</BODY>
99</HTML>
100