1<HTML> 2<HEAD> 3<!-- This HTML file has been created by texi2html 1.52b 4 from gperf.texi on 19 March 2013 --> 5 6<META HTTP-EQUIV="content-type" CONTENT="text/html; charset=UTF-8"> 7<TITLE>Perfect Hash Function Generator - 3 Static search structures and GNU gperf</TITLE> 8</HEAD> 9<BODY> 10Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_3.html">previous</A>, <A HREF="gperf_5.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. 11<P><HR><P> 12 13 14<H1><A NAME="SEC4" HREF="gperf_toc.html#TOC4">3 Static search structures and GNU <CODE>gperf</CODE></A></H1> 15<P> 16<A NAME="IDX2"></A> 17 18</P> 19<P> 20A <EM>static search structure</EM> is an Abstract Data Type with certain 21fundamental operations, e.g., <EM>initialize</EM>, <EM>insert</EM>, 22and <EM>retrieve</EM>. Conceptually, all insertions occur before any 23retrievals. In practice, <CODE>gperf</CODE> generates a <EM>static</EM> array 24containing search set keywords and any associated attributes specified 25by the user. Thus, there is essentially no execution-time cost for the 26insertions. It is a useful data structure for representing <EM>static 27search sets</EM>. Static search sets occur frequently in software system 28applications. Typical static search sets include compiler reserved 29words, assembler instruction opcodes, and built-in shell interpreter 30commands. Search set members, called <EM>keywords</EM>, are inserted into 31the structure only once, usually during program initialization, and are 32not generally modified at run-time. 33 34</P> 35<P> 36Numerous static search structure implementations exist, e.g., 37arrays, linked lists, binary search trees, digital search tries, and 38hash tables. Different approaches offer trade-offs between space 39utilization and search time efficiency. For example, an <VAR>n</VAR> element 40sorted array is space efficient, though the average-case time 41complexity for retrieval operations using binary search is 42proportional to log <VAR>n</VAR>. Conversely, hash table implementations 43often locate a table entry in constant time, but typically impose 44additional memory overhead and exhibit poor worst case performance. 45 46</P> 47<P> 48<A NAME="IDX3"></A> 49<EM>Minimal perfect hash functions</EM> provide an optimal solution for a 50particular class of static search sets. A minimal perfect hash 51function is defined by two properties: 52 53</P> 54 55<UL> 56<LI> 57 58It allows keyword recognition in a static search set using at most 59<EM>one</EM> probe into the hash table. This represents the ���perfect��� 60property. 61<LI> 62 63The actual memory allocated to store the keywords is precisely large 64enough for the keyword set, and <EM>no larger</EM>. This is the 65���minimal��� property. 66</UL> 67 68<P> 69For most applications it is far easier to generate <EM>perfect</EM> hash 70functions than <EM>minimal perfect</EM> hash functions. Moreover, 71non-minimal perfect hash functions frequently execute faster than 72minimal ones in practice. This phenomena occurs since searching a 73sparse keyword table increases the probability of locating a ���null��� 74entry, thereby reducing string comparisons. <CODE>gperf</CODE>'s default 75behavior generates <EM>near-minimal</EM> perfect hash functions for 76keyword sets. However, <CODE>gperf</CODE> provides many options that permit 77user control over the degree of minimality and perfection. 78 79</P> 80<P> 81Static search sets often exhibit relative stability over time. For 82example, Ada's 63 reserved words have remained constant for nearly a 83decade. It is therefore frequently worthwhile to expend concerted 84effort building an optimal search structure <EM>once</EM>, if it 85subsequently receives heavy use multiple times. <CODE>gperf</CODE> removes 86the drudgery associated with constructing time- and space-efficient 87search structures by hand. It has proven a useful and practical tool 88for serious programming projects. Output from <CODE>gperf</CODE> is currently 89used in several production and research compilers, including GNU C, GNU 90C++, GNU Java, GNU Pascal, and GNU Modula 3. The latter two compilers are 91not yet part of the official GNU distribution. Each compiler utilizes 92<CODE>gperf</CODE> to automatically generate static search structures that 93efficiently identify their respective reserved keywords. 94 95</P> 96<P><HR><P> 97Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_3.html">previous</A>, <A HREF="gperf_5.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. 98</BODY> 99</HTML> 100