1228068Sbapt\input texinfo @c -*- texinfo -*-
2228068Sbapt@c %**start of header
3228068Sbapt@setfilename gperf.info
4228068Sbapt@settitle Perfect Hash Function Generator
5228068Sbapt@c @setchapternewpage odd
6228068Sbapt@c %**end of header
7228068Sbapt
8228068Sbapt@c some day we should @include version.texi instead of defining
9228068Sbapt@c these values at hand.
10228068Sbapt@set UPDATED 31 March 2007
11228068Sbapt@set EDITION 3.0.3
12228068Sbapt@set VERSION 3.0.3
13228068Sbapt@c ---------------------
14228068Sbapt
15228068Sbapt@c remove the black boxes generated in the GPL appendix.
16228068Sbapt@finalout
17228068Sbapt
18228068Sbapt@c Merge functions into the concept index
19228068Sbapt@syncodeindex fn cp
20228068Sbapt@c @synindex pg cp
21228068Sbapt
22228068Sbapt@dircategory Programming Tools
23228068Sbapt@direntry
24228068Sbapt* Gperf: (gperf).                Perfect Hash Function Generator.
25228068Sbapt@end direntry
26228068Sbapt
27228068Sbapt@ifinfo
28228068SbaptThis file documents the features of the GNU Perfect Hash Function
29228068SbaptGenerator @value{VERSION}.
30228068Sbapt
31228068SbaptCopyright @copyright{} 1989-2006 Free Software Foundation, Inc.
32228068Sbapt
33228068SbaptPermission is granted to make and distribute verbatim copies of this
34228068Sbaptmanual provided the copyright notice and this permission notice are
35228068Sbaptpreserved on all copies.
36228068Sbapt
37228068Sbapt@ignore
38228068SbaptPermission is granted to process this file through TeX and print the
39228068Sbaptresults, provided the printed document carries a copying permission
40228068Sbaptnotice identical to this one except for the removal of this paragraph
41228068Sbapt(this paragraph not being relevant to the printed manual).
42228068Sbapt
43228068Sbapt@end ignore
44228068Sbapt
45228068SbaptPermission is granted to copy and distribute modified versions of this
46228068Sbaptmanual under the conditions for verbatim copying, provided also that the
47228068Sbaptsection entitled ``GNU General Public License'' is included exactly as
48228068Sbaptin the original, and provided that the entire resulting derived work is
49228068Sbaptdistributed under the terms of a permission notice identical to this
50228068Sbaptone.
51228068Sbapt
52228068SbaptPermission is granted to copy and distribute translations of this manual
53228068Sbaptinto another language, under the above conditions for modified versions,
54228068Sbaptexcept that the section entitled ``GNU General Public License'' and this
55228068Sbaptpermission notice may be included in translations approved by the Free
56228068SbaptSoftware Foundation instead of in the original English.
57228068Sbapt
58228068Sbapt@end ifinfo
59228068Sbapt
60228068Sbapt@titlepage
61228068Sbapt@title User's Guide to @code{gperf} @value{VERSION}
62228068Sbapt@subtitle The GNU Perfect Hash Function Generator
63228068Sbapt@subtitle Edition @value{EDITION}, @value{UPDATED}
64228068Sbapt@author Douglas C. Schmidt
65228068Sbapt@author Bruno Haible
66228068Sbapt
67228068Sbapt@page
68228068Sbapt@vskip 0pt plus 1filll
69228068SbaptCopyright @copyright{} 1989-2007 Free Software Foundation, Inc.
70228068Sbapt
71228068Sbapt
72228068SbaptPermission is granted to make and distribute verbatim copies of
73228068Sbaptthis manual provided the copyright notice and this permission notice
74228068Sbaptare preserved on all copies.
75228068Sbapt
76228068SbaptPermission is granted to copy and distribute modified versions of this
77228068Sbaptmanual under the conditions for verbatim copying, provided also that the
78228068Sbaptsection entitled ``GNU General Public License'' is included
79228068Sbaptexactly as in the original, and provided that the entire resulting
80228068Sbaptderived work is distributed under the terms of a permission notice
81228068Sbaptidentical to this one.
82228068Sbapt
83228068SbaptPermission is granted to copy and distribute translations of this manual
84228068Sbaptinto another language, under the above conditions for modified versions,
85228068Sbaptexcept that the section entitled ``GNU General Public License'' may be
86228068Sbaptincluded in a translation approved by the author instead of in the
87228068Sbaptoriginal English.
88228068Sbapt@end titlepage
89228068Sbapt
90228068Sbapt@ifinfo
91228068Sbapt@node Top, Copying, (dir), (dir)
92228068Sbapt@top Introduction
93228068Sbapt
94228068SbaptThis manual documents the GNU @code{gperf} perfect hash function generator
95228068Sbaptutility, focusing on its features and how to use them, and how to report
96228068Sbaptbugs.
97228068Sbapt
98228068Sbapt@menu
99228068Sbapt* Copying::                     GNU @code{gperf} General Public License says
100228068Sbapt                                how you can copy and share @code{gperf}.
101228068Sbapt* Contributors::                People who have contributed to @code{gperf}.
102228068Sbapt* Motivation::                  The purpose of @code{gperf}.
103228068Sbapt* Search Structures::           Static search structures and GNU @code{gperf}
104228068Sbapt* Description::                 High-level discussion of how GPERF functions.
105228068Sbapt* Options::                     A description of options to the program.
106228068Sbapt* Bugs::                        Known bugs and limitations with GPERF.
107228068Sbapt* Projects::                    Things still left to do.
108228068Sbapt* Bibliography::                Material Referenced in this Report.
109228068Sbapt
110228068Sbapt* Concept Index::               
111228068Sbapt
112228068Sbapt@detailmenu --- The Detailed Node Listing ---
113228068Sbapt
114228068SbaptHigh-Level Description of GNU @code{gperf}
115228068Sbapt
116228068Sbapt* Input Format::                Input Format to @code{gperf}
117228068Sbapt* Output Format::               Output Format for Generated C Code with @code{gperf}
118228068Sbapt* Binary Strings::              Use of NUL bytes
119228068Sbapt
120228068SbaptInput Format to @code{gperf}
121228068Sbapt
122228068Sbapt* Declarations::                Declarations.
123228068Sbapt* Keywords::                    Format for Keyword Entries.
124228068Sbapt* Functions::                   Including Additional C Functions.
125228068Sbapt* Controls for GNU indent::     Where to place directives for GNU @code{indent}.
126228068Sbapt
127228068SbaptDeclarations
128228068Sbapt
129228068Sbapt* User-supplied Struct::        Specifying keywords with attributes.
130228068Sbapt* Gperf Declarations::          Embedding command line options in the input.
131228068Sbapt* C Code Inclusion::            Including C declarations and definitions.
132228068Sbapt
133228068SbaptInvoking @code{gperf}
134228068Sbapt
135228068Sbapt* Input Details::               Options that affect Interpretation of the Input File
136228068Sbapt* Output Language::             Specifying the Language for the Output Code
137228068Sbapt* Output Details::              Fine tuning Details in the Output Code
138228068Sbapt* Algorithmic Details::         Changing the Algorithms employed by @code{gperf}
139228068Sbapt* Verbosity::                   Informative Output
140228068Sbapt
141228068Sbapt@end detailmenu
142228068Sbapt@end menu
143228068Sbapt
144228068Sbapt@end ifinfo
145228068Sbapt
146228068Sbapt@node Copying, Contributors, Top, Top
147228068Sbapt@unnumbered GNU GENERAL PUBLIC LICENSE
148228068Sbapt@include gpl.texinfo
149228068Sbapt
150228068Sbapt@node Contributors, Motivation, Copying, Top
151228068Sbapt@unnumbered Contributors to GNU @code{gperf} Utility
152228068Sbapt
153228068Sbapt@itemize @bullet
154228068Sbapt@item
155228068Sbapt@cindex Bugs 
156228068SbaptThe GNU @code{gperf} perfect hash function generator utility was
157228068Sbaptwritten in GNU C++ by Douglas C. Schmidt.  The general
158228068Sbaptidea for the perfect hash function generator was inspired by Keith
159228068SbaptBostic's algorithm written in C, and distributed to net.sources around
160228068Sbapt1984.  The current program is a heavily modified, enhanced, and extended
161228068Sbaptimplementation of Keith's basic idea, created at the University of
162228068SbaptCalifornia, Irvine.  Bugs, patches, and suggestions should be reported
163228068Sbaptto @code{<bug-gnu-gperf@@gnu.org>}.
164228068Sbapt
165228068Sbapt@item
166228068SbaptSpecial thanks is extended to Michael Tiemann and Doug Lea, for
167228068Sbaptproviding a useful compiler, and for giving me a forum to exhibit my
168228068Sbaptcreation.
169228068Sbapt
170228068SbaptIn addition, Adam de Boor and Nels Olson provided many tips and insights
171228068Sbaptthat greatly helped improve the quality and functionality of @code{gperf}.
172228068Sbapt
173228068Sbapt@item
174228068SbaptBruno Haible enhanced and optimized the search algorithm.  He also rewrote
175228068Sbaptthe input routines and the output routines for better reliability, and
176228068Sbaptadded a testsuite.
177228068Sbapt@end itemize
178228068Sbapt
179228068Sbapt@node Motivation, Search Structures, Contributors, Top
180228068Sbapt@chapter Introduction
181228068Sbapt
182228068Sbapt@code{gperf} is a perfect hash function generator written in C++.  It
183228068Sbapttransforms an @var{n} element user-specified keyword set @var{W} into a
184228068Sbaptperfect hash function @var{F}.  @var{F} uniquely maps keywords in
185228068Sbapt@var{W} onto the range 0..@var{k}, where @var{k} >= @var{n-1}.  If @var{k}
186228068Sbapt= @var{n-1} then @var{F} is a @emph{minimal} perfect hash function.
187228068Sbapt@code{gperf} generates a 0..@var{k} element static lookup table and a
188228068Sbaptpair of C functions.  These functions determine whether a given
189228068Sbaptcharacter string @var{s} occurs in @var{W}, using at most one probe into
190228068Sbaptthe lookup table.
191228068Sbapt
192228068Sbapt@code{gperf} currently generates the reserved keyword recognizer for
193228068Sbaptlexical analyzers in several production and research compilers and
194228068Sbaptlanguage processing tools, including GNU C, GNU C++, GNU Java, GNU Pascal,
195228068SbaptGNU Modula 3, and GNU indent.  Complete C++ source code for @code{gperf} is
196228068Sbaptavailable from @code{http://ftp.gnu.org/pub/gnu/gperf/}.
197228068SbaptA paper describing @code{gperf}'s design and implementation in greater
198228068Sbaptdetail is available in the Second USENIX C++ Conference proceedings
199228068Sbaptor from @code{http://www.cs.wustl.edu/~schmidt/resume.html}.
200228068Sbapt
201228068Sbapt@node Search Structures, Description, Motivation, Top
202228068Sbapt@chapter Static search structures and GNU @code{gperf}
203228068Sbapt@cindex Static search structure
204228068Sbapt
205228068SbaptA @dfn{static search structure} is an Abstract Data Type with certain
206228068Sbaptfundamental operations, e.g., @emph{initialize}, @emph{insert},
207228068Sbaptand @emph{retrieve}.  Conceptually, all insertions occur before any
208228068Sbaptretrievals.  In practice, @code{gperf} generates a @emph{static} array
209228068Sbaptcontaining search set keywords and any associated attributes specified
210228068Sbaptby the user.  Thus, there is essentially no execution-time cost for the
211228068Sbaptinsertions.  It is a useful data structure for representing @emph{static
212228068Sbaptsearch sets}.  Static search sets occur frequently in software system
213228068Sbaptapplications.  Typical static search sets include compiler reserved
214228068Sbaptwords, assembler instruction opcodes, and built-in shell interpreter
215228068Sbaptcommands.  Search set members, called @dfn{keywords}, are inserted into
216228068Sbaptthe structure only once, usually during program initialization, and are
217228068Sbaptnot generally modified at run-time.
218228068Sbapt
219228068SbaptNumerous static search structure implementations exist, e.g.,
220228068Sbaptarrays, linked lists, binary search trees, digital search tries, and
221228068Sbapthash tables.  Different approaches offer trade-offs between space
222228068Sbaptutilization and search time efficiency.  For example, an @var{n} element
223228068Sbaptsorted array is space efficient, though the average-case time
224228068Sbaptcomplexity for retrieval operations using binary search is
225228068Sbaptproportional to log @var{n}.  Conversely, hash table implementations
226228068Sbaptoften locate a table entry in constant time, but typically impose
227228068Sbaptadditional memory overhead and exhibit poor worst case performance.
228228068Sbapt
229228068Sbapt@cindex Minimal perfect hash functions
230228068Sbapt@emph{Minimal perfect hash functions} provide an optimal solution for a
231228068Sbaptparticular class of static search sets.  A minimal perfect hash
232228068Sbaptfunction is defined by two properties:
233228068Sbapt
234228068Sbapt@itemize @bullet
235228068Sbapt@item 
236228068SbaptIt allows keyword recognition in a static search set using at most
237228068Sbapt@emph{one} probe into the hash table.  This represents the ``perfect''
238228068Sbaptproperty.
239228068Sbapt@item 
240228068SbaptThe actual memory allocated to store the keywords is precisely large
241228068Sbaptenough for the keyword set, and @emph{no larger}.  This is the
242228068Sbapt``minimal'' property.
243228068Sbapt@end itemize
244228068Sbapt
245228068SbaptFor most applications it is far easier to generate @emph{perfect} hash
246228068Sbaptfunctions than @emph{minimal perfect} hash functions.  Moreover,
247228068Sbaptnon-minimal perfect hash functions frequently execute faster than
248228068Sbaptminimal ones in practice.  This phenomena occurs since searching a
249228068Sbaptsparse keyword table increases the probability of locating a ``null''
250228068Sbaptentry, thereby reducing string comparisons.  @code{gperf}'s default
251228068Sbaptbehavior generates @emph{near-minimal} perfect hash functions for
252228068Sbaptkeyword sets.  However, @code{gperf} provides many options that permit
253228068Sbaptuser control over the degree of minimality and perfection.
254228068Sbapt
255228068SbaptStatic search sets often exhibit relative stability over time.  For
256228068Sbaptexample, Ada's 63 reserved words have remained constant for nearly a
257228068Sbaptdecade.  It is therefore frequently worthwhile to expend concerted
258228068Sbapteffort building an optimal search structure @emph{once}, if it
259228068Sbaptsubsequently receives heavy use multiple times.  @code{gperf} removes
260228068Sbaptthe drudgery associated with constructing time- and space-efficient
261228068Sbaptsearch structures by hand.  It has proven a useful and practical tool
262228068Sbaptfor serious programming projects.  Output from @code{gperf} is currently
263228068Sbaptused in several production and research compilers, including GNU C, GNU
264228068SbaptC++, GNU Java, GNU Pascal, and GNU Modula 3.  The latter two compilers are
265228068Sbaptnot yet part of the official GNU distribution.  Each compiler utilizes
266228068Sbapt@code{gperf} to automatically generate static search structures that
267228068Sbaptefficiently identify their respective reserved keywords.
268228068Sbapt
269228068Sbapt@node Description, Options, Search Structures, Top
270228068Sbapt@chapter High-Level Description of GNU @code{gperf}
271228068Sbapt
272228068Sbapt@menu
273228068Sbapt* Input Format::                Input Format to @code{gperf}
274228068Sbapt* Output Format::               Output Format for Generated C Code with @code{gperf}
275228068Sbapt* Binary Strings::              Use of NUL bytes
276228068Sbapt@end menu
277228068Sbapt
278228068SbaptThe perfect hash function generator @code{gperf} reads a set of
279228068Sbapt``keywords'' from an input file (or from the standard input by
280228068Sbaptdefault).  It attempts to derive a perfect hashing function that
281228068Sbaptrecognizes a member of the @dfn{static keyword set} with at most a
282228068Sbaptsingle probe into the lookup table.  If @code{gperf} succeeds in
283228068Sbaptgenerating such a function it produces a pair of C source code routines
284228068Sbaptthat perform hashing and table lookup recognition.  All generated C code
285228068Sbaptis directed to the standard output.  Command-line options described
286228068Sbaptbelow allow you to modify the input and output format to @code{gperf}.
287228068Sbapt
288228068SbaptBy default, @code{gperf} attempts to produce time-efficient code, with
289228068Sbaptless emphasis on efficient space utilization.  However, several options
290228068Sbaptexist that permit trading-off execution time for storage space and vice
291228068Sbaptversa.  In particular, expanding the generated table size produces a
292228068Sbaptsparse search structure, generally yielding faster searches.
293228068SbaptConversely, you can direct @code{gperf} to utilize a C @code{switch}
294228068Sbaptstatement scheme that minimizes data space storage size.  Furthermore,
295228068Sbaptusing a C @code{switch} may actually speed up the keyword retrieval time
296228068Sbaptsomewhat.  Actual results depend on your C compiler, of course.
297228068Sbapt
298228068SbaptIn general, @code{gperf} assigns values to the bytes it is using
299228068Sbaptfor hashing until some set of values gives each keyword a unique value.
300228068SbaptA helpful heuristic is that the larger the hash value range, the easier
301228068Sbaptit is for @code{gperf} to find and generate a perfect hash function.
302228068SbaptExperimentation is the key to getting the most from @code{gperf}.
303228068Sbapt
304228068Sbapt@node Input Format, Output Format, Description, Description
305228068Sbapt@section Input Format to @code{gperf}
306228068Sbapt@cindex Format
307228068Sbapt@cindex Declaration section
308228068Sbapt@cindex Keywords section
309228068Sbapt@cindex Functions section
310228068SbaptYou can control the input file format by varying certain command-line
311228068Sbaptarguments, in particular the @samp{-t} option.  The input's appearance
312228068Sbaptis similar to GNU utilities @code{flex} and @code{bison} (or UNIX
313228068Sbaptutilities @code{lex} and @code{yacc}).  Here's an outline of the general
314228068Sbaptformat:
315228068Sbapt
316228068Sbapt@example
317228068Sbapt@group
318228068Sbaptdeclarations
319228068Sbapt%%
320228068Sbaptkeywords
321228068Sbapt%%
322228068Sbaptfunctions
323228068Sbapt@end group
324228068Sbapt@end example
325228068Sbapt
326228068Sbapt@emph{Unlike} @code{flex} or @code{bison}, the declarations section and
327228068Sbaptthe functions section are optional.  The following sections describe the
328228068Sbaptinput format for each section.
329228068Sbapt
330228068Sbapt@menu
331228068Sbapt* Declarations::                Declarations.
332228068Sbapt* Keywords::                    Format for Keyword Entries.
333228068Sbapt* Functions::                   Including Additional C Functions.
334228068Sbapt* Controls for GNU indent::     Where to place directives for GNU @code{indent}.
335228068Sbapt@end menu
336228068Sbapt
337228068SbaptIt is possible to omit the declaration section entirely, if the @samp{-t}
338228068Sbaptoption is not given.  In this case the input file begins directly with the
339228068Sbaptfirst keyword line, e.g.:
340228068Sbapt
341228068Sbapt@example
342228068Sbapt@group
343228068Sbaptjanuary
344228068Sbaptfebruary
345228068Sbaptmarch
346228068Sbaptapril
347228068Sbapt...
348228068Sbapt@end group
349228068Sbapt@end example
350228068Sbapt
351228068Sbapt@node Declarations, Keywords, Input Format, Input Format
352228068Sbapt@subsection Declarations
353228068Sbapt
354228068SbaptThe keyword input file optionally contains a section for including
355228068Sbaptarbitrary C declarations and definitions, @code{gperf} declarations that
356228068Sbaptact like command-line options, as well as for providing a user-supplied
357228068Sbapt@code{struct}.
358228068Sbapt
359228068Sbapt@menu
360228068Sbapt* User-supplied Struct::        Specifying keywords with attributes.
361228068Sbapt* Gperf Declarations::          Embedding command line options in the input.
362228068Sbapt* C Code Inclusion::            Including C declarations and definitions.
363228068Sbapt@end menu
364228068Sbapt
365228068Sbapt@node User-supplied Struct, Gperf Declarations, Declarations, Declarations
366228068Sbapt@subsubsection User-supplied @code{struct}
367228068Sbapt
368228068SbaptIf the @samp{-t} option (or, equivalently, the @samp{%struct-type} declaration)
369228068Sbapt@emph{is} enabled, you @emph{must} provide a C @code{struct} as the last
370228068Sbaptcomponent in the declaration section from the input file.  The first
371228068Sbaptfield in this struct must be of type @code{char *} or @code{const char *}
372228068Sbaptif the @samp{-P} option is not given, or of type @code{int} if the option
373228068Sbapt@samp{-P} (or, equivalently, the @samp{%pic} declaration) is enabled.
374228068SbaptThis first field must be called @samp{name}, although it is possible to modify
375228068Sbaptits name with the @samp{-K} option (or, equivalently, the
376228068Sbapt@samp{%define slot-name} declaration) described below.
377228068Sbapt
378228068SbaptHere is a simple example, using months of the year and their attributes as
379228068Sbaptinput:
380228068Sbapt
381228068Sbapt@example
382228068Sbapt@group
383228068Sbaptstruct month @{ char *name; int number; int days; int leap_days; @};
384228068Sbapt%%
385228068Sbaptjanuary,   1, 31, 31
386228068Sbaptfebruary,  2, 28, 29
387228068Sbaptmarch,     3, 31, 31
388228068Sbaptapril,     4, 30, 30
389228068Sbaptmay,       5, 31, 31
390228068Sbaptjune,      6, 30, 30
391228068Sbaptjuly,      7, 31, 31
392228068Sbaptaugust,    8, 31, 31
393228068Sbaptseptember, 9, 30, 30
394228068Sbaptoctober,  10, 31, 31
395228068Sbaptnovember, 11, 30, 30
396228068Sbaptdecember, 12, 31, 31
397228068Sbapt@end group
398228068Sbapt@end example
399228068Sbapt
400228068Sbapt@cindex @samp{%%}
401228068SbaptSeparating the @code{struct} declaration from the list of keywords and
402228068Sbaptother fields are a pair of consecutive percent signs, @samp{%%},
403228068Sbaptappearing left justified in the first column, as in the UNIX utility
404228068Sbapt@code{lex}.
405228068Sbapt
406228068SbaptIf the @code{struct} has already been declared in an include file, it can
407228068Sbaptbe mentioned in an abbreviated form, like this:
408228068Sbapt
409228068Sbapt@example
410228068Sbapt@group
411228068Sbaptstruct month;
412228068Sbapt%%
413228068Sbaptjanuary,   1, 31, 31
414228068Sbapt...
415228068Sbapt@end group
416228068Sbapt@end example
417228068Sbapt
418228068Sbapt@node Gperf Declarations, C Code Inclusion, User-supplied Struct, Declarations
419228068Sbapt@subsubsection Gperf Declarations
420228068Sbapt
421228068SbaptThe declaration section can contain @code{gperf} declarations.  They
422228068Sbaptinfluence the way @code{gperf} works, like command line options do.
423228068SbaptIn fact, every such declaration is equivalent to a command line option.
424228068SbaptThere are three forms of declarations:
425228068Sbapt
426228068Sbapt@enumerate
427228068Sbapt@item
428228068SbaptDeclarations without argument, like @samp{%compare-lengths}.
429228068Sbapt
430228068Sbapt@item
431228068SbaptDeclarations with an argument, like @samp{%switch=@var{count}}.
432228068Sbapt
433228068Sbapt@item
434228068SbaptDeclarations of names of entities in the output file, like
435228068Sbapt@samp{%define lookup-function-name @var{name}}.
436228068Sbapt@end enumerate
437228068Sbapt
438228068SbaptWhen a declaration is given both in the input file and as a command line
439228068Sbaptoption, the command-line option's value prevails.
440228068Sbapt
441228068SbaptThe following @code{gperf} declarations are available.
442228068Sbapt
443228068Sbapt@table @samp
444228068Sbapt@item %delimiters=@var{delimiter-list}
445228068Sbapt@cindex @samp{%delimiters}
446228068SbaptAllows you to provide a string containing delimiters used to
447228068Sbaptseparate keywords from their attributes.  The default is ",".  This
448228068Sbaptoption is essential if you want to use keywords that have embedded
449228068Sbaptcommas or newlines.
450228068Sbapt
451228068Sbapt@item %struct-type
452228068Sbapt@cindex @samp{%struct-type}
453228068SbaptAllows you to include a @code{struct} type declaration for generated
454228068Sbaptcode; see above for an example.
455228068Sbapt
456228068Sbapt@item %ignore-case
457228068Sbapt@cindex @samp{%ignore-case}
458228068SbaptConsider upper and lower case ASCII characters as equivalent.  The string
459228068Sbaptcomparison will use a case insignificant character comparison.  Note that
460228068Sbaptlocale dependent case mappings are ignored.
461228068Sbapt
462228068Sbapt@item %language=@var{language-name}
463228068Sbapt@cindex @samp{%language}
464228068SbaptInstructs @code{gperf} to generate code in the language specified by the
465228068Sbaptoption's argument.  Languages handled are currently:
466228068Sbapt
467228068Sbapt@table @samp
468228068Sbapt@item KR-C
469228068SbaptOld-style K&R C.  This language is understood by old-style C compilers and
470228068SbaptANSI C compilers, but ANSI C compilers may flag warnings (or even errors)
471228068Sbaptbecause of lacking @samp{const}.
472228068Sbapt
473228068Sbapt@item C
474228068SbaptCommon C.  This language is understood by ANSI C compilers, and also by
475228068Sbaptold-style C compilers, provided that you @code{#define const} to empty
476228068Sbaptfor compilers which don't know about this keyword.
477228068Sbapt
478228068Sbapt@item ANSI-C
479228068SbaptANSI C.  This language is understood by ANSI C compilers and C++ compilers.
480228068Sbapt
481228068Sbapt@item C++
482228068SbaptC++.  This language is understood by C++ compilers.
483228068Sbapt@end table
484228068Sbapt
485228068SbaptThe default is C.
486228068Sbapt
487228068Sbapt@item %define slot-name @var{name}
488228068Sbapt@cindex @samp{%define slot-name}
489228068SbaptThis declaration is only useful when option @samp{-t} (or, equivalently, the
490228068Sbapt@samp{%struct-type} declaration) has been given.
491228068SbaptBy default, the program assumes the structure component identifier for
492228068Sbaptthe keyword is @samp{name}.  This option allows an arbitrary choice of
493228068Sbaptidentifier for this component, although it still must occur as the first
494228068Sbaptfield in your supplied @code{struct}.
495228068Sbapt
496228068Sbapt@item %define initializer-suffix @var{initializers}
497228068Sbapt@cindex @samp{%define initializer-suffix}
498228068SbaptThis declaration is only useful when option @samp{-t} (or, equivalently, the
499228068Sbapt@samp{%struct-type} declaration) has been given.
500228068SbaptIt permits to specify initializers for the structure members following
501228068Sbapt@var{slot-name} in empty hash table entries.  The list of initializers
502228068Sbaptshould start with a comma.  By default, the emitted code will
503228068Sbaptzero-initialize structure members following @var{slot-name}.
504228068Sbapt
505228068Sbapt@item %define hash-function-name @var{name}
506228068Sbapt@cindex @samp{%define hash-function-name}
507228068SbaptAllows you to specify the name for the generated hash function.  Default
508228068Sbaptname is @samp{hash}.  This option permits the use of two hash tables in
509228068Sbaptthe same file.
510228068Sbapt
511228068Sbapt@item %define lookup-function-name @var{name}
512228068Sbapt@cindex @samp{%define lookup-function-name}
513228068SbaptAllows you to specify the name for the generated lookup function.
514228068SbaptDefault name is @samp{in_word_set}.  This option permits multiple
515228068Sbaptgenerated hash functions to be used in the same application.
516228068Sbapt
517228068Sbapt@item %define class-name @var{name}
518228068Sbapt@cindex @samp{%define class-name}
519228068SbaptThis option is only useful when option @samp{-L C++} (or, equivalently,
520228068Sbaptthe @samp{%language=C++} declaration) has been given.  It
521228068Sbaptallows you to specify the name of generated C++ class.  Default name is
522228068Sbapt@code{Perfect_Hash}.
523228068Sbapt
524228068Sbapt@item %7bit
525228068Sbapt@cindex @samp{%7bit}
526228068SbaptThis option specifies that all strings that will be passed as arguments
527228068Sbaptto the generated hash function and the generated lookup function will
528228068Sbaptsolely consist of 7-bit ASCII characters (bytes in the range 0..127).
529228068Sbapt(Note that the ANSI C functions @code{isalnum} and @code{isgraph} do
530228068Sbapt@emph{not} guarantee that a byte is in this range.  Only an explicit
531228068Sbapttest like @samp{c >= 'A' && c <= 'Z'} guarantees this.)
532228068Sbapt
533228068Sbapt@item %compare-lengths
534228068Sbapt@cindex @samp{%compare-lengths}
535228068SbaptCompare keyword lengths before trying a string comparison.  This option
536228068Sbaptis mandatory for binary comparisons (@pxref{Binary Strings}).  It also might
537228068Sbaptcut down on the number of string comparisons made during the lookup, since
538228068Sbaptkeywords with different lengths are never compared via @code{strcmp}.
539228068SbaptHowever, using @samp{%compare-lengths} might greatly increase the size of the
540228068Sbaptgenerated C code if the lookup table range is large (which implies that
541228068Sbaptthe switch option @samp{-S} or @samp{%switch} is not enabled), since the length
542228068Sbapttable contains as many elements as there are entries in the lookup table.
543228068Sbapt
544228068Sbapt@item %compare-strncmp
545228068Sbapt@cindex @samp{%compare-strncmp}
546228068SbaptGenerates C code that uses the @code{strncmp} function to perform
547228068Sbaptstring comparisons.  The default action is to use @code{strcmp}.
548228068Sbapt
549228068Sbapt@item %readonly-tables
550228068Sbapt@cindex @samp{%readonly-tables}
551228068SbaptMakes the contents of all generated lookup tables constant, i.e.,
552228068Sbapt``readonly''.  Many compilers can generate more efficient code for this
553228068Sbaptby putting the tables in readonly memory.
554228068Sbapt
555228068Sbapt@item %enum
556228068Sbapt@cindex @samp{%enum}
557228068SbaptDefine constant values using an enum local to the lookup function rather
558228068Sbaptthan with #defines.  This also means that different lookup functions can
559228068Sbaptreside in the same file.  Thanks to James Clark @code{<jjc@@ai.mit.edu>}.
560228068Sbapt
561228068Sbapt@item %includes
562228068Sbapt@cindex @samp{%includes}
563228068SbaptInclude the necessary system include file, @code{<string.h>}, at the
564228068Sbaptbeginning of the code.  By default, this is not done; the user must
565228068Sbaptinclude this header file himself to allow compilation of the code.
566228068Sbapt
567228068Sbapt@item %global-table
568228068Sbapt@cindex @samp{%global-table}
569228068SbaptGenerate the static table of keywords as a static global variable,
570228068Sbaptrather than hiding it inside of the lookup function (which is the
571228068Sbaptdefault behavior).
572228068Sbapt
573228068Sbapt@item %pic
574228068Sbapt@cindex @samp{%pic}
575228068SbaptOptimize the generated table for inclusion in shared libraries.  This
576228068Sbaptreduces the startup time of programs using a shared library containing
577228068Sbaptthe generated code.  If the @samp{%struct-type} declaration (or,
578228068Sbaptequivalently, the option @samp{-t}) is also given, the first field of the
579228068Sbaptuser-defined struct must be of type @samp{int}, not @samp{char *}, because
580228068Sbaptit will contain offsets into the string pool instead of actual strings.
581228068SbaptTo convert such an offset to a string, you can use the expression
582228068Sbapt@samp{stringpool + @var{o}}, where @var{o} is the offset.  The string pool
583228068Sbaptname can be changed through the @samp{%define string-pool-name} declaration.
584228068Sbapt
585228068Sbapt@item %define string-pool-name @var{name}
586228068Sbapt@cindex @samp{%define string-pool-name}
587228068SbaptAllows you to specify the name of the generated string pool created by
588228068Sbaptthe declaration @samp{%pic} (or, equivalently, the option @samp{-P}).
589228068SbaptThe default name is @samp{stringpool}.  This declaration permits the use of
590228068Sbapttwo hash tables in the same file, with @samp{%pic} and even when the
591228068Sbapt@samp{%global-table} declaration (or, equivalently, the option @samp{-G})
592228068Sbaptis given.
593228068Sbapt
594228068Sbapt@item %null-strings
595228068Sbapt@cindex @samp{%null-strings}
596228068SbaptUse NULL strings instead of empty strings for empty keyword table entries.
597228068SbaptThis reduces the startup time of programs using a shared library containing
598228068Sbaptthe generated code (but not as much as the declaration @samp{%pic}), at the
599228068Sbaptexpense of one more test-and-branch instruction at run time.
600228068Sbapt
601228068Sbapt@item %define word-array-name @var{name}
602228068Sbapt@cindex @samp{%define word-array-name}
603228068SbaptAllows you to specify the name for the generated array containing the
604228068Sbapthash table.  Default name is @samp{wordlist}.  This option permits the
605228068Sbaptuse of two hash tables in the same file, even when the option @samp{-G}
606228068Sbapt(or, equivalently, the @samp{%global-table} declaration) is given.
607228068Sbapt
608228068Sbapt@item %define length-table-name @var{name}
609228068Sbapt@cindex @samp{%define length-table-name}
610228068SbaptAllows you to specify the name for the generated array containing the
611228068Sbaptlength table.  Default name is @samp{lengthtable}.  This option permits the
612228068Sbaptuse of two length tables in the same file, even when the option @samp{-G}
613228068Sbapt(or, equivalently, the @samp{%global-table} declaration) is given.
614228068Sbapt
615228068Sbapt@item %switch=@var{count}
616228068Sbapt@cindex @samp{%switch}
617228068SbaptCauses the generated C code to use a @code{switch} statement scheme,
618228068Sbaptrather than an array lookup table.  This can lead to a reduction in both
619228068Sbapttime and space requirements for some input files.  The argument to this
620228068Sbaptoption determines how many @code{switch} statements are generated.  A
621228068Sbaptvalue of 1 generates 1 @code{switch} containing all the elements, a
622228068Sbaptvalue of 2 generates 2 tables with 1/2 the elements in each
623228068Sbapt@code{switch}, etc.  This is useful since many C compilers cannot
624228068Sbaptcorrectly generate code for large @code{switch} statements.  This option
625228068Sbaptwas inspired in part by Keith Bostic's original C program.
626228068Sbapt
627228068Sbapt@item %omit-struct-type
628228068Sbapt@cindex @samp{%omit-struct-type}
629228068SbaptPrevents the transfer of the type declaration to the output file.  Use
630228068Sbaptthis option if the type is already defined elsewhere.
631228068Sbapt@end table
632228068Sbapt
633228068Sbapt@node C Code Inclusion,  , Gperf Declarations, Declarations
634228068Sbapt@subsubsection C Code Inclusion
635228068Sbapt
636228068Sbapt@cindex @samp{%@{}
637228068Sbapt@cindex @samp{%@}}
638228068SbaptUsing a syntax similar to GNU utilities @code{flex} and @code{bison}, it
639228068Sbaptis possible to directly include C source text and comments verbatim into
640228068Sbaptthe generated output file.  This is accomplished by enclosing the region
641228068Sbaptinside left-justified surrounding @samp{%@{}, @samp{%@}} pairs.  Here is
642228068Sbaptan input fragment based on the previous example that illustrates this
643228068Sbaptfeature:
644228068Sbapt
645228068Sbapt@example
646228068Sbapt@group
647228068Sbapt%@{
648228068Sbapt#include <assert.h>
649228068Sbapt/* This section of code is inserted directly into the output. */
650228068Sbaptint return_month_days (struct month *months, int is_leap_year);
651228068Sbapt%@}
652228068Sbaptstruct month @{ char *name; int number; int days; int leap_days; @};
653228068Sbapt%%
654228068Sbaptjanuary,   1, 31, 31
655228068Sbaptfebruary,  2, 28, 29
656228068Sbaptmarch,     3, 31, 31
657228068Sbapt...
658228068Sbapt@end group
659228068Sbapt@end example
660228068Sbapt
661228068Sbapt@node Keywords, Functions, Declarations, Input Format
662228068Sbapt@subsection Format for Keyword Entries
663228068Sbapt
664228068SbaptThe second input file format section contains lines of keywords and any
665228068Sbaptassociated attributes you might supply.  A line beginning with @samp{#}
666228068Sbaptin the first column is considered a comment.  Everything following the
667228068Sbapt@samp{#} is ignored, up to and including the following newline.  A line
668228068Sbaptbeginning with @samp{%} in the first column is an option declaration and
669228068Sbaptmust not occur within the keywords section.
670228068Sbapt
671228068SbaptThe first field of each non-comment line is always the keyword itself.  It
672228068Sbaptcan be given in two ways: as a simple name, i.e., without surrounding
673228068Sbaptstring quotation marks, or as a string enclosed in double-quotes, in
674228068SbaptC syntax, possibly with backslash escapes like @code{\"} or @code{\234}
675228068Sbaptor @code{\xa8}.  In either case, it must start right at the beginning
676228068Sbaptof the line, without leading whitespace.
677228068SbaptIn this context, a ``field'' is considered to extend up to, but
678228068Sbaptnot include, the first blank, comma, or newline.  Here is a simple
679228068Sbaptexample taken from a partial list of C reserved words:
680228068Sbapt
681228068Sbapt@example
682228068Sbapt@group
683228068Sbapt# These are a few C reserved words, see the c.gperf file 
684228068Sbapt# for a complete list of ANSI C reserved words.
685228068Sbaptunsigned
686228068Sbaptsizeof
687228068Sbaptswitch
688228068Sbaptsigned
689228068Sbaptif
690228068Sbaptdefault
691228068Sbaptfor
692228068Sbaptwhile
693228068Sbaptreturn
694228068Sbapt@end group
695228068Sbapt@end example
696228068Sbapt
697228068SbaptNote that unlike @code{flex} or @code{bison} the first @samp{%%} marker
698228068Sbaptmay be elided if the declaration section is empty.
699228068Sbapt
700228068SbaptAdditional fields may optionally follow the leading keyword.  Fields
701228068Sbaptshould be separated by commas, and terminate at the end of line.  What
702228068Sbaptthese fields mean is entirely up to you; they are used to initialize the
703228068Sbaptelements of the user-defined @code{struct} provided by you in the
704228068Sbaptdeclaration section.  If the @samp{-t} option (or, equivalently, the
705228068Sbapt@samp{%struct-type} declaration) is @emph{not} enabled
706228068Sbaptthese fields are simply ignored.  All previous examples except the last
707228068Sbaptone contain keyword attributes.
708228068Sbapt
709228068Sbapt@node Functions, Controls for GNU indent, Keywords, Input Format
710228068Sbapt@subsection Including Additional C Functions
711228068Sbapt
712228068SbaptThe optional third section also corresponds closely with conventions
713228068Sbaptfound in @code{flex} and @code{bison}.  All text in this section,
714228068Sbaptstarting at the final @samp{%%} and extending to the end of the input
715228068Sbaptfile, is included verbatim into the generated output file.  Naturally,
716228068Sbaptit is your responsibility to ensure that the code contained in this
717228068Sbaptsection is valid C.
718228068Sbapt
719228068Sbapt@node Controls for GNU indent,  , Functions, Input Format
720228068Sbapt@subsection Where to place directives for GNU @code{indent}.
721228068Sbapt
722228068SbaptIf you want to invoke GNU @code{indent} on a @code{gperf} input file,
723228068Sbaptyou will see that GNU @code{indent} doesn't understand the @samp{%%},
724228068Sbapt@samp{%@{} and @samp{%@}} directives that control @code{gperf}'s
725228068Sbaptinterpretation of the input file.  Therefore you have to insert some
726228068Sbaptdirectives for GNU @code{indent}.  More precisely, assuming the most
727228068Sbaptgeneral input file structure
728228068Sbapt
729228068Sbapt@example
730228068Sbapt@group
731228068Sbaptdeclarations part 1
732228068Sbapt%@{
733228068Sbaptverbatim code
734228068Sbapt%@}
735228068Sbaptdeclarations part 2
736228068Sbapt%%
737228068Sbaptkeywords
738228068Sbapt%%
739228068Sbaptfunctions
740228068Sbapt@end group
741228068Sbapt@end example
742228068Sbapt
743228068Sbapt@noindent
744228068Sbaptyou would insert @samp{*INDENT-OFF*} and @samp{*INDENT-ON*} comments
745228068Sbaptas follows:
746228068Sbapt
747228068Sbapt@example
748228068Sbapt@group
749228068Sbapt/* *INDENT-OFF* */
750228068Sbaptdeclarations part 1
751228068Sbapt%@{
752228068Sbapt/* *INDENT-ON* */
753228068Sbaptverbatim code
754228068Sbapt/* *INDENT-OFF* */
755228068Sbapt%@}
756228068Sbaptdeclarations part 2
757228068Sbapt%%
758228068Sbaptkeywords
759228068Sbapt%%
760228068Sbapt/* *INDENT-ON* */
761228068Sbaptfunctions
762228068Sbapt@end group
763228068Sbapt@end example
764228068Sbapt
765228068Sbapt@node Output Format, Binary Strings, Input Format, Description
766228068Sbapt@section Output Format for Generated C Code with @code{gperf}
767228068Sbapt@cindex hash table
768228068Sbapt
769228068SbaptSeveral options control how the generated C code appears on the standard 
770228068Sbaptoutput.  Two C functions are generated.  They are called @code{hash} and 
771228068Sbapt@code{in_word_set}, although you may modify their names with a command-line 
772228068Sbaptoption.  Both functions require two arguments, a string, @code{char *} 
773228068Sbapt@var{str}, and a length parameter, @code{int} @var{len}.  Their default 
774228068Sbaptfunction prototypes are as follows:
775228068Sbapt
776228068Sbapt@deftypefun {unsigned int} hash (const char * @var{str}, unsigned int @var{len})
777228068SbaptBy default, the generated @code{hash} function returns an integer value
778228068Sbaptcreated by adding @var{len} to several user-specified @var{str} byte
779228068Sbaptpositions indexed into an @dfn{associated values} table stored in a
780228068Sbaptlocal static array.  The associated values table is constructed
781228068Sbaptinternally by @code{gperf} and later output as a static local C array
782228068Sbaptcalled @samp{hash_table}.  The relevant selected positions (i.e. indices
783228068Sbaptinto @var{str}) are specified via the @samp{-k} option when running
784228068Sbapt@code{gperf}, as detailed in the @emph{Options} section below (@pxref{Options}).
785228068Sbapt@end deftypefun
786228068Sbapt
787228068Sbapt@deftypefun {} in_word_set (const char * @var{str}, unsigned int @var{len})
788228068SbaptIf @var{str} is in the keyword set, returns a pointer to that
789228068Sbaptkeyword.  More exactly, if the option @samp{-t} (or, equivalently, the
790228068Sbapt@samp{%struct-type} declaration) was given, it returns
791228068Sbapta pointer to the matching keyword's structure.  Otherwise it returns
792228068Sbapt@code{NULL}.
793228068Sbapt@end deftypefun
794228068Sbapt
795228068SbaptIf the option @samp{-c} (or, equivalently, the @samp{%compare-strncmp}
796228068Sbaptdeclaration) is not used, @var{str} must be a NUL terminated
797228068Sbaptstring of exactly length @var{len}.  If @samp{-c} (or, equivalently, the
798228068Sbapt@samp{%compare-strncmp} declaration) is used, @var{str} must
799228068Sbaptsimply be an array of @var{len} bytes and does not need to be NUL
800228068Sbaptterminated.
801228068Sbapt
802228068SbaptThe code generated for these two functions is affected by the following
803228068Sbaptoptions:
804228068Sbapt
805228068Sbapt@table @samp
806228068Sbapt@item -t
807228068Sbapt@itemx --struct-type
808228068SbaptMake use of the user-defined @code{struct}.
809228068Sbapt
810228068Sbapt@item -S @var{total-switch-statements}
811228068Sbapt@itemx --switch=@var{total-switch-statements}
812228068Sbapt@cindex @code{switch}
813228068SbaptGenerate 1 or more C @code{switch} statement rather than use a large,
814228068Sbapt(and potentially sparse) static array.  Although the exact time and
815228068Sbaptspace savings of this approach vary according to your C compiler's
816228068Sbaptdegree of optimization, this method often results in smaller and faster
817228068Sbaptcode.
818228068Sbapt@end table
819228068Sbapt
820228068SbaptIf the @samp{-t} and @samp{-S} options (or, equivalently, the
821228068Sbapt@samp{%struct-type} and @samp{%switch} declarations) are omitted, the default
822228068Sbaptaction
823228068Sbaptis to generate a @code{char *} array containing the keywords, together with
824228068Sbaptadditional empty strings used for padding the array.  By experimenting
825228068Sbaptwith the various input and output options, and timing the resulting C
826228068Sbaptcode, you can determine the best option choices for different keyword
827228068Sbaptset characteristics.
828228068Sbapt
829228068Sbapt@node Binary Strings,  , Output Format, Description
830228068Sbapt@section Use of NUL bytes
831228068Sbapt@cindex NUL
832228068Sbapt
833228068SbaptBy default, the code generated by @code{gperf} operates on zero
834228068Sbaptterminated strings, the usual representation of strings in C.  This means
835228068Sbaptthat the keywords in the input file must not contain NUL bytes,
836228068Sbaptand the @var{str} argument passed to @code{hash} or @code{in_word_set}
837228068Sbaptmust be NUL terminated and have exactly length @var{len}.
838228068Sbapt
839228068SbaptIf option @samp{-c} (or, equivalently, the @samp{%compare-strncmp}
840228068Sbaptdeclaration) is used, then the @var{str} argument does not need
841228068Sbaptto be NUL terminated.  The code generated by @code{gperf} will only
842228068Sbaptaccess the first @var{len}, not @var{len+1}, bytes starting at @var{str}.
843228068SbaptHowever, the keywords in the input file still must not contain NUL
844228068Sbaptbytes.
845228068Sbapt
846228068SbaptIf option @samp{-l} (or, equivalently, the @samp{%compare-lengths}
847228068Sbaptdeclaration) is used, then the hash table performs binary
848228068Sbaptcomparison.  The keywords in the input file may contain NUL bytes,
849228068Sbaptwritten in string syntax as @code{\000} or @code{\x00}, and the code
850228068Sbaptgenerated by @code{gperf} will treat NUL like any other byte.
851228068SbaptAlso, in this case the @samp{-c} option (or, equivalently, the
852228068Sbapt@samp{%compare-strncmp} declaration) is ignored.
853228068Sbapt
854228068Sbapt@node Options, Bugs, Description, Top
855228068Sbapt@chapter Invoking @code{gperf}
856228068Sbapt
857228068SbaptThere are @emph{many} options to @code{gperf}.  They were added to make
858228068Sbaptthe program more convenient for use with real applications.  ``On-line''
859228068Sbapthelp is readily available via the @samp{--help} option.  Here is the
860228068Sbaptcomplete list of options.
861228068Sbapt
862228068Sbapt@menu
863228068Sbapt* Output File::                 Specifying the Location of the Output File
864228068Sbapt* Input Details::               Options that affect Interpretation of the Input File
865228068Sbapt* Output Language::             Specifying the Language for the Output Code
866228068Sbapt* Output Details::              Fine tuning Details in the Output Code
867228068Sbapt* Algorithmic Details::         Changing the Algorithms employed by @code{gperf}
868228068Sbapt* Verbosity::                   Informative Output
869228068Sbapt@end menu
870228068Sbapt
871228068Sbapt@node Output File, Input Details, Options, Options
872228068Sbapt@section Specifying the Location of the Output File
873228068Sbapt
874228068Sbapt@table @samp
875228068Sbapt@item --output-file=@var{file}
876228068SbaptAllows you to specify the name of the file to which the output is written to.
877228068Sbapt@end table
878228068Sbapt
879228068SbaptThe results are written to standard output if no output file is specified
880228068Sbaptor if it is @samp{-}.
881228068Sbapt
882228068Sbapt@node Input Details, Output Language, Output File, Options
883228068Sbapt@section Options that affect Interpretation of the Input File
884228068Sbapt
885228068SbaptThese options are also available as declarations in the input file
886228068Sbapt(@pxref{Gperf Declarations}).
887228068Sbapt
888228068Sbapt@table @samp
889228068Sbapt@item -e @var{keyword-delimiter-list}
890228068Sbapt@itemx --delimiters=@var{keyword-delimiter-list}
891228068Sbapt@cindex Delimiters
892228068SbaptAllows you to provide a string containing delimiters used to
893228068Sbaptseparate keywords from their attributes.  The default is ",".  This
894228068Sbaptoption is essential if you want to use keywords that have embedded
895228068Sbaptcommas or newlines.  One useful trick is to use -e'TAB', where TAB is
896228068Sbaptthe literal tab character.
897228068Sbapt
898228068Sbapt@item -t
899228068Sbapt@itemx --struct-type
900228068SbaptAllows you to include a @code{struct} type declaration for generated
901228068Sbaptcode.  Any text before a pair of consecutive @samp{%%} is considered
902228068Sbaptpart of the type declaration.  Keywords and additional fields may follow
903228068Sbaptthis, one group of fields per line.  A set of examples for generating
904228068Sbaptperfect hash tables and functions for Ada, C, C++, Pascal, Modula 2,
905228068SbaptModula 3 and JavaScript reserved words are distributed with this release.
906228068Sbapt
907228068Sbapt@item --ignore-case
908228068SbaptConsider upper and lower case ASCII characters as equivalent.  The string
909228068Sbaptcomparison will use a case insignificant character comparison.  Note that
910228068Sbaptlocale dependent case mappings are ignored.  This option is therefore not
911228068Sbaptsuitable if a properly internationalized or locale aware case mapping
912228068Sbaptshould be used.  (For example, in a Turkish locale, the upper case equivalent
913228068Sbaptof the lowercase ASCII letter @samp{i} is the non-ASCII character
914228068Sbapt@samp{capital i with dot above}.)  For this case, it is better to apply
915228068Sbaptan uppercase or lowercase conversion on the string before passing it to
916228068Sbaptthe @code{gperf} generated function.
917228068Sbapt@end table
918228068Sbapt
919228068Sbapt@node Output Language, Output Details, Input Details, Options
920228068Sbapt@section Options to specify the Language for the Output Code
921228068Sbapt
922228068SbaptThese options are also available as declarations in the input file
923228068Sbapt(@pxref{Gperf Declarations}).
924228068Sbapt
925228068Sbapt@table @samp
926228068Sbapt@item -L @var{generated-language-name}
927228068Sbapt@itemx --language=@var{generated-language-name}
928228068SbaptInstructs @code{gperf} to generate code in the language specified by the
929228068Sbaptoption's argument.  Languages handled are currently:
930228068Sbapt
931228068Sbapt@table @samp
932228068Sbapt@item KR-C
933228068SbaptOld-style K&R C.  This language is understood by old-style C compilers and
934228068SbaptANSI C compilers, but ANSI C compilers may flag warnings (or even errors)
935228068Sbaptbecause of lacking @samp{const}.
936228068Sbapt
937228068Sbapt@item C
938228068SbaptCommon C.  This language is understood by ANSI C compilers, and also by
939228068Sbaptold-style C compilers, provided that you @code{#define const} to empty
940228068Sbaptfor compilers which don't know about this keyword.
941228068Sbapt
942228068Sbapt@item ANSI-C
943228068SbaptANSI C.  This language is understood by ANSI C compilers and C++ compilers.
944228068Sbapt
945228068Sbapt@item C++
946228068SbaptC++.  This language is understood by C++ compilers.
947228068Sbapt@end table
948228068Sbapt
949228068SbaptThe default is C.
950228068Sbapt
951228068Sbapt@item -a
952228068SbaptThis option is supported for compatibility with previous releases of
953228068Sbapt@code{gperf}.  It does not do anything.
954228068Sbapt
955228068Sbapt@item -g
956228068SbaptThis option is supported for compatibility with previous releases of
957228068Sbapt@code{gperf}.  It does not do anything.
958228068Sbapt@end table
959228068Sbapt
960228068Sbapt@node Output Details, Algorithmic Details, Output Language, Options
961228068Sbapt@section Options for fine tuning Details in the Output Code
962228068Sbapt
963228068SbaptMost of these options are also available as declarations in the input file
964228068Sbapt(@pxref{Gperf Declarations}).
965228068Sbapt
966228068Sbapt@table @samp
967228068Sbapt@item -K @var{slot-name}
968228068Sbapt@itemx --slot-name=@var{slot-name}
969228068Sbapt@cindex Slot name
970228068SbaptThis option is only useful when option @samp{-t} (or, equivalently, the
971228068Sbapt@samp{%struct-type} declaration) has been given.
972228068SbaptBy default, the program assumes the structure component identifier for
973228068Sbaptthe keyword is @samp{name}.  This option allows an arbitrary choice of
974228068Sbaptidentifier for this component, although it still must occur as the first
975228068Sbaptfield in your supplied @code{struct}.
976228068Sbapt
977228068Sbapt@item -F @var{initializers}
978228068Sbapt@itemx --initializer-suffix=@var{initializers}
979228068Sbapt@cindex Initializers
980228068SbaptThis option is only useful when option @samp{-t} (or, equivalently, the
981228068Sbapt@samp{%struct-type} declaration) has been given.
982228068SbaptIt permits to specify initializers for the structure members following
983228068Sbapt@var{slot-name} in empty hash table entries.  The list of initializers
984228068Sbaptshould start with a comma.  By default, the emitted code will
985228068Sbaptzero-initialize structure members following @var{slot-name}.
986228068Sbapt
987228068Sbapt@item -H @var{hash-function-name}
988228068Sbapt@itemx --hash-function-name=@var{hash-function-name}
989228068SbaptAllows you to specify the name for the generated hash function.  Default
990228068Sbaptname is @samp{hash}.  This option permits the use of two hash tables in
991228068Sbaptthe same file.
992228068Sbapt
993228068Sbapt@item -N @var{lookup-function-name}
994228068Sbapt@itemx --lookup-function-name=@var{lookup-function-name}
995228068SbaptAllows you to specify the name for the generated lookup function.
996228068SbaptDefault name is @samp{in_word_set}.  This option permits multiple
997228068Sbaptgenerated hash functions to be used in the same application.
998228068Sbapt
999228068Sbapt@item -Z @var{class-name}
1000228068Sbapt@itemx --class-name=@var{class-name}
1001228068Sbapt@cindex Class name
1002228068SbaptThis option is only useful when option @samp{-L C++} (or, equivalently,
1003228068Sbaptthe @samp{%language=C++} declaration) has been given.  It
1004228068Sbaptallows you to specify the name of generated C++ class.  Default name is
1005228068Sbapt@code{Perfect_Hash}.
1006228068Sbapt
1007228068Sbapt@item -7
1008228068Sbapt@itemx --seven-bit
1009228068SbaptThis option specifies that all strings that will be passed as arguments
1010228068Sbaptto the generated hash function and the generated lookup function will
1011228068Sbaptsolely consist of 7-bit ASCII characters (bytes in the range 0..127).
1012228068Sbapt(Note that the ANSI C functions @code{isalnum} and @code{isgraph} do
1013228068Sbapt@emph{not} guarantee that a byte is in this range.  Only an explicit
1014228068Sbapttest like @samp{c >= 'A' && c <= 'Z'} guarantees this.) This was the
1015228068Sbaptdefault in versions of @code{gperf} earlier than 2.7; now the default is
1016228068Sbaptto support 8-bit and multibyte characters.
1017228068Sbapt
1018228068Sbapt@item -l
1019228068Sbapt@itemx --compare-lengths
1020228068SbaptCompare keyword lengths before trying a string comparison.  This option
1021228068Sbaptis mandatory for binary comparisons (@pxref{Binary Strings}).  It also might
1022228068Sbaptcut down on the number of string comparisons made during the lookup, since
1023228068Sbaptkeywords with different lengths are never compared via @code{strcmp}.
1024228068SbaptHowever, using @samp{-l} might greatly increase the size of the
1025228068Sbaptgenerated C code if the lookup table range is large (which implies that
1026228068Sbaptthe switch option @samp{-S} or @samp{%switch} is not enabled), since the length
1027228068Sbapttable contains as many elements as there are entries in the lookup table.
1028228068Sbapt
1029228068Sbapt@item -c
1030228068Sbapt@itemx --compare-strncmp
1031228068SbaptGenerates C code that uses the @code{strncmp} function to perform
1032228068Sbaptstring comparisons.  The default action is to use @code{strcmp}.
1033228068Sbapt
1034228068Sbapt@item -C
1035228068Sbapt@itemx --readonly-tables
1036228068SbaptMakes the contents of all generated lookup tables constant, i.e.,
1037228068Sbapt``readonly''.  Many compilers can generate more efficient code for this
1038228068Sbaptby putting the tables in readonly memory.
1039228068Sbapt
1040228068Sbapt@item -E
1041228068Sbapt@itemx  --enum
1042228068SbaptDefine constant values using an enum local to the lookup function rather
1043228068Sbaptthan with #defines.  This also means that different lookup functions can
1044228068Sbaptreside in the same file.  Thanks to James Clark @code{<jjc@@ai.mit.edu>}.
1045228068Sbapt
1046228068Sbapt@item -I
1047228068Sbapt@itemx --includes
1048228068SbaptInclude the necessary system include file, @code{<string.h>}, at the
1049228068Sbaptbeginning of the code.  By default, this is not done; the user must
1050228068Sbaptinclude this header file himself to allow compilation of the code.
1051228068Sbapt
1052228068Sbapt@item -G
1053228068Sbapt@itemx --global-table
1054228068SbaptGenerate the static table of keywords as a static global variable,
1055228068Sbaptrather than hiding it inside of the lookup function (which is the
1056228068Sbaptdefault behavior).
1057228068Sbapt
1058228068Sbapt@item -P
1059228068Sbapt@itemx --pic
1060228068SbaptOptimize the generated table for inclusion in shared libraries.  This
1061228068Sbaptreduces the startup time of programs using a shared library containing
1062228068Sbaptthe generated code.  If the option @samp{-t} (or, equivalently, the
1063228068Sbapt@samp{%struct-type} declaration) is also given, the first field of the
1064228068Sbaptuser-defined struct must be of type @samp{int}, not @samp{char *}, because
1065228068Sbaptit will contain offsets into the string pool instead of actual strings.
1066228068SbaptTo convert such an offset to a string, you can use the expression
1067228068Sbapt@samp{stringpool + @var{o}}, where @var{o} is the offset.  The string pool
1068228068Sbaptname can be changed through the option @samp{--string-pool-name}.
1069228068Sbapt
1070228068Sbapt@item -Q @var{string-pool-name}
1071228068Sbapt@itemx --string-pool-name=@var{string-pool-name}
1072228068SbaptAllows you to specify the name of the generated string pool created by
1073228068Sbaptoption @samp{-P}.  The default name is @samp{stringpool}.  This option
1074228068Sbaptpermits the use of two hash tables in the same file, with @samp{-P} and
1075228068Sbapteven when the option @samp{-G} (or, equivalently, the @samp{%global-table}
1076228068Sbaptdeclaration) is given.
1077228068Sbapt
1078228068Sbapt@item --null-strings
1079228068SbaptUse NULL strings instead of empty strings for empty keyword table entries.
1080228068SbaptThis reduces the startup time of programs using a shared library containing
1081228068Sbaptthe generated code (but not as much as option @samp{-P}), at the expense
1082228068Sbaptof one more test-and-branch instruction at run time.
1083228068Sbapt
1084228068Sbapt@item -W @var{hash-table-array-name}
1085228068Sbapt@itemx --word-array-name=@var{hash-table-array-name}
1086228068Sbapt@cindex Array name
1087228068SbaptAllows you to specify the name for the generated array containing the
1088228068Sbapthash table.  Default name is @samp{wordlist}.  This option permits the
1089228068Sbaptuse of two hash tables in the same file, even when the option @samp{-G}
1090228068Sbapt(or, equivalently, the @samp{%global-table} declaration) is given.
1091228068Sbapt
1092228068Sbapt@itemx --length-table-name=@var{length-table-array-name}
1093228068Sbapt@cindex Array name
1094228068SbaptAllows you to specify the name for the generated array containing the
1095228068Sbaptlength table.  Default name is @samp{lengthtable}.  This option permits the
1096228068Sbaptuse of two length tables in the same file, even when the option @samp{-G}
1097228068Sbapt(or, equivalently, the @samp{%global-table} declaration) is given.
1098228068Sbapt
1099228068Sbapt@item -S @var{total-switch-statements}
1100228068Sbapt@itemx --switch=@var{total-switch-statements}
1101228068Sbapt@cindex @code{switch}
1102228068SbaptCauses the generated C code to use a @code{switch} statement scheme,
1103228068Sbaptrather than an array lookup table.  This can lead to a reduction in both
1104228068Sbapttime and space requirements for some input files.  The argument to this
1105228068Sbaptoption determines how many @code{switch} statements are generated.  A
1106228068Sbaptvalue of 1 generates 1 @code{switch} containing all the elements, a
1107228068Sbaptvalue of 2 generates 2 tables with 1/2 the elements in each
1108228068Sbapt@code{switch}, etc.  This is useful since many C compilers cannot
1109228068Sbaptcorrectly generate code for large @code{switch} statements.  This option
1110228068Sbaptwas inspired in part by Keith Bostic's original C program.
1111228068Sbapt
1112228068Sbapt@item -T
1113228068Sbapt@itemx --omit-struct-type
1114228068SbaptPrevents the transfer of the type declaration to the output file.  Use
1115228068Sbaptthis option if the type is already defined elsewhere.
1116228068Sbapt
1117228068Sbapt@item -p
1118228068SbaptThis option is supported for compatibility with previous releases of
1119228068Sbapt@code{gperf}.  It does not do anything.
1120228068Sbapt@end table
1121228068Sbapt
1122228068Sbapt@node Algorithmic Details, Verbosity, Output Details, Options
1123228068Sbapt@section Options for changing the Algorithms employed by @code{gperf}
1124228068Sbapt
1125228068Sbapt@table @samp
1126228068Sbapt@item -k @var{selected-byte-positions}
1127228068Sbapt@itemx --key-positions=@var{selected-byte-positions}
1128228068SbaptAllows selection of the byte positions used in the keywords'
1129228068Sbapthash function.  The allowable choices range between 1-255, inclusive.
1130228068SbaptThe positions are separated by commas, e.g., @samp{-k 9,4,13,14};
1131228068Sbaptranges may be used, e.g., @samp{-k 2-7}; and positions may occur
1132228068Sbaptin any order.  Furthermore, the wildcard '*' causes the generated
1133228068Sbapthash function to consider @strong{all} byte positions in each keyword,
1134228068Sbaptwhereas '$' instructs the hash function to use the ``final byte''
1135228068Sbaptof a keyword (this is the only way to use a byte position greater than
1136228068Sbapt255, incidentally).
1137228068Sbapt
1138228068SbaptFor instance, the option @samp{-k 1,2,4,6-10,'$'} generates a hash
1139228068Sbaptfunction that considers positions 1,2,4,6,7,8,9,10, plus the last
1140228068Sbaptbyte in each keyword (which may be at a different position for each
1141228068Sbaptkeyword, obviously).  Keywords
1142228068Sbaptwith length less than the indicated byte positions work properly, since
1143228068Sbaptselected byte positions exceeding the keyword length are simply not
1144228068Sbaptreferenced in the hash function.
1145228068Sbapt
1146228068SbaptThis option is not normally needed since version 2.8 of @code{gperf};
1147228068Sbaptthe default byte positions are computed depending on the keyword set,
1148228068Sbaptthrough a search that minimizes the number of byte positions.
1149228068Sbapt
1150228068Sbapt@item -D
1151228068Sbapt@itemx --duplicates
1152228068Sbapt@cindex Duplicates
1153228068SbaptHandle keywords whose selected byte sets hash to duplicate values.
1154228068SbaptDuplicate hash values can occur if a set of keywords has the same names, but
1155228068Sbaptpossesses different attributes, or if the selected byte positions are not well
1156228068Sbaptchosen.  With the -D option @code{gperf} treats all these keywords as
1157228068Sbaptpart of an equivalence class and generates a perfect hash function with
1158228068Sbaptmultiple comparisons for duplicate keywords.  It is up to you to completely
1159228068Sbaptdisambiguate the keywords by modifying the generated C code.  However,
1160228068Sbapt@code{gperf} helps you out by organizing the output.
1161228068Sbapt
1162228068SbaptUsing this option usually means that the generated hash function is no
1163228068Sbaptlonger perfect.  On the other hand, it permits @code{gperf} to work on
1164228068Sbaptkeyword sets that it otherwise could not handle.
1165228068Sbapt
1166228068Sbapt@item -m @var{iterations}
1167228068Sbapt@itemx --multiple-iterations=@var{iterations}
1168228068SbaptPerform multiple choices of the @samp{-i} and @samp{-j} values, and
1169228068Sbaptchoose the best results.  This increases the running time by a factor of
1170228068Sbapt@var{iterations} but does a good job minimizing the generated table size.
1171228068Sbapt
1172228068Sbapt@item -i @var{initial-value}
1173228068Sbapt@itemx --initial-asso=@var{initial-value}
1174228068SbaptProvides an initial @var{value} for the associate values array.  Default
1175228068Sbaptis 0.  Increasing the initial value helps inflate the final table size,
1176228068Sbaptpossibly leading to more time efficient keyword lookups.  Note that this
1177228068Sbaptoption is not particularly useful when @samp{-S} (or, equivalently,
1178228068Sbapt@samp{%switch}) is used.  Also,
1179228068Sbapt@samp{-i} is overridden when the @samp{-r} option is used.
1180228068Sbapt
1181228068Sbapt@item -j @var{jump-value}
1182228068Sbapt@itemx --jump=@var{jump-value}
1183228068Sbapt@cindex Jump value
1184228068SbaptAffects the ``jump value'', i.e., how far to advance the associated
1185228068Sbaptbyte value upon collisions.  @var{Jump-value} is rounded up to an
1186228068Sbaptodd number, the default is 5.  If the @var{jump-value} is 0 @code{gperf}
1187228068Sbaptjumps by random amounts.
1188228068Sbapt
1189228068Sbapt@item -n
1190228068Sbapt@itemx --no-strlen
1191228068SbaptInstructs the generator not to include the length of a keyword when
1192228068Sbaptcomputing its hash value.  This may save a few assembly instructions in
1193228068Sbaptthe generated lookup table.
1194228068Sbapt
1195228068Sbapt@item -r
1196228068Sbapt@itemx --random
1197228068SbaptUtilizes randomness to initialize the associated values table.  This
1198228068Sbaptfrequently generates solutions faster than using deterministic
1199228068Sbaptinitialization (which starts all associated values at 0).  Furthermore,
1200228068Sbaptusing the randomization option generally increases the size of the
1201228068Sbapttable.
1202228068Sbapt
1203228068Sbapt@item -s @var{size-multiple}
1204228068Sbapt@itemx --size-multiple=@var{size-multiple}
1205228068SbaptAffects the size of the generated hash table.  The numeric argument for
1206228068Sbaptthis option indicates ``how many times larger or smaller'' the maximum
1207228068Sbaptassociated value range should be, in relationship to the number of keywords.
1208228068SbaptIt can be written as an integer, a floating-point number or a fraction.
1209228068SbaptFor example, a value of 3 means ``allow the maximum associated value to be
1210228068Sbaptabout 3 times larger than the number of input keywords''.
1211228068SbaptConversely, a value of 1/3 means ``allow the maximum associated value to
1212228068Sbaptbe about 3 times smaller than the number of input keywords''.  Values
1213228068Sbaptsmaller than 1 are useful for limiting the overall size of the generated hash
1214228068Sbapttable, though the option @samp{-m} is better at this purpose.
1215228068Sbapt
1216228068SbaptIf `generate switch' option @samp{-S} (or, equivalently, @samp{%switch}) is
1217228068Sbapt@emph{not} enabled, the maximum
1218228068Sbaptassociated value influences the static array table size, and a larger
1219228068Sbapttable should decrease the time required for an unsuccessful search, at
1220228068Sbaptthe expense of extra table space.
1221228068Sbapt
1222228068SbaptThe default value is 1, thus the default maximum associated value about
1223228068Sbaptthe same size as the number of keywords (for efficiency, the maximum
1224228068Sbaptassociated value is always rounded up to a power of 2).  The actual
1225228068Sbapttable size may vary somewhat, since this technique is essentially a
1226228068Sbaptheuristic.
1227228068Sbapt@end table
1228228068Sbapt
1229228068Sbapt@node Verbosity,  , Algorithmic Details, Options
1230228068Sbapt@section Informative Output
1231228068Sbapt
1232228068Sbapt@table @samp
1233228068Sbapt@item -h
1234228068Sbapt@itemx --help
1235228068SbaptPrints a short summary on the meaning of each program option.  Aborts
1236228068Sbaptfurther program execution.
1237228068Sbapt
1238228068Sbapt@item -v
1239228068Sbapt@itemx --version
1240228068SbaptPrints out the current version number.
1241228068Sbapt
1242228068Sbapt@item -d
1243228068Sbapt@itemx --debug
1244228068SbaptEnables the debugging option.  This produces verbose diagnostics to
1245228068Sbapt``standard error'' when @code{gperf} is executing.  It is useful both for
1246228068Sbaptmaintaining the program and for determining whether a given set of
1247228068Sbaptoptions is actually speeding up the search for a solution.  Some useful
1248228068Sbaptinformation is dumped at the end of the program when the @samp{-d}
1249228068Sbaptoption is enabled.
1250228068Sbapt@end table
1251228068Sbapt
1252228068Sbapt@node Bugs, Projects, Options, Top
1253228068Sbapt@chapter Known Bugs and Limitations with @code{gperf}
1254228068Sbapt
1255228068SbaptThe following are some limitations with the current release of
1256228068Sbapt@code{gperf}:
1257228068Sbapt
1258228068Sbapt@itemize @bullet
1259228068Sbapt@item
1260228068SbaptThe @code{gperf} utility is tuned to execute quickly, and works quickly
1261228068Sbaptfor small to medium size data sets (around 1000 keywords).  It is
1262228068Sbaptextremely useful for maintaining perfect hash functions for compiler
1263228068Sbaptkeyword sets.  Several recent enhancements now enable @code{gperf} to
1264228068Sbaptwork efficiently on much larger keyword sets (over 15,000 keywords).
1265228068SbaptWhen processing large keyword sets it helps greatly to have over 8 megs
1266228068Sbaptof RAM.
1267228068Sbapt
1268228068Sbapt@item 
1269228068SbaptThe size of the generate static keyword array can get @emph{extremely}
1270228068Sbaptlarge if the input keyword file is large or if the keywords are quite
1271228068Sbaptsimilar.  This tends to slow down the compilation of the generated C
1272228068Sbaptcode, and @emph{greatly} inflates the object code size.  If this
1273228068Sbaptsituation occurs, consider using the @samp{-S} option to reduce data
1274228068Sbaptsize, potentially increasing keyword recognition time a negligible
1275228068Sbaptamount.  Since many C compilers cannot correctly generate code for
1276228068Sbaptlarge switch statements it is important to qualify the @var{-S} option
1277228068Sbaptwith an appropriate numerical argument that controls the number of
1278228068Sbaptswitch statements generated.
1279228068Sbapt
1280228068Sbapt@item 
1281228068SbaptThe maximum number of selected byte positions has an
1282228068Sbaptarbitrary limit of 255.  This restriction should be removed, and if
1283228068Sbaptanyone considers this a problem write me and let me know so I can remove
1284228068Sbaptthe constraint.
1285228068Sbapt@end itemize
1286228068Sbapt
1287228068Sbapt@node Projects, Bibliography, Bugs, Top
1288228068Sbapt@chapter Things Still Left to Do
1289228068Sbapt
1290228068SbaptIt should be ``relatively'' easy to replace the current perfect hash
1291228068Sbaptfunction algorithm with a more exhaustive approach; the perfect hash
1292228068Sbaptmodule is essential independent from other program modules.  Additional
1293228068Sbaptworthwhile improvements include:
1294228068Sbapt
1295228068Sbapt@itemize @bullet
1296228068Sbapt@item 
1297228068SbaptAnother useful extension involves modifying the program to generate
1298228068Sbapt``minimal'' perfect hash functions (under certain circumstances, the
1299228068Sbaptcurrent version can be rather extravagant in the generated table size).
1300228068SbaptThis is mostly of theoretical interest, since a sparse table
1301228068Sbaptoften produces faster lookups, and use of the @samp{-S} @code{switch}
1302228068Sbaptoption can minimize the data size, at the expense of slightly longer
1303228068Sbaptlookups (note that the gcc compiler generally produces good code for
1304228068Sbapt@code{switch} statements, reducing the need for more complex schemes).
1305228068Sbapt
1306228068Sbapt@item
1307228068SbaptIn addition to improving the algorithm, it would also be useful to
1308228068Sbaptgenerate an Ada package as the code output, in addition to the current
1309228068SbaptC and C++ routines.
1310228068Sbapt@end itemize
1311228068Sbapt
1312228068Sbapt@page
1313228068Sbapt
1314228068Sbapt@node Bibliography, Concept Index, Projects, Top
1315228068Sbapt@chapter Bibliography
1316228068Sbapt
1317228068Sbapt[1] Chang, C.C.: @i{A Scheme for Constructing Ordered Minimal Perfect
1318228068SbaptHashing Functions} Information Sciences 39(1986), 187-195.
1319228068Sbapt
1320228068Sbapt[2] Cichelli, Richard J. @i{Author's Response to ``On Cichelli's Minimal Perfect Hash
1321228068SbaptFunctions Method''} Communications of the ACM, 23, 12(December 1980), 729.
1322228068Sbapt
1323228068Sbapt[3] Cichelli, Richard J. @i{Minimal Perfect Hash Functions Made Simple}
1324228068SbaptCommunications of the ACM, 23, 1(January 1980), 17-19.
1325228068Sbapt
1326228068Sbapt[4] Cook, C. R. and Oldehoeft, R.R. @i{A Letter Oriented Minimal
1327228068SbaptPerfect Hashing Function} SIGPLAN Notices, 17, 9(September 1982), 18-27.
1328228068Sbapt
1329228068Sbapt[5] Cormack, G. V. and Horspool, R. N. S. and Kaiserwerth, M.
1330228068Sbapt@i{Practical Perfect Hashing} Computer Journal, 28, 1(January 1985), 54-58.
1331228068Sbapt
1332228068Sbapt[6] Jaeschke, G. @i{Reciprocal Hashing: A Method for Generating Minimal
1333228068SbaptPerfect Hashing Functions} Communications of the ACM, 24, 12(December
1334228068Sbapt1981), 829-833.
1335228068Sbapt
1336228068Sbapt[7] Jaeschke, G. and Osterburg, G. @i{On Cichelli's Minimal Perfect
1337228068SbaptHash Functions Method} Communications of the ACM, 23, 12(December 1980),
1338228068Sbapt728-729.
1339228068Sbapt
1340228068Sbapt[8] Sager, Thomas J. @i{A Polynomial Time Generator for Minimal Perfect
1341228068SbaptHash Functions} Communications of the ACM, 28, 5(December 1985), 523-532
1342228068Sbapt
1343228068Sbapt[9] Schmidt, Douglas C. @i{GPERF: A Perfect Hash Function Generator}
1344228068SbaptSecond USENIX C++ Conference Proceedings, April 1990.
1345228068Sbapt
1346228068Sbapt[10] Schmidt, Douglas C. @i{GPERF: A Perfect Hash Function Generator}
1347228068SbaptC++ Report, SIGS 10 10 (November/December 1998).
1348228068Sbapt
1349228068Sbapt[11] Sebesta, R.W. and Taylor, M.A. @i{Minimal Perfect Hash Functions
1350228068Sbaptfor Reserved Word Lists}  SIGPLAN Notices, 20, 12(September 1985), 47-53.
1351228068Sbapt
1352228068Sbapt[12] Sprugnoli, R. @i{Perfect Hashing Functions: A Single Probe
1353228068SbaptRetrieving Method for Static Sets} Communications of the ACM, 20
1354228068Sbapt11(November 1977), 841-850.
1355228068Sbapt
1356228068Sbapt[13] Stallman, Richard M. @i{Using and Porting GNU CC} Free Software Foundation,
1357228068Sbapt1988.
1358228068Sbapt
1359228068Sbapt[14] Stroustrup, Bjarne @i{The C++ Programming Language.} Addison-Wesley, 1986.
1360228068Sbapt
1361228068Sbapt[15] Tiemann, Michael D. @i{User's Guide to GNU C++} Free Software
1362228068SbaptFoundation, 1989.
1363228068Sbapt
1364228068Sbapt@node Concept Index,  , Bibliography, Top
1365228068Sbapt@unnumbered Concept Index
1366228068Sbapt
1367228068Sbapt@printindex cp
1368228068Sbapt
1369228068Sbapt@contents
1370228068Sbapt@bye
1371