1<HTML> 2<HEAD> 3<!-- This HTML file has been created by texi2html 1.52b 4 from gperf.texi on 19 March 2013 --> 5 6<META HTTP-EQUIV="content-type" CONTENT="text/html; charset=UTF-8"> 7<TITLE>Perfect Hash Function Generator - 5 Invoking gperf</TITLE> 8</HEAD> 9<BODY> 10Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_5.html">previous</A>, <A HREF="gperf_7.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. 11<P><HR><P> 12 13 14<H1><A NAME="SEC17" HREF="gperf_toc.html#TOC17">5 Invoking <CODE>gperf</CODE></A></H1> 15 16<P> 17There are <EM>many</EM> options to <CODE>gperf</CODE>. They were added to make 18the program more convenient for use with real applications. ���On-line��� 19help is readily available via the <SAMP>‘--help’</SAMP> option. Here is the 20complete list of options. 21 22</P> 23 24 25 26<H2><A NAME="SEC18" HREF="gperf_toc.html#TOC18">5.1 Specifying the Location of the Output File</A></H2> 27 28<DL COMPACT> 29 30<DT><SAMP>‘--output-file=<VAR>file</VAR>’</SAMP> 31<DD> 32Allows you to specify the name of the file to which the output is written to. 33</DL> 34 35<P> 36The results are written to standard output if no output file is specified 37or if it is <SAMP>‘-’</SAMP>. 38 39</P> 40 41 42<H2><A NAME="SEC19" HREF="gperf_toc.html#TOC19">5.2 Options that affect Interpretation of the Input File</A></H2> 43 44<P> 45These options are also available as declarations in the input file 46(see section <A HREF="gperf_5.html#SEC9">4.1.1.2 Gperf Declarations</A>). 47 48</P> 49<DL COMPACT> 50 51<DT><SAMP>‘-e <VAR>keyword-delimiter-list</VAR>’</SAMP> 52<DD> 53<DT><SAMP>‘--delimiters=<VAR>keyword-delimiter-list</VAR>’</SAMP> 54<DD> 55<A NAME="IDX40"></A> 56Allows you to provide a string containing delimiters used to 57separate keywords from their attributes. The default is ",". This 58option is essential if you want to use keywords that have embedded 59commas or newlines. One useful trick is to use -e'TAB', where TAB is 60the literal tab character. 61 62<DT><SAMP>‘-t’</SAMP> 63<DD> 64<DT><SAMP>‘--struct-type’</SAMP> 65<DD> 66Allows you to include a <CODE>struct</CODE> type declaration for generated 67code. Any text before a pair of consecutive <SAMP>‘%%’</SAMP> is considered 68part of the type declaration. Keywords and additional fields may follow 69this, one group of fields per line. A set of examples for generating 70perfect hash tables and functions for Ada, C, C++, Pascal, Modula 2, 71Modula 3 and JavaScript reserved words are distributed with this release. 72 73<DT><SAMP>‘--ignore-case’</SAMP> 74<DD> 75Consider upper and lower case ASCII characters as equivalent. The string 76comparison will use a case insignificant character comparison. Note that 77locale dependent case mappings are ignored. This option is therefore not 78suitable if a properly internationalized or locale aware case mapping 79should be used. (For example, in a Turkish locale, the upper case equivalent 80of the lowercase ASCII letter <SAMP>‘i’</SAMP> is the non-ASCII character 81<SAMP>‘capital i with dot above’</SAMP>.) For this case, it is better to apply 82an uppercase or lowercase conversion on the string before passing it to 83the <CODE>gperf</CODE> generated function. 84</DL> 85 86 87 88<H2><A NAME="SEC20" HREF="gperf_toc.html#TOC20">5.3 Options to specify the Language for the Output Code</A></H2> 89 90<P> 91These options are also available as declarations in the input file 92(see section <A HREF="gperf_5.html#SEC9">4.1.1.2 Gperf Declarations</A>). 93 94</P> 95<DL COMPACT> 96 97<DT><SAMP>‘-L <VAR>generated-language-name</VAR>’</SAMP> 98<DD> 99<DT><SAMP>‘--language=<VAR>generated-language-name</VAR>’</SAMP> 100<DD> 101Instructs <CODE>gperf</CODE> to generate code in the language specified by the 102option's argument. Languages handled are currently: 103 104<DL COMPACT> 105 106<DT><SAMP>‘KR-C’</SAMP> 107<DD> 108Old-style K&R C. This language is understood by old-style C compilers and 109ANSI C compilers, but ANSI C compilers may flag warnings (or even errors) 110because of lacking <SAMP>‘const’</SAMP>. 111 112<DT><SAMP>‘C’</SAMP> 113<DD> 114Common C. This language is understood by ANSI C compilers, and also by 115old-style C compilers, provided that you <CODE>#define const</CODE> to empty 116for compilers which don't know about this keyword. 117 118<DT><SAMP>‘ANSI-C’</SAMP> 119<DD> 120ANSI C. This language is understood by ANSI C compilers and C++ compilers. 121 122<DT><SAMP>‘C++’</SAMP> 123<DD> 124C++. This language is understood by C++ compilers. 125</DL> 126 127The default is C. 128 129<DT><SAMP>‘-a’</SAMP> 130<DD> 131This option is supported for compatibility with previous releases of 132<CODE>gperf</CODE>. It does not do anything. 133 134<DT><SAMP>‘-g’</SAMP> 135<DD> 136This option is supported for compatibility with previous releases of 137<CODE>gperf</CODE>. It does not do anything. 138</DL> 139 140 141 142<H2><A NAME="SEC21" HREF="gperf_toc.html#TOC21">5.4 Options for fine tuning Details in the Output Code</A></H2> 143 144<P> 145Most of these options are also available as declarations in the input file 146(see section <A HREF="gperf_5.html#SEC9">4.1.1.2 Gperf Declarations</A>). 147 148</P> 149<DL COMPACT> 150 151<DT><SAMP>‘-K <VAR>slot-name</VAR>’</SAMP> 152<DD> 153<DT><SAMP>‘--slot-name=<VAR>slot-name</VAR>’</SAMP> 154<DD> 155<A NAME="IDX41"></A> 156This option is only useful when option <SAMP>‘-t’</SAMP> (or, equivalently, the 157<SAMP>‘%struct-type’</SAMP> declaration) has been given. 158By default, the program assumes the structure component identifier for 159the keyword is <SAMP>‘name’</SAMP>. This option allows an arbitrary choice of 160identifier for this component, although it still must occur as the first 161field in your supplied <CODE>struct</CODE>. 162 163<DT><SAMP>‘-F <VAR>initializers</VAR>’</SAMP> 164<DD> 165<DT><SAMP>‘--initializer-suffix=<VAR>initializers</VAR>’</SAMP> 166<DD> 167<A NAME="IDX42"></A> 168This option is only useful when option <SAMP>‘-t’</SAMP> (or, equivalently, the 169<SAMP>‘%struct-type’</SAMP> declaration) has been given. 170It permits to specify initializers for the structure members following 171<VAR>slot-name</VAR> in empty hash table entries. The list of initializers 172should start with a comma. By default, the emitted code will 173zero-initialize structure members following <VAR>slot-name</VAR>. 174 175<DT><SAMP>‘-H <VAR>hash-function-name</VAR>’</SAMP> 176<DD> 177<DT><SAMP>‘--hash-function-name=<VAR>hash-function-name</VAR>’</SAMP> 178<DD> 179Allows you to specify the name for the generated hash function. Default 180name is <SAMP>‘hash’</SAMP>. This option permits the use of two hash tables in 181the same file. 182 183<DT><SAMP>‘-N <VAR>lookup-function-name</VAR>’</SAMP> 184<DD> 185<DT><SAMP>‘--lookup-function-name=<VAR>lookup-function-name</VAR>’</SAMP> 186<DD> 187Allows you to specify the name for the generated lookup function. 188Default name is <SAMP>‘in_word_set’</SAMP>. This option permits multiple 189generated hash functions to be used in the same application. 190 191<DT><SAMP>‘-Z <VAR>class-name</VAR>’</SAMP> 192<DD> 193<DT><SAMP>‘--class-name=<VAR>class-name</VAR>’</SAMP> 194<DD> 195<A NAME="IDX43"></A> 196This option is only useful when option <SAMP>‘-L C++’</SAMP> (or, equivalently, 197the <SAMP>‘%language=C++’</SAMP> declaration) has been given. It 198allows you to specify the name of generated C++ class. Default name is 199<CODE>Perfect_Hash</CODE>. 200 201<DT><SAMP>‘-7’</SAMP> 202<DD> 203<DT><SAMP>‘--seven-bit’</SAMP> 204<DD> 205This option specifies that all strings that will be passed as arguments 206to the generated hash function and the generated lookup function will 207solely consist of 7-bit ASCII characters (bytes in the range 0..127). 208(Note that the ANSI C functions <CODE>isalnum</CODE> and <CODE>isgraph</CODE> do 209<EM>not</EM> guarantee that a byte is in this range. Only an explicit 210test like <SAMP>‘c >= 'A' && c <= 'Z'’</SAMP> guarantees this.) This was the 211default in versions of <CODE>gperf</CODE> earlier than 2.7; now the default is 212to support 8-bit and multibyte characters. 213 214<DT><SAMP>‘-l’</SAMP> 215<DD> 216<DT><SAMP>‘--compare-lengths’</SAMP> 217<DD> 218Compare keyword lengths before trying a string comparison. This option 219is mandatory for binary comparisons (see section <A HREF="gperf_5.html#SEC15">4.3 Use of NUL bytes</A>). It also might 220cut down on the number of string comparisons made during the lookup, since 221keywords with different lengths are never compared via <CODE>strcmp</CODE>. 222However, using <SAMP>‘-l’</SAMP> might greatly increase the size of the 223generated C code if the lookup table range is large (which implies that 224the switch option <SAMP>‘-S’</SAMP> or <SAMP>‘%switch’</SAMP> is not enabled), since the length 225table contains as many elements as there are entries in the lookup table. 226 227<DT><SAMP>‘-c’</SAMP> 228<DD> 229<DT><SAMP>‘--compare-strncmp’</SAMP> 230<DD> 231Generates C code that uses the <CODE>strncmp</CODE> function to perform 232string comparisons. The default action is to use <CODE>strcmp</CODE>. 233 234<DT><SAMP>‘-C’</SAMP> 235<DD> 236<DT><SAMP>‘--readonly-tables’</SAMP> 237<DD> 238Makes the contents of all generated lookup tables constant, i.e., 239���readonly���. Many compilers can generate more efficient code for this 240by putting the tables in readonly memory. 241 242<DT><SAMP>‘-E’</SAMP> 243<DD> 244<DT><SAMP>‘--enum’</SAMP> 245<DD> 246Define constant values using an enum local to the lookup function rather 247than with #defines. This also means that different lookup functions can 248reside in the same file. Thanks to James Clark <CODE><jjc@ai.mit.edu></CODE>. 249 250<DT><SAMP>‘-I’</SAMP> 251<DD> 252<DT><SAMP>‘--includes’</SAMP> 253<DD> 254Include the necessary system include file, <CODE><string.h></CODE>, at the 255beginning of the code. By default, this is not done; the user must 256include this header file himself to allow compilation of the code. 257 258<DT><SAMP>‘-G’</SAMP> 259<DD> 260<DT><SAMP>‘--global-table’</SAMP> 261<DD> 262Generate the static table of keywords as a static global variable, 263rather than hiding it inside of the lookup function (which is the 264default behavior). 265 266<DT><SAMP>‘-P’</SAMP> 267<DD> 268<DT><SAMP>‘--pic’</SAMP> 269<DD> 270Optimize the generated table for inclusion in shared libraries. This 271reduces the startup time of programs using a shared library containing 272the generated code. If the option <SAMP>‘-t’</SAMP> (or, equivalently, the 273<SAMP>‘%struct-type’</SAMP> declaration) is also given, the first field of the 274user-defined struct must be of type <SAMP>‘int’</SAMP>, not <SAMP>‘char *’</SAMP>, because 275it will contain offsets into the string pool instead of actual strings. 276To convert such an offset to a string, you can use the expression 277<SAMP>‘stringpool + <VAR>o</VAR>’</SAMP>, where <VAR>o</VAR> is the offset. The string pool 278name can be changed through the option <SAMP>‘--string-pool-name’</SAMP>. 279 280<DT><SAMP>‘-Q <VAR>string-pool-name</VAR>’</SAMP> 281<DD> 282<DT><SAMP>‘--string-pool-name=<VAR>string-pool-name</VAR>’</SAMP> 283<DD> 284Allows you to specify the name of the generated string pool created by 285option <SAMP>‘-P’</SAMP>. The default name is <SAMP>‘stringpool’</SAMP>. This option 286permits the use of two hash tables in the same file, with <SAMP>‘-P’</SAMP> and 287even when the option <SAMP>‘-G’</SAMP> (or, equivalently, the <SAMP>‘%global-table’</SAMP> 288declaration) is given. 289 290<DT><SAMP>‘--null-strings’</SAMP> 291<DD> 292Use NULL strings instead of empty strings for empty keyword table entries. 293This reduces the startup time of programs using a shared library containing 294the generated code (but not as much as option <SAMP>‘-P’</SAMP>), at the expense 295of one more test-and-branch instruction at run time. 296 297<DT><SAMP>‘-W <VAR>hash-table-array-name</VAR>’</SAMP> 298<DD> 299<DT><SAMP>‘--word-array-name=<VAR>hash-table-array-name</VAR>’</SAMP> 300<DD> 301<A NAME="IDX44"></A> 302Allows you to specify the name for the generated array containing the 303hash table. Default name is <SAMP>‘wordlist’</SAMP>. This option permits the 304use of two hash tables in the same file, even when the option <SAMP>‘-G’</SAMP> 305(or, equivalently, the <SAMP>‘%global-table’</SAMP> declaration) is given. 306 307<DT><SAMP>‘--length-table-name=<VAR>length-table-array-name</VAR>’</SAMP> 308<DD> 309<A NAME="IDX45"></A> 310Allows you to specify the name for the generated array containing the 311length table. Default name is <SAMP>‘lengthtable’</SAMP>. This option permits the 312use of two length tables in the same file, even when the option <SAMP>‘-G’</SAMP> 313(or, equivalently, the <SAMP>‘%global-table’</SAMP> declaration) is given. 314 315<DT><SAMP>‘-S <VAR>total-switch-statements</VAR>’</SAMP> 316<DD> 317<DT><SAMP>‘--switch=<VAR>total-switch-statements</VAR>’</SAMP> 318<DD> 319<A NAME="IDX46"></A> 320Causes the generated C code to use a <CODE>switch</CODE> statement scheme, 321rather than an array lookup table. This can lead to a reduction in both 322time and space requirements for some input files. The argument to this 323option determines how many <CODE>switch</CODE> statements are generated. A 324value of 1 generates 1 <CODE>switch</CODE> containing all the elements, a 325value of 2 generates 2 tables with 1/2 the elements in each 326<CODE>switch</CODE>, etc. This is useful since many C compilers cannot 327correctly generate code for large <CODE>switch</CODE> statements. This option 328was inspired in part by Keith Bostic's original C program. 329 330<DT><SAMP>‘-T’</SAMP> 331<DD> 332<DT><SAMP>‘--omit-struct-type’</SAMP> 333<DD> 334Prevents the transfer of the type declaration to the output file. Use 335this option if the type is already defined elsewhere. 336 337<DT><SAMP>‘-p’</SAMP> 338<DD> 339This option is supported for compatibility with previous releases of 340<CODE>gperf</CODE>. It does not do anything. 341</DL> 342 343 344 345<H2><A NAME="SEC22" HREF="gperf_toc.html#TOC22">5.5 Options for changing the Algorithms employed by <CODE>gperf</CODE></A></H2> 346 347<DL COMPACT> 348 349<DT><SAMP>‘-k <VAR>selected-byte-positions</VAR>’</SAMP> 350<DD> 351<DT><SAMP>‘--key-positions=<VAR>selected-byte-positions</VAR>’</SAMP> 352<DD> 353Allows selection of the byte positions used in the keywords' 354hash function. The allowable choices range between 1-255, inclusive. 355The positions are separated by commas, e.g., <SAMP>‘-k 9,4,13,14’</SAMP>; 356ranges may be used, e.g., <SAMP>‘-k 2-7’</SAMP>; and positions may occur 357in any order. Furthermore, the wildcard '*' causes the generated 358hash function to consider <STRONG>all</STRONG> byte positions in each keyword, 359whereas '$' instructs the hash function to use the ���final byte��� 360of a keyword (this is the only way to use a byte position greater than 361255, incidentally). 362 363For instance, the option <SAMP>‘-k 1,2,4,6-10,'$'’</SAMP> generates a hash 364function that considers positions 1,2,4,6,7,8,9,10, plus the last 365byte in each keyword (which may be at a different position for each 366keyword, obviously). Keywords 367with length less than the indicated byte positions work properly, since 368selected byte positions exceeding the keyword length are simply not 369referenced in the hash function. 370 371This option is not normally needed since version 2.8 of <CODE>gperf</CODE>; 372the default byte positions are computed depending on the keyword set, 373through a search that minimizes the number of byte positions. 374 375<DT><SAMP>‘-D’</SAMP> 376<DD> 377<DT><SAMP>‘--duplicates’</SAMP> 378<DD> 379<A NAME="IDX47"></A> 380Handle keywords whose selected byte sets hash to duplicate values. 381Duplicate hash values can occur if a set of keywords has the same names, but 382possesses different attributes, or if the selected byte positions are not well 383chosen. With the -D option <CODE>gperf</CODE> treats all these keywords as 384part of an equivalence class and generates a perfect hash function with 385multiple comparisons for duplicate keywords. It is up to you to completely 386disambiguate the keywords by modifying the generated C code. However, 387<CODE>gperf</CODE> helps you out by organizing the output. 388 389Using this option usually means that the generated hash function is no 390longer perfect. On the other hand, it permits <CODE>gperf</CODE> to work on 391keyword sets that it otherwise could not handle. 392 393<DT><SAMP>‘-m <VAR>iterations</VAR>’</SAMP> 394<DD> 395<DT><SAMP>‘--multiple-iterations=<VAR>iterations</VAR>’</SAMP> 396<DD> 397Perform multiple choices of the <SAMP>‘-i’</SAMP> and <SAMP>‘-j’</SAMP> values, and 398choose the best results. This increases the running time by a factor of 399<VAR>iterations</VAR> but does a good job minimizing the generated table size. 400 401<DT><SAMP>‘-i <VAR>initial-value</VAR>’</SAMP> 402<DD> 403<DT><SAMP>‘--initial-asso=<VAR>initial-value</VAR>’</SAMP> 404<DD> 405Provides an initial <VAR>value</VAR> for the associate values array. Default 406is 0. Increasing the initial value helps inflate the final table size, 407possibly leading to more time efficient keyword lookups. Note that this 408option is not particularly useful when <SAMP>‘-S’</SAMP> (or, equivalently, 409<SAMP>‘%switch’</SAMP>) is used. Also, 410<SAMP>‘-i’</SAMP> is overridden when the <SAMP>‘-r’</SAMP> option is used. 411 412<DT><SAMP>‘-j <VAR>jump-value</VAR>’</SAMP> 413<DD> 414<DT><SAMP>‘--jump=<VAR>jump-value</VAR>’</SAMP> 415<DD> 416<A NAME="IDX48"></A> 417Affects the ���jump value���, i.e., how far to advance the associated 418byte value upon collisions. <VAR>Jump-value</VAR> is rounded up to an 419odd number, the default is 5. If the <VAR>jump-value</VAR> is 0 <CODE>gperf</CODE> 420jumps by random amounts. 421 422<DT><SAMP>‘-n’</SAMP> 423<DD> 424<DT><SAMP>‘--no-strlen’</SAMP> 425<DD> 426Instructs the generator not to include the length of a keyword when 427computing its hash value. This may save a few assembly instructions in 428the generated lookup table. 429 430<DT><SAMP>‘-r’</SAMP> 431<DD> 432<DT><SAMP>‘--random’</SAMP> 433<DD> 434Utilizes randomness to initialize the associated values table. This 435frequently generates solutions faster than using deterministic 436initialization (which starts all associated values at 0). Furthermore, 437using the randomization option generally increases the size of the 438table. 439 440<DT><SAMP>‘-s <VAR>size-multiple</VAR>’</SAMP> 441<DD> 442<DT><SAMP>‘--size-multiple=<VAR>size-multiple</VAR>’</SAMP> 443<DD> 444Affects the size of the generated hash table. The numeric argument for 445this option indicates ���how many times larger or smaller��� the maximum 446associated value range should be, in relationship to the number of keywords. 447It can be written as an integer, a floating-point number or a fraction. 448For example, a value of 3 means ���allow the maximum associated value to be 449about 3 times larger than the number of input keywords���. 450Conversely, a value of 1/3 means ���allow the maximum associated value to 451be about 3 times smaller than the number of input keywords���. Values 452smaller than 1 are useful for limiting the overall size of the generated hash 453table, though the option <SAMP>‘-m’</SAMP> is better at this purpose. 454 455If `generate switch' option <SAMP>‘-S’</SAMP> (or, equivalently, <SAMP>‘%switch’</SAMP>) is 456<EM>not</EM> enabled, the maximum 457associated value influences the static array table size, and a larger 458table should decrease the time required for an unsuccessful search, at 459the expense of extra table space. 460 461The default value is 1, thus the default maximum associated value about 462the same size as the number of keywords (for efficiency, the maximum 463associated value is always rounded up to a power of 2). The actual 464table size may vary somewhat, since this technique is essentially a 465heuristic. 466</DL> 467 468 469 470<H2><A NAME="SEC23" HREF="gperf_toc.html#TOC23">5.6 Informative Output</A></H2> 471 472<DL COMPACT> 473 474<DT><SAMP>‘-h’</SAMP> 475<DD> 476<DT><SAMP>‘--help’</SAMP> 477<DD> 478Prints a short summary on the meaning of each program option. Aborts 479further program execution. 480 481<DT><SAMP>‘-v’</SAMP> 482<DD> 483<DT><SAMP>‘--version’</SAMP> 484<DD> 485Prints out the current version number. 486 487<DT><SAMP>‘-d’</SAMP> 488<DD> 489<DT><SAMP>‘--debug’</SAMP> 490<DD> 491Enables the debugging option. This produces verbose diagnostics to 492���standard error��� when <CODE>gperf</CODE> is executing. It is useful both for 493maintaining the program and for determining whether a given set of 494options is actually speeding up the search for a solution. Some useful 495information is dumped at the end of the program when the <SAMP>‘-d’</SAMP> 496option is enabled. 497</DL> 498 499<P><HR><P> 500Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_5.html">previous</A>, <A HREF="gperf_7.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. 501</BODY> 502</HTML> 503