ChangeLog revision 18214
118214SpeterSat Nov 11 13:43:17 1989  Doug Schmidt  (schmidt at zola.ics.uci.edu)
218214Speter
318214Speter        * Modified options by changing macro TOTAL_POSITIONS to GET_CHARSET_SIZE
418214Speter          and SET_CHARSET_SIZE.  These two routines now either return
518214Speter          the total charset size *or* the length of the largest keyword
618214Speter          if the user specifies the -k'*' (ALLCHARS) option.  This change
718214Speter          cleans up client code.
818214Speter
918214SpeterFri Nov 10 15:21:19 1989  Doug Schmidt  (schmidt at zola.ics.uci.edu)
1018214Speter
1118214Speter        * Made sure to explicitly initialize perfect.fewest_collisions to
1218214Speter          0.
1318214Speter
1418214SpeterWed Nov  1 21:39:54 1989  Doug Schmidt  (schmidt at zola.ics.uci.edu)
1518214Speter
1618214Speter        * Upgraded the version number to 2.0, reflecting all the 
1718214Speter          major recent changes!
1818214Speter
1918214Speter        * Rearranged code so that fewer source lines are greater than 80 columns. 
2018214Speter
2118214Speter        * Cleaned up some loose ends noticed by Nels Olson.
2218214Speter          1.  Removed `if (collisions <= perfect.fewest_collisions)'
2318214Speter              from affects_prev () since it was superfluous.
2418214Speter          2.  Removed the fields best_char_value and best_asso_value
2518214Speter              from PERFECT.  There were also unnecessary.
2618214Speter          3.  Fixed a braino in boolarray.c's bool_array_reset ()
2718214Speter              function.  Since iteration numbers can never be zero
2818214Speter              the `if (bool_array.iteration_number++ == 0)' must be
2918214Speter              `if (++bool_array.iteration_number == 0).'
3018214Speter          4.  Broke the -h help options string into 3 pieces.  This
3118214Speter              should silence broken C compilers for a while!
3218214Speter          5.  Modified `report_error ()' so that it correctly handled
3318214Speter              "%%".
3418214Speter
3518214SpeterTue Oct 31 23:01:11 1989  Doug Schmidt  (schmidt at siam.ics.uci.edu)
3618214Speter
3718214Speter        * It is important to note that -D no longer enables -S.
3818214Speter          There is a good reason for this change, which will become
3918214Speter          manifested in the next release... (suspense!).
4018214Speter
4118214Speter        * Made some subtle changes to print_switch so that if finally
4218214Speter          seems to work correctly.  Needs more stress testing, however...
4318214Speter
4418214SpeterMon Oct 30 15:06:59 1989  Doug Schmidt  (schmidt at zola.ics.uci.edu)
4518214Speter
4618214Speter        * Made a major change to the keylist.c's print_switch () function.
4718214Speter          The user can now specify the number of switch statements to generate
4818214Speter          via an argument to the -S option, i.e., -S1 means `generate 1
4918214Speter          switch statement with all keywords in it,' -S2 means generate
5018214Speter          2 switch statements with 1/2 the elements in each one, etc.
5118214Speter          Hopefully this will fix the problem with C compilers not being
5218214Speter          able to generate code for giant switch statements (but don't
5318214Speter          hold your breath!)
5418214Speter
5518214Speter        * Changed keylist.c's length () function to keyword_list_length ().
5618214Speter          Also put an extern decl for this function and max_key_length ()
5718214Speter          into the keylist.h file (don't know why it wasn't already there...).
5818214Speter
5918214SpeterSun Oct 29 08:55:29 1989  Doug Schmidt  (schmidt at siam.ics.uci.edu)
6018214Speter
6118214Speter        * Added a feature to main.c that prints out the starting wall-clock
6218214Speter          time before the program begins and prints out the ending wall-clock
6318214Speter          time when the program is finished.
6418214Speter
6518214Speter        * Added the GATHER_STATISTICS code in hashtable.c so we can
6618214Speter          keep track of how well double hashing is doing.  Eventually,
6718214Speter          GATHER_STATISTICS will be added so that all instrumentation
6818214Speter          code can be conditionally compiled in.
6918214Speter
7018214Speter        * Modified xmalloc.c's buffered_malloc () routine so that it
7118214Speter          rounded the SIZE byte requests up to the correct alignment
7218214Speter          size for the target machine.
7318214Speter
7418214SpeterSat Oct 28 11:17:36 1989  Doug Schmidt  (schmidt at glacier.ics.uci.edu)
7518214Speter
7618214Speter        * Fixed a stupid bug in keylist.c's print_switch () routine.  This
7718214Speter          was necessary to make sure the generated switch statement worked
7818214Speter          correctly when *both* `natural,' i.e., static links and dynamic
7918214Speter          links, i.e., unresolved duplicates, hash to the same value.
8018214Speter
8118214Speter        * Modified LIST_NODE so that the char *char_set field is redeclared
8218214Speter          as char char_set[1] and placed at the bottom of the struct.  This
8318214Speter          enables us to play the old `variable-length string in the struct
8418214Speter          trick' (third time I fell for it this week, chief!).  The
8518214Speter          main purpose of this is to remove n calls to buffered_malloc,
8618214Speter          where n is the number of keywords.
8718214Speter
8818214Speter        * Added a new function to xmalloc.c called buffered_malloc.  This
8918214Speter          reduces the number of calls to malloc by grabbing large chunks
9018214Speter          of memory and doling them out in small pieces.  Almost all uses
9118214Speter          of xmalloc were replaced with calls to buffered_malloc ().
9218214Speter
9318214Speter        * Modified boolarray.c's bool_array_destroy () function so that
9418214Speter          it now frees the bool_array.storage_array when it is no longer
9518214Speter          needed.  Since this array is generally very large it makes sense
9618214Speter          to return the memory to the freelist when it is no longer in use.
9718214Speter
9818214Speter        * Changed the interface to hash_table_init.  This function is
9918214Speter          now passed a pointer to a power-of-two sized buffer that serves
10018214Speter          as storage for the hash table.  Although this weakens information
10118214Speter          hiding a little bit it greatly reduces dynamic memory fragmentation,
10218214Speter          since we can now obtain the memory via a call to alloca, rather
10318214Speter          than malloc.  This change modified keylist.c's read_keys () calling
10418214Speter          interface.
10518214Speter
10618214Speter        * Since alloca is now being used more aggressively a conditional
10718214Speter          compilation section was added to the init_all () routine in main.c.
10818214Speter          Taken from GNU GCC, this code gets rid of any avoidable limit
10918214Speter          on stack size so that alloca does not fail.  It is only used
11018214Speter          if the -DRLIMIT_STACK symbol is defined when gperf is compiled.
11118214Speter
11218214Speter        * Moved the `destructor' for bool_array from destroy_all () in 
11318214Speter          main.c into perfect_generate () in perfect.c.  This enables
11418214Speter          us to get free up the large dynamic memory as soon as it is no
11518214Speter          longer necessary.  Also moved the hash_table destructor from
11618214Speter          destroy_all into read_keys () from keylist.c.  This accomplishes
11718214Speter          the same purpose, i.e., we can free up the space immediately.
11818214Speter
11918214Speter        * Added warnings in option.c so that user's would be informed
12018214Speter          that -r superceeds -i on the command-line.
12118214Speter          
12218214Speter        * Rewrote affects_prev () from perfect.c.  First, the code structure
12318214Speter          was cleaned up considerably (removing the need for a dreaded
12418214Speter          goto!).  Secondly, a major change occurred so that affects_prev ()
12518214Speter          returns FALSE (success) when fewest_hits gets down to whatever
12618214Speter          it was after inserting the previous key (instead of waiting for
12718214Speter          it to reach 0).  In other words, it stops trying if it can
12818214Speter          resolve the new collisions added by a key, even if there are
12918214Speter          still other old, unresolved collisions.  This modification was
13018214Speter          suggested by Nels Olson and seems to *greatly* increase the
13118214Speter          speed of gperf for large keyfiles.  Thanks Nels!
13218214Speter
13318214Speter        * In a similar vein, inside the change () routine from perfect.c
13418214Speter          the variable `perfect.fewest_collisions is no longer initialized
13518214Speter          with the length of the keyword list.  Instead it starts out at
13618214Speter          0 and is incremented by 1 every time change () is called.
13718214Speter          The rationale for this behavior is that there are times when a
13818214Speter          collision causes the number of duplicates (collisions) to
13918214Speter          increase by a large amount when it would presumably just have
14018214Speter          gone up by 1 if none of the asso_values were changed.  That is,
14118214Speter          at the beginning of change(), you could initialize fewest_hits
14218214Speter          to 1+(previous value of fewest_hits) instead of to the number of
14318214Speter          keys.  Thanks again, Nels.
14418214Speter
14518214Speter        * Replaced alloca with xmalloc in perfect.c's change () function.
14618214Speter          This should eliminate some overhead at the expense of a little
14718214Speter          extra memory that is never reclaimed.
14818214Speter
14918214Speter        * Renamed perfect.c's merge_sets () to compute_disjoint_union ()
15018214Speter          to reflect the change in behavior.
15118214Speter
15218214SpeterFri Oct 27 10:12:27 1989  Doug Schmidt  (schmidt at crimee.ics.uci.edu)
15318214Speter
15418214Speter        * Added the -e option so users can supply a string containing
15518214Speter          the characters used to separate keywords from their attributes.
15618214Speter          The default behavior is ",\n".
15718214Speter
15818214Speter        * Removed the char *uniq_set field from LIST_NODE and modified
15918214Speter          uses of uniq_set in perfect.c and keylist.c.  Due to changes
16018214Speter          to perfect.c's merge_set () described below this field was
16118214Speter          no longer necessary, and its removal makes the program
16218214Speter          smaller and potentially faster.
16318214Speter          
16418214Speter        * Added lots of changes/fixes suggested by Nels Olson
16518214Speter          (umls.UUCP!olson@mis.ucsf.edu).  In particular:
16618214Speter          1.  Changed BOOL_ARRAY so that it would dynamically create
16718214Speter              an array of unsigned shorts rather than ints if the 
16818214Speter              LO_CAL symbol was defined during program compilation.
16918214Speter              This cuts the amount of dynamic memory usage in half,
17018214Speter              which is important for large keyfile input.
17118214Speter          2.  Added some additional debugging statements that print extra
17218214Speter              info to stderr when the -d option is enabled.
17318214Speter          3.  Fixed a really stupid bug in the print_switch () from keylist.c.
17418214Speter              A right paren was placed at the wrong location.  
17518214Speter          4.  Fixed a subtle problem with printing case values when keylinks
17618214Speter              appear.  The logic failed to account for the fact that there
17718214Speter              can be keylinks *and* regular node info also!
17818214Speter          5.  Finally split the huge help string into two parts.  This keeps
17918214Speter              breaking compilers with static limits on the length of tokens...
18018214Speter          6.  Modified the -j option so that -j 0 means `try random values
18118214Speter              when searching for a way to resolve collisions.'
18218214Speter          7.  Added a field `num_done' to the PERFECT struct.  This is used
18318214Speter              to report information collected when trying to resolve
18418214Speter              hash collisions.
18518214Speter          8.  Modified the merge_sets algorithm to perform a disjoint
18618214Speter              union of two multisets.  This ensures that subsequent
18718214Speter              processing in perfect.c function affect_prev () doesn't
18818214Speter              waste time trying to change an associated value that is
18918214Speter              shared between two conflicting keywords.
19018214Speter          9.  Modified affects_prev so that it doesn't try random jump
19118214Speter              values unless the -j 0 option is enabled.
19218214Speter          10. Fixed a silly bug in perfect.c change ().
19318214Speter              This problem caused gperf to seg fault when
19418214Speter              the -k* option was given and the keyfile file had long
19518214Speter              keywords.
19618214Speter          11. Changed the behavior of keylist.c's read_keys () routine
19718214Speter              so that it would honor -D unequivocally, i.e., it doesn't
19818214Speter              try to turn off dup handling if the user requests it, even
19918214Speter              if there are no immediate links in the keyfile input.
20018214Speter
20118214SpeterMon Oct 16 19:58:08 1989  Doug Schmidt  (schmidt at glacier.ics.uci.edu)
20218214Speter
20318214Speter        * Fixed a number of small bugs kindly brought to my attention by
20418214Speter          Adam de Boor (bsw!adam@uunet.UU.NET).  Thanks Adam!  In particular,
20518214Speter          changed the behavior for the -a (ANSI) option so that the
20618214Speter          generated prototypes use int rather than size_t for the LEN 
20718214Speter          parameter.  It was too ugly having to #include <stddef.h> all
20818214Speter          over the place...
20918214Speter
21018214Speter        * Added a majorly neat hack to Bool_Array, suggested by rfg.
21118214Speter          The basic idea was to throw away the Ullman array technique.
21218214Speter          The Ullman array was used to remove the need to reinitialize all 
21318214Speter          the Bool_Array elements to zero everytime we needed to determine
21418214Speter          whether there were duplicate hash values in the keyword list.  
21518214Speter          The current trick uses an `iteration number' scheme, which takes
21618214Speter          about 1/3 the space and reduces the overall program running a 
21718214Speter          time by about 20 percent for large input!  The hack works as 
21818214Speter          follows:
21918214Speter          
22018214Speter          1. Dynamically allocate 1 boolean array of size k.
22118214Speter          2. Initialize the boolean array to zeros, and consider the first
22218214Speter             iteration to be iteration 1.
22318214Speter          2. Then on all subsequent iterations we `reset' the bool array by
22418214Speter             kicking the iteration count by 1. 
22518214Speter          3. When it comes time to check whether a hash value is currently
22618214Speter             in the boolean array we simply check its index location.  If
22718214Speter             the value stored there is *not* equal to the current iteration
22818214Speter             number then the item is clearly *not* in the set.  In that
22918214Speter             case we assign the iteration number to that array's index
23018214Speter             location for future reference.  Otherwise, if the item at
23118214Speter             the index location *is* equal to the iteration number we've
23218214Speter             found a duplicate.  No muss, no fuss!
23318214Speter
23418214SpeterThu Oct 12 18:08:43 1989  Doug Schmidt  (schmidt at zola.ics.uci.edu)
23518214Speter
23618214Speter        * Updated the version number to 1.9.
23718214Speter
23818214Speter        * Added support for the -C option.  This makes the contents of
23918214Speter          all generated tables `readonly'.
24018214Speter
24118214Speter        * Changed the handling of generated switches so that there is
24218214Speter          only one call to str[n]?cmp.  This *greatly* reduces the size of
24318214Speter          the generated assembly code on all compilers I've seen.
24418214Speter
24518214Speter        * Fixed a subtle bug that occurred when the -l and -S option
24618214Speter          was given.  Code produced looked something like:
24718214Speter
24818214Speter          if (len != key_len || !strcmp (s1, resword->name)) return resword;
24918214Speter
25018214Speter          which doesn't make any sense.  Clearly, this should be:
25118214Speter
25218214Speter          if (len == key_len && !strcmp (s1, resword->name)) return resword;
25318214Speter
25418214SpeterSat Sep 30 12:55:24 1989  Doug Schmidt  (schmidt at glacier.ics.uci.edu)
25518214Speter
25618214Speter        * Fixed a stupid bug in Key_List::print_hash_function that manifested
25718214Speter          itself if the `-k$' option was given (i.e., only use the key[length]
25818214Speter          character in the hash function).
25918214Speter
26018214SpeterMon Jul 24 17:09:46 1989  Doug Schmidt  (schmidt at glacier.ics.uci.edu)
26118214Speter
26218214Speter        * Fixed a bug in PRINT_MIN_MAX that resulted in MAX_INT being printed
26318214Speter          for the MIN_KEY_LEN if there was only 1 keyword in the input file
26418214Speter          (yeah, that's a pretty unlikely occurrence, I agree!).
26518214Speter
26618214Speter        * Fixed PRINT_HASH_FUNCTION and PRINT_LOOKUP_FUNCTION in keylist.c
26718214Speter          so that the generated functions take an unsigned int length argument.
26818214Speter          If -a is enabled the prototype is (const char *str, size_t len).
26918214Speter
27018214SpeterFri Jul 21 13:06:15 1989  Doug Schmidt  (schmidt at zola.ics.uci.edu)
27118214Speter
27218214Speter        * Fixed a horrible typo in PRINT_KEYWORD_TABLE in keylist.cc
27318214Speter          that prevented links from being printed correctly.
27418214Speter
27518214SpeterSun Jul  9 17:53:28 1989  Doug Schmidt  (schmidt at glacier.ics.uci.edu)
27618214Speter
27718214Speter        * Changed the ./tests subdirectory Makefile so that it 
27818214Speter          uses $(CC) instead of gcc.
27918214Speter
28018214SpeterSun Jul  2 12:14:04 1989  Doug Schmidt  (schmidt at glacier.ics.uci.edu)
28118214Speter
28218214Speter        * Moved comment handling from keylist.c to readline.c.  This
28318214Speter          simplifies the code and reduces the number of malloc calls.
28418214Speter
28518214Speter        * Fixed a number of subtle bugs that occurred when -S was
28618214Speter          combined with various and sundry options.
28718214Speter
28818214Speter        * Added the -G option, that makes the generated keyword table
28918214Speter          a global static variable, rather than hiding it inside
29018214Speter          the lookup function.  This allows other functions to directly
29118214Speter          access the contents in this table.
29218214Speter
29318214SpeterSat Jul  1 10:12:21 1989  Doug Schmidt  (schmidt at crimee.ics.uci.edu)
29418214Speter
29518214Speter        * Added the "#" feature, that allows comments inside the keyword
29618214Speter          list from the input file.
29718214Speter          
29818214Speter        * Also added the -H option (user can give the name of the hash
29918214Speter          function) and the -T option (prevents the transfer of the type decl
30018214Speter          to the output file, which is useful if the type is already defined
30118214Speter          elsewhere).
30218214Speter
30318214SpeterFri Jun 30 18:22:35 1989  Doug Schmidt  (schmidt at crimee.ics.uci.edu)
30418214Speter
30518214Speter        * Added Adam de Boor's changes.  Created an UNSET_OPTION macro.
30618214Speter
30718214SpeterSat Jun 17 10:56:00 1989  Doug Schmidt  (schmidt at glacier.ics.uci.edu)
30818214Speter
30918214Speter        * Modified option.h and option.c so that all mixed operations
31018214Speter          between integers and enumerals are cast correctly to int.
31118214Speter          This prevents errors in some brain-damaged C compilers.
31218214Speter
31318214SpeterFri Jun 16 14:13:15 1989  Doug Schmidt  (schmidt at crimee.ics.uci.edu)
31418214Speter
31518214Speter        * Modified the -f (FAST) option.  This now takes an argument.
31618214Speter          The argument corresponds to the number of iterations used
31718214Speter          to resolve collisions.  -f 0 uses the length of the
31818214Speter          keyword list (which is what -f did before).  This makes
31918214Speter          life much easier when dealing with large keyword files.
32018214Speter
32118214SpeterWed Jun  7 23:07:13 1989  Doug Schmidt  (schmidt at zola.ics.uci.edu)
32218214Speter
32318214Speter        * Updated to version 1.8 in preparation to release to Doug Lea
32418214Speter          and FSF.
32518214Speter
32618214Speter        * Added the -c (comparison) option.  Enabling this
32718214Speter          will use the strncmp function for string comparisons.
32818214Speter          The default is to use strcmp.
32918214Speter
33018214SpeterTue Jun  6 16:32:09 1989  Doug Schmidt  (schmidt at zola.ics.uci.edu)
33118214Speter
33218214Speter        * Fixed another stupid typo in xmalloc.c (XMALLOC).  I accidentally
33318214Speter          left the ANSI-fied prototype in place.  This obviously
33418214Speter          fails on old-style C compilers.
33518214Speter
33618214Speter        * Fixed stupid typos in PRINT_SWITCH from the keylist.c.  This
33718214Speter          caused the -D option to produce incorrect output when used
33818214Speter          in conjunction with -p and -t.
33918214Speter          
34018214Speter        * Replaced the use of STRCMP with STRNCMP for the generated
34118214Speter          C output code.          
34218214Speter
34318214SpeterFri Jun  2 23:16:01 1989  Doug Schmidt  (schmidt at trinite.ics.uci.edu)
34418214Speter
34518214Speter        * Added a new function (XMALLOC) and file (xmalloc.c).  All
34618214Speter          calls to MALLOC were replaced by calls to XMALLOC.  This 
34718214Speter          will complain when virtual memory runs out (don't laugh, 
34818214Speter          this has happened!)
34918214Speter
35018214SpeterThu Jun  1 21:10:10 1989  Doug Schmidt  (schmidt at zola.ics.uci.edu)
35118214Speter
35218214Speter        * Fixed a typo in options.c that prevented the -f option
35318214Speter          from being given on the command-line.
35418214Speter
35518214SpeterWed May  3 17:48:02 1989  Doug Schmidt  (schmidt at zola.ics.uci.edu)
35618214Speter
35718214Speter        * Updated to version 1.7.  This reflects the recent major changes
35818214Speter          and the new C port.
35918214Speter
36018214Speter        * Fixed a typo in perfect.c perfect_destroy that prevented the actual 
36118214Speter          maximum hash table size from being printed.
36218214Speter
36318214Speter        * Added support for the -f option.  This generates the perfect
36418214Speter          hash function ``fast.''  It reduces the execution time of
36518214Speter          gperf, at the cost of minimizing the range of hash values.
36618214Speter
36718214SpeterTue May  2 16:23:29 1989  Doug Schmidt  (schmidt at crimee.ics.uci.edu)
36818214Speter
36918214Speter        * Enabled the diagnostics dump if the debugging option is enabled.
37018214Speter        
37118214Speter        * Removed all calls to FREE (silly to do this at program termination).
37218214Speter
37318214Speter        * Ported gperf to C.  From now on both K&R C and GNU G++ versions
37418214Speter          will be supported.
37518214Speter
376