1Implementation notes
2--------------------
3
4Files
5-----
6
7The library tex2any.lib contains the generic Latex parser.
8It comprises tex2any.cc, tex2any.h and texutils.cc.
9
10The executable Tex2RTF is made up of tex2any.lib,
11tex2rtf.cc (main driver and user interface), and specific
12drivers for generating output: rtfutils.cc, htmlutil.cc
13and xlputils.cc.
14
15Data structures
16---------------
17
18Class declarations are found in tex2any.h.
19
20TexMacroDef holds a macro (Latex command) definition: name, identifier,
21number of arguments, whether it should be ignored, etc. Integer
22identifiers are used for each Latex command for efficiency when
23generating output. A hash table MacroDefs stores all the TexMacroDefs,
24indexed on command name.
25
26Each unit of a Latex file is stored in a TexChunk. A TexChunk can be
27a macro, argument or just a string: a TexChunk macro has child
28chunks for the arguments, and each argument will have one or more
29children for representing another command or a simple string.
30
31Parsing
32-------
33
34Parsing is relatively add hoc. read_a_line reads in a line at a time,
35doing some processing for file commands (e.g. input, verbatiminclude).
36File handles are stored in a stack so file input commands may be nested.
37
38ParseArg parses an argument (which might be the whole Latex input,
39which is treated as an argument) or a single command, or a command
40argument. The parsing gets a little hairy because an environment,
41a normal command and bracketed commands (e.g. {\bf thing}) all get
42parsed into the same format. An environment, for example,
43is usually a one-argument command, as is {\bf thing}. It also
44deals with user-defined macros.
45
46Whilst parsing, the function MatchMacro gets called to
47attempt to find a command following a backslash (or the
48start of an environment). ParseMacroBody parses the
49arguments of a command when one is found.
50
51Generation
52----------
53
54The upshot of parsing is a hierarchy of TexChunks.
55TraverseFromDocument calls the recursive TraverseFromChunk,
56and is called by the 'client' converter application to
57start the generation process. TraverseFromChunk
58calls the two functions OnMacro and OnArgument,
59twice for each chunk to allow for preprocessing
60and postprocessing of each macro or argument.
61
62The client defines OnMacro and OnArgument to test
63the command identifier, and output the appropriate
64code. To help do this, the function TexOutput
65outputs to the current stream(s), and
66SetCurrentOutput(s) allows the setting of one
67or two output streams for the output to be sent to.
68Usually two outputs at a time are sufficient for
69hypertext applications where a title is likely
70to appear in an index and as a section header.
71
72There are support functions for getting the string
73data for the current chunk (GetArgData) and the
74current chunk (GetArgChunk). If you have a handle
75on a chunk, you can output it several times by calling
76TraverseChildrenFromChunk (not TraverseFromChunk because
77that causes infinite recursion).
78
79The client (here, Tex2RTF) also defines OnError and OnInform output
80functions appropriate to the desired user interface.
81
82References
83----------
84
85Adding, finding and resolving references are supported
86with functions from texutils.cc. WriteTexReferences
87and ReadTexReferences allow saving and reading references
88between conversion processes, rather like real LaTeX.
89
90Bibliography
91------------
92
93Again texutils.cc provides functions for reading in .bib files and
94resolving references. The function OutputBibItem gives a generic way
95outputting bibliography items, by 'faking' calls to OnMacro and
96OnArgument, allowing the existing low-level client code to take care of
97formatting.
98
99Units
100-----
101
102Unit parsing code is in texutils.cc as ParseUnitArgument. It converts
103units to points.
104
105Common errors
106-------------
107
1081) Macro not found: \end{center} ...
109
110Rewrite:
111
112\begin{center}
113{\large{\underline{A}}}
114\end{center}
115
116as:
117
118\begin{center}
119{\large \underline{A}}
120\end{center}
121
1222) Tables crash RTF. Set 'compatibility ' to TRUE in .ini file; also
123check for \\ end of row characters on their own on a line, insert
124correct number of ampersands for the number of columns.  E.g.
125
126hello & world\\
127\\
128
129becomes
130
131hello & world\\
132&\\
133
1343) If list items indent erratically, try increasing
135listItemIndent to give more space between label and following text.
136A global replace of '\item [' to '\item[' may also be helpful to remove
137unnecessary space before the item label.
138
1394) Missing figure or section references: ensure all labels _directly_ follow captions
140or sections (no intervening white space).
141