1<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>libxslt: An Extended Tutorial</title><meta name="generator" content="DocBook XSL Stylesheets V1.66.0"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="article" lang="en"><div class="titlepage"><div><div><h2 class="title"><a name="libxslt"></a>libxslt: An Extended Tutorial</h2></div><div><div class="author"><h3 class="author"><span class="firstname">Panos</span> <span class="surname">Louridas</span></h3></div></div><div><p class="copyright">Copyright � 2004 Panagiotis Louridas</p></div><div><div class="legalnotice"><a name="id2839296"></a><p>Permission is hereby granted, free of charge, to
2  any person obtaining a copy of this software and associated
3  documentation files (the "Software"), to deal in the Software
4  without restriction, including without limitation the rights to use,
5  copy, modify, merge, publish, distribute, sublicense, and/or sell
6  copies of the Software, and to permit persons to whom the Software
7  is furnished to do so, subject to the following conditions:
8  </p><p>The above copyright notice and this permission notice shall be
9  included in all copies or substantial portions of the Software.
10  </p><p>THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
11  EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
12  MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
13  NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
14  LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
15  OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
16  WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.</p></div></div></div><hr></div><div class="toc"><p><b>Table of Contents</b></p><dl><dt><span class="sect1"><a href="#id2771767">Introduction</a></span></dt><dt><span class="sect1"><a href="#id2771862">Setting the Scene</a></span></dt><dt><span class="sect1"><a href="#id2799225">Program Start</a></span></dt><dt><span class="sect1"><a href="#id2799358">Arguments Collection</a></span></dt><dt><span class="sect1"><a href="#id2799396">Parsing</a></span></dt><dt><span class="sect1"><a href="#id2771038">File Processing</a></span></dt><dt><span class="sect1"><a href="#id2771153">*NIX Compiling and Linking</a></span></dt><dt><span class="sect1"><a href="#windows-build">MS-Windows Compiling and
17Linking</a></span></dt><dd><dl><dt><span class="sect2"><a href="#windows-ports-build">Building the Ports in
18MS-Windows</a></span></dt></dl></dd><dt><span class="sect1"><a href="#id2839739">zlib, iconv and All That</a></span></dt><dt><span class="sect1"><a href="#id2839841">The Complete Program</a></span></dt></dl></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2771767"></a>Introduction</h2></div></div></div><p>The Extensible Stylesheet Language Transformations (XSLT)
19specification defines an XML template language for transforming XML
20documents. An XSLT engine reads an XSLT file and an XML document and
21transforms the document accordingly.</p><p>We want to perform a series of XSLT transformations to a series
22of documents. An obvious solution is to use the operating system's
23pipe mechanism and start a series of transformation processes, each
24one taking as input the output of the previous transformation. It
25would be interesting, though, and perhaps more efficient if we could
26do our job within a single process.</p><p>libxslt is a library for doing XSLT transformations. It is built
27on libxml, which is a library for handling XML documents. libxml and
28libxslt are used by the GNOME project. Although developed in the
29*NIX world, both libxml and libxslt have been
30ported to the MS-Windows platform. In principle an application using
31libxslt should be easily portable between the two systems. In
32practice, however, there arise various wrinkles. These do not have
33anything to do with libxml or libxslt per se, but rather with the
34different compilation and linking procedures of each system.</p><p>The presented solution is an extension of <a href="http://xmlsoft.org/XSLT/tutorial/libxslttutorial.html" target="_top">John
35Fleck's libxslt tutorial</a>, but the present tutorial tries to be
36self-contained. It develops a minimal libxslt application
37(libxslt_pipes) that can perform a series of transformations to a
38series of files in a pipe-like manner. An invocation might be:</p><p>
39  <b class="userinput"><tt>
40    libxslt_pipes --out results.xml foo.xsl bar.xsl doc1.xml doc2.xml
41  </tt></b>
42</p><p>The <tt class="filename">foo.xsl</tt> stylesheet will be applied to
43<tt class="filename"> doc1.xml</tt> and the <tt class="filename">bar.xsl</tt>
44stylesheet will be applied to the resulting document; then the two
45stylesheets will be applied in the same sequence to
46<tt class="filename">bar.xsl</tt>. The results are sent to
47<tt class="filename">results.xml</tt> (if no output is specified they are
48sent to standard output).</p><p>The application is compiled in both *NIX
49systems and MS-Windows, where by *NIX systems we
50mean Linux, BSD, and other members of the
51family. The gcc suite is used in the *NIX platform
52and the Microsoft compiler and linker are used in the
53MS-Windows platform.</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2771862"></a>Setting the Scene</h2></div></div></div><p>
54We need to include the necessary libraries:
55
56</p><pre class="programlisting">
57  
58  #include &lt;stdio.h&gt;
59  #include &lt;string.h&gt;
60  #include &lt;stdlib.h&gt;
61  
62  #include &lt;libxslt/transform.h&gt;
63  #include &lt;libxslt/xsltutils.h&gt;
64  
65</pre><p>
66</p><p>The first group of include directives includes general C
67libraries. The libraries we need to make libxslt work are in the
68second group. The <tt class="filename">transform.h</tt> header file
69declares the API that does the bulk of the actual processing. The
70<tt class="filename">xsltutils.h</tt> header file declares the API for some
71generic utility functions of the XSLT engine; among other things,
72saving to a file, which is what we need it for.</p><p>
73If our input files contain entities through external subsets, we need
74to tell libxslt to load them. The global variable
75<tt class="function">xmlLoadExtDtdDefaultValue</tt>, defined in
76<tt class="filename">libxml/globals.h</tt>, is responsible for that. As the
77variable is defined outside our program we must specify external
78linkage:
79  </p><pre class="programlisting">
80    extern int xmlLoadExtDtdDefaultValue;
81  </pre><p>
82</p><p>
83The program is called from the command line. We anticipate that the
84user may not call it the right way, so we define a function for
85describing its usage:
86</p><pre class="programlisting">
87  static void usage(const char *name) {
88      printf("Usage: %s [options] stylesheet [stylesheet ...] file [file ...]\n",
89          name);
90      printf("      --out file: send output to file\n");
91      printf("      --param name value: pass a (parameter,value) pair\n");
92  }
93</pre><p>
94</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2799225"></a>Program Start</h2></div></div></div><p>We need to define a few variables that are used throughout the
95program:
96</p><pre class="programlisting">
97    int main(int argc, char **argv) {
98        int arg_indx;
99	const char *params[16 + 1];
100	int params_indx = 0;
101	int stylesheet_indx = 0;
102	int file_indx = 0;
103	int i, j, k;
104	FILE *output_file = stdout;
105	xsltStylesheetPtr *stylesheets = 
106	    (xsltStylesheetPtr *) calloc(argc, sizeof(xsltStylesheetPtr));
107	    xmlDocPtr *files = (xmlDocPtr *) calloc(argc, sizeof(xmlDocPtr));
108	int return_value = 0;
109</pre><p>
110</p><p>The <tt class="varname">arg_indx</tt> integer is an index used to
111iterate over the program arguments. The <tt class="varname">params</tt>
112string array is used to collect the XSLT parameters. In XSLT,
113additional information may be passed to the processor via
114parameters. The user of the program specifies these in key-value pairs
115in the command line following the <b class="userinput"><tt>--param</tt></b>
116command line argument. We accept up to 8 such key-value pairs, which
117we track with the <tt class="varname">params_indx</tt> integer. libxslt
118expects the parameters array to be null-terminated, so we have to
119allocate one extra place (16 + 1) for it. The
120<tt class="varname">file_indx</tt> is an index to iterate over the files to
121be processed. The <tt class="varname">i</tt>, <tt class="varname">j</tt>,
122<tt class="varname">k</tt> integers are additional indices for iteration
123purposes, and <tt class="varname">return_value</tt> is the value the program
124returns to the operating system. We expect the result of the
125transformation to be the standard output in most cases, but the user
126may wish otherwise via the <tt class="option">--out</tt> command line
127option, so we need to keep track of the situation with the
128<tt class="varname">output_file</tt> file pointer.</p><p>In libxslt, XSLT stylesheets are internally stored in
129<span class="structname">xsltStylesheet</span> structures; similarly, in
130libxml XML documents are stored in <span class="structname">xmlDoc</span>
131structures. <span class="type">xsltStylesheetPtr</span> and <span class="type">xmlDocPtr</span>
132are simply typedefs of pointers to them. The user may specify any
133number of stylesheets that will be applied to the documents one after
134the other. To save time we parse the stylesheets and the documents as
135we read them from the command line and keep the parsed representation
136of them. The parsed results are kept in arrays. These are dynamically
137allocated and sized to the number of arguments; this wastes some
138space, but not much (the size of <span class="type">xmlStyleSheetPtr</span> and
139<span class="type">xmlDocPtr</span> is the size of a pointer) and simplifies code
140later on. The array memory is allocated with
141<tt class="function">calloc</tt> to ensure contents are initialised to
142zero.
143</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2799358"></a>Arguments Collection</h2></div></div></div><p>If the program gets no arguments at all, we print the usage
144description, set the program return value to 1 and exit. Instead of
145returning directly we go to (literally) to the end of the program text
146where some housekeeping takes place.</p><p>
147</p><pre class="programlisting">
148  
149    if (argc &lt;= 1) {
150        usage(argv[0]);
151        return_value = 1;
152        goto finish;
153    }
154        
155    /* Collect arguments */
156    for (arg_indx = 1; arg_indx &lt; argc; arg_indx++) {
157        if (argv[arg_indx][0] != '-')
158            break;
159        if ((!strcmp(argv[arg_indx], "-param"))
160                || (!strcmp(argv[arg_indx], "--param"))) {
161            arg_indx++;
162            params[params_indx++] = argv[arg_indx++];
163            params[params_indx++] = argv[arg_indx];
164            if (params_indx &gt;= 16) {
165                fprintf(stderr, "too many params\n");
166                return_value = 1;
167                goto finish;
168            }
169        }  else if ((!strcmp(argv[arg_indx], "-o"))
170                || (!strcmp(argv[arg_indx], "--out"))) {
171            arg_indx++;
172            output_file = fopen(argv[arg_indx], "w");
173        } else {
174            fprintf(stderr, "Unknown option %s\n", argv[arg_indx]);
175            usage(argv[0]);
176            return_value = 1;
177            goto finish;
178        }
179    }
180    params[params_indx] = 0;
181    
182</pre><p>
183</p><p>If the user passes arguments we have to collect them. This is a
184matter of iterating over the program argument list while we encounter
185arguments starting with a dash. The XSLT parameters are put into the
186<tt class="varname">params</tt> array and the <tt class="varname">output_file</tt>
187is set to the user request, if any. After processing all the parameter
188key-value pairs we set the last element of the <tt class="varname">params</tt>
189array to null.
190</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2799396"></a>Parsing</h2></div></div></div><p>The rest of the argument list is taken to be stylesheets and
191files to be transformed. Stylesheets are identified by their suffix,
192which is expected to be xsl (case sensitive). All other files are
193assumed to be XML documents, regardless of suffix.</p><p>
194</p><pre class="programlisting">
195  
196    /* Collect and parse stylesheets and files to be transformed */
197    for (; arg_indx &lt; argc; arg_indx++) {
198        char *argument =
199            (char *) malloc(sizeof(char) * (strlen(argv[arg_indx]) + 1));
200        strcpy(argument, argv[arg_indx]);
201        if (strtok(argument, ".")) {
202            char *suffix = strtok(0, ".");
203            if (suffix &amp;&amp; !strcmp(suffix, "xsl")) {
204                stylesheets[stylesheet_indx++] =
205                    xsltParseStylesheetFile((const xmlChar *)argv[arg_indx]);;
206            } else {
207                files[file_indx++] = xmlParseFile(argv[arg_indx]);
208            }
209        } else {
210            files[file_indx++] = xmlParseFile(argv[arg_indx]);
211        }
212        free(argument);
213    }
214  
215</pre><p>
216</p><p>Stylesheets are parsed using the
217<tt class="function">xsltParseStylesheetFile</tt>
218function. <tt class="function">xsltParseStylesheetFile</tt> takes as
219argument a pointer to an <span class="type">xmlChar</span>, a typedef of an
220unsigned char; in effect, the filename of the stylesheet. The
221resulting <span class="type">xsltStylesheetPtr</span> is placed in the
222<tt class="varname">stylesheets</tt> array. In the same vein, XML files are
223parsed using the <tt class="function">xmlParseFile</tt> function that takes
224as argument the file's name; the resulting <span class="type">xmlDocPtr</span> is
225placed in the <tt class="varname">files</tt> array.
226</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2771038"></a>File Processing</h2></div></div></div><p>All stylesheets are applied to each file one after the
227other. Stylesheets are applied with the
228<tt class="function">xsltApplyStylesheet</tt> function that takes as
229argument the stylesheet to be applied, the file to be transformed and
230any parameters we have collected. The in-memory representation of an
231XML document takes space, which we free using the
232<tt class="function">xmlFreeDoc</tt> function. The file is then saved to the
233specified output.</p><p>
234</p><pre class="programlisting">
235  
236    /* Process files */
237    for (i = 0; files[i]; i++) {
238        doc = files[i];
239        res = doc;
240        for (j = 0; stylesheets[j]; j++) {
241            res = xsltApplyStylesheet(stylesheets[j], doc, params);
242            xmlFreeDoc(doc);
243            doc = res;
244        }
245
246        if (stylesheets[0]) {
247            xsltSaveResultToFile(output_file, res, stylesheets[j-1]);
248        } else {
249            xmlDocDump(output_file, res);
250        }
251        xmlFreeDoc(res);
252    }
253
254    fclose(output_file);
255
256    for (k = 0; stylesheets[k]; k++) {
257        xsltFreeStylesheet(stylesheets[k]);
258    }
259
260    xsltCleanupGlobals();
261    xmlCleanupParser();
262
263 finish:
264    free(stylesheets);
265    free(files);
266    return(return_value);
267    
268</pre><p>
269</p><p>To output an XML document we have in memory we use the
270<tt class="function">xlstSaveResultToFile</tt> function, where we specify
271the destination, the document and the stylesheet that has been applied
272to it. The stylesheet is required so that output-related information
273contained in the stylesheet, such as the encoding to be used, is used
274in output. If no transformation has taken place, which will happen
275when the user specifies no stylesheets at all in the command line, we
276use the <tt class="function">xmlDocDump</tt> libxml function that saves the
277source document to the file without further ado.</p><p>As parsed stylesheets take up space in memory, we take care to
278free that memory after use with a call to
279<tt class="function">xmlFreeStyleSheet</tt>. When all work is done, we
280clean up all global variables used by the XSLT library using
281<tt class="function">xsltCleanupGlobals</tt>. Likewise, all global memory
282allocated for the XML parser is reclaimed by a call to
283<tt class="function">xmlCleanupParser</tt>. Before returning we deallocate
284the memory allocated for the holding the pointers to the XML documents
285and stylesheets.</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2771153"></a>*NIX Compiling and Linking</h2></div></div></div><p>Compiling and linking in a *NIX environment
286is easy, as the required libraries are almost certain to be already in
287place (remember that libxml and libxslt are used by the GNOME project,
288so they are present in most installations). The program can be
289dynamically linked so that its footprint is minimized, or statically
290linked, so that it stands by itself, carrying all required code.</p><p>For dynamic linking the following one liner will do:</p><p>
291<b class="userinput"><tt>gcc -o libxslt_pipes -Wall -I/usr/include/libxml2 -lxslt
292-lxml2 -L/usr/lib libxslt_pipes.c</tt></b>
293</p><p>We assume that the necessary header files are in <tt class="filename">/usr/include/libxml2</tt> and that the
294required libraries (<tt class="filename">libxslt.so</tt>,
295<tt class="filename">libxml2.so</tt>) are in <tt class="filename">/usr/lib</tt>.</p><p>In general, a program may need to link to additional libraries,
296depending on the processing it actually performs. A good way to start
297is to use the <span><b class="command">xslt-config</b></span> script. The
298<tt class="option">--help</tt> option displays usage
299information. Running</p><p>
300  <b class="userinput"><tt>
301    xslt-config --cflags
302  </tt></b>
303</p><p>we get compile flags, while running</p><p>
304  <b class="userinput"><tt>
305    xslt-config --libs
306  </tt></b>
307</p><p>we get the library settings for the linker.</p><p>For static linking we must list more libraries than we did for
308dynamic linking, as the libraries on which the libxsl and libxslt
309libraries depend are also needed. Using <span><b class="command">xslt-config</b></span>
310on a particular installation we create the following one-liner:</p><p>
311<b class="userinput"><tt>
312gcc -o libxslt_pipes -Wall -I/usr/include/libxml2 libxslt_pipes.c
313-static -L/usr/lib -lxslt -lxml2 -lz -lpthread -lm
314</tt></b>
315</p><p>If we get warnings to the effect that some function in
316statically linked applications requires at runtime the shared
317libraries used from the glibc version used for linking, that means
318that the binary is not completely static. Although we statically
319linked against the GNU C runtime library glibc, glibc uses external
320libraries to perform some of its functions. Same version libraries
321must be present on the system we want the application to run. One way
322to avoid this it to use an alternative C runtime, for example <a href="http://www.uclibc.org" target="_top">uClibc</a>, which requires obtaining
323and building a uClibc toolchain first (if the reason for trying to get
324a statically linked version of the program is to embed it somewhere,
325using uClibc might be a good idea anyway).
326</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="windows-build"></a>MS-Windows Compiling and
327Linking</h2></div></div></div><p>Compiling and linking in MS-Windows requires
328some attention. First, the MS-Windows ports must be
329downloaded and installed in the programming workstation. The ports are
330available in <a href="http://www.zlatkovic.com/libxml.en.html" target="_top">Igor
331Zlatkovi&#263;'s site</a>. We need the ports for iconv, zlib, libxml,
332and libxslt. In contrast to *NIX environments, we
333cannot assume that the libraries needed will be present in other
334computers where the program will be used. One solution is to
335distribute the program along with the necessary dynamic
336libraries. Another solution is to statically link the program so that
337only a single executable file will have to be distributed.</p><p>We assume that we have decompressed the downloaded ports and
338have placed the required contents of their <tt class="filename">include</tt> directories in an <tt class="filename">include</tt> directory in our file system. The
339required contents include everything apart from the <tt class="filename">libexslt</tt> directory of the libxslt port,
340as we are not using EXLST (an initiative to provide extensions to
341XSLT) in this project. In order to compile the program we have to make
342sure that all necessary header files are included. When using the
343Microsoft compiler this translates to adding the required
344<tt class="option">/I</tt> switches in the command line. If using a Visual
345Studio product the same effect is attained by specifying additional
346include directories in the compilation options. In the end, if the
347headers have been copied in <tt class="filename">C:\include</tt> the command line must contain
348<tt class="option">/I"C:\include" /I"C:\include\libslt"
349/I"C:\include\libxml"</tt>.</p><p>This being a C program, it needs to be compiled against an
350implementation of the C libraries. Microsoft provides various
351implementations. The ports, however, have been compiled against the
352<tt class="filename">msvcrt.dll</tt> implementation, so it is wise to use
353the same runtime in our project, lest we wish to come against
354unexpected runtime crashes. The <tt class="filename">msvcrt.dll</tt> is a
355multi-threaded implementation and is specified by giving
356<tt class="option">/MD</tt> as a compiler option. Unfortunately, the
357correspondence between the <tt class="option">/MD</tt> switch and
358<tt class="filename">msvcrt.dll</tt> breaks after version 6 of the
359Microsoft compiler. In version 7 and later (i.e., Visual Studio .NET),
360<tt class="option">/MD</tt> links against a different DLL; in version 7.1
361this is <tt class="filename">msvcrt71.dll</tt>. The end result of this bit
362of esoterica is that if you try to dynamically link your application
363with a compiler whose version is greater than 6, your program is
364likely to crash unexpectedly. Alternatively, you may wish to compile
365all iconv, zlib, libxml and libxslt yourself, using the new runtime
366library. This is not a tall order, and some details are given
367<a href="#windows-ports-build" title="Building the Ports in
368MS-Windows">below</a>.</p><p>There are three kinds of libraries in MS-Windows. Dynamically
369Linked Libraries (DLLs), like <tt class="filename">msvcrt.dll</tt> we met
370above, are used for dynamic linking; an application links to them at
371runtime, so the application does not include the code contained in
372them. Static libraries are used for static linking; an application
373adds the libraries' code to its own code at link time. Import
374libraries are used when building an application that uses DLLs. For
375the application to be built, the linker must somehow find the
376definitions of the functions that will be provided in runtime by the
377DLLs, otherwise it will complain about unresolved references. Import
378libraries contain function stubs that, for each DLL function we want
379to call, know where to look for it in the DLL. In essence, in order to
380use a DLL we must link against its corresponding import library. DLLs
381have a <tt class="filename">.dll</tt> suffix; static and import libraries
382both have a <tt class="filename">.lib</tt> suffix. In the MS-Windows ports
383of libxml and libxslt static libraries are distinguished by their name
384ending in <tt class="filename">_a.lib</tt>, while in the zlib port the
385import library is <tt class="filename">zdll.lib</tt> and the static library
386is <tt class="filename">zlib.lib</tt>. In what follows we assume we have a
387<tt class="filename">lib</tt> directory in our filesystem
388where we place the libraries we need for linking.</p><p>If we want to link dynamically we must make sure the <tt class="filename">lib</tt> directory contains
389<tt class="filename">iconv.lib</tt>, <tt class="filename">libxslt.lib</tt>,
390<tt class="filename">libxml2.lib</tt>, and
391<tt class="filename">zdll.lib</tt>. When using the Microsoft linker this
392translates to adding the required <tt class="option">/LIBPATH</tt>
393switch and the necessary libraries in the command line. In Visual
394Studio we must specify an additional library directory for <tt class="filename">lib</tt> and put the necessary libraries in
395the additional dependencies. In the end, the command line must include
396<tt class="option">/LIBPATH:"C:\lib" "lib\iconv.lib" "lib\libxslt.lib"
397"lib\libxml2.lib" "lib\zdll.lib"</tt>, provided the libraries'
398directory is <tt class="filename">C:\lib</tt>. In order
399for the resulting executable to run, the ports DLLs must be present;
400one way is to place all DLLs contained in the ports in the home
401directory of our application, and make sure they are distributed
402together.</p><p>If we want to link statically we must make sure the <tt class="filename">lib</tt> directory contains
403<tt class="filename">iconv_a.lib</tt>, <tt class="filename">libxslt_a.lib</tt>,
404<tt class="filename">libxml2_a.lib</tt>, and
405<tt class="filename">zlib.lib</tt>. Adding <tt class="filename">lib</tt> as a library directory and putting
406the necessary libraries in the additional dependencies, we get a
407command line that should include <tt class="option">/LIBPATH:"C:\lib"
408"lib\iconv_a.lib" "lib\libxslt_a.lib" "lib\libxml2_a.lib"
409"lib\zlib.lib"</tt>. The resulting executable is much bigger
410than if we linked dynamically; it is, however, self-contained and can
411be distributed more easily, in theory at least. In practice, however,
412the executable is not completely static. We saw that the ports are
413compiled against <tt class="filename">msvcrt.dll</tt>, so the program does
414require that DLL at runtime. Moreover, since when using a version of
415Microsoft developer tools with a version number greater than 6, we are
416no longer using <tt class="filename">msvcrt.dll</tt>, but another runtime
417like <tt class="filename">msvcrt71.dll</tt>, and we then need that DLL.  In
418contrast to <tt class="filename">msvcrt.dll</tt> it may not be present on
419the target computer, so we may have to copy it along.</p><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="windows-ports-build"></a>Building the Ports in
420MS-Windows</h3></div></div></div><p>The source code of the ports is readily available on the web,
421one has to check the ports sites. Each port can be built without
422problems in an MS-Windows environment using Microsoft development
423tools.  The necessary command line tools (compiler, linker,
424<span><b class="command">nmake</b></span>) must be available. This means running a
425batch file called <span><b class="command">vcvars32.bat</b></span> that comes with
426Visual Studio (its exact location in the directory tree may vary
427depending on the version of Visual Studio, but a file search will find
428it anyway). Makefiles for the Microsoft tools are found in all
429ports. They are distinguished by their suffix, e.g.,
430<tt class="filename">Makefile.msvc</tt> or
431<tt class="filename">Makefile.msc</tt>. To build zlib it suffices to run
432<span><b class="command">nmake</b></span> against <tt class="filename">Makefile.msc</tt>
433(i.e., with the <tt class="option">/F</tt> option); similarly, to build
434<tt class="filename">iconv</tt> it suffices to run <span><b class="command">nmake</b></span>
435against <tt class="filename">Makefile.msvc</tt>. Building libxml and
436libxslt requires an extra configuration step; we must run the
437<tt class="filename">configure.js</tt> configuration script with the
438<span><b class="command">cscript</b></span> command. <tt class="filename">configure.js</tt>
439is found in the <tt class="filename">win32</tt> directory
440in the distributions. It is written in JScript, Microsoft's
441implementation of the ECMA 262 language specification (ECMAScript
442Edition 3), a JavaScript offspring. The configuration string takes a
443number of parameters detailing our environment and needs;
444<b class="userinput"><tt>cscript configure.js help</tt></b> documents
445them.</p><p>It is wise to read all documentation files in the source
446distributions before starting; moreover, pay attention to the
447dependencies between the ports. If we configure libxml and libxslt to
448use iconv and zlib we must build these two first and make sure their
449headers and libraries can be found by the compiler and the
450linker when building libxml and libxslt.</p></div></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2839739"></a>zlib, iconv and All That</h2></div></div></div><p>We saw that libxml and libxslt depend on various other
451libraries, for instance zlib, iconv, and so forth. Taking a look into
452them gives us clues on the capabilities of libxml and libxslt.</p><p><a href="http://www.zlib.org" target="_top">zlib</a> is a free general
453purpose lossless data compression library. It is a venerable
454workhorse; more than <a href="http://www.gzip.org/zlib/apps.html" target="_top">500 applications</a>
455(both commercial and open source) seem to use the library. libxml uses
456zlib so that it can read from or write to compressed files
457directly. The <tt class="function">xmlParseFile</tt> function can
458transparently parse a compressed document to produce an
459<span class="structname">xmlDoc</span>. If we want to create a compressed
460document with libxml we can use an
461<span class="structname">xmlTextWriterPtr</span> (obtained through
462<tt class="function">xmlNewTextWriterDoc</tt>), or another related
463structure from <tt class="filename">libxml/xmlwriter.h</tt>, with
464compression enabled.</p><p>XML allows documents to use a variety of different character
465encodings. <a href="http://www.gnu.org/software/libiconv" target="_top">iconv</a> is a free
466library for converting between different character encodings.  libxml
467provides a set of default converters for some encodings: UTF-8, UTF-16
468(little endian and big endian), ISO-8859-1, ASCII, and HTML (a
469specific handler for the conversion of UTF-8 to ASCII with HTML
470predefined entities like &amp;copy; for the copyright sign). However,
471when compiled with iconv support, libxml and libxslt can handle the
472full range of encodings provided by iconv; these should cover most
473needs.</p><p>libxml and libxslt can be used in multi-threaded
474applications. In MS-Windows they are linked against
475<tt class="filename">MSVCRT.DLL</tt> (or one of its descendants, as we saw
476<a href="#windows-build" title="MS-Windows Compiling and
477Linking">above</a>). In *NIX the pthreads
478(POSIX threads) library is used.</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2839841"></a>The Complete Program</h2></div></div></div><p>
479The complete program listing is given below. The program is also
480<a href="libxslt_pipes.c" target="_top">available online</a>.
481</p><p>
482</p><pre class="programlisting">
483/*
484 * libxslt_pipes.c: a program for performing a series of XSLT
485 * transformations
486 *
487 * Writen by Panos Louridas, based on libxslt_tutorial.c by John Fleck.
488 *
489 * This program is free software; you can redistribute it and/or modify
490 * it under the terms of the GNU General Public License as published by
491 * the Free Software Foundation; either version 2 of the License, or
492 * (at your option) any later version.
493 *
494 * This program is distributed in the hope that it will be useful,
495 * but WITHOUT ANY WARRANTY; without even the implied warranty of
496 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
497 * GNU General Public License for more details.
498 *
499 * You should have received a copy of the GNU General Public License
500 * along with this program; if not, write to the Free Software
501 * Foundation, Inc.,  59 Temple Place - Suite 330, Cambridge, MA 02139, USA.
502 *
503 */ 
504
505#include &lt;stdio.h&gt;
506#include &lt;string.h&gt;
507#include &lt;stdlib.h&gt;
508
509#include &lt;libxslt/transform.h&gt;
510#include &lt;libxslt/xsltutils.h&gt;
511
512extern int xmlLoadExtDtdDefaultValue;
513
514static void usage(const char *name) {
515    printf("Usage: %s [options] stylesheet [stylesheet ...] file [file ...]\n",
516            name);
517    printf("      --out file: send output to file\n");
518    printf("      --param name value: pass a (parameter,value) pair\n");
519}
520
521int main(int argc, char **argv) {
522    int arg_indx;
523    const char *params[16 + 1];
524    int params_indx = 0;
525    int stylesheet_indx = 0;
526    int file_indx = 0;
527    int i, j, k;
528    FILE *output_file = stdout;
529    xsltStylesheetPtr *stylesheets = 
530        (xsltStylesheetPtr *) calloc(argc, sizeof(xsltStylesheetPtr));
531    xmlDocPtr *files = (xmlDocPtr *) calloc(argc, sizeof(xmlDocPtr));
532    xmlDocPtr doc, res;
533    int return_value = 0;
534        
535    if (argc &lt;= 1) {
536        usage(argv[0]);
537        return_value = 1;
538        goto finish;
539    }
540        
541    /* Collect arguments */
542    for (arg_indx = 1; arg_indx &lt; argc; arg_indx++) {
543        if (argv[arg_indx][0] != '-')
544            break;
545        if ((!strcmp(argv[arg_indx], "-param"))
546                || (!strcmp(argv[arg_indx], "--param"))) {
547            arg_indx++;
548            params[params_indx++] = argv[arg_indx++];
549            params[params_indx++] = argv[arg_indx];
550            if (params_indx &gt;= 16) {
551                fprintf(stderr, "too many params\n");
552                return_value = 1;
553                goto finish;
554            }
555        }  else if ((!strcmp(argv[arg_indx], "-o"))
556                || (!strcmp(argv[arg_indx], "--out"))) {
557            arg_indx++;
558            output_file = fopen(argv[arg_indx], "w");
559        } else {
560            fprintf(stderr, "Unknown option %s\n", argv[arg_indx]);
561            usage(argv[0]);
562            return_value = 1;
563            goto finish;
564        }
565    }
566    params[params_indx] = 0;
567
568    /* Collect and parse stylesheets and files to be transformed */
569    for (; arg_indx &lt; argc; arg_indx++) {
570        char *argument =
571            (char *) malloc(sizeof(char) * (strlen(argv[arg_indx]) + 1));
572        strcpy(argument, argv[arg_indx]);
573        if (strtok(argument, ".")) {
574            char *suffix = strtok(0, ".");
575            if (suffix &amp;&amp; !strcmp(suffix, "xsl")) {
576                stylesheets[stylesheet_indx++] =
577                    xsltParseStylesheetFile((const xmlChar *)argv[arg_indx]);;
578            } else {
579                files[file_indx++] = xmlParseFile(argv[arg_indx]);
580            }
581        } else {
582            files[file_indx++] = xmlParseFile(argv[arg_indx]);
583        }
584        free(argument);
585    }
586
587    xmlSubstituteEntitiesDefault(1);
588    xmlLoadExtDtdDefaultValue = 1;
589
590    /* Process files */
591    for (i = 0; files[i]; i++) {
592        doc = files[i];
593        res = doc;
594        for (j = 0; stylesheets[j]; j++) {
595            res = xsltApplyStylesheet(stylesheets[j], doc, params);
596            xmlFreeDoc(doc);
597            doc = res;
598        }
599
600        if (stylesheets[0]) {
601            xsltSaveResultToFile(output_file, res, stylesheets[j-1]);
602        } else {
603            xmlDocDump(output_file, res);
604        }
605        xmlFreeDoc(res);
606    }
607
608    fclose(output_file);
609
610    for (k = 0; stylesheets[k]; k++) {
611        xsltFreeStylesheet(stylesheets[k]);
612    }
613
614    xsltCleanupGlobals();
615    xmlCleanupParser();
616
617 finish:
618    free(stylesheets);
619    free(files);
620    return(return_value);
621}
622
623</pre><p>
624</p></div></div></body></html>
625