1<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>libxslt: An Extended Tutorial</title><meta name="generator" content="DocBook XSL Stylesheets V1.66.0"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="article" lang="en"><div class="titlepage"><div><div><h2 class="title"><a name="libxslt"></a>libxslt: An Extended Tutorial</h2></div><div><div class="author"><h3 class="author"><span class="firstname">Panos</span> <span class="surname">Louridas</span></h3></div></div><div><p class="copyright">Copyright � 2004 Panagiotis Louridas</p></div><div><div class="legalnotice"><a name="id2839296"></a><p>Permission is hereby granted, free of charge, to 2 any person obtaining a copy of this software and associated 3 documentation files (the "Software"), to deal in the Software 4 without restriction, including without limitation the rights to use, 5 copy, modify, merge, publish, distribute, sublicense, and/or sell 6 copies of the Software, and to permit persons to whom the Software 7 is furnished to do so, subject to the following conditions: 8 </p><p>The above copyright notice and this permission notice shall be 9 included in all copies or substantial portions of the Software. 10 </p><p>THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 11 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 12 MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 13 NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE 14 LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 15 OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 16 WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.</p></div></div></div><hr></div><div class="toc"><p><b>Table of Contents</b></p><dl><dt><span class="sect1"><a href="#id2771767">Introduction</a></span></dt><dt><span class="sect1"><a href="#id2771862">Setting the Scene</a></span></dt><dt><span class="sect1"><a href="#id2799225">Program Start</a></span></dt><dt><span class="sect1"><a href="#id2799358">Arguments Collection</a></span></dt><dt><span class="sect1"><a href="#id2799396">Parsing</a></span></dt><dt><span class="sect1"><a href="#id2771038">File Processing</a></span></dt><dt><span class="sect1"><a href="#id2771153">*NIX Compiling and Linking</a></span></dt><dt><span class="sect1"><a href="#windows-build">MS-Windows Compiling and 17Linking</a></span></dt><dd><dl><dt><span class="sect2"><a href="#windows-ports-build">Building the Ports in 18MS-Windows</a></span></dt></dl></dd><dt><span class="sect1"><a href="#id2839739">zlib, iconv and All That</a></span></dt><dt><span class="sect1"><a href="#id2839841">The Complete Program</a></span></dt></dl></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2771767"></a>Introduction</h2></div></div></div><p>The Extensible Stylesheet Language Transformations (XSLT) 19specification defines an XML template language for transforming XML 20documents. An XSLT engine reads an XSLT file and an XML document and 21transforms the document accordingly.</p><p>We want to perform a series of XSLT transformations to a series 22of documents. An obvious solution is to use the operating system's 23pipe mechanism and start a series of transformation processes, each 24one taking as input the output of the previous transformation. It 25would be interesting, though, and perhaps more efficient if we could 26do our job within a single process.</p><p>libxslt is a library for doing XSLT transformations. It is built 27on libxml, which is a library for handling XML documents. libxml and 28libxslt are used by the GNOME project. Although developed in the 29*NIX world, both libxml and libxslt have been 30ported to the MS-Windows platform. In principle an application using 31libxslt should be easily portable between the two systems. In 32practice, however, there arise various wrinkles. These do not have 33anything to do with libxml or libxslt per se, but rather with the 34different compilation and linking procedures of each system.</p><p>The presented solution is an extension of <a href="http://xmlsoft.org/XSLT/tutorial/libxslttutorial.html" target="_top">John 35Fleck's libxslt tutorial</a>, but the present tutorial tries to be 36self-contained. It develops a minimal libxslt application 37(libxslt_pipes) that can perform a series of transformations to a 38series of files in a pipe-like manner. An invocation might be:</p><p> 39 <b class="userinput"><tt> 40 libxslt_pipes --out results.xml foo.xsl bar.xsl doc1.xml doc2.xml 41 </tt></b> 42</p><p>The <tt class="filename">foo.xsl</tt> stylesheet will be applied to 43<tt class="filename"> doc1.xml</tt> and the <tt class="filename">bar.xsl</tt> 44stylesheet will be applied to the resulting document; then the two 45stylesheets will be applied in the same sequence to 46<tt class="filename">bar.xsl</tt>. The results are sent to 47<tt class="filename">results.xml</tt> (if no output is specified they are 48sent to standard output).</p><p>The application is compiled in both *NIX 49systems and MS-Windows, where by *NIX systems we 50mean Linux, BSD, and other members of the 51family. The gcc suite is used in the *NIX platform 52and the Microsoft compiler and linker are used in the 53MS-Windows platform.</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2771862"></a>Setting the Scene</h2></div></div></div><p> 54We need to include the necessary libraries: 55 56</p><pre class="programlisting"> 57 58 #include <stdio.h> 59 #include <string.h> 60 #include <stdlib.h> 61 62 #include <libxslt/transform.h> 63 #include <libxslt/xsltutils.h> 64 65</pre><p> 66</p><p>The first group of include directives includes general C 67libraries. The libraries we need to make libxslt work are in the 68second group. The <tt class="filename">transform.h</tt> header file 69declares the API that does the bulk of the actual processing. The 70<tt class="filename">xsltutils.h</tt> header file declares the API for some 71generic utility functions of the XSLT engine; among other things, 72saving to a file, which is what we need it for.</p><p> 73If our input files contain entities through external subsets, we need 74to tell libxslt to load them. The global variable 75<tt class="function">xmlLoadExtDtdDefaultValue</tt>, defined in 76<tt class="filename">libxml/globals.h</tt>, is responsible for that. As the 77variable is defined outside our program we must specify external 78linkage: 79 </p><pre class="programlisting"> 80 extern int xmlLoadExtDtdDefaultValue; 81 </pre><p> 82</p><p> 83The program is called from the command line. We anticipate that the 84user may not call it the right way, so we define a function for 85describing its usage: 86</p><pre class="programlisting"> 87 static void usage(const char *name) { 88 printf("Usage: %s [options] stylesheet [stylesheet ...] file [file ...]\n", 89 name); 90 printf(" --out file: send output to file\n"); 91 printf(" --param name value: pass a (parameter,value) pair\n"); 92 } 93</pre><p> 94</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2799225"></a>Program Start</h2></div></div></div><p>We need to define a few variables that are used throughout the 95program: 96</p><pre class="programlisting"> 97 int main(int argc, char **argv) { 98 int arg_indx; 99 const char *params[16 + 1]; 100 int params_indx = 0; 101 int stylesheet_indx = 0; 102 int file_indx = 0; 103 int i, j, k; 104 FILE *output_file = stdout; 105 xsltStylesheetPtr *stylesheets = 106 (xsltStylesheetPtr *) calloc(argc, sizeof(xsltStylesheetPtr)); 107 xmlDocPtr *files = (xmlDocPtr *) calloc(argc, sizeof(xmlDocPtr)); 108 int return_value = 0; 109</pre><p> 110</p><p>The <tt class="varname">arg_indx</tt> integer is an index used to 111iterate over the program arguments. The <tt class="varname">params</tt> 112string array is used to collect the XSLT parameters. In XSLT, 113additional information may be passed to the processor via 114parameters. The user of the program specifies these in key-value pairs 115in the command line following the <b class="userinput"><tt>--param</tt></b> 116command line argument. We accept up to 8 such key-value pairs, which 117we track with the <tt class="varname">params_indx</tt> integer. libxslt 118expects the parameters array to be null-terminated, so we have to 119allocate one extra place (16 + 1) for it. The 120<tt class="varname">file_indx</tt> is an index to iterate over the files to 121be processed. The <tt class="varname">i</tt>, <tt class="varname">j</tt>, 122<tt class="varname">k</tt> integers are additional indices for iteration 123purposes, and <tt class="varname">return_value</tt> is the value the program 124returns to the operating system. We expect the result of the 125transformation to be the standard output in most cases, but the user 126may wish otherwise via the <tt class="option">--out</tt> command line 127option, so we need to keep track of the situation with the 128<tt class="varname">output_file</tt> file pointer.</p><p>In libxslt, XSLT stylesheets are internally stored in 129<span class="structname">xsltStylesheet</span> structures; similarly, in 130libxml XML documents are stored in <span class="structname">xmlDoc</span> 131structures. <span class="type">xsltStylesheetPtr</span> and <span class="type">xmlDocPtr</span> 132are simply typedefs of pointers to them. The user may specify any 133number of stylesheets that will be applied to the documents one after 134the other. To save time we parse the stylesheets and the documents as 135we read them from the command line and keep the parsed representation 136of them. The parsed results are kept in arrays. These are dynamically 137allocated and sized to the number of arguments; this wastes some 138space, but not much (the size of <span class="type">xmlStyleSheetPtr</span> and 139<span class="type">xmlDocPtr</span> is the size of a pointer) and simplifies code 140later on. The array memory is allocated with 141<tt class="function">calloc</tt> to ensure contents are initialised to 142zero. 143</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2799358"></a>Arguments Collection</h2></div></div></div><p>If the program gets no arguments at all, we print the usage 144description, set the program return value to 1 and exit. Instead of 145returning directly we go to (literally) to the end of the program text 146where some housekeeping takes place.</p><p> 147</p><pre class="programlisting"> 148 149 if (argc <= 1) { 150 usage(argv[0]); 151 return_value = 1; 152 goto finish; 153 } 154 155 /* Collect arguments */ 156 for (arg_indx = 1; arg_indx < argc; arg_indx++) { 157 if (argv[arg_indx][0] != '-') 158 break; 159 if ((!strcmp(argv[arg_indx], "-param")) 160 || (!strcmp(argv[arg_indx], "--param"))) { 161 arg_indx++; 162 params[params_indx++] = argv[arg_indx++]; 163 params[params_indx++] = argv[arg_indx]; 164 if (params_indx >= 16) { 165 fprintf(stderr, "too many params\n"); 166 return_value = 1; 167 goto finish; 168 } 169 } else if ((!strcmp(argv[arg_indx], "-o")) 170 || (!strcmp(argv[arg_indx], "--out"))) { 171 arg_indx++; 172 output_file = fopen(argv[arg_indx], "w"); 173 } else { 174 fprintf(stderr, "Unknown option %s\n", argv[arg_indx]); 175 usage(argv[0]); 176 return_value = 1; 177 goto finish; 178 } 179 } 180 params[params_indx] = 0; 181 182</pre><p> 183</p><p>If the user passes arguments we have to collect them. This is a 184matter of iterating over the program argument list while we encounter 185arguments starting with a dash. The XSLT parameters are put into the 186<tt class="varname">params</tt> array and the <tt class="varname">output_file</tt> 187is set to the user request, if any. After processing all the parameter 188key-value pairs we set the last element of the <tt class="varname">params</tt> 189array to null. 190</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2799396"></a>Parsing</h2></div></div></div><p>The rest of the argument list is taken to be stylesheets and 191files to be transformed. Stylesheets are identified by their suffix, 192which is expected to be xsl (case sensitive). All other files are 193assumed to be XML documents, regardless of suffix.</p><p> 194</p><pre class="programlisting"> 195 196 /* Collect and parse stylesheets and files to be transformed */ 197 for (; arg_indx < argc; arg_indx++) { 198 char *argument = 199 (char *) malloc(sizeof(char) * (strlen(argv[arg_indx]) + 1)); 200 strcpy(argument, argv[arg_indx]); 201 if (strtok(argument, ".")) { 202 char *suffix = strtok(0, "."); 203 if (suffix && !strcmp(suffix, "xsl")) { 204 stylesheets[stylesheet_indx++] = 205 xsltParseStylesheetFile((const xmlChar *)argv[arg_indx]);; 206 } else { 207 files[file_indx++] = xmlParseFile(argv[arg_indx]); 208 } 209 } else { 210 files[file_indx++] = xmlParseFile(argv[arg_indx]); 211 } 212 free(argument); 213 } 214 215</pre><p> 216</p><p>Stylesheets are parsed using the 217<tt class="function">xsltParseStylesheetFile</tt> 218function. <tt class="function">xsltParseStylesheetFile</tt> takes as 219argument a pointer to an <span class="type">xmlChar</span>, a typedef of an 220unsigned char; in effect, the filename of the stylesheet. The 221resulting <span class="type">xsltStylesheetPtr</span> is placed in the 222<tt class="varname">stylesheets</tt> array. In the same vein, XML files are 223parsed using the <tt class="function">xmlParseFile</tt> function that takes 224as argument the file's name; the resulting <span class="type">xmlDocPtr</span> is 225placed in the <tt class="varname">files</tt> array. 226</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2771038"></a>File Processing</h2></div></div></div><p>All stylesheets are applied to each file one after the 227other. Stylesheets are applied with the 228<tt class="function">xsltApplyStylesheet</tt> function that takes as 229argument the stylesheet to be applied, the file to be transformed and 230any parameters we have collected. The in-memory representation of an 231XML document takes space, which we free using the 232<tt class="function">xmlFreeDoc</tt> function. The file is then saved to the 233specified output.</p><p> 234</p><pre class="programlisting"> 235 236 /* Process files */ 237 for (i = 0; files[i]; i++) { 238 doc = files[i]; 239 res = doc; 240 for (j = 0; stylesheets[j]; j++) { 241 res = xsltApplyStylesheet(stylesheets[j], doc, params); 242 xmlFreeDoc(doc); 243 doc = res; 244 } 245 246 if (stylesheets[0]) { 247 xsltSaveResultToFile(output_file, res, stylesheets[j-1]); 248 } else { 249 xmlDocDump(output_file, res); 250 } 251 xmlFreeDoc(res); 252 } 253 254 fclose(output_file); 255 256 for (k = 0; stylesheets[k]; k++) { 257 xsltFreeStylesheet(stylesheets[k]); 258 } 259 260 xsltCleanupGlobals(); 261 xmlCleanupParser(); 262 263 finish: 264 free(stylesheets); 265 free(files); 266 return(return_value); 267 268</pre><p> 269</p><p>To output an XML document we have in memory we use the 270<tt class="function">xlstSaveResultToFile</tt> function, where we specify 271the destination, the document and the stylesheet that has been applied 272to it. The stylesheet is required so that output-related information 273contained in the stylesheet, such as the encoding to be used, is used 274in output. If no transformation has taken place, which will happen 275when the user specifies no stylesheets at all in the command line, we 276use the <tt class="function">xmlDocDump</tt> libxml function that saves the 277source document to the file without further ado.</p><p>As parsed stylesheets take up space in memory, we take care to 278free that memory after use with a call to 279<tt class="function">xmlFreeStyleSheet</tt>. When all work is done, we 280clean up all global variables used by the XSLT library using 281<tt class="function">xsltCleanupGlobals</tt>. Likewise, all global memory 282allocated for the XML parser is reclaimed by a call to 283<tt class="function">xmlCleanupParser</tt>. Before returning we deallocate 284the memory allocated for the holding the pointers to the XML documents 285and stylesheets.</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2771153"></a>*NIX Compiling and Linking</h2></div></div></div><p>Compiling and linking in a *NIX environment 286is easy, as the required libraries are almost certain to be already in 287place (remember that libxml and libxslt are used by the GNOME project, 288so they are present in most installations). The program can be 289dynamically linked so that its footprint is minimized, or statically 290linked, so that it stands by itself, carrying all required code.</p><p>For dynamic linking the following one liner will do:</p><p> 291<b class="userinput"><tt>gcc -o libxslt_pipes -Wall -I/usr/include/libxml2 -lxslt 292-lxml2 -L/usr/lib libxslt_pipes.c</tt></b> 293</p><p>We assume that the necessary header files are in <tt class="filename">/usr/include/libxml2</tt> and that the 294required libraries (<tt class="filename">libxslt.so</tt>, 295<tt class="filename">libxml2.so</tt>) are in <tt class="filename">/usr/lib</tt>.</p><p>In general, a program may need to link to additional libraries, 296depending on the processing it actually performs. A good way to start 297is to use the <span><b class="command">xslt-config</b></span> script. The 298<tt class="option">--help</tt> option displays usage 299information. Running</p><p> 300 <b class="userinput"><tt> 301 xslt-config --cflags 302 </tt></b> 303</p><p>we get compile flags, while running</p><p> 304 <b class="userinput"><tt> 305 xslt-config --libs 306 </tt></b> 307</p><p>we get the library settings for the linker.</p><p>For static linking we must list more libraries than we did for 308dynamic linking, as the libraries on which the libxsl and libxslt 309libraries depend are also needed. Using <span><b class="command">xslt-config</b></span> 310on a particular installation we create the following one-liner:</p><p> 311<b class="userinput"><tt> 312gcc -o libxslt_pipes -Wall -I/usr/include/libxml2 libxslt_pipes.c 313-static -L/usr/lib -lxslt -lxml2 -lz -lpthread -lm 314</tt></b> 315</p><p>If we get warnings to the effect that some function in 316statically linked applications requires at runtime the shared 317libraries used from the glibc version used for linking, that means 318that the binary is not completely static. Although we statically 319linked against the GNU C runtime library glibc, glibc uses external 320libraries to perform some of its functions. Same version libraries 321must be present on the system we want the application to run. One way 322to avoid this it to use an alternative C runtime, for example <a href="http://www.uclibc.org" target="_top">uClibc</a>, which requires obtaining 323and building a uClibc toolchain first (if the reason for trying to get 324a statically linked version of the program is to embed it somewhere, 325using uClibc might be a good idea anyway). 326</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="windows-build"></a>MS-Windows Compiling and 327Linking</h2></div></div></div><p>Compiling and linking in MS-Windows requires 328some attention. First, the MS-Windows ports must be 329downloaded and installed in the programming workstation. The ports are 330available in <a href="http://www.zlatkovic.com/libxml.en.html" target="_top">Igor 331Zlatković's site</a>. We need the ports for iconv, zlib, libxml, 332and libxslt. In contrast to *NIX environments, we 333cannot assume that the libraries needed will be present in other 334computers where the program will be used. One solution is to 335distribute the program along with the necessary dynamic 336libraries. Another solution is to statically link the program so that 337only a single executable file will have to be distributed.</p><p>We assume that we have decompressed the downloaded ports and 338have placed the required contents of their <tt class="filename">include</tt> directories in an <tt class="filename">include</tt> directory in our file system. The 339required contents include everything apart from the <tt class="filename">libexslt</tt> directory of the libxslt port, 340as we are not using EXLST (an initiative to provide extensions to 341XSLT) in this project. In order to compile the program we have to make 342sure that all necessary header files are included. When using the 343Microsoft compiler this translates to adding the required 344<tt class="option">/I</tt> switches in the command line. If using a Visual 345Studio product the same effect is attained by specifying additional 346include directories in the compilation options. In the end, if the 347headers have been copied in <tt class="filename">C:\include</tt> the command line must contain 348<tt class="option">/I"C:\include" /I"C:\include\libslt" 349/I"C:\include\libxml"</tt>.</p><p>This being a C program, it needs to be compiled against an 350implementation of the C libraries. Microsoft provides various 351implementations. The ports, however, have been compiled against the 352<tt class="filename">msvcrt.dll</tt> implementation, so it is wise to use 353the same runtime in our project, lest we wish to come against 354unexpected runtime crashes. The <tt class="filename">msvcrt.dll</tt> is a 355multi-threaded implementation and is specified by giving 356<tt class="option">/MD</tt> as a compiler option. Unfortunately, the 357correspondence between the <tt class="option">/MD</tt> switch and 358<tt class="filename">msvcrt.dll</tt> breaks after version 6 of the 359Microsoft compiler. In version 7 and later (i.e., Visual Studio .NET), 360<tt class="option">/MD</tt> links against a different DLL; in version 7.1 361this is <tt class="filename">msvcrt71.dll</tt>. The end result of this bit 362of esoterica is that if you try to dynamically link your application 363with a compiler whose version is greater than 6, your program is 364likely to crash unexpectedly. Alternatively, you may wish to compile 365all iconv, zlib, libxml and libxslt yourself, using the new runtime 366library. This is not a tall order, and some details are given 367<a href="#windows-ports-build" title="Building the Ports in 368MS-Windows">below</a>.</p><p>There are three kinds of libraries in MS-Windows. Dynamically 369Linked Libraries (DLLs), like <tt class="filename">msvcrt.dll</tt> we met 370above, are used for dynamic linking; an application links to them at 371runtime, so the application does not include the code contained in 372them. Static libraries are used for static linking; an application 373adds the libraries' code to its own code at link time. Import 374libraries are used when building an application that uses DLLs. For 375the application to be built, the linker must somehow find the 376definitions of the functions that will be provided in runtime by the 377DLLs, otherwise it will complain about unresolved references. Import 378libraries contain function stubs that, for each DLL function we want 379to call, know where to look for it in the DLL. In essence, in order to 380use a DLL we must link against its corresponding import library. DLLs 381have a <tt class="filename">.dll</tt> suffix; static and import libraries 382both have a <tt class="filename">.lib</tt> suffix. In the MS-Windows ports 383of libxml and libxslt static libraries are distinguished by their name 384ending in <tt class="filename">_a.lib</tt>, while in the zlib port the 385import library is <tt class="filename">zdll.lib</tt> and the static library 386is <tt class="filename">zlib.lib</tt>. In what follows we assume we have a 387<tt class="filename">lib</tt> directory in our filesystem 388where we place the libraries we need for linking.</p><p>If we want to link dynamically we must make sure the <tt class="filename">lib</tt> directory contains 389<tt class="filename">iconv.lib</tt>, <tt class="filename">libxslt.lib</tt>, 390<tt class="filename">libxml2.lib</tt>, and 391<tt class="filename">zdll.lib</tt>. When using the Microsoft linker this 392translates to adding the required <tt class="option">/LIBPATH</tt> 393switch and the necessary libraries in the command line. In Visual 394Studio we must specify an additional library directory for <tt class="filename">lib</tt> and put the necessary libraries in 395the additional dependencies. In the end, the command line must include 396<tt class="option">/LIBPATH:"C:\lib" "lib\iconv.lib" "lib\libxslt.lib" 397"lib\libxml2.lib" "lib\zdll.lib"</tt>, provided the libraries' 398directory is <tt class="filename">C:\lib</tt>. In order 399for the resulting executable to run, the ports DLLs must be present; 400one way is to place all DLLs contained in the ports in the home 401directory of our application, and make sure they are distributed 402together.</p><p>If we want to link statically we must make sure the <tt class="filename">lib</tt> directory contains 403<tt class="filename">iconv_a.lib</tt>, <tt class="filename">libxslt_a.lib</tt>, 404<tt class="filename">libxml2_a.lib</tt>, and 405<tt class="filename">zlib.lib</tt>. Adding <tt class="filename">lib</tt> as a library directory and putting 406the necessary libraries in the additional dependencies, we get a 407command line that should include <tt class="option">/LIBPATH:"C:\lib" 408"lib\iconv_a.lib" "lib\libxslt_a.lib" "lib\libxml2_a.lib" 409"lib\zlib.lib"</tt>. The resulting executable is much bigger 410than if we linked dynamically; it is, however, self-contained and can 411be distributed more easily, in theory at least. In practice, however, 412the executable is not completely static. We saw that the ports are 413compiled against <tt class="filename">msvcrt.dll</tt>, so the program does 414require that DLL at runtime. Moreover, since when using a version of 415Microsoft developer tools with a version number greater than 6, we are 416no longer using <tt class="filename">msvcrt.dll</tt>, but another runtime 417like <tt class="filename">msvcrt71.dll</tt>, and we then need that DLL. In 418contrast to <tt class="filename">msvcrt.dll</tt> it may not be present on 419the target computer, so we may have to copy it along.</p><div class="sect2" lang="en"><div class="titlepage"><div><div><h3 class="title"><a name="windows-ports-build"></a>Building the Ports in 420MS-Windows</h3></div></div></div><p>The source code of the ports is readily available on the web, 421one has to check the ports sites. Each port can be built without 422problems in an MS-Windows environment using Microsoft development 423tools. The necessary command line tools (compiler, linker, 424<span><b class="command">nmake</b></span>) must be available. This means running a 425batch file called <span><b class="command">vcvars32.bat</b></span> that comes with 426Visual Studio (its exact location in the directory tree may vary 427depending on the version of Visual Studio, but a file search will find 428it anyway). Makefiles for the Microsoft tools are found in all 429ports. They are distinguished by their suffix, e.g., 430<tt class="filename">Makefile.msvc</tt> or 431<tt class="filename">Makefile.msc</tt>. To build zlib it suffices to run 432<span><b class="command">nmake</b></span> against <tt class="filename">Makefile.msc</tt> 433(i.e., with the <tt class="option">/F</tt> option); similarly, to build 434<tt class="filename">iconv</tt> it suffices to run <span><b class="command">nmake</b></span> 435against <tt class="filename">Makefile.msvc</tt>. Building libxml and 436libxslt requires an extra configuration step; we must run the 437<tt class="filename">configure.js</tt> configuration script with the 438<span><b class="command">cscript</b></span> command. <tt class="filename">configure.js</tt> 439is found in the <tt class="filename">win32</tt> directory 440in the distributions. It is written in JScript, Microsoft's 441implementation of the ECMA 262 language specification (ECMAScript 442Edition 3), a JavaScript offspring. The configuration string takes a 443number of parameters detailing our environment and needs; 444<b class="userinput"><tt>cscript configure.js help</tt></b> documents 445them.</p><p>It is wise to read all documentation files in the source 446distributions before starting; moreover, pay attention to the 447dependencies between the ports. If we configure libxml and libxslt to 448use iconv and zlib we must build these two first and make sure their 449headers and libraries can be found by the compiler and the 450linker when building libxml and libxslt.</p></div></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2839739"></a>zlib, iconv and All That</h2></div></div></div><p>We saw that libxml and libxslt depend on various other 451libraries, for instance zlib, iconv, and so forth. Taking a look into 452them gives us clues on the capabilities of libxml and libxslt.</p><p><a href="http://www.zlib.org" target="_top">zlib</a> is a free general 453purpose lossless data compression library. It is a venerable 454workhorse; more than <a href="http://www.gzip.org/zlib/apps.html" target="_top">500 applications</a> 455(both commercial and open source) seem to use the library. libxml uses 456zlib so that it can read from or write to compressed files 457directly. The <tt class="function">xmlParseFile</tt> function can 458transparently parse a compressed document to produce an 459<span class="structname">xmlDoc</span>. If we want to create a compressed 460document with libxml we can use an 461<span class="structname">xmlTextWriterPtr</span> (obtained through 462<tt class="function">xmlNewTextWriterDoc</tt>), or another related 463structure from <tt class="filename">libxml/xmlwriter.h</tt>, with 464compression enabled.</p><p>XML allows documents to use a variety of different character 465encodings. <a href="http://www.gnu.org/software/libiconv" target="_top">iconv</a> is a free 466library for converting between different character encodings. libxml 467provides a set of default converters for some encodings: UTF-8, UTF-16 468(little endian and big endian), ISO-8859-1, ASCII, and HTML (a 469specific handler for the conversion of UTF-8 to ASCII with HTML 470predefined entities like &copy; for the copyright sign). However, 471when compiled with iconv support, libxml and libxslt can handle the 472full range of encodings provided by iconv; these should cover most 473needs.</p><p>libxml and libxslt can be used in multi-threaded 474applications. In MS-Windows they are linked against 475<tt class="filename">MSVCRT.DLL</tt> (or one of its descendants, as we saw 476<a href="#windows-build" title="MS-Windows Compiling and 477Linking">above</a>). In *NIX the pthreads 478(POSIX threads) library is used.</p></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id2839841"></a>The Complete Program</h2></div></div></div><p> 479The complete program listing is given below. The program is also 480<a href="libxslt_pipes.c" target="_top">available online</a>. 481</p><p> 482</p><pre class="programlisting"> 483/* 484 * libxslt_pipes.c: a program for performing a series of XSLT 485 * transformations 486 * 487 * Writen by Panos Louridas, based on libxslt_tutorial.c by John Fleck. 488 * 489 * This program is free software; you can redistribute it and/or modify 490 * it under the terms of the GNU General Public License as published by 491 * the Free Software Foundation; either version 2 of the License, or 492 * (at your option) any later version. 493 * 494 * This program is distributed in the hope that it will be useful, 495 * but WITHOUT ANY WARRANTY; without even the implied warranty of 496 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 497 * GNU General Public License for more details. 498 * 499 * You should have received a copy of the GNU General Public License 500 * along with this program; if not, write to the Free Software 501 * Foundation, Inc., 59 Temple Place - Suite 330, Cambridge, MA 02139, USA. 502 * 503 */ 504 505#include <stdio.h> 506#include <string.h> 507#include <stdlib.h> 508 509#include <libxslt/transform.h> 510#include <libxslt/xsltutils.h> 511 512extern int xmlLoadExtDtdDefaultValue; 513 514static void usage(const char *name) { 515 printf("Usage: %s [options] stylesheet [stylesheet ...] file [file ...]\n", 516 name); 517 printf(" --out file: send output to file\n"); 518 printf(" --param name value: pass a (parameter,value) pair\n"); 519} 520 521int main(int argc, char **argv) { 522 int arg_indx; 523 const char *params[16 + 1]; 524 int params_indx = 0; 525 int stylesheet_indx = 0; 526 int file_indx = 0; 527 int i, j, k; 528 FILE *output_file = stdout; 529 xsltStylesheetPtr *stylesheets = 530 (xsltStylesheetPtr *) calloc(argc, sizeof(xsltStylesheetPtr)); 531 xmlDocPtr *files = (xmlDocPtr *) calloc(argc, sizeof(xmlDocPtr)); 532 xmlDocPtr doc, res; 533 int return_value = 0; 534 535 if (argc <= 1) { 536 usage(argv[0]); 537 return_value = 1; 538 goto finish; 539 } 540 541 /* Collect arguments */ 542 for (arg_indx = 1; arg_indx < argc; arg_indx++) { 543 if (argv[arg_indx][0] != '-') 544 break; 545 if ((!strcmp(argv[arg_indx], "-param")) 546 || (!strcmp(argv[arg_indx], "--param"))) { 547 arg_indx++; 548 params[params_indx++] = argv[arg_indx++]; 549 params[params_indx++] = argv[arg_indx]; 550 if (params_indx >= 16) { 551 fprintf(stderr, "too many params\n"); 552 return_value = 1; 553 goto finish; 554 } 555 } else if ((!strcmp(argv[arg_indx], "-o")) 556 || (!strcmp(argv[arg_indx], "--out"))) { 557 arg_indx++; 558 output_file = fopen(argv[arg_indx], "w"); 559 } else { 560 fprintf(stderr, "Unknown option %s\n", argv[arg_indx]); 561 usage(argv[0]); 562 return_value = 1; 563 goto finish; 564 } 565 } 566 params[params_indx] = 0; 567 568 /* Collect and parse stylesheets and files to be transformed */ 569 for (; arg_indx < argc; arg_indx++) { 570 char *argument = 571 (char *) malloc(sizeof(char) * (strlen(argv[arg_indx]) + 1)); 572 strcpy(argument, argv[arg_indx]); 573 if (strtok(argument, ".")) { 574 char *suffix = strtok(0, "."); 575 if (suffix && !strcmp(suffix, "xsl")) { 576 stylesheets[stylesheet_indx++] = 577 xsltParseStylesheetFile((const xmlChar *)argv[arg_indx]);; 578 } else { 579 files[file_indx++] = xmlParseFile(argv[arg_indx]); 580 } 581 } else { 582 files[file_indx++] = xmlParseFile(argv[arg_indx]); 583 } 584 free(argument); 585 } 586 587 xmlSubstituteEntitiesDefault(1); 588 xmlLoadExtDtdDefaultValue = 1; 589 590 /* Process files */ 591 for (i = 0; files[i]; i++) { 592 doc = files[i]; 593 res = doc; 594 for (j = 0; stylesheets[j]; j++) { 595 res = xsltApplyStylesheet(stylesheets[j], doc, params); 596 xmlFreeDoc(doc); 597 doc = res; 598 } 599 600 if (stylesheets[0]) { 601 xsltSaveResultToFile(output_file, res, stylesheets[j-1]); 602 } else { 603 xmlDocDump(output_file, res); 604 } 605 xmlFreeDoc(res); 606 } 607 608 fclose(output_file); 609 610 for (k = 0; stylesheets[k]; k++) { 611 xsltFreeStylesheet(stylesheets[k]); 612 } 613 614 xsltCleanupGlobals(); 615 xmlCleanupParser(); 616 617 finish: 618 free(stylesheets); 619 free(files); 620 return(return_value); 621} 622 623</pre><p> 624</p></div></div></body></html> 625