xmlwf.1 (104349) | xmlwf.1 (178848) |
---|---|
1.\" This manpage has been automatically generated by docbook2man 2.\" from a DocBook document. This tool can be found at: 3.\" <http://shell.ipoline.com/~elmert/comp/docbook2X/> 4.\" Please send any bug reports, improvements, comments, patches, 5.\" etc. to Steve Cheng <steve@ggi-project.org>. | 1.\" This manpage has been automatically generated by docbook2man 2.\" from a DocBook document. This tool can be found at: 3.\" <http://shell.ipoline.com/~elmert/comp/docbook2X/> 4.\" Please send any bug reports, improvements, comments, patches, 5.\" etc. to Steve Cheng <steve@ggi-project.org>. |
6.TH "XMLWF" "1" "22 April 2002" "" "" | 6.TH "XMLWF" "1" "24 January 2003" "" "" |
7.SH NAME 8xmlwf \- Determines if an XML document is well-formed 9.SH SYNOPSIS 10 11\fBxmlwf\fR [ \fB-s\fR] [ \fB-n\fR] [ \fB-p\fR] [ \fB-x\fR] [ \fB-e \fIencoding\fB\fR] [ \fB-w\fR] [ \fB-d \fIoutput-dir\fB\fR] [ \fB-c\fR] [ \fB-m\fR] [ \fB-r\fR] [ \fB-t\fR] [ \fB-v\fR] [ \fBfile ...\fR] 12 13.SH "DESCRIPTION" 14.PP | 7.SH NAME 8xmlwf \- Determines if an XML document is well-formed 9.SH SYNOPSIS 10 11\fBxmlwf\fR [ \fB-s\fR] [ \fB-n\fR] [ \fB-p\fR] [ \fB-x\fR] [ \fB-e \fIencoding\fB\fR] [ \fB-w\fR] [ \fB-d \fIoutput-dir\fB\fR] [ \fB-c\fR] [ \fB-m\fR] [ \fB-r\fR] [ \fB-t\fR] [ \fB-v\fR] [ \fBfile ...\fR] 12 13.SH "DESCRIPTION" 14.PP |
15\fBxmlwf\fR uses the Expat library to determine 16if an XML document is well-formed. It is non-validating. | 15\fBxmlwf\fR uses the Expat library to 16determine if an XML document is well-formed. It is 17non-validating. |
17.PP | 18.PP |
18If you do not specify any files on the command-line, 19and you have a recent version of xmlwf, the input 20file will be read from stdin. | 19If you do not specify any files on the command-line, and you 20have a recent version of \fBxmlwf\fR, the 21input file will be read from standard input. |
21.SH "WELL-FORMED DOCUMENTS" 22.PP 23A well-formed document must adhere to the 24following rules: 25.TP 0.2i 26\(bu 27The file begins with an XML declaration. For instance, 28<?xml version="1.0" standalone="yes"?>. | 22.SH "WELL-FORMED DOCUMENTS" 23.PP 24A well-formed document must adhere to the 25following rules: 26.TP 0.2i 27\(bu 28The file begins with an XML declaration. For instance, 29<?xml version="1.0" standalone="yes"?>. |
29\fBNOTE:\fR xmlwf does not currently | 30\fBNOTE:\fR 31\fBxmlwf\fR does not currently |
30check for a valid XML declaration. 31.TP 0.2i 32\(bu 33Every start tag is either empty (<tag/>) 34or has a corresponding end tag. 35.TP 0.2i 36\(bu 37There is exactly one root element. This element must contain --- 5 unchanged lines hidden (view full) --- 43All elements nest properly. 44.TP 0.2i 45\(bu 46All attribute values are enclosed in quotes (either single 47or double). 48.PP 49If the document has a DTD, and it strictly complies with that 50DTD, then the document is also considered \fBvalid\fR. | 32check for a valid XML declaration. 33.TP 0.2i 34\(bu 35Every start tag is either empty (<tag/>) 36or has a corresponding end tag. 37.TP 0.2i 38\(bu 39There is exactly one root element. This element must contain --- 5 unchanged lines hidden (view full) --- 45All elements nest properly. 46.TP 0.2i 47\(bu 48All attribute values are enclosed in quotes (either single 49or double). 50.PP 51If the document has a DTD, and it strictly complies with that 52DTD, then the document is also considered \fBvalid\fR. |
51xmlwf is a non-validating parser -- it does not check the DTD. 52However, it does support external entities (see the -x option). | 53\fBxmlwf\fR is a non-validating parser -- 54it does not check the DTD. However, it does support 55external entities (see the \fB-x\fR option). |
53.SH "OPTIONS" 54.PP 55When an option includes an argument, you may specify the argument either | 56.SH "OPTIONS" 57.PP 58When an option includes an argument, you may specify the argument either |
56separate ("d output") or mashed ("-doutput"). xmlwf supports both. | 59separately ("\fB-d\fR output") or concatenated with the 60option ("\fB-d\fRoutput"). \fBxmlwf\fR 61supports both. |
57.TP 58\fB-c\fR | 62.TP 63\fB-c\fR |
59If the input file is well-formed and xmlwf doesn't 60encounter any errors, the input file is simply copied to | 64If the input file is well-formed and \fBxmlwf\fR 65doesn't encounter any errors, the input file is simply copied to |
61the output directory unchanged. | 66the output directory unchanged. |
62This implies no namespaces (turns off -n) and 63requires -d to specify an output file. | 67This implies no namespaces (turns off \fB-n\fR) and 68requires \fB-d\fR to specify an output file. |
64.TP 65\fB-d output-dir\fR 66Specifies a directory to contain transformed 67representations of the input files. | 69.TP 70\fB-d output-dir\fR 71Specifies a directory to contain transformed 72representations of the input files. |
68By default, -d outputs a canonical representation | 73By default, \fB-d\fR outputs a canonical representation |
69(described below). | 74(described below). |
70You can select different output formats using -c and -m. | 75You can select different output formats using \fB-c\fR 76and \fB-m\fR. |
71 72The output filenames will 73be exactly the same as the input filenames or "STDIN" if the input is | 77 78The output filenames will 79be exactly the same as the input filenames or "STDIN" if the input is |
74coming from STDIN. Therefore, you must be careful that the | 80coming from standard input. Therefore, you must be careful that the |
75output file does not go into the same directory as the input | 81output file does not go into the same directory as the input |
76file. Otherwise, xmlwf will delete the input file before 77it generates the output file (just like running | 82file. Otherwise, \fBxmlwf\fR will delete the 83input file before it generates the output file (just like running |
78cat < file > file in most shells). 79 80Two structurally equivalent XML documents have a byte-for-byte 81identical canonical XML representation. 82Note that ignorable white space is considered significant and 83is treated equivalently to data. 84More on canonical XML can be found at 85http://www.jclark.com/xml/canonxml.html . 86.TP 87\fB-e encoding\fR 88Specifies the character encoding for the document, overriding | 84cat < file > file in most shells). 85 86Two structurally equivalent XML documents have a byte-for-byte 87identical canonical XML representation. 88Note that ignorable white space is considered significant and 89is treated equivalently to data. 90More on canonical XML can be found at 91http://www.jclark.com/xml/canonxml.html . 92.TP 93\fB-e encoding\fR 94Specifies the character encoding for the document, overriding |
89any document encoding declaration. xmlwf 90has four built-in encodings: | 95any document encoding declaration. \fBxmlwf\fR 96supports four built-in encodings: |
91US-ASCII, 92UTF-8, 93UTF-16, and 94ISO-8859-1. | 97US-ASCII, 98UTF-8, 99UTF-16, and 100ISO-8859-1. |
95Also see the -w option. | 101Also see the \fB-w\fR option. |
96.TP 97\fB-m\fR 98Outputs some strange sort of XML file that completely | 102.TP 103\fB-m\fR 104Outputs some strange sort of XML file that completely |
99describes the the input file, including character postitions. 100Requires -d to specify an output file. | 105describes the input file, including character positions. 106Requires \fB-d\fR to specify an output file. |
101.TP 102\fB-n\fR 103Turns on namespace processing. (describe namespaces) | 107.TP 108\fB-n\fR 109Turns on namespace processing. (describe namespaces) |
104-c disables namespaces. | 110\fB-c\fR disables namespaces. |
105.TP 106\fB-p\fR 107Tells xmlwf to process external DTDs and parameter 108entities. 109 | 111.TP 112\fB-p\fR 113Tells xmlwf to process external DTDs and parameter 114entities. 115 |
110Normally xmlwf never parses parameter entities. 111-p tells it to always parse them. 112-p implies -x. | 116Normally \fBxmlwf\fR never parses parameter 117entities. \fB-p\fR tells it to always parse them. 118\fB-p\fR implies \fB-x\fR. |
113.TP 114\fB-r\fR | 119.TP 120\fB-r\fR |
115Normally xmlwf memory-maps the XML file before parsing. 116-r turns off memory-mapping and uses normal file IO calls instead. | 121Normally \fBxmlwf\fR memory-maps the XML file 122before parsing; this can result in faster parsing on many 123platforms. 124\fB-r\fR turns off memory-mapping and uses normal file 125IO calls instead. |
117Of course, memory-mapping is automatically turned off | 126Of course, memory-mapping is automatically turned off |
118when reading from STDIN. | 127when reading from standard input. 128 129Use of memory-mapping can cause some platforms to report 130substantially higher memory usage for 131\fBxmlwf\fR, but this appears to be a matter of 132the operating system reporting memory in a strange way; there is 133not a leak in \fBxmlwf\fR. |
119.TP 120\fB-s\fR 121Prints an error if the document is not standalone. 122A document is standalone if it has no external subset and no 123references to parameter entities. 124.TP 125\fB-t\fR 126Turns on timings. This tells Expat to parse the entire file, 127but not perform any processing. 128This gives a fairly accurate idea of the raw speed of Expat itself 129without client overhead. | 134.TP 135\fB-s\fR 136Prints an error if the document is not standalone. 137A document is standalone if it has no external subset and no 138references to parameter entities. 139.TP 140\fB-t\fR 141Turns on timings. This tells Expat to parse the entire file, 142but not perform any processing. 143This gives a fairly accurate idea of the raw speed of Expat itself 144without client overhead. |
130-t turns off most of the output options (-d, -m -c, ...). | 145\fB-t\fR turns off most of the output options 146(\fB-d\fR, \fB-m\fR, \fB-c\fR, 147\&...). |
131.TP 132\fB-v\fR | 148.TP 149\fB-v\fR |
133Prints the version of the Expat library being used, and then exits. | 150Prints the version of the Expat library being used, including some 151information on the compile-time configuration of the library, and 152then exits. |
134.TP 135\fB-w\fR | 153.TP 154\fB-w\fR |
136Enables Windows code pages. 137Normally, xmlwf will throw an error if it runs across 138an encoding that it is not equipped to handle itself. With 139-w, xmlwf will try to use a Windows code page. See 140also -e. | 155Enables support for Windows code pages. 156Normally, \fBxmlwf\fR will throw an error if it 157runs across an encoding that it is not equipped to handle itself. With 158\fB-w\fR, xmlwf will try to use a Windows code 159page. See also \fB-e\fR. |
141.TP 142\fB-x\fR 143Turns on parsing external entities. 144 145Non-validating parsers are not required to resolve external 146entities, or even expand entities at all. 147Expat always expands internal entities (?), 148but external entity parsing must be enabled explicitly. --- 10 unchanged lines hidden (view full) --- 159And here are some examples of external entities: 160 161.nf 162<!ENTITY header SYSTEM "header-&vers;.xml"> (parsed) 163<!ENTITY logo SYSTEM "logo.png" PNG> (unparsed) 164.fi 165.TP 166\fB--\fR | 160.TP 161\fB-x\fR 162Turns on parsing external entities. 163 164Non-validating parsers are not required to resolve external 165entities, or even expand entities at all. 166Expat always expands internal entities (?), 167but external entity parsing must be enabled explicitly. --- 10 unchanged lines hidden (view full) --- 178And here are some examples of external entities: 179 180.nf 181<!ENTITY header SYSTEM "header-&vers;.xml"> (parsed) 182<!ENTITY logo SYSTEM "logo.png" PNG> (unparsed) 183.fi 184.TP 185\fB--\fR |
167For some reason, xmlwf specifically ignores "--" 168anywhere it appears on the command line. | 186(Two hyphens.) 187Terminates the list of options. This is only needed if a filename 188starts with a hyphen. For example: 189 190.nf 191xmlwf -- -myfile.xml 192.fi 193 194will run \fBxmlwf\fR on the file 195\fI-myfile.xml\fR. |
169.PP | 196.PP |
170Older versions of xmlwf do not support reading from STDIN. | 197Older versions of \fBxmlwf\fR do not support 198reading from standard input. |
171.SH "OUTPUT" 172.PP | 199.SH "OUTPUT" 200.PP |
173If an input file is not well-formed, xmlwf outputs 174a single line describing the problem to STDOUT. 175If a file is well formed, xmlwf outputs nothing. | 201If an input file is not well-formed, 202\fBxmlwf\fR prints a single line describing 203the problem to standard output. If a file is well formed, 204\fBxmlwf\fR outputs nothing. |
176Note that the result code is \fBnot\fR set. 177.SH "BUGS" 178.PP 179According to the W3C standard, an XML file without a 180declaration at the beginning is not considered well-formed. | 205Note that the result code is \fBnot\fR set. 206.SH "BUGS" 207.PP 208According to the W3C standard, an XML file without a 209declaration at the beginning is not considered well-formed. |
181However, xmlwf allows this to pass. | 210However, \fBxmlwf\fR allows this to pass. |
182.PP | 211.PP |
183xmlwf returns a 0 - noerr result, even if the file is 184not well-formed. There is no good way for a program to use 185xmlwf to quickly check a file -- it must parse xmlwf's STDOUT. | 212\fBxmlwf\fR returns a 0 - noerr result, 213even if the file is not well-formed. There is no good way for 214a program to use \fBxmlwf\fR to quickly 215check a file -- it must parse \fBxmlwf\fR's 216standard output. |
186.PP | 217.PP |
187The errors should go to STDERR, not stdout. | 218The errors should go to standard error, not standard output. |
188.PP | 219.PP |
189There should be a way to get -d to send its output to STDOUT 190rather than forcing the user to send it to a file. | 220There should be a way to get \fB-d\fR to send its 221output to standard output rather than forcing the user to send 222it to a file. |
191.PP | 223.PP |
192I have no idea why anyone would want to use the -d, -c 193and -m options. If someone could explain it to me, I'd 194like to add this information to this manpage. | 224I have no idea why anyone would want to use the 225\fB-d\fR, \fB-c\fR, and 226\fB-m\fR options. If someone could explain it to 227me, I'd like to add this information to this manpage. |
195.SH "ALTERNATIVES" 196.PP 197Here are some XML validators on the web: 198 199.nf 200http://www.hcrc.ed.ac.uk/~richard/xml-check.html 201http://www.stg.brown.edu/service/xmlvalid/ 202http://www.scripting.com/frontier5/xml/code/xmlValidator.html 203http://www.xml.com/pub/a/tools/ruwf/check.html | 228.SH "ALTERNATIVES" 229.PP 230Here are some XML validators on the web: 231 232.nf 233http://www.hcrc.ed.ac.uk/~richard/xml-check.html 234http://www.stg.brown.edu/service/xmlvalid/ 235http://www.scripting.com/frontier5/xml/code/xmlValidator.html 236http://www.xml.com/pub/a/tools/ruwf/check.html |
237.fi 238.SH "SEE ALSO" 239.PP 240 241.nf 242The Expat home page: http://www.libexpat.org/ 243The W3 XML specification: http://www.w3.org/TR/REC-xml 244.fi 245.SH "AUTHOR" 246.PP 247This manual page was written by Scott Bronson <bronson@rinspin.com> for 248the Debian GNU/Linux system (but may be used by others). Permission is 249granted to copy, distribute and/or modify this document under 250the terms of the GNU Free Documentation 251License, Version 1.1. |
|