1 2=head1 NAME 3 4Pod::Simple - framework for parsing Pod 5 6=head1 SYNOPSIS 7 8 TODO 9 10=head1 DESCRIPTION 11 12Pod::Simple is a Perl library for parsing text in the Pod ("plain old 13documentation") markup language that is typically used for writing 14documentation for Perl and for Perl modules. The Pod format is explained 15in L<perlpod>; the most common formatter is called C<perldoc>. 16 17Be sure to read L</ENCODING> if your Pod contains non-ASCII characters. 18 19Pod formatters can use Pod::Simple to parse Pod documents and render them into 20plain text, HTML, or any number of other formats. Typically, such formatters 21will be subclasses of Pod::Simple, and so they will inherit its methods, like 22C<parse_file>. But note that Pod::Simple doesn't understand and 23properly parse Perl itself, so if you have a file which contains a Perl 24program that has a multi-line quoted string which has lines that look 25like pod, Pod::Simple will treat them as pod. This can be avoided if 26the file makes these into indented here documents instead. 27 28If you're reading this document just because you have a Pod-processing 29subclass that you want to use, this document (plus the documentation for the 30subclass) is probably all you need to read. 31 32If you're reading this document because you want to write a formatter 33subclass, continue reading it and then read L<Pod::Simple::Subclassing>, and 34then possibly even read L<perlpodspec> (some of which is for parser-writers, 35but much of which is notes to formatter-writers). 36 37=head1 MAIN METHODS 38 39=over 40 41=item C<< $parser = I<SomeClass>->new(); >> 42 43This returns a new parser object, where I<C<SomeClass>> is a subclass 44of Pod::Simple. 45 46=item C<< $parser->output_fh( *OUT ); >> 47 48This sets the filehandle that C<$parser>'s output will be written to. 49You can pass C<*STDOUT> or C<*STDERR>, otherwise you should probably do 50something like this: 51 52 my $outfile = "output.txt"; 53 open TXTOUT, ">$outfile" or die "Can't write to $outfile: $!"; 54 $parser->output_fh(*TXTOUT); 55 56...before you call one of the C<< $parser->parse_I<whatever> >> methods. 57 58=item C<< $parser->output_string( \$somestring ); >> 59 60This sets the string that C<$parser>'s output will be sent to, 61instead of any filehandle. 62 63 64=item C<< $parser->parse_file( I<$some_filename> ); >> 65 66=item C<< $parser->parse_file( *INPUT_FH ); >> 67 68This reads the Pod content of the file (or filehandle) that you specify, 69and processes it with that C<$parser> object, according to however 70C<$parser>'s class works, and according to whatever parser options you 71have set up for this C<$parser> object. 72 73=item C<< $parser->parse_string_document( I<$all_content> ); >> 74 75This works just like C<parse_file> except that it reads the Pod 76content not from a file, but from a string that you have already 77in memory. 78 79=item C<< $parser->parse_lines( I<...@lines...>, undef ); >> 80 81This processes the lines in C<@lines> (where each list item must be a 82defined value, and must contain exactly one line of content -- so no 83items like C<"foo\nbar"> are allowed). The final C<undef> is used to 84indicate the end of document being parsed. 85 86The other C<parser_I<whatever>> methods are meant to be called only once 87per C<$parser> object; but C<parse_lines> can be called as many times per 88C<$parser> object as you want, as long as the last call (and only 89the last call) ends with an C<undef> value. 90 91 92=item C<< $parser->content_seen >> 93 94This returns true only if there has been any real content seen for this 95document. Returns false in cases where the document contains content, 96but does not make use of any Pod markup. 97 98=item C<< I<SomeClass>->filter( I<$filename> ); >> 99 100=item C<< I<SomeClass>->filter( I<*INPUT_FH> ); >> 101 102=item C<< I<SomeClass>->filter( I<\$document_content> ); >> 103 104This is a shortcut method for creating a new parser object, setting the 105output handle to STDOUT, and then processing the specified file (or 106filehandle, or in-memory document). This is handy for one-liners like 107this: 108 109 perl -MPod::Simple::Text -e "Pod::Simple::Text->filter('thingy.pod')" 110 111=back 112 113 114 115=head1 SECONDARY METHODS 116 117Some of these methods might be of interest to general users, as 118well as of interest to formatter-writers. 119 120Note that the general pattern here is that the accessor-methods 121read the attribute's value with C<< $value = $parser->I<attribute> >> 122and set the attribute's value with 123C<< $parser->I<attribute>(I<newvalue>) >>. For each accessor, I typically 124only mention one syntax or another, based on which I think you are actually 125most likely to use. 126 127 128=over 129 130=item C<< $parser->parse_characters( I<SOMEVALUE> ) >> 131 132The Pod parser normally expects to read octets and to convert those octets 133to characters based on the C<=encoding> declaration in the Pod source. Set 134this option to a true value to indicate that the Pod source is already a Perl 135character stream. This tells the parser to ignore any C<=encoding> command 136and to skip all the code paths involving decoding octets. 137 138=item C<< $parser->no_whining( I<SOMEVALUE> ) >> 139 140If you set this attribute to a true value, you will suppress the 141parser's complaints about irregularities in the Pod coding. By default, 142this attribute's value is false, meaning that irregularities will 143be reported. 144 145Note that turning this attribute to true won't suppress one or two kinds 146of complaints about rarely occurring unrecoverable errors. 147 148 149=item C<< $parser->no_errata_section( I<SOMEVALUE> ) >> 150 151If you set this attribute to a true value, you will stop the parser from 152generating a "POD ERRORS" section at the end of the document. By 153default, this attribute's value is false, meaning that an errata section 154will be generated, as necessary. 155 156 157=item C<< $parser->complain_stderr( I<SOMEVALUE> ) >> 158 159If you set this attribute to a true value, it will send reports of 160parsing errors to STDERR. By default, this attribute's value is false, 161meaning that no output is sent to STDERR. 162 163Setting C<complain_stderr> also sets C<no_errata_section>. 164 165 166=item C<< $parser->source_filename >> 167 168This returns the filename that this parser object was set to read from. 169 170 171=item C<< $parser->doc_has_started >> 172 173This returns true if C<$parser> has read from a source, and has seen 174Pod content in it. 175 176 177=item C<< $parser->source_dead >> 178 179This returns true if C<$parser> has read from a source, and come to the 180end of that source. 181 182=item C<< $parser->strip_verbatim_indent( I<SOMEVALUE> ) >> 183 184The perlpod spec for a Verbatim paragraph is "It should be reproduced 185exactly...", which means that the whitespace you've used to indent your 186verbatim blocks will be preserved in the output. This can be annoying for 187outputs such as HTML, where that whitespace will remain in front of every 188line. It's an unfortunate case where syntax is turned into semantics. 189 190If the POD you're parsing adheres to a consistent indentation policy, you can 191have such indentation stripped from the beginning of every line of your 192verbatim blocks. This method tells Pod::Simple what to strip. For two-space 193indents, you'd use: 194 195 $parser->strip_verbatim_indent(' '); 196 197For tab indents, you'd use a tab character: 198 199 $parser->strip_verbatim_indent("\t"); 200 201If the POD is inconsistent about the indentation of verbatim blocks, but you 202have figured out a heuristic to determine how much a particular verbatim block 203is indented, you can pass a code reference instead. The code reference will be 204executed with one argument, an array reference of all the lines in the 205verbatim block, and should return the value to be stripped from each line. For 206example, if you decide that you're fine to use the first line of the verbatim 207block to set the standard for indentation of the rest of the block, you can 208look at the first line and return the appropriate value, like so: 209 210 $new->strip_verbatim_indent(sub { 211 my $lines = shift; 212 (my $indent = $lines->[0]) =~ s/\S.*//; 213 return $indent; 214 }); 215 216If you'd rather treat each line individually, you can do that, too, by just 217transforming them in-place in the code reference and returning C<undef>. Say 218that you don't want I<any> lines indented. You can do something like this: 219 220 $new->strip_verbatim_indent(sub { 221 my $lines = shift; 222 sub { s/^\s+// for @{ $lines }, 223 return undef; 224 }); 225 226=item C<< $parser->expand_verbatim_tabs( I<n> ) >> 227 228Default: 8 229 230If after any stripping of indentation in verbatim blocks, there remain 231tabs, this method call indicates what to do with them. C<0> 232means leave them as tabs, any other number indicates that each tab is to 233be translated so as to have tab stops every C<n> columns. 234 235This is independent of other methods (except that it operates after any 236verbatim input stripping is done). 237 238Like the other methods, the input parameter is not checked for validity. 239C<undef> or containing non-digits has the same effect as 8. 240 241=back 242 243=head1 TERTIARY METHODS 244 245=over 246 247=item C<< $parser->abandon_output_fh() >>X<abandon_output_fh> 248 249Cancel output to the file handle. Any POD read by the C<$parser> is not 250effected. 251 252=item C<< $parser->abandon_output_string() >>X<abandon_output_string> 253 254Cancel output to the output string. Any POD read by the C<$parser> is not 255effected. 256 257=item C<< $parser->accept_code( @codes ) >>X<accept_code> 258 259Alias for L<< accept_codes >>. 260 261=item C<< $parser->accept_codes( @codes ) >>X<accept_codes> 262 263Allows C<$parser> to accept a list of L<perlpod/Formatting Codes>. This can be 264used to implement user-defined codes. 265 266=item C<< $parser->accept_directive_as_data( @directives ) >>X<accept_directive_as_data> 267 268Allows C<$parser> to accept a list of directives for data paragraphs. A 269directive is the label of a L<perlpod/Command Paragraph>. A data paragraph is 270one delimited by C<< =begin/=for/=end >> directives. This can be used to 271implement user-defined directives. 272 273=item C<< $parser->accept_directive_as_processed( @directives ) >>X<accept_directive_as_processed> 274 275Allows C<$parser> to accept a list of directives for processed paragraphs. A 276directive is the label of a L<perlpod/Command Paragraph>. A processed 277paragraph is also known as L<perlpod/Ordinary Paragraph>. This can be used to 278implement user-defined directives. 279 280=item C<< $parser->accept_directive_as_verbatim( @directives ) >>X<accept_directive_as_verbatim> 281 282Allows C<$parser> to accept a list of directives for L<perlpod/Verbatim 283Paragraph>. A directive is the label of a L<perlpod/Command Paragraph>. This 284can be used to implement user-defined directives. 285 286=item C<< $parser->accept_target( @targets ) >>X<accept_target> 287 288Alias for L<< accept_targets >>. 289 290=item C<< $parser->accept_target_as_text( @targets ) >>X<accept_target_as_text> 291 292Alias for L<< accept_targets_as_text >>. 293 294=item C<< $parser->accept_targets( @targets ) >>X<accept_targets> 295 296Accepts targets for C<< =begin/=for/=end >> sections of the POD. 297 298=item C<< $parser->accept_targets_as_text( @targets ) >>X<accept_targets_as_text> 299 300Accepts targets for C<< =begin/=for/=end >> sections that should be parsed as 301POD. For details, see L<< perlpodspec/About Data Paragraphs >>. 302 303=item C<< $parser->any_errata_seen() >>X<any_errata_seen> 304 305Used to check if any errata was seen. 306 307I<Example:> 308 309 die "too many errors\n" if $parser->any_errata_seen(); 310 311=item C<< $parser->errata_seen() >>X<errata_seen> 312 313Returns a hash reference of all errata seen, both whines and screams. The hash reference's keys are the line number and the value is an array reference of the errors for that line. 314 315I<Example:> 316 317 if ( $parser->any_errata_seen() ) { 318 $logger->log( $parser->errata_seen() ); 319 } 320 321=item C<< $parser->detected_encoding() >>X<detected_encoding> 322 323Return the encoding corresponding to C<< =encoding >>, but only if the 324encoding was recognized and handled. 325 326=item C<< $parser->encoding() >>X<encoding> 327 328Return encoding of the document, even if the encoding is not correctly 329handled. 330 331=item C<< $parser->parse_from_file( $source, $to ) >>X<parse_from_file> 332 333Parses from C<$source> file to C<$to> file. Similar to L<< 334Pod::Parser/parse_from_file >>. 335 336=item C<< $parser->scream( @error_messages ) >>X<scream> 337 338Log an error that can't be ignored. 339 340=item C<< $parser->unaccept_code( @codes ) >>X<unaccept_code> 341 342Alias for L<< unaccept_codes >>. 343 344=item C<< $parser->unaccept_codes( @codes ) >>X<unaccept_codes> 345 346Removes C<< @codes >> as valid codes for the parse. 347 348=item C<< $parser->unaccept_directive( @directives ) >>X<unaccept_directive> 349 350Alias for L<< unaccept_directives >>. 351 352=item C<< $parser->unaccept_directives( @directives ) >>X<unaccept_directives> 353 354Removes C<< @directives >> as valid directives for the parse. 355 356=item C<< $parser->unaccept_target( @targets ) >>X<unaccept_target> 357 358Alias for L<< unaccept_targets >>. 359 360=item C<< $parser->unaccept_targets( @targets ) >>X<unaccept_targets> 361 362Removes C<< @targets >> as valid targets for the parse. 363 364=item C<< $parser->version_report() >>X<version_report> 365 366Returns a string describing the version. 367 368=item C<< $parser->whine( @error_messages ) >>X<whine> 369 370Log an error unless C<< $parser->no_whining( TRUE ); >>. 371 372=back 373 374=head1 ENCODING 375 376The Pod::Simple parser expects to read B<octets>. The parser will decode the 377octets into Perl's internal character string representation using the value of 378the C<=encoding> declaration in the POD source. 379 380If the POD source does not include an C<=encoding> declaration, the parser will 381attempt to guess the encoding (selecting one of UTF-8 or CP 1252) by examining 382the first non-ASCII bytes and applying the heuristic described in 383L<perlpodspec>. (If the POD source contains only ASCII bytes, the 384encoding is assumed to be ASCII.) 385 386If you set the C<parse_characters> option to a true value the parser will 387expect characters rather than octets; will ignore any C<=encoding>; and will 388make no attempt to decode the input. 389 390=head1 SEE ALSO 391 392L<Pod::Simple::Subclassing> 393 394L<perlpod|perlpod> 395 396L<perlpodspec|perlpodspec> 397 398L<Pod::Escapes|Pod::Escapes> 399 400L<perldoc> 401 402=head1 SUPPORT 403 404Questions or discussion about POD and Pod::Simple should be sent to the 405pod-people@perl.org mail list. Send an empty email to 406pod-people-subscribe@perl.org to subscribe. 407 408This module is managed in an open GitHub repository, 409L<https://github.com/perl-pod/pod-simple/>. Feel free to fork and contribute, or 410to clone L<git://github.com/perl-pod/pod-simple.git> and send patches! 411 412Please use L<https://github.com/perl-pod/pod-simple/issues/new> to file a bug 413report. 414 415=head1 COPYRIGHT AND DISCLAIMERS 416 417Copyright (c) 2002 Sean M. Burke. 418 419This library is free software; you can redistribute it and/or modify it 420under the same terms as Perl itself. 421 422This program is distributed in the hope that it will be useful, but 423without any warranty; without even the implied warranty of 424merchantability or fitness for a particular purpose. 425 426=head1 AUTHOR 427 428Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. 429But don't bother him, he's retired. 430 431Pod::Simple is maintained by: 432 433=over 434 435=item * Allison Randal C<allison@perl.org> 436 437=item * Hans Dieter Pearcey C<hdp@cpan.org> 438 439=item * David E. Wheeler C<dwheeler@cpan.org> 440 441=item * Karl Williamson C<khw@cpan.org> 442 443=back 444 445Documentation has been contributed by: 446 447=over 448 449=item * Gabor Szabo C<szabgab@gmail.com> 450 451=item * Shawn H Corey C<SHCOREY at cpan.org> 452 453=back 454 455=cut 456