• Home
  • History
  • Annotate
  • only in this directory
NameDateSize

..11-Apr-2013244

ArtisticH A D20-Feb-20136 KiB

AUTHORSH A D20-Feb-2013480

Build.PLH A D20-Feb-2013682

ChangesH A D20-Feb-20134.5 KiB

COPYINGH A D20-Feb-201317.6 KiB

CREDITSH A D20-Feb-2013995

examples/H11-Apr-20139

lib/H05-Apr-20133

LICENCEH A D20-Feb-2013481

Makefile.PLH A D20-Feb-2013716

MANIFESTH A D20-Feb-2013951

META.ymlH A D20-Feb-20131.4 KiB

READMEH A D20-Feb-201320.7 KiB

SIGNATUREH A D20-Feb-20133.9 KiB

t/H11-Apr-201327

README

1NAME
2    DateTime::Format::Builder - Create DateTime parser classes and objects.
3
4SYNOPSIS
5        package DateTime::Format::Brief;
6        our $VERSION = '0.07';
7        use DateTime::Format::Builder
8        (
9            parsers => {
10                parse_datetime => [
11                {
12                    regex => qr/^(\d{4})(\d\d)(\d\d)(\d\d)(\d\d)(\d\d)$/,
13                    params => [qw( year month day hour minute second )],
14                },
15                {
16                    regex => qr/^(\d{4})(\d\d)(\d\d)$/,
17                    params => [qw( year month day )],
18                },
19                ],
20            }
21        );
22
23DESCRIPTION
24    DateTime::Format::Builder creates DateTime parsers. Many string formats
25    of dates and times are simple and just require a basic regular
26    expression to extract the relevant information. Builder provides a
27    simple way to do this without writing reams of structural code.
28
29    Builder provides a number of methods, most of which you'll never need,
30    or at least rarely need. They're provided more for exposing of the
31    module's innards to any subclasses, or for when you need to do something
32    slightly beyond what I expected.
33
34TUTORIAL
35    See DateTime::Format::Builder::Tutorial.
36
37ERROR HANDLING AND BAD PARSES
38    Often, I will speak of `undef' being returned, however that's not
39    strictly true.
40
41    When a simple single specification is given for a method, the method
42    isn't given a single parser directly. It's given a wrapper that will
43    call `on_fail()' if the single parser returns `undef'. The single parser
44    must return `undef' so that a multiple parser can work nicely and actual
45    errors can be thrown from any of the callbacks.
46
47    Similarly, any multiple parsers will only call `on_fail()' right at the
48    end when it's tried all it could.
49
50    `on_fail()' (see later) is defined, by default, to throw an error.
51
52    Multiple parser specifications can also specify `on_fail' with a coderef
53    as an argument in the options block. This will take precedence over the
54    inheritable and over-ridable method.
55
56    That said, don't throw real errors from callbacks in multiple parser
57    specifications unless you really want parsing to stop right there and
58    not try any other parsers.
59
60    In summary: calling a method will result in either a `DateTime' object
61    being returned or an error being thrown (unless you've overridden
62    `on_fail()' or `create_method()', or you've specified a `on_fail' key to
63    a multiple parser specification).
64
65    Individual parsers (be they multiple parsers or single parsers) will
66    return either the `DateTime' object or `undef'.
67
68SINGLE SPECIFICATIONS
69    A single specification is a hash ref of instructions on how to create a
70    parser.
71
72    The precise set of keys and values varies according to parser type.
73    There are some common ones though:
74
75    *   length is an optional parameter that can be used to specify that
76        this particular *regex* is only applicable to strings of a certain
77        fixed length. This can be used to make parsers more efficient. It's
78        strongly recommended that any parser that can use this parameter
79        does.
80
81        You may happily specify the same length twice. The parsers will be
82        tried in order of specification.
83
84        You can also specify multiple lengths by giving it an arrayref of
85        numbers rather than just a single scalar. If doing so, please keep
86        the number of lengths to a minimum.
87
88        If any specifications without *length*s are given and the particular
89        *length* parser fails, then the non-*length* parsers are tried.
90
91        This parameter is ignored unless the specification is part of a
92        multiple parser specification.
93
94    *   label provides a name for the specification and is passed to some of
95        the callbacks about to mentioned.
96
97    *   on_match and on_fail are callbacks. Both routines will be called
98        with parameters of:
99
100        *   input, being the input to the parser (after any preprocessing
101            callbacks).
102
103        *   label, being the label of the parser, if there is one.
104
105        *   self, being the object on which the method has been invoked
106            (which may just be a class name). Naturally, you can then invoke
107            your own methods on it do get information you want.
108
109        *   args, being an arrayref of any passed arguments, if any. If
110            there were no arguments, then this parameter is not given.
111
112        These routines will be called depending on whether the regex match
113        succeeded or failed.
114
115    *   preprocess is a callback provided for cleaning up input prior to
116        parsing. It's given a hash as arguments with the following keys:
117
118        *   input being the datetime string the parser was given (if using
119            multiple specifications and an overall *preprocess* then this is
120            the date after it's been through that preprocessor).
121
122        *   parsed being the state of parsing so far. Usually empty at this
123            point unless an overall *preprocess* was given. Items may be
124            placed in it and will be given to any postprocessor and
125            `DateTime->new' (unless the postprocessor deletes it).
126
127        *   self, args, label as per *on_match* and *on_fail*.
128
129        The return value from the routine is what is given to the *regex*.
130        Note that this is last code stop before the match.
131
132        Note: mixing *length* and a *preprocess* that modifies the length of
133        the input string is probably not what you meant to do. You probably
134        meant to use the *multiple parser* variant of *preprocess* which is
135        done before any length calculations. This `single parser' variant of
136        *preprocess* is performed after any length calculations.
137
138    *   postprocess is the last code stop before `DateTime->new()' is
139        called. It's given the same arguments as *preprocess*. This allows
140        it to modify the parsed parameters after the parse and before the
141        creation of the object. For example, you might use:
142
143            {
144                regex  => qr/^(\d\d) (\d\d) (\d\d)$/,
145                params => [qw( year  month  day   )],
146                postprocess => \&_fix_year,
147            }
148
149        where `_fix_year' is defined as:
150
151            sub _fix_year
152            {
153                my %args = @_;
154                my ($date, $p) = @args{qw( input parsed )};
155                $p->{year} += $p->{year} > 69 ? 1900 : 2000;
156                return 1;
157            }
158
159        This will cause the two digit years to be corrected according to the
160        cut off. If the year was '69' or lower, then it is made into 2069
161        (or 2045, or whatever the year was parsed as). Otherwise it is
162        assumed to be 19xx. The DateTime::Format::Mail module uses code
163        similar to this (only it allows the cut off to be configured and it
164        doesn't use Builder).
165
166        Note: It is very important to return an explicit value from the
167        *postprocess* callback. If the return value is false then the parse
168        is taken to have failed. If the return value is true, then the parse
169        is taken to have succeeded and `DateTime->new()' is called.
170
171    See the documentation for the individual parsers for their valid keys.
172
173    Parsers at the time of writing are:
174
175    *   DateTime::Format::Builder::Parser::Regex - provides regular
176        expression based parsing.
177
178    *   DateTime::Format::Builder::Parser::Strptime - provides strptime
179        based parsing.
180
181  Subroutines / coderefs as specifications.
182    A single parser specification can be a coderef. This was added mostly
183    because it could be and because I knew someone, somewhere, would want to
184    use it.
185
186    If the specification is a reference to a piece of code, be it a
187    subroutine, anonymous, or whatever, then it's passed more or less
188    straight through. The code should return `undef' in event of failure (or
189    any false value, but `undef' is strongly preferred), or a true value in
190    the event of success (ideally a `DateTime' object or some object that
191    has the same interface).
192
193    This all said, I generally wouldn't recommend using this feature unless
194    you have to.
195
196  Callbacks
197    I mention a number of callbacks in this document.
198
199    Any time you see a callback being mentioned, you can, if you like,
200    substitute an arrayref of coderefs rather than having the straight
201    coderef.
202
203MULTIPLE SPECIFICATIONS
204    These are very easily described as an array of single specifications.
205
206    Note that if the first element of the array is an arrayref, then you're
207    specifying options.
208
209    *   preprocess lets you specify a preprocessor that is called before any
210        of the parsers are tried. This lets you do things like strip off
211        timezones or any unnecessary data. The most common use people have
212        for it at present is to get the input date to a particular length so
213        that the *length* is usable (DateTime::Format::ICal would use it to
214        strip off the variable length timezone).
215
216        Arguments are as for the *single parser* *preprocess* variant with
217        the exception that *label* is never given.
218
219    *   on_fail should be a reference to a subroutine that is called if the
220        parser fails. If this is not provided, the default action is to call
221        `DateTime::Format::Builder::on_fail', or the `on_fail' method of the
222        subclass of DTFB that was used to create the parser.
223
224EXECUTION FLOW
225    Builder allows you to plug in a fair few callbacks, which can make
226    following how a parse failed (or succeeded unexpectedly) somewhat
227    tricky.
228
229  For Single Specifications
230    A single specification will do the following:
231
232    User calls parser:
233
234           my $dt = $class->parse_datetime( $string );
235
236    1   *preprocess* is called. It's given `$string' and a reference to the
237        parsing workspace hash, which we'll call `$p'. At this point, `$p'
238        is empty. The return value is used as `$date' for the rest of this
239        single parser. Anything put in `$p' is also used for the rest of
240        this single parser.
241
242    2   *regex* is applied.
243
244    3   If *regex* did not match, then *on_fail* is called (and is given
245        `$date' and also *label* if it was defined). Any return value is
246        ignored and the next thing is for the single parser to return
247        `undef'.
248
249        If *regex* did match, then *on_match* is called with the same
250        arguments as would be given to *on_fail*. The return value is
251        similarly ignored, but we then move to step 4 rather than exiting
252        the parser.
253
254    4   *postprocess* is called with `$date' and a filled out `$p'. The
255        return value is taken as a indication of whether the parse was a
256        success or not. If it wasn't a success then the single parser will
257        exit at this point, returning undef.
258
259    5   `DateTime->new()' is called and the user is given the resultant
260        `DateTime' object.
261
262    See the section on error handling regarding the `undef's mentioned
263    above.
264
265  For Multiple Specifications
266    With multiple specifications:
267
268    User calls parser:
269
270          my $dt = $class->complex_parse( $string );
271
272    1   The overall *preprocess*or is called and is given `$string' and the
273        hashref `$p' (identically to the per parser *preprocess* mentioned
274        in the previous flow).
275
276        If the callback modifies `$p' then a copy of `$p' is given to each
277        of the individual parsers. This is so parsers won't accidentally
278        pollute each other's workspace.
279
280    2   If an appropriate length specific parser is found, then it is called
281        and the single parser flow (see the previous section) is followed,
282        and the parser is given a copy of `$p' and the return value of the
283        overall *preprocess*or as `$date'.
284
285        If a `DateTime' object was returned so we go straight back to the
286        user.
287
288        If no appropriate parser was found, or the parser returned `undef',
289        then we progress to step 3!
290
291    3   Any non-*length* based parsers are tried in the order they were
292        specified.
293
294        For each of those the single specification flow above is performed,
295        and is given a copy of the output from the overall preprocessor.
296
297        If a real `DateTime' object is returned then we exit back to the
298        user.
299
300        If no parser could parse, then an error is thrown.
301
302    See the section on error handling regarding the `undef's mentioned
303    above.
304
305METHODS
306    In the general course of things you won't need any of the methods. Life
307    often throws unexpected things at us so the methods are all available
308    for use.
309
310  import
311    `import()' is a wrapper for `create_class()'. If you specify the *class*
312    option (see documentation for `create_class()') it will be ignored.
313
314  create_class
315    This method can be used as the runtime equivalent of `import()'. That
316    is, it takes the exact same parameters as when one does:
317
318       use DateTime::Format::Builder ( blah blah blah )
319
320    That can be (almost) equivalently written as:
321
322       use DateTime::Format::Builder;
323       DateTime::Format::Builder->create_class( blah blah blah );
324
325    The difference being that the first is done at compile time while the
326    second is done at run time.
327
328    In the tutorial I said there were only two parameters at present. I
329    lied. There are actually three of them.
330
331    *   parsers takes a hashref of methods and their parser specifications.
332        See the tutorial above for details.
333
334        Note that if you define a subroutine of the same name as one of the
335        methods you define here, an error will be thrown.
336
337    *   constructor determines whether and how to create a `new()' function
338        in the new class. If given a true value, a constructor is created.
339        If given a false value, one isn't.
340
341        If given an anonymous sub or a reference to a sub then that is used
342        as `new()'.
343
344        The default is `1' (that is, create a constructor using our default
345        code which simply creates a hashref and blesses it).
346
347        If your class defines its own `new()' method it will not be
348        overwritten. If you define your own `new()' and also tell Builder to
349        define one an error will be thrown.
350
351    *   verbose takes a value. If the value is undef, then logging is
352        disabled. If the value is a filehandle then that's where logging
353        will go. If it's a true value, then output will go to `STDERR'.
354
355        Alternatively, call `$DateTime::Format::Builder::verbose()' with the
356        relevant value. Whichever value is given more recently is adhered
357        to.
358
359        Be aware that verbosity is a global wide setting.
360
361    *   class is optional and specifies the name of the class in which to
362        create the specified methods.
363
364        If using this method in the guise of `import()' then this field will
365        cause an error so it is only of use when calling as
366        `create_class()'.
367
368    *   version is also optional and specifies the value to give `$VERSION'
369        in the class. It's generally not recommended unless you're combining
370        with the *class* option. A `ExtUtils::MakeMaker' / `CPAN' compliant
371        version specification is much better.
372
373    In addition to creating any of the methods it also creates a `new()'
374    method that can instantiate (or clone) objects.
375
376SUBCLASSING
377    In the rest of the documentation I've often lied in order to get some of
378    the ideas across more easily. The thing is, this module's very flexible.
379    You can get markedly different behaviour from simply subclassing it and
380    overriding some methods.
381
382  create_method
383    Given a parser coderef, returns a coderef that is suitable to be a
384    method.
385
386    The default action is to call `on_fail()' in the event of a non-parse,
387    but you can make it do whatever you want.
388
389  on_fail
390    This is called in the event of a non-parse (unless you've overridden
391    `create_method()' to do something else.
392
393    The single argument is the input string. The default action is to call
394    `croak()'. Above, where I've said parsers or methods throw errors, this
395    is the method that is doing the error throwing.
396
397    You could conceivably override this method to, say, return `undef'.
398
399USING BUILDER OBJECTS aka USERS USING BUILDER
400    The methods listed in the METHODS section are all you generally need
401    when creating your own class. Sometimes you may not want a full blown
402    class to parse something just for this one program. Some methods are
403    provided to make that task easier.
404
405  new
406    The basic constructor. It takes no arguments, merely returns a new
407    `DateTime::Format::Builder' object.
408
409        my $parser = DateTime::Format::Builder->new();
410
411    If called as a method on an object (rather than as a class method), then
412    it clones the object.
413
414        my $clone = $parser->new();
415
416  clone
417    Provided for those who prefer an explicit `clone()' method rather than
418    using `new()' as an object method.
419
420        my $clone_of_clone = $clone->clone();
421
422  parser
423    Given either a single or multiple parser specification, sets the object
424    to have a parser based on that specification.
425
426        $parser->parser(
427            regex  => qr/^ (\d{4}) (\d\d) (\d\d) $/x;
428            params => [qw( year    month  day    )],
429        );
430
431    The arguments given to `parser()' are handed directly to
432    `create_parser()'. The resultant parser is passed to `set_parser()'.
433
434    If called as an object method, it returns the object.
435
436    If called as a class method, it creates a new object, sets its parser
437    and returns that object.
438
439  set_parser
440    Sets the parser of the object to the given parser.
441
442       $parser->set_parser( $coderef );
443
444    Note: this method does not take specifications. It also does not take
445    anything except coderefs. Luckily, coderefs are what most of the other
446    methods produce.
447
448    The method return value is the object itself.
449
450  get_parser
451    Returns the parser the object is using.
452
453       my $code = $parser->get_parser();
454
455  parse_datetime
456    Given a string, it calls the parser and returns the `DateTime' object
457    that results.
458
459       my $dt = $parser->parse_datetime( "1979 07 16" );
460
461    The return value, if not a `DateTime' object, is whatever the parser
462    wants to return. Generally this means that if the parse failed an error
463    will be thrown.
464
465  format_datetime
466    If you call this function, it will throw an errror.
467
468LONGER EXAMPLES
469    Some longer examples are provided in the distribution. These implement
470    some of the common parsing DateTime modules using Builder. Each of them
471    are, or were, drop in replacements for the modules at the time of
472    writing them.
473
474THANKS
475    Dave Rolsky (DROLSKY) for kickstarting the DateTime project, writing
476    DateTime::Format::ICal and DateTime::Format::MySQL, and some much needed
477    review.
478
479    Joshua Hoblitt (JHOBLITT) for the concept, some of the API, impetus for
480    writing the multilength code (both one length with multiple parsers and
481    single parser with multiple lengths), blame for the Regex custom
482    constructor code, spotting a bug in Dispatch, and more much needed
483    review.
484
485    Kellan Elliott-McCrea (KELLAN) for even more review, suggestions,
486    DateTime::Format::W3CDTF and the encouragement to rewrite these docs
487    almost 100%!
488
489    Claus F�rber (CFAERBER) for having me get around to fixing the
490    auto-constructor writing, providing the 'args'/'self' patch, and
491    suggesting the multi-callbacks.
492
493    Rick Measham (RICKM) for DateTime::Format::Strptime which Builder now
494    supports.
495
496    Matthew McGillis for pointing out that `on_fail' overriding should be
497    simpler.
498
499    Simon Cozens (SIMON) for saying it was cool.
500
501SUPPORT
502    Support for this module is provided via the datetime@perl.org email
503    list. See http://lists.perl.org/ for more details.
504
505    Alternatively, log them via the CPAN RT system via the web or email:
506
507        http://rt.cpan.org/NoAuth/ReportBug.html?Queue=DateTime%3A%3AFormat%3A%3ABuilder
508        bug-datetime-format-builder@rt.cpan.org
509
510    This makes it much easier for me to track things and thus means your
511    problem is less likely to be neglected.
512
513LICENCE AND COPYRIGHT
514    Copyright E<copy> Iain Truskett, 2003. All rights reserved.
515
516    This library is free software; you can redistribute it and/or modify it
517    under the same terms as Perl itself, either Perl version 5.000 or, at
518    your option, any later version of Perl 5 you may have available.
519
520    The full text of the licences can be found in the Artistic and COPYING
521    files included with this module, or in perlartistic and perlgpl as
522    supplied with Perl 5.8.1 and later.
523
524AUTHOR
525    Originally written by Iain Truskett <spoon@cpan.org>, who died on
526    December 29, 2003.
527
528    Maintained by Dave Rolsky <autarch@urth.org>.
529
530SEE ALSO
531    `datetime@perl.org' mailing list.
532
533    http://datetime.perl.org/
534
535    perl, DateTime, DateTime::Format::Builder::Tutorial,
536    DateTime::Format::Builder::Parser
537
538