1NAME 2 JSON::XS - JSON serialising/deserialising, done correctly and fast 3 4 JSON::XS - 正しくて高速な JSON シリアライザ/デシリアライザ 5 (http://fleur.hio.jp/perldoc/mix/lib/JSON/XS.html) 6 7SYNOPSIS 8 use JSON::XS; 9 10 # exported functions, they croak on error 11 # and expect/generate UTF-8 12 13 $utf8_encoded_json_text = encode_json $perl_hash_or_arrayref; 14 $perl_hash_or_arrayref = decode_json $utf8_encoded_json_text; 15 16 # OO-interface 17 18 $coder = JSON::XS->new->ascii->pretty->allow_nonref; 19 $pretty_printed_unencoded = $coder->encode ($perl_scalar); 20 $perl_scalar = $coder->decode ($unicode_json_text); 21 22 # Note that JSON version 2.0 and above will automatically use JSON::XS 23 # if available, at virtually no speed overhead either, so you should 24 # be able to just: 25 26 use JSON; 27 28 # and do the same things, except that you have a pure-perl fallback now. 29 30DESCRIPTION 31 This module converts Perl data structures to JSON and vice versa. Its 32 primary goal is to be *correct* and its secondary goal is to be *fast*. 33 To reach the latter goal it was written in C. 34 35 Beginning with version 2.0 of the JSON module, when both JSON and 36 JSON::XS are installed, then JSON will fall back on JSON::XS (this can 37 be overridden) with no overhead due to emulation (by inheriting 38 constructor and methods). If JSON::XS is not available, it will fall 39 back to the compatible JSON::PP module as backend, so using JSON instead 40 of JSON::XS gives you a portable JSON API that can be fast when you need 41 and doesn't require a C compiler when that is a problem. 42 43 As this is the n-th-something JSON module on CPAN, what was the reason 44 to write yet another JSON module? While it seems there are many JSON 45 modules, none of them correctly handle all corner cases, and in most 46 cases their maintainers are unresponsive, gone missing, or not listening 47 to bug reports for other reasons. 48 49 See MAPPING, below, on how JSON::XS maps perl values to JSON values and 50 vice versa. 51 52 FEATURES 53 * correct Unicode handling 54 55 This module knows how to handle Unicode, documents how and when it 56 does so, and even documents what "correct" means. 57 58 * round-trip integrity 59 60 When you serialise a perl data structure using only data types 61 supported by JSON and Perl, the deserialised data structure is 62 identical on the Perl level. (e.g. the string "2.0" doesn't suddenly 63 become "2" just because it looks like a number). There *are* minor 64 exceptions to this, read the MAPPING section below to learn about 65 those. 66 67 * strict checking of JSON correctness 68 69 There is no guessing, no generating of illegal JSON texts by 70 default, and only JSON is accepted as input by default (the latter 71 is a security feature). 72 73 * fast 74 75 Compared to other JSON modules and other serialisers such as 76 Storable, this module usually compares favourably in terms of speed, 77 too. 78 79 * simple to use 80 81 This module has both a simple functional interface as well as an 82 object oriented interface interface. 83 84 * reasonably versatile output formats 85 86 You can choose between the most compact guaranteed-single-line 87 format possible (nice for simple line-based protocols), a pure-ASCII 88 format (for when your transport is not 8-bit clean, still supports 89 the whole Unicode range), or a pretty-printed format (for when you 90 want to read that stuff). Or you can combine those features in 91 whatever way you like. 92 93FUNCTIONAL INTERFACE 94 The following convenience methods are provided by this module. They are 95 exported by default: 96 97 $json_text = encode_json $perl_scalar 98 Converts the given Perl data structure to a UTF-8 encoded, binary 99 string (that is, the string contains octets only). Croaks on error. 100 101 This function call is functionally identical to: 102 103 $json_text = JSON::XS->new->utf8->encode ($perl_scalar) 104 105 Except being faster. 106 107 $perl_scalar = decode_json $json_text 108 The opposite of "encode_json": expects an UTF-8 (binary) string and 109 tries to parse that as an UTF-8 encoded JSON text, returning the 110 resulting reference. Croaks on error. 111 112 This function call is functionally identical to: 113 114 $perl_scalar = JSON::XS->new->utf8->decode ($json_text) 115 116 Except being faster. 117 118 $is_boolean = JSON::XS::is_bool $scalar 119 Returns true if the passed scalar represents either JSON::XS::true 120 or JSON::XS::false, two constants that act like 1 and 0, 121 respectively and are used to represent JSON "true" and "false" 122 values in Perl. 123 124 See MAPPING, below, for more information on how JSON values are 125 mapped to Perl. 126 127A FEW NOTES ON UNICODE AND PERL 128 Since this often leads to confusion, here are a few very clear words on 129 how Unicode works in Perl, modulo bugs. 130 131 1. Perl strings can store characters with ordinal values > 255. 132 This enables you to store Unicode characters as single characters in 133 a Perl string - very natural. 134 135 2. Perl does *not* associate an encoding with your strings. 136 ... until you force it to, e.g. when matching it against a regex, or 137 printing the scalar to a file, in which case Perl either interprets 138 your string as locale-encoded text, octets/binary, or as Unicode, 139 depending on various settings. In no case is an encoding stored 140 together with your data, it is *use* that decides encoding, not any 141 magical meta data. 142 143 3. The internal utf-8 flag has no meaning with regards to the encoding 144 of your string. 145 Just ignore that flag unless you debug a Perl bug, a module written 146 in XS or want to dive into the internals of perl. Otherwise it will 147 only confuse you, as, despite the name, it says nothing about how 148 your string is encoded. You can have Unicode strings with that flag 149 set, with that flag clear, and you can have binary data with that 150 flag set and that flag clear. Other possibilities exist, too. 151 152 If you didn't know about that flag, just the better, pretend it 153 doesn't exist. 154 155 4. A "Unicode String" is simply a string where each character can be 156 validly interpreted as a Unicode code point. 157 If you have UTF-8 encoded data, it is no longer a Unicode string, 158 but a Unicode string encoded in UTF-8, giving you a binary string. 159 160 5. A string containing "high" (> 255) character values is *not* a UTF-8 161 string. 162 It's a fact. Learn to live with it. 163 164 I hope this helps :) 165 166OBJECT-ORIENTED INTERFACE 167 The object oriented interface lets you configure your own encoding or 168 decoding style, within the limits of supported formats. 169 170 $json = new JSON::XS 171 Creates a new JSON::XS object that can be used to de/encode JSON 172 strings. All boolean flags described below are by default 173 *disabled*. 174 175 The mutators for flags all return the JSON object again and thus 176 calls can be chained: 177 178 my $json = JSON::XS->new->utf8->space_after->encode ({a => [1,2]}) 179 => {"a": [1, 2]} 180 181 $json = $json->ascii ([$enable]) 182 $enabled = $json->get_ascii 183 If $enable is true (or missing), then the "encode" method will not 184 generate characters outside the code range 0..127 (which is ASCII). 185 Any Unicode characters outside that range will be escaped using 186 either a single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL 187 escape sequence, as per RFC4627. The resulting encoded JSON text can 188 be treated as a native Unicode string, an ascii-encoded, 189 latin1-encoded or UTF-8 encoded string, or any other superset of 190 ASCII. 191 192 If $enable is false, then the "encode" method will not escape 193 Unicode characters unless required by the JSON syntax or other 194 flags. This results in a faster and more compact format. 195 196 See also the section *ENCODING/CODESET FLAG NOTES* later in this 197 document. 198 199 The main use for this flag is to produce JSON texts that can be 200 transmitted over a 7-bit channel, as the encoded JSON texts will not 201 contain any 8 bit characters. 202 203 JSON::XS->new->ascii (1)->encode ([chr 0x10401]) 204 => ["\ud801\udc01"] 205 206 $json = $json->latin1 ([$enable]) 207 $enabled = $json->get_latin1 208 If $enable is true (or missing), then the "encode" method will 209 encode the resulting JSON text as latin1 (or iso-8859-1), escaping 210 any characters outside the code range 0..255. The resulting string 211 can be treated as a latin1-encoded JSON text or a native Unicode 212 string. The "decode" method will not be affected in any way by this 213 flag, as "decode" by default expects Unicode, which is a strict 214 superset of latin1. 215 216 If $enable is false, then the "encode" method will not escape 217 Unicode characters unless required by the JSON syntax or other 218 flags. 219 220 See also the section *ENCODING/CODESET FLAG NOTES* later in this 221 document. 222 223 The main use for this flag is efficiently encoding binary data as 224 JSON text, as most octets will not be escaped, resulting in a 225 smaller encoded size. The disadvantage is that the resulting JSON 226 text is encoded in latin1 (and must correctly be treated as such 227 when storing and transferring), a rare encoding for JSON. It is 228 therefore most useful when you want to store data structures known 229 to contain binary data efficiently in files or databases, not when 230 talking to other JSON encoders/decoders. 231 232 JSON::XS->new->latin1->encode (["\x{89}\x{abc}"] 233 => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not) 234 235 $json = $json->utf8 ([$enable]) 236 $enabled = $json->get_utf8 237 If $enable is true (or missing), then the "encode" method will 238 encode the JSON result into UTF-8, as required by many protocols, 239 while the "decode" method expects to be handled an UTF-8-encoded 240 string. Please note that UTF-8-encoded strings do not contain any 241 characters outside the range 0..255, they are thus useful for 242 bytewise/binary I/O. In future versions, enabling this option might 243 enable autodetection of the UTF-16 and UTF-32 encoding families, as 244 described in RFC4627. 245 246 If $enable is false, then the "encode" method will return the JSON 247 string as a (non-encoded) Unicode string, while "decode" expects 248 thus a Unicode string. Any decoding or encoding (e.g. to UTF-8 or 249 UTF-16) needs to be done yourself, e.g. using the Encode module. 250 251 See also the section *ENCODING/CODESET FLAG NOTES* later in this 252 document. 253 254 Example, output UTF-16BE-encoded JSON: 255 256 use Encode; 257 $jsontext = encode "UTF-16BE", JSON::XS->new->encode ($object); 258 259 Example, decode UTF-32LE-encoded JSON: 260 261 use Encode; 262 $object = JSON::XS->new->decode (decode "UTF-32LE", $jsontext); 263 264 $json = $json->pretty ([$enable]) 265 This enables (or disables) all of the "indent", "space_before" and 266 "space_after" (and in the future possibly more) flags in one call to 267 generate the most readable (or most compact) form possible. 268 269 Example, pretty-print some simple structure: 270 271 my $json = JSON::XS->new->pretty(1)->encode ({a => [1,2]}) 272 => 273 { 274 "a" : [ 275 1, 276 2 277 ] 278 } 279 280 $json = $json->indent ([$enable]) 281 $enabled = $json->get_indent 282 If $enable is true (or missing), then the "encode" method will use a 283 multiline format as output, putting every array member or 284 object/hash key-value pair into its own line, indenting them 285 properly. 286 287 If $enable is false, no newlines or indenting will be produced, and 288 the resulting JSON text is guaranteed not to contain any "newlines". 289 290 This setting has no effect when decoding JSON texts. 291 292 $json = $json->space_before ([$enable]) 293 $enabled = $json->get_space_before 294 If $enable is true (or missing), then the "encode" method will add 295 an extra optional space before the ":" separating keys from values 296 in JSON objects. 297 298 If $enable is false, then the "encode" method will not add any extra 299 space at those places. 300 301 This setting has no effect when decoding JSON texts. You will also 302 most likely combine this setting with "space_after". 303 304 Example, space_before enabled, space_after and indent disabled: 305 306 {"key" :"value"} 307 308 $json = $json->space_after ([$enable]) 309 $enabled = $json->get_space_after 310 If $enable is true (or missing), then the "encode" method will add 311 an extra optional space after the ":" separating keys from values in 312 JSON objects and extra whitespace after the "," separating key-value 313 pairs and array members. 314 315 If $enable is false, then the "encode" method will not add any extra 316 space at those places. 317 318 This setting has no effect when decoding JSON texts. 319 320 Example, space_before and indent disabled, space_after enabled: 321 322 {"key": "value"} 323 324 $json = $json->relaxed ([$enable]) 325 $enabled = $json->get_relaxed 326 If $enable is true (or missing), then "decode" will accept some 327 extensions to normal JSON syntax (see below). "encode" will not be 328 affected in anyway. *Be aware that this option makes you accept 329 invalid JSON texts as if they were valid!*. I suggest only to use 330 this option to parse application-specific files written by humans 331 (configuration files, resource files etc.) 332 333 If $enable is false (the default), then "decode" will only accept 334 valid JSON texts. 335 336 Currently accepted extensions are: 337 338 * list items can have an end-comma 339 340 JSON *separates* array elements and key-value pairs with commas. 341 This can be annoying if you write JSON texts manually and want 342 to be able to quickly append elements, so this extension accepts 343 comma at the end of such items not just between them: 344 345 [ 346 1, 347 2, <- this comma not normally allowed 348 ] 349 { 350 "k1": "v1", 351 "k2": "v2", <- this comma not normally allowed 352 } 353 354 * shell-style '#'-comments 355 356 Whenever JSON allows whitespace, shell-style comments are 357 additionally allowed. They are terminated by the first 358 carriage-return or line-feed character, after which more 359 white-space and comments are allowed. 360 361 [ 362 1, # this comment not allowed in JSON 363 # neither this one... 364 ] 365 366 $json = $json->canonical ([$enable]) 367 $enabled = $json->get_canonical 368 If $enable is true (or missing), then the "encode" method will 369 output JSON objects by sorting their keys. This is adding a 370 comparatively high overhead. 371 372 If $enable is false, then the "encode" method will output key-value 373 pairs in the order Perl stores them (which will likely change 374 between runs of the same script). 375 376 This option is useful if you want the same data structure to be 377 encoded as the same JSON text (given the same overall settings). If 378 it is disabled, the same hash might be encoded differently even if 379 contains the same data, as key-value pairs have no inherent ordering 380 in Perl. 381 382 This setting has no effect when decoding JSON texts. 383 384 This setting has currently no effect on tied hashes. 385 386 $json = $json->allow_nonref ([$enable]) 387 $enabled = $json->get_allow_nonref 388 If $enable is true (or missing), then the "encode" method can 389 convert a non-reference into its corresponding string, number or 390 null JSON value, which is an extension to RFC4627. Likewise, 391 "decode" will accept those JSON values instead of croaking. 392 393 If $enable is false, then the "encode" method will croak if it isn't 394 passed an arrayref or hashref, as JSON texts must either be an 395 object or array. Likewise, "decode" will croak if given something 396 that is not a JSON object or array. 397 398 Example, encode a Perl scalar as JSON value with enabled 399 "allow_nonref", resulting in an invalid JSON text: 400 401 JSON::XS->new->allow_nonref->encode ("Hello, World!") 402 => "Hello, World!" 403 404 $json = $json->allow_unknown ([$enable]) 405 $enabled = $json->get_allow_unknown 406 If $enable is true (or missing), then "encode" will *not* throw an 407 exception when it encounters values it cannot represent in JSON (for 408 example, filehandles) but instead will encode a JSON "null" value. 409 Note that blessed objects are not included here and are handled 410 separately by c<allow_nonref>. 411 412 If $enable is false (the default), then "encode" will throw an 413 exception when it encounters anything it cannot encode as JSON. 414 415 This option does not affect "decode" in any way, and it is 416 recommended to leave it off unless you know your communications 417 partner. 418 419 $json = $json->allow_blessed ([$enable]) 420 $enabled = $json->get_allow_blessed 421 If $enable is true (or missing), then the "encode" method will not 422 barf when it encounters a blessed reference. Instead, the value of 423 the convert_blessed option will decide whether "null" 424 ("convert_blessed" disabled or no "TO_JSON" method found) or a 425 representation of the object ("convert_blessed" enabled and 426 "TO_JSON" method found) is being encoded. Has no effect on "decode". 427 428 If $enable is false (the default), then "encode" will throw an 429 exception when it encounters a blessed object. 430 431 $json = $json->convert_blessed ([$enable]) 432 $enabled = $json->get_convert_blessed 433 If $enable is true (or missing), then "encode", upon encountering a 434 blessed object, will check for the availability of the "TO_JSON" 435 method on the object's class. If found, it will be called in scalar 436 context and the resulting scalar will be encoded instead of the 437 object. If no "TO_JSON" method is found, the value of 438 "allow_blessed" will decide what to do. 439 440 The "TO_JSON" method may safely call die if it wants. If "TO_JSON" 441 returns other blessed objects, those will be handled in the same 442 way. "TO_JSON" must take care of not causing an endless recursion 443 cycle (== crash) in this case. The name of "TO_JSON" was chosen 444 because other methods called by the Perl core (== not by the user of 445 the object) are usually in upper case letters and to avoid 446 collisions with any "to_json" function or method. 447 448 This setting does not yet influence "decode" in any way, but in the 449 future, global hooks might get installed that influence "decode" and 450 are enabled by this setting. 451 452 If $enable is false, then the "allow_blessed" setting will decide 453 what to do when a blessed object is found. 454 455 $json = $json->filter_json_object ([$coderef->($hashref)]) 456 When $coderef is specified, it will be called from "decode" each 457 time it decodes a JSON object. The only argument is a reference to 458 the newly-created hash. If the code references returns a single 459 scalar (which need not be a reference), this value (i.e. a copy of 460 that scalar to avoid aliasing) is inserted into the deserialised 461 data structure. If it returns an empty list (NOTE: *not* "undef", 462 which is a valid scalar), the original deserialised hash will be 463 inserted. This setting can slow down decoding considerably. 464 465 When $coderef is omitted or undefined, any existing callback will be 466 removed and "decode" will not change the deserialised hash in any 467 way. 468 469 Example, convert all JSON objects into the integer 5: 470 471 my $js = JSON::XS->new->filter_json_object (sub { 5 }); 472 # returns [5] 473 $js->decode ('[{}]') 474 # throw an exception because allow_nonref is not enabled 475 # so a lone 5 is not allowed. 476 $js->decode ('{"a":1, "b":2}'); 477 478 $json = $json->filter_json_single_key_object ($key [=> 479 $coderef->($value)]) 480 Works remotely similar to "filter_json_object", but is only called 481 for JSON objects having a single key named $key. 482 483 This $coderef is called before the one specified via 484 "filter_json_object", if any. It gets passed the single value in the 485 JSON object. If it returns a single value, it will be inserted into 486 the data structure. If it returns nothing (not even "undef" but the 487 empty list), the callback from "filter_json_object" will be called 488 next, as if no single-key callback were specified. 489 490 If $coderef is omitted or undefined, the corresponding callback will 491 be disabled. There can only ever be one callback for a given key. 492 493 As this callback gets called less often then the 494 "filter_json_object" one, decoding speed will not usually suffer as 495 much. Therefore, single-key objects make excellent targets to 496 serialise Perl objects into, especially as single-key JSON objects 497 are as close to the type-tagged value concept as JSON gets (it's 498 basically an ID/VALUE tuple). Of course, JSON does not support this 499 in any way, so you need to make sure your data never looks like a 500 serialised Perl hash. 501 502 Typical names for the single object key are "__class_whatever__", or 503 "$__dollars_are_rarely_used__$" or "}ugly_brace_placement", or even 504 things like "__class_md5sum(classname)__", to reduce the risk of 505 clashing with real hashes. 506 507 Example, decode JSON objects of the form "{ "__widget__" => <id> }" 508 into the corresponding $WIDGET{<id>} object: 509 510 # return whatever is in $WIDGET{5}: 511 JSON::XS 512 ->new 513 ->filter_json_single_key_object (__widget__ => sub { 514 $WIDGET{ $_[0] } 515 }) 516 ->decode ('{"__widget__": 5') 517 518 # this can be used with a TO_JSON method in some "widget" class 519 # for serialisation to json: 520 sub WidgetBase::TO_JSON { 521 my ($self) = @_; 522 523 unless ($self->{id}) { 524 $self->{id} = ..get..some..id..; 525 $WIDGET{$self->{id}} = $self; 526 } 527 528 { __widget__ => $self->{id} } 529 } 530 531 $json = $json->shrink ([$enable]) 532 $enabled = $json->get_shrink 533 Perl usually over-allocates memory a bit when allocating space for 534 strings. This flag optionally resizes strings generated by either 535 "encode" or "decode" to their minimum size possible. This can save 536 memory when your JSON texts are either very very long or you have 537 many short strings. It will also try to downgrade any strings to 538 octet-form if possible: perl stores strings internally either in an 539 encoding called UTF-X or in octet-form. The latter cannot store 540 everything but uses less space in general (and some buggy Perl or C 541 code might even rely on that internal representation being used). 542 543 The actual definition of what shrink does might change in future 544 versions, but it will always try to save space at the expense of 545 time. 546 547 If $enable is true (or missing), the string returned by "encode" 548 will be shrunk-to-fit, while all strings generated by "decode" will 549 also be shrunk-to-fit. 550 551 If $enable is false, then the normal perl allocation algorithms are 552 used. If you work with your data, then this is likely to be faster. 553 554 In the future, this setting might control other things, such as 555 converting strings that look like integers or floats into integers 556 or floats internally (there is no difference on the Perl level), 557 saving space. 558 559 $json = $json->max_depth ([$maximum_nesting_depth]) 560 $max_depth = $json->get_max_depth 561 Sets the maximum nesting level (default 512) accepted while encoding 562 or decoding. If a higher nesting level is detected in JSON text or a 563 Perl data structure, then the encoder and decoder will stop and 564 croak at that point. 565 566 Nesting level is defined by number of hash- or arrayrefs that the 567 encoder needs to traverse to reach a given point or the number of 568 "{" or "[" characters without their matching closing parenthesis 569 crossed to reach a given character in a string. 570 571 Setting the maximum depth to one disallows any nesting, so that 572 ensures that the object is only a single hash/object or array. 573 574 If no argument is given, the highest possible setting will be used, 575 which is rarely useful. 576 577 Note that nesting is implemented by recursion in C. The default 578 value has been chosen to be as large as typical operating systems 579 allow without crashing. 580 581 See SECURITY CONSIDERATIONS, below, for more info on why this is 582 useful. 583 584 $json = $json->max_size ([$maximum_string_size]) 585 $max_size = $json->get_max_size 586 Set the maximum length a JSON text may have (in bytes) where 587 decoding is being attempted. The default is 0, meaning no limit. 588 When "decode" is called on a string that is longer then this many 589 bytes, it will not attempt to decode the string but throw an 590 exception. This setting has no effect on "encode" (yet). 591 592 If no argument is given, the limit check will be deactivated (same 593 as when 0 is specified). 594 595 See SECURITY CONSIDERATIONS, below, for more info on why this is 596 useful. 597 598 $json_text = $json->encode ($perl_scalar) 599 Converts the given Perl data structure (a simple scalar or a 600 reference to a hash or array) to its JSON representation. Simple 601 scalars will be converted into JSON string or number sequences, 602 while references to arrays become JSON arrays and references to 603 hashes become JSON objects. Undefined Perl values (e.g. "undef") 604 become JSON "null" values. Neither "true" nor "false" values will be 605 generated. 606 607 $perl_scalar = $json->decode ($json_text) 608 The opposite of "encode": expects a JSON text and tries to parse it, 609 returning the resulting simple scalar or reference. Croaks on error. 610 611 JSON numbers and strings become simple Perl scalars. JSON arrays 612 become Perl arrayrefs and JSON objects become Perl hashrefs. "true" 613 becomes 1, "false" becomes 0 and "null" becomes "undef". 614 615 ($perl_scalar, $characters) = $json->decode_prefix ($json_text) 616 This works like the "decode" method, but instead of raising an 617 exception when there is trailing garbage after the first JSON 618 object, it will silently stop parsing there and return the number of 619 characters consumed so far. 620 621 This is useful if your JSON texts are not delimited by an outer 622 protocol (which is not the brightest thing to do in the first place) 623 and you need to know where the JSON text ends. 624 625 JSON::XS->new->decode_prefix ("[1] the tail") 626 => ([], 3) 627 628INCREMENTAL PARSING 629 In some cases, there is the need for incremental parsing of JSON texts. 630 While this module always has to keep both JSON text and resulting Perl 631 data structure in memory at one time, it does allow you to parse a JSON 632 stream incrementally. It does so by accumulating text until it has a 633 full JSON object, which it then can decode. This process is similar to 634 using "decode_prefix" to see if a full JSON object is available, but is 635 much more efficient (and can be implemented with a minimum of method 636 calls). 637 638 JSON::XS will only attempt to parse the JSON text once it is sure it has 639 enough text to get a decisive result, using a very simple but truly 640 incremental parser. This means that it sometimes won't stop as early as 641 the full parser, for example, it doesn't detect mismatched parentheses. 642 The only thing it guarantees is that it starts decoding as soon as a 643 syntactically valid JSON text has been seen. This means you need to set 644 resource limits (e.g. "max_size") to ensure the parser will stop parsing 645 in the presence if syntax errors. 646 647 The following methods implement this incremental parser. 648 649 [void, scalar or list context] = $json->incr_parse ([$string]) 650 This is the central parsing function. It can both append new text 651 and extract objects from the stream accumulated so far (both of 652 these functions are optional). 653 654 If $string is given, then this string is appended to the already 655 existing JSON fragment stored in the $json object. 656 657 After that, if the function is called in void context, it will 658 simply return without doing anything further. This can be used to 659 add more text in as many chunks as you want. 660 661 If the method is called in scalar context, then it will try to 662 extract exactly *one* JSON object. If that is successful, it will 663 return this object, otherwise it will return "undef". If there is a 664 parse error, this method will croak just as "decode" would do (one 665 can then use "incr_skip" to skip the errornous part). This is the 666 most common way of using the method. 667 668 And finally, in list context, it will try to extract as many objects 669 from the stream as it can find and return them, or the empty list 670 otherwise. For this to work, there must be no separators between the 671 JSON objects or arrays, instead they must be concatenated 672 back-to-back. If an error occurs, an exception will be raised as in 673 the scalar context case. Note that in this case, any 674 previously-parsed JSON texts will be lost. 675 676 Example: Parse some JSON arrays/objects in a given string and return 677 them. 678 679 my @objs = JSON::XS->new->incr_parse ("[5][7][1,2]"); 680 681 $lvalue_string = $json->incr_text 682 This method returns the currently stored JSON fragment as an lvalue, 683 that is, you can manipulate it. This *only* works when a preceding 684 call to "incr_parse" in *scalar context* successfully returned an 685 object. Under all other circumstances you must not call this 686 function (I mean it. although in simple tests it might actually 687 work, it *will* fail under real world conditions). As a special 688 exception, you can also call this method before having parsed 689 anything. 690 691 This function is useful in two cases: a) finding the trailing text 692 after a JSON object or b) parsing multiple JSON objects separated by 693 non-JSON text (such as commas). 694 695 $json->incr_skip 696 This will reset the state of the incremental parser and will remove 697 the parsed text from the input buffer so far. This is useful after 698 "incr_parse" died, in which case the input buffer and incremental 699 parser state is left unchanged, to skip the text parsed so far and 700 to reset the parse state. 701 702 The difference to "incr_reset" is that only text until the parse 703 error occured is removed. 704 705 $json->incr_reset 706 This completely resets the incremental parser, that is, after this 707 call, it will be as if the parser had never parsed anything. 708 709 This is useful if you want to repeatedly parse JSON objects and want 710 to ignore any trailing data, which means you have to reset the 711 parser after each successful decode. 712 713 LIMITATIONS 714 All options that affect decoding are supported, except "allow_nonref". 715 The reason for this is that it cannot be made to work sensibly: JSON 716 objects and arrays are self-delimited, i.e. you can concatenate them 717 back to back and still decode them perfectly. This does not hold true 718 for JSON numbers, however. 719 720 For example, is the string 1 a single JSON number, or is it simply the 721 start of 12? Or is 12 a single JSON number, or the concatenation of 1 722 and 2? In neither case you can tell, and this is why JSON::XS takes the 723 conservative route and disallows this case. 724 725 EXAMPLES 726 Some examples will make all this clearer. First, a simple example that 727 works similarly to "decode_prefix": We want to decode the JSON object at 728 the start of a string and identify the portion after the JSON object: 729 730 my $text = "[1,2,3] hello"; 731 732 my $json = new JSON::XS; 733 734 my $obj = $json->incr_parse ($text) 735 or die "expected JSON object or array at beginning of string"; 736 737 my $tail = $json->incr_text; 738 # $tail now contains " hello" 739 740 Easy, isn't it? 741 742 Now for a more complicated example: Imagine a hypothetical protocol 743 where you read some requests from a TCP stream, and each request is a 744 JSON array, without any separation between them (in fact, it is often 745 useful to use newlines as "separators", as these get interpreted as 746 whitespace at the start of the JSON text, which makes it possible to 747 test said protocol with "telnet"...). 748 749 Here is how you'd do it (it is trivial to write this in an event-based 750 manner): 751 752 my $json = new JSON::XS; 753 754 # read some data from the socket 755 while (sysread $socket, my $buf, 4096) { 756 757 # split and decode as many requests as possible 758 for my $request ($json->incr_parse ($buf)) { 759 # act on the $request 760 } 761 } 762 763 Another complicated example: Assume you have a string with JSON objects 764 or arrays, all separated by (optional) comma characters (e.g. "[1],[2], 765 [3]"). To parse them, we have to skip the commas between the JSON texts, 766 and here is where the lvalue-ness of "incr_text" comes in useful: 767 768 my $text = "[1],[2], [3]"; 769 my $json = new JSON::XS; 770 771 # void context, so no parsing done 772 $json->incr_parse ($text); 773 774 # now extract as many objects as possible. note the 775 # use of scalar context so incr_text can be called. 776 while (my $obj = $json->incr_parse) { 777 # do something with $obj 778 779 # now skip the optional comma 780 $json->incr_text =~ s/^ \s* , //x; 781 } 782 783 Now lets go for a very complex example: Assume that you have a gigantic 784 JSON array-of-objects, many gigabytes in size, and you want to parse it, 785 but you cannot load it into memory fully (this has actually happened in 786 the real world :). 787 788 Well, you lost, you have to implement your own JSON parser. But JSON::XS 789 can still help you: You implement a (very simple) array parser and let 790 JSON decode the array elements, which are all full JSON objects on their 791 own (this wouldn't work if the array elements could be JSON numbers, for 792 example): 793 794 my $json = new JSON::XS; 795 796 # open the monster 797 open my $fh, "<bigfile.json" 798 or die "bigfile: $!"; 799 800 # first parse the initial "[" 801 for (;;) { 802 sysread $fh, my $buf, 65536 803 or die "read error: $!"; 804 $json->incr_parse ($buf); # void context, so no parsing 805 806 # Exit the loop once we found and removed(!) the initial "[". 807 # In essence, we are (ab-)using the $json object as a simple scalar 808 # we append data to. 809 last if $json->incr_text =~ s/^ \s* \[ //x; 810 } 811 812 # now we have the skipped the initial "[", so continue 813 # parsing all the elements. 814 for (;;) { 815 # in this loop we read data until we got a single JSON object 816 for (;;) { 817 if (my $obj = $json->incr_parse) { 818 # do something with $obj 819 last; 820 } 821 822 # add more data 823 sysread $fh, my $buf, 65536 824 or die "read error: $!"; 825 $json->incr_parse ($buf); # void context, so no parsing 826 } 827 828 # in this loop we read data until we either found and parsed the 829 # separating "," between elements, or the final "]" 830 for (;;) { 831 # first skip whitespace 832 $json->incr_text =~ s/^\s*//; 833 834 # if we find "]", we are done 835 if ($json->incr_text =~ s/^\]//) { 836 print "finished.\n"; 837 exit; 838 } 839 840 # if we find ",", we can continue with the next element 841 if ($json->incr_text =~ s/^,//) { 842 last; 843 } 844 845 # if we find anything else, we have a parse error! 846 if (length $json->incr_text) { 847 die "parse error near ", $json->incr_text; 848 } 849 850 # else add more data 851 sysread $fh, my $buf, 65536 852 or die "read error: $!"; 853 $json->incr_parse ($buf); # void context, so no parsing 854 } 855 856 This is a complex example, but most of the complexity comes from the 857 fact that we are trying to be correct (bear with me if I am wrong, I 858 never ran the above example :). 859 860MAPPING 861 This section describes how JSON::XS maps Perl values to JSON values and 862 vice versa. These mappings are designed to "do the right thing" in most 863 circumstances automatically, preserving round-tripping characteristics 864 (what you put in comes out as something equivalent). 865 866 For the more enlightened: note that in the following descriptions, 867 lowercase *perl* refers to the Perl interpreter, while uppercase *Perl* 868 refers to the abstract Perl language itself. 869 870 JSON -> PERL 871 object 872 A JSON object becomes a reference to a hash in Perl. No ordering of 873 object keys is preserved (JSON does not preserve object key ordering 874 itself). 875 876 array 877 A JSON array becomes a reference to an array in Perl. 878 879 string 880 A JSON string becomes a string scalar in Perl - Unicode codepoints 881 in JSON are represented by the same codepoints in the Perl string, 882 so no manual decoding is necessary. 883 884 number 885 A JSON number becomes either an integer, numeric (floating point) or 886 string scalar in perl, depending on its range and any fractional 887 parts. On the Perl level, there is no difference between those as 888 Perl handles all the conversion details, but an integer may take 889 slightly less memory and might represent more values exactly than 890 floating point numbers. 891 892 If the number consists of digits only, JSON::XS will try to 893 represent it as an integer value. If that fails, it will try to 894 represent it as a numeric (floating point) value if that is possible 895 without loss of precision. Otherwise it will preserve the number as 896 a string value (in which case you lose roundtripping ability, as the 897 JSON number will be re-encoded toa JSON string). 898 899 Numbers containing a fractional or exponential part will always be 900 represented as numeric (floating point) values, possibly at a loss 901 of precision (in which case you might lose perfect roundtripping 902 ability, but the JSON number will still be re-encoded as a JSON 903 number). 904 905 Note that precision is not accuracy - binary floating point values 906 cannot represent most decimal fractions exactly, and when converting 907 from and to floating point, JSON::XS only guarantees precision up to 908 but not including the leats significant bit. 909 910 true, false 911 These JSON atoms become "JSON::XS::true" and "JSON::XS::false", 912 respectively. They are overloaded to act almost exactly like the 913 numbers 1 and 0. You can check whether a scalar is a JSON boolean by 914 using the "JSON::XS::is_bool" function. 915 916 null 917 A JSON null atom becomes "undef" in Perl. 918 919 PERL -> JSON 920 The mapping from Perl to JSON is slightly more difficult, as Perl is a 921 truly typeless language, so we can only guess which JSON type is meant 922 by a Perl value. 923 924 hash references 925 Perl hash references become JSON objects. As there is no inherent 926 ordering in hash keys (or JSON objects), they will usually be 927 encoded in a pseudo-random order that can change between runs of the 928 same program but stays generally the same within a single run of a 929 program. JSON::XS can optionally sort the hash keys (determined by 930 the *canonical* flag), so the same datastructure will serialise to 931 the same JSON text (given same settings and version of JSON::XS), 932 but this incurs a runtime overhead and is only rarely useful, e.g. 933 when you want to compare some JSON text against another for 934 equality. 935 936 array references 937 Perl array references become JSON arrays. 938 939 other references 940 Other unblessed references are generally not allowed and will cause 941 an exception to be thrown, except for references to the integers 0 942 and 1, which get turned into "false" and "true" atoms in JSON. You 943 can also use "JSON::XS::false" and "JSON::XS::true" to improve 944 readability. 945 946 encode_json [\0, JSON::XS::true] # yields [false,true] 947 948 JSON::XS::true, JSON::XS::false 949 These special values become JSON true and JSON false values, 950 respectively. You can also use "\1" and "\0" directly if you want. 951 952 blessed objects 953 Blessed objects are not directly representable in JSON. See the 954 "allow_blessed" and "convert_blessed" methods on various options on 955 how to deal with this: basically, you can choose between throwing an 956 exception, encoding the reference as if it weren't blessed, or 957 provide your own serialiser method. 958 959 simple scalars 960 Simple Perl scalars (any scalar that is not a reference) are the 961 most difficult objects to encode: JSON::XS will encode undefined 962 scalars as JSON "null" values, scalars that have last been used in a 963 string context before encoding as JSON strings, and anything else as 964 number value: 965 966 # dump as number 967 encode_json [2] # yields [2] 968 encode_json [-3.0e17] # yields [-3e+17] 969 my $value = 5; encode_json [$value] # yields [5] 970 971 # used as string, so dump as string 972 print $value; 973 encode_json [$value] # yields ["5"] 974 975 # undef becomes null 976 encode_json [undef] # yields [null] 977 978 You can force the type to be a JSON string by stringifying it: 979 980 my $x = 3.1; # some variable containing a number 981 "$x"; # stringified 982 $x .= ""; # another, more awkward way to stringify 983 print $x; # perl does it for you, too, quite often 984 985 You can force the type to be a JSON number by numifying it: 986 987 my $x = "3"; # some variable containing a string 988 $x += 0; # numify it, ensuring it will be dumped as a number 989 $x *= 1; # same thing, the choice is yours. 990 991 You can not currently force the type in other, less obscure, ways. 992 Tell me if you need this capability (but don't forget to explain why 993 it's needed :). 994 995 Note that numerical precision has the same meaning as under Perl (so 996 binary to decimal conversion follows the same rules as in Perl, 997 which can differ to other languages). Also, your perl interpreter 998 might expose extensions to the floating point numbers of your 999 platform, such as infinities or NaN's - these cannot be represented 1000 in JSON, and it is an error to pass those in. 1001 1002ENCODING/CODESET FLAG NOTES 1003 The interested reader might have seen a number of flags that signify 1004 encodings or codesets - "utf8", "latin1" and "ascii". There seems to be 1005 some confusion on what these do, so here is a short comparison: 1006 1007 "utf8" controls whether the JSON text created by "encode" (and expected 1008 by "decode") is UTF-8 encoded or not, while "latin1" and "ascii" only 1009 control whether "encode" escapes character values outside their 1010 respective codeset range. Neither of these flags conflict with each 1011 other, although some combinations make less sense than others. 1012 1013 Care has been taken to make all flags symmetrical with respect to 1014 "encode" and "decode", that is, texts encoded with any combination of 1015 these flag values will be correctly decoded when the same flags are used 1016 - in general, if you use different flag settings while encoding vs. when 1017 decoding you likely have a bug somewhere. 1018 1019 Below comes a verbose discussion of these flags. Note that a "codeset" 1020 is simply an abstract set of character-codepoint pairs, while an 1021 encoding takes those codepoint numbers and *encodes* them, in our case 1022 into octets. Unicode is (among other things) a codeset, UTF-8 is an 1023 encoding, and ISO-8859-1 (= latin 1) and ASCII are both codesets *and* 1024 encodings at the same time, which can be confusing. 1025 1026 "utf8" flag disabled 1027 When "utf8" is disabled (the default), then "encode"/"decode" 1028 generate and expect Unicode strings, that is, characters with high 1029 ordinal Unicode values (> 255) will be encoded as such characters, 1030 and likewise such characters are decoded as-is, no canges to them 1031 will be done, except "(re-)interpreting" them as Unicode codepoints 1032 or Unicode characters, respectively (to Perl, these are the same 1033 thing in strings unless you do funny/weird/dumb stuff). 1034 1035 This is useful when you want to do the encoding yourself (e.g. when 1036 you want to have UTF-16 encoded JSON texts) or when some other layer 1037 does the encoding for you (for example, when printing to a terminal 1038 using a filehandle that transparently encodes to UTF-8 you certainly 1039 do NOT want to UTF-8 encode your data first and have Perl encode it 1040 another time). 1041 1042 "utf8" flag enabled 1043 If the "utf8"-flag is enabled, "encode"/"decode" will encode all 1044 characters using the corresponding UTF-8 multi-byte sequence, and 1045 will expect your input strings to be encoded as UTF-8, that is, no 1046 "character" of the input string must have any value > 255, as UTF-8 1047 does not allow that. 1048 1049 The "utf8" flag therefore switches between two modes: disabled means 1050 you will get a Unicode string in Perl, enabled means you get an 1051 UTF-8 encoded octet/binary string in Perl. 1052 1053 "latin1" or "ascii" flags enabled 1054 With "latin1" (or "ascii") enabled, "encode" will escape characters 1055 with ordinal values > 255 (> 127 with "ascii") and encode the 1056 remaining characters as specified by the "utf8" flag. 1057 1058 If "utf8" is disabled, then the result is also correctly encoded in 1059 those character sets (as both are proper subsets of Unicode, meaning 1060 that a Unicode string with all character values < 256 is the same 1061 thing as a ISO-8859-1 string, and a Unicode string with all 1062 character values < 128 is the same thing as an ASCII string in 1063 Perl). 1064 1065 If "utf8" is enabled, you still get a correct UTF-8-encoded string, 1066 regardless of these flags, just some more characters will be escaped 1067 using "\uXXXX" then before. 1068 1069 Note that ISO-8859-1-*encoded* strings are not compatible with UTF-8 1070 encoding, while ASCII-encoded strings are. That is because the 1071 ISO-8859-1 encoding is NOT a subset of UTF-8 (despite the ISO-8859-1 1072 *codeset* being a subset of Unicode), while ASCII is. 1073 1074 Surprisingly, "decode" will ignore these flags and so treat all 1075 input values as governed by the "utf8" flag. If it is disabled, this 1076 allows you to decode ISO-8859-1- and ASCII-encoded strings, as both 1077 strict subsets of Unicode. If it is enabled, you can correctly 1078 decode UTF-8 encoded strings. 1079 1080 So neither "latin1" nor "ascii" are incompatible with the "utf8" 1081 flag - they only govern when the JSON output engine escapes a 1082 character or not. 1083 1084 The main use for "latin1" is to relatively efficiently store binary 1085 data as JSON, at the expense of breaking compatibility with most 1086 JSON decoders. 1087 1088 The main use for "ascii" is to force the output to not contain 1089 characters with values > 127, which means you can interpret the 1090 resulting string as UTF-8, ISO-8859-1, ASCII, KOI8-R or most about 1091 any character set and 8-bit-encoding, and still get the same data 1092 structure back. This is useful when your channel for JSON transfer 1093 is not 8-bit clean or the encoding might be mangled in between (e.g. 1094 in mail), and works because ASCII is a proper subset of most 8-bit 1095 and multibyte encodings in use in the world. 1096 1097 JSON and ECMAscript 1098 JSON syntax is based on how literals are represented in javascript (the 1099 not-standardised predecessor of ECMAscript) which is presumably why it 1100 is called "JavaScript Object Notation". 1101 1102 However, JSON is not a subset (and also not a superset of course) of 1103 ECMAscript (the standard) or javascript (whatever browsers actually 1104 implement). 1105 1106 If you want to use javascript's "eval" function to "parse" JSON, you 1107 might run into parse errors for valid JSON texts, or the resulting data 1108 structure might not be queryable: 1109 1110 One of the problems is that U+2028 and U+2029 are valid characters 1111 inside JSON strings, but are not allowed in ECMAscript string literals, 1112 so the following Perl fragment will not output something that can be 1113 guaranteed to be parsable by javascript's "eval": 1114 1115 use JSON::XS; 1116 1117 print encode_json [chr 0x2028]; 1118 1119 The right fix for this is to use a proper JSON parser in your javascript 1120 programs, and not rely on "eval" (see for example Douglas Crockford's 1121 json2.js parser). 1122 1123 If this is not an option, you can, as a stop-gap measure, simply encode 1124 to ASCII-only JSON: 1125 1126 use JSON::XS; 1127 1128 print JSON::XS->new->ascii->encode ([chr 0x2028]); 1129 1130 Note that this will enlarge the resulting JSON text quite a bit if you 1131 have many non-ASCII characters. You might be tempted to run some regexes 1132 to only escape U+2028 and U+2029, e.g.: 1133 1134 # DO NOT USE THIS! 1135 my $json = JSON::XS->new->utf8->encode ([chr 0x2028]); 1136 $json =~ s/\xe2\x80\xa8/\\u2028/g; # escape U+2028 1137 $json =~ s/\xe2\x80\xa9/\\u2029/g; # escape U+2029 1138 print $json; 1139 1140 Note that *this is a bad idea*: the above only works for U+2028 and 1141 U+2029 and thus only for fully ECMAscript-compliant parsers. Many 1142 existing javascript implementations, however, have issues with other 1143 characters as well - using "eval" naively simply *will* cause problems. 1144 1145 Another problem is that some javascript implementations reserve some 1146 property names for their own purposes (which probably makes them 1147 non-ECMAscript-compliant). For example, Iceweasel reserves the 1148 "__proto__" property name for its own purposes. 1149 1150 If that is a problem, you could parse try to filter the resulting JSON 1151 output for these property strings, e.g.: 1152 1153 $json =~ s/"__proto__"\s*:/"__proto__renamed":/g; 1154 1155 This works because "__proto__" is not valid outside of strings, so every 1156 occurence of ""__proto__"\s*:" must be a string used as property name. 1157 1158 If you know of other incompatibilities, please let me know. 1159 1160 JSON and YAML 1161 You often hear that JSON is a subset of YAML. This is, however, a mass 1162 hysteria(*) and very far from the truth (as of the time of this 1163 writing), so let me state it clearly: *in general, there is no way to 1164 configure JSON::XS to output a data structure as valid YAML* that works 1165 in all cases. 1166 1167 If you really must use JSON::XS to generate YAML, you should use this 1168 algorithm (subject to change in future versions): 1169 1170 my $to_yaml = JSON::XS->new->utf8->space_after (1); 1171 my $yaml = $to_yaml->encode ($ref) . "\n"; 1172 1173 This will *usually* generate JSON texts that also parse as valid YAML. 1174 Please note that YAML has hardcoded limits on (simple) object key 1175 lengths that JSON doesn't have and also has different and incompatible 1176 unicode character escape syntax, so you should make sure that your hash 1177 keys are noticeably shorter than the 1024 "stream characters" YAML 1178 allows and that you do not have characters with codepoint values outside 1179 the Unicode BMP (basic multilingual page). YAML also does not allow "\/" 1180 sequences in strings (which JSON::XS does not *currently* generate, but 1181 other JSON generators might). 1182 1183 There might be other incompatibilities that I am not aware of (or the 1184 YAML specification has been changed yet again - it does so quite often). 1185 In general you should not try to generate YAML with a JSON generator or 1186 vice versa, or try to parse JSON with a YAML parser or vice versa: 1187 chances are high that you will run into severe interoperability problems 1188 when you least expect it. 1189 1190 (*) I have been pressured multiple times by Brian Ingerson (one of the 1191 authors of the YAML specification) to remove this paragraph, despite 1192 him acknowledging that the actual incompatibilities exist. As I was 1193 personally bitten by this "JSON is YAML" lie, I refused and said I 1194 will continue to educate people about these issues, so others do not 1195 run into the same problem again and again. After this, Brian called 1196 me a (quote)*complete and worthless idiot*(unquote). 1197 1198 In my opinion, instead of pressuring and insulting people who 1199 actually clarify issues with YAML and the wrong statements of some 1200 of its proponents, I would kindly suggest reading the JSON spec 1201 (which is not that difficult or long) and finally make YAML 1202 compatible to it, and educating users about the changes, instead of 1203 spreading lies about the real compatibility for many *years* and 1204 trying to silence people who point out that it isn't true. 1205 1206 Addendum/2009: the YAML 1.2 spec is still incompatible with JSON, 1207 even though the incompatibilities have been documented (and are 1208 known to Brian) for many years and the spec makes explicit claims 1209 that YAML is a superset of JSON. It would be so easy to fix, but 1210 apparently, bullying people and corrupting userdata is so much 1211 easier. 1212 1213 SPEED 1214 It seems that JSON::XS is surprisingly fast, as shown in the following 1215 tables. They have been generated with the help of the "eg/bench" program 1216 in the JSON::XS distribution, to make it easy to compare on your own 1217 system. 1218 1219 First comes a comparison between various modules using a very short 1220 single-line JSON string (also available at 1221 <http://dist.schmorp.de/misc/json/short.json>). 1222 1223 {"method": "handleMessage", "params": ["user1", 1224 "we were just talking"], "id": null, "array":[1,11,234,-5,1e5,1e7, 1225 1, 0]} 1226 1227 It shows the number of encodes/decodes per second (JSON::XS uses the 1228 functional interface, while JSON::XS/2 uses the OO interface with 1229 pretty-printing and hashkey sorting enabled, JSON::XS/3 enables shrink. 1230 JSON::DWIW/DS uses the deserialise function, while JSON::DWIW::FJ uses 1231 the from_json method). Higher is better: 1232 1233 module | encode | decode | 1234 --------------|------------|------------| 1235 JSON::DWIW/DS | 86302.551 | 102300.098 | 1236 JSON::DWIW/FJ | 86302.551 | 75983.768 | 1237 JSON::PP | 15827.562 | 6638.658 | 1238 JSON::Syck | 63358.066 | 47662.545 | 1239 JSON::XS | 511500.488 | 511500.488 | 1240 JSON::XS/2 | 291271.111 | 388361.481 | 1241 JSON::XS/3 | 361577.931 | 361577.931 | 1242 Storable | 66788.280 | 265462.278 | 1243 --------------+------------+------------+ 1244 1245 That is, JSON::XS is almost six times faster than JSON::DWIW on 1246 encoding, about five times faster on decoding, and over thirty to 1247 seventy times faster than JSON's pure perl implementation. It also 1248 compares favourably to Storable for small amounts of data. 1249 1250 Using a longer test string (roughly 18KB, generated from Yahoo! Locals 1251 search API (<http://dist.schmorp.de/misc/json/long.json>). 1252 1253 module | encode | decode | 1254 --------------|------------|------------| 1255 JSON::DWIW/DS | 1647.927 | 2673.916 | 1256 JSON::DWIW/FJ | 1630.249 | 2596.128 | 1257 JSON::PP | 400.640 | 62.311 | 1258 JSON::Syck | 1481.040 | 1524.869 | 1259 JSON::XS | 20661.596 | 9541.183 | 1260 JSON::XS/2 | 10683.403 | 9416.938 | 1261 JSON::XS/3 | 20661.596 | 9400.054 | 1262 Storable | 19765.806 | 10000.725 | 1263 --------------+------------+------------+ 1264 1265 Again, JSON::XS leads by far (except for Storable which non-surprisingly 1266 decodes a bit faster). 1267 1268 On large strings containing lots of high Unicode characters, some 1269 modules (such as JSON::PC) seem to decode faster than JSON::XS, but the 1270 result will be broken due to missing (or wrong) Unicode handling. Others 1271 refuse to decode or encode properly, so it was impossible to prepare a 1272 fair comparison table for that case. 1273 1274SECURITY CONSIDERATIONS 1275 When you are using JSON in a protocol, talking to untrusted potentially 1276 hostile creatures requires relatively few measures. 1277 1278 First of all, your JSON decoder should be secure, that is, should not 1279 have any buffer overflows. Obviously, this module should ensure that and 1280 I am trying hard on making that true, but you never know. 1281 1282 Second, you need to avoid resource-starving attacks. That means you 1283 should limit the size of JSON texts you accept, or make sure then when 1284 your resources run out, that's just fine (e.g. by using a separate 1285 process that can crash safely). The size of a JSON text in octets or 1286 characters is usually a good indication of the size of the resources 1287 required to decode it into a Perl structure. While JSON::XS can check 1288 the size of the JSON text, it might be too late when you already have it 1289 in memory, so you might want to check the size before you accept the 1290 string. 1291 1292 Third, JSON::XS recurses using the C stack when decoding objects and 1293 arrays. The C stack is a limited resource: for instance, on my amd64 1294 machine with 8MB of stack size I can decode around 180k nested arrays 1295 but only 14k nested JSON objects (due to perl itself recursing deeply on 1296 croak to free the temporary). If that is exceeded, the program crashes. 1297 To be conservative, the default nesting limit is set to 512. If your 1298 process has a smaller stack, you should adjust this setting accordingly 1299 with the "max_depth" method. 1300 1301 Something else could bomb you, too, that I forgot to think of. In that 1302 case, you get to keep the pieces. I am always open for hints, though... 1303 1304 Also keep in mind that JSON::XS might leak contents of your Perl data 1305 structures in its error messages, so when you serialise sensitive 1306 information you might want to make sure that exceptions thrown by 1307 JSON::XS will not end up in front of untrusted eyes. 1308 1309 If you are using JSON::XS to return packets to consumption by JavaScript 1310 scripts in a browser you should have a look at 1311 <http://blog.archive.jpsykes.com/47/practical-csrf-and-json-security/> 1312 to see whether you are vulnerable to some common attack vectors (which 1313 really are browser design bugs, but it is still you who will have to 1314 deal with it, as major browser developers care only for features, not 1315 about getting security right). 1316 1317THREADS 1318 This module is *not* guaranteed to be thread safe and there are no plans 1319 to change this until Perl gets thread support (as opposed to the 1320 horribly slow so-called "threads" which are simply slow and bloated 1321 process simulations - use fork, it's *much* faster, cheaper, better). 1322 1323 (It might actually work, but you have been warned). 1324 1325BUGS 1326 While the goal of this module is to be correct, that unfortunately does 1327 not mean it's bug-free, only that I think its design is bug-free. If you 1328 keep reporting bugs they will be fixed swiftly, though. 1329 1330 Please refrain from using rt.cpan.org or any other bug reporting 1331 service. I put the contact address into my modules for a reason. 1332 1333SEE ALSO 1334 The json_xs command line utility for quick experiments. 1335 1336AUTHOR 1337 Marc Lehmann <schmorp@schmorp.de> 1338 http://home.schmorp.de/ 1339 1340