1=head1 NAME 2 3Types::Serialiser - simple data types for common serialisation formats 4 5=encoding utf-8 6 7=head1 SYNOPSIS 8 9=head1 DESCRIPTION 10 11This module provides some extra datatypes that are used by common 12serialisation formats such as JSON or CBOR. The idea is to have a 13repository of simple/small constants and containers that can be shared by 14different implementations so they become interoperable between each other. 15 16=cut 17 18package Types::Serialiser; 19 20use common::sense; # required to suppress annoying warnings 21 22our $VERSION = '1.0'; 23 24=head1 SIMPLE SCALAR CONSTANTS 25 26Simple scalar constants are values that are overloaded to act like simple 27Perl values, but have (class) type to differentiate them from normal Perl 28scalars. This is necessary because these have different representations in 29the serialisation formats. 30 31=head2 BOOLEANS (Types::Serialiser::Boolean class) 32 33This type has only two instances, true and false. A natural representation 34for these in Perl is C<1> and C<0>, but serialisation formats need to be 35able to differentiate between them and mere numbers. 36 37=over 4 38 39=item $Types::Serialiser::true, Types::Serialiser::true 40 41This value represents the "true" value. In most contexts is acts like 42the number C<1>. It is up to you whether you use the variable form 43(C<$Types::Serialiser::true>) or the constant form (C<Types::Serialiser::true>). 44 45The constant is represented as a reference to a scalar containing C<1> - 46implementations are allowed to directly test for this. 47 48=item $Types::Serialiser::false, Types::Serialiser::false 49 50This value represents the "false" value. In most contexts is acts like 51the number C<0>. It is up to you whether you use the variable form 52(C<$Types::Serialiser::false>) or the constant form (C<Types::Serialiser::false>). 53 54The constant is represented as a reference to a scalar containing C<0> - 55implementations are allowed to directly test for this. 56 57=item $is_bool = Types::Serialiser::is_bool $value 58 59Returns true iff the C<$value> is either C<$Types::Serialiser::true> or 60C<$Types::Serialiser::false>. 61 62For example, you could differentiate between a perl true value and a 63C<Types::Serialiser::true> by using this: 64 65 $value && Types::Serialiser::is_bool $value 66 67=item $is_true = Types::Serialiser::is_true $value 68 69Returns true iff C<$value> is C<$Types::Serialiser::true>. 70 71=item $is_false = Types::Serialiser::is_false $value 72 73Returns false iff C<$value> is C<$Types::Serialiser::false>. 74 75=back 76 77=head2 ERROR (Types::Serialiser::Error class) 78 79This class has only a single instance, C<error>. It is used to signal 80an encoding or decoding error. In CBOR for example, and object that 81couldn't be encoded will be represented by a CBOR undefined value, which 82is represented by the error value in Perl. 83 84=over 4 85 86=item $Types::Serialiser::error, Types::Serialiser::error 87 88This value represents the "error" value. Accessing values of this type 89will throw an exception. 90 91The constant is represented as a reference to a scalar containing C<undef> 92- implementations are allowed to directly test for this. 93 94=item $is_error = Types::Serialiser::is_error $value 95 96Returns false iff C<$value> is C<$Types::Serialiser::error>. 97 98=back 99 100=cut 101 102BEGIN { 103 # for historical reasons, and to avoid extra dependencies in JSON::PP, 104 # we alias *Types::Serialiser::Boolean with JSON::PP::Boolean. 105 package JSON::PP::Boolean; 106 107 *Types::Serialiser::Boolean:: = *JSON::PP::Boolean::; 108} 109 110{ 111 # this must done before blessing to work around bugs 112 # in perl < 5.18 (it seems to be fixed in 5.18). 113 package Types::Serialiser::BooleanBase; 114 115 use overload 116 "0+" => sub { ${$_[0]} }, 117 "++" => sub { $_[0] = ${$_[0]} + 1 }, 118 "--" => sub { $_[0] = ${$_[0]} - 1 }, 119 fallback => 1; 120 121 @Types::Serialiser::Boolean::ISA = Types::Serialiser::BooleanBase::; 122} 123 124our $true = do { bless \(my $dummy = 1), Types::Serialiser::Boolean:: }; 125our $false = do { bless \(my $dummy = 0), Types::Serialiser::Boolean:: }; 126our $error = do { bless \(my $dummy ), Types::Serialiser::Error:: }; 127 128sub true () { $true } 129sub false () { $false } 130sub error () { $error } 131 132sub is_bool ($) { UNIVERSAL::isa $_[0], Types::Serialiser::Boolean:: } 133sub is_true ($) { $_[0] && UNIVERSAL::isa $_[0], Types::Serialiser::Boolean:: } 134sub is_false ($) { !$_[0] && UNIVERSAL::isa $_[0], Types::Serialiser::Boolean:: } 135sub is_error ($) { UNIVERSAL::isa $_[0], Types::Serialiser::Error:: } 136 137package Types::Serialiser::Error; 138 139sub error { 140 require Carp; 141 Carp::croak ("caught attempt to use the Types::Serialiser::error value"); 142}; 143 144use overload 145 "0+" => \&error, 146 "++" => \&error, 147 "--" => \&error, 148 fallback => 1; 149 150=head1 NOTES FOR XS USERS 151 152The recommended way to detect whether a scalar is one of these objects 153is to check whether the stash is the C<Types::Serialiser::Boolean> or 154C<Types::Serialiser::Error> stash, and then follow the scalar reference to 155see if it's C<1> (true), C<0> (false) or C<undef> (error). 156 157While it is possible to use an isa test, directly comparing stash pointers 158is faster and guaranteed to work. 159 160For historical reasons, the C<Types::Serialiser::Boolean> stash is 161just an alias for C<JSON::PP::Boolean>. When printed, the classname 162with usually be C<JSON::PP::Boolean>, but isa tests and stash pointer 163comparison will normally work correctly (i.e. Types::Serialiser::true ISA 164JSON::PP::Boolean, but also ISA Types::Serialiser::Boolean). 165 166=head1 A GENERIC OBJECT SERIALIATION PROTOCOL 167 168This section explains the object serialisation protocol used by 169L<CBOR::XS>. It is meant to be generic enough to support any kind of 170generic object serialiser. 171 172This protocol is called "the Types::Serialiser object serialisation 173protocol". 174 175=head2 ENCODING 176 177When the encoder encounters an object that it cannot otherwise encode (for 178example, L<CBOR::XS> can encode a few special types itself, and will first 179attempt to use the special C<TO_CBOR> serialisation protocol), it will 180look up the C<FREEZE> method on the object. 181 182Note that the C<FREEZE> method will normally be called I<during> encoding, 183and I<MUST NOT> change the data structure that is being encoded in any 184way, or it might cause memory corruption or worse. 185 186If it exists, it will call it with two arguments: the object to serialise, 187and a constant string that indicates the name of the data model. For 188example L<CBOR::XS> uses C<CBOR>, and the L<JSON> and L<JSON::XS> modules 189(or any other JSON serialiser), would use C<JSON> as second argument. 190 191The C<FREEZE> method can then return zero or more values to identify the 192object instance. The serialiser is then supposed to encode the class name 193and all of these return values (which must be encodable in the format) 194using the relevant form for Perl objects. In CBOR for example, there is a 195registered tag number for encoded perl objects. 196 197The values that C<FREEZE> returns must be serialisable with the serialiser 198that calls it. Therefore, it is recommended to use simple types such as 199strings and numbers, and maybe array references and hashes (basically, the 200JSON data model). You can always use a more complex format for a specific 201data model by checking the second argument, the data model. 202 203The "data model" is not the same as the "data format" - the data model 204indicates what types and kinds of return values can be returned from 205C<FREEZE>. For example, in C<CBOR> it is permissible to return tagged CBOR 206values, while JSON does not support these at all, so C<JSON> would be a 207valid (but too limited) data model name for C<CBOR::XS>. similarly, a 208serialising format that supports more or less the same data model as JSON 209could use C<JSON> as data model without losing anything. 210 211=head2 DECODING 212 213When the decoder then encounters such an encoded perl object, it should 214look up the C<THAW> method on the stored classname, and invoke it with the 215classname, the constant string to identify the data model/data format, and 216all the return values returned by C<FREEZE>. 217 218=head2 EXAMPLES 219 220See the C<OBJECT SERIALISATION> section in the L<CBOR::XS> manpage for 221more details, an example implementation, and code examples. 222 223Here is an example C<FREEZE>/C<THAW> method pair: 224 225 sub My::Object::FREEZE { 226 my ($self, $model) = @_; 227 228 ($self->{type}, $self->{id}, $self->{variant}) 229 } 230 231 sub My::Object::THAW { 232 my ($class, $model, $type, $id, $variant) = @_; 233 234 $class->new (type => $type, id => $id, variant => $variant) 235 } 236 237=head1 BUGS 238 239The use of L<overload> makes this module much heavier than it should be 240(on my system, this module: 4kB RSS, overload: 260kB RSS). 241 242=head1 SEE ALSO 243 244Currently, L<JSON::XS> and L<CBOR::XS> use these types. 245 246=head1 AUTHOR 247 248 Marc Lehmann <schmorp@schmorp.de> 249 http://home.schmorp.de/ 250 251=cut 252 2531 254 255