1=head1 NAME 2 3BerkeleyDB - Perl extension for Berkeley DB version 2, 3 or 4 4 5=head1 SYNOPSIS 6 7 use BerkeleyDB; 8 9 $env = new BerkeleyDB::Env [OPTIONS] ; 10 11 $db = tie %hash, 'BerkeleyDB::Hash', [OPTIONS] ; 12 $db = new BerkeleyDB::Hash [OPTIONS] ; 13 14 $db = tie %hash, 'BerkeleyDB::Btree', [OPTIONS] ; 15 $db = new BerkeleyDB::Btree [OPTIONS] ; 16 17 $db = tie @array, 'BerkeleyDB::Recno', [OPTIONS] ; 18 $db = new BerkeleyDB::Recno [OPTIONS] ; 19 20 $db = tie @array, 'BerkeleyDB::Queue', [OPTIONS] ; 21 $db = new BerkeleyDB::Queue [OPTIONS] ; 22 23 $db = new BerkeleyDB::Unknown [OPTIONS] ; 24 25 $status = BerkeleyDB::db_remove [OPTIONS] 26 $status = BerkeleyDB::db_rename [OPTIONS] 27 $status = BerkeleyDB::db_verify [OPTIONS] 28 29 $hash{$key} = $value ; 30 $value = $hash{$key} ; 31 each %hash ; 32 keys %hash ; 33 values %hash ; 34 35 $status = $db->db_get() 36 $status = $db->db_put() ; 37 $status = $db->db_del() ; 38 $status = $db->db_sync() ; 39 $status = $db->db_close() ; 40 $status = $db->db_pget() 41 $hash_ref = $db->db_stat() ; 42 $status = $db->db_key_range(); 43 $type = $db->type() ; 44 $status = $db->status() ; 45 $boolean = $db->byteswapped() ; 46 $status = $db->truncate($count) ; 47 $status = $db->compact($start, $stop, $c_data, $flags, $end); 48 49 $bool = $env->cds_enabled(); 50 $bool = $db->cds_enabled(); 51 $lock = $db->cds_lock(); 52 $lock->cds_unlock(); 53 54 ($flag, $old_offset, $old_length) = $db->partial_set($offset, $length) ; 55 ($flag, $old_offset, $old_length) = $db->partial_clear() ; 56 57 $cursor = $db->db_cursor([$flags]) ; 58 $newcursor = $cursor->c_dup([$flags]); 59 $status = $cursor->c_get() ; 60 $status = $cursor->c_put() ; 61 $status = $cursor->c_del() ; 62 $status = $cursor->c_count() ; 63 $status = $cursor->c_pget() ; 64 $status = $cursor->status() ; 65 $status = $cursor->c_close() ; 66 67 $cursor = $db->db_join() ; 68 $status = $cursor->c_get() ; 69 $status = $cursor->c_close() ; 70 71 $status = $env->txn_checkpoint() 72 $hash_ref = $env->txn_stat() 73 $status = $env->setmutexlocks() 74 $status = $env->set_flags() 75 $status = $env->set_timeout() 76 $status = $env->lsn_reset() 77 78 $txn = $env->txn_begin() ; 79 $db->Txn($txn); 80 $txn->Txn($db1, $db2,...); 81 $status = $txn->txn_prepare() 82 $status = $txn->txn_commit() 83 $status = $txn->txn_abort() 84 $status = $txn->txn_id() 85 $status = $txn->txn_discard() 86 $status = $txn->set_timeout() 87 88 $status = $env->set_lg_dir(); 89 $status = $env->set_lg_bsize(); 90 $status = $env->set_lg_max(); 91 92 $status = $env->set_data_dir() ; 93 $status = $env->set_tmp_dir() ; 94 $status = $env->set_verbose() ; 95 $db_env_ptr = $env->DB_ENV() ; 96 97 $BerkeleyDB::Error 98 $BerkeleyDB::db_version 99 100 # DBM Filters 101 $old_filter = $db->filter_store_key ( sub { ... } ) ; 102 $old_filter = $db->filter_store_value( sub { ... } ) ; 103 $old_filter = $db->filter_fetch_key ( sub { ... } ) ; 104 $old_filter = $db->filter_fetch_value( sub { ... } ) ; 105 106 # deprecated, but supported 107 $txn_mgr = $env->TxnMgr(); 108 $status = $txn_mgr->txn_checkpoint() 109 $hash_ref = $txn_mgr->txn_stat() 110 $txn = $txn_mgr->txn_begin() ; 111 112=head1 DESCRIPTION 113 114B<NOTE: This document is still under construction. Expect it to be 115incomplete in places.> 116 117This Perl module provides an interface to most of the functionality 118available in Berkeley DB versions 2, 3 and 4. In general it is safe to assume 119that the interface provided here to be identical to the Berkeley DB 120interface. The main changes have been to make the Berkeley DB API work 121in a Perl way. Note that if you are using Berkeley DB 2.x, the new 122features available in Berkeley DB 3.x or DB 4.x are not available via 123this module. 124 125The reader is expected to be familiar with the Berkeley DB 126documentation. Where the interface provided here is identical to the 127Berkeley DB library and the... TODO 128 129The B<db_appinit>, B<db_cursor>, B<db_open> and B<db_txn> man pages are 130particularly relevant. 131 132The interface to Berkeley DB is implemented with a number of Perl 133classes. 134 135=head1 The BerkeleyDB::Env Class 136 137The B<BerkeleyDB::Env> class provides an interface to the Berkeley DB 138function B<db_appinit> in Berkeley DB 2.x or B<db_env_create> and 139B<DBENV-E<gt>open> in Berkeley DB 3.x/4.x. Its purpose is to initialise a 140number of sub-systems that can then be used in a consistent way in all 141the databases you make use of in the environment. 142 143If you don't intend using transactions, locking or logging, then you 144shouldn't need to make use of B<BerkeleyDB::Env>. 145 146Note that an environment consists of a number of files that Berkeley DB 147manages behind the scenes for you. When you first use an environment, it 148needs to be explicitly created. This is done by including C<DB_CREATE> 149with the C<Flags> parameter, described below. 150 151=head2 Synopsis 152 153 $env = new BerkeleyDB::Env 154 [ -Home => $path, ] 155 [ -Server => $name, ] 156 [ -CacheSize => $number, ] 157 [ -Config => { name => value, name => value }, ] 158 [ -ErrFile => filename, ] 159 [ -MsgFile => filename, ] 160 [ -ErrPrefix => "string", ] 161 [ -Flags => number, ] 162 [ -SetFlags => bitmask, ] 163 [ -LockDetect => number, ] 164 [ -SharedMemKey => number, ] 165 [ -Verbose => boolean, ] 166 [ -Encrypt => { Password => "string", 167 Flags => number }, ] 168 169All the parameters to the BerkeleyDB::Env constructor are optional. 170 171=over 5 172 173=item -Home 174 175If present, this parameter should point to an existing directory. Any 176files that I<aren't> specified with an absolute path in the sub-systems 177that are initialised by the BerkeleyDB::Env class will be assumed to 178live in the B<Home> directory. 179 180For example, in the code fragment below the database "fred.db" will be 181opened in the directory "/home/databases" because it was specified as a 182relative path, but "joe.db" will be opened in "/other" because it was 183part of an absolute path. 184 185 $env = new BerkeleyDB::Env 186 -Home => "/home/databases" 187 ... 188 189 $db1 = new BerkeleyDB::Hash 190 -Filename => "fred.db", 191 -Env => $env 192 ... 193 194 $db2 = new BerkeleyDB::Hash 195 -Filename => "/other/joe.db", 196 -Env => $env 197 ... 198 199=item -Server 200 201If present, this parameter should be the hostname of a server that is running 202the Berkeley DB RPC server. All databases will be accessed via the RPC server. 203 204=item -Encrypt 205 206If present, this parameter will enable encryption of all data before 207it is written to the database. This parameters must be given a hash 208reference. The format is shown below. 209 210 -Encrypt => { -Password => "abc", Flags => DB_ENCRYPT_AES } 211 212Valid values for the Flags are 0 or C<DB_ENCRYPT_AES>. 213 214This option requires Berkeley DB 4.1 or better. 215 216=item -Cachesize 217 218If present, this parameter sets the size of the environments shared memory 219buffer pool. 220 221=item -SharedMemKey 222 223If present, this parameter sets the base segment ID for the shared memory 224region used by Berkeley DB. 225 226This option requires Berkeley DB 3.1 or better. 227 228Use C<$env-E<gt>get_shm_key($id)> to find out the base segment ID used 229once the environment is open. 230 231=item -ThreadCount 232 233If present, this parameter declares the approximate number of threads that 234will be used in the database environment. This parameter is only necessary 235when the $env->failchk method will be used. It does not actually set the 236maximum number of threads but rather is used to determine memory sizing. 237 238This option requires Berkeley DB 4.4 or better. It is only supported on 239Unix/Linux. 240 241=item -Config 242 243This is a variation on the C<-Home> parameter, but it allows finer 244control of where specific types of files will be stored. 245 246The parameter expects a reference to a hash. Valid keys are: 247B<DB_DATA_DIR>, B<DB_LOG_DIR> and B<DB_TMP_DIR> 248 249The code below shows an example of how it can be used. 250 251 $env = new BerkeleyDB::Env 252 -Config => { DB_DATA_DIR => "/home/databases", 253 DB_LOG_DIR => "/home/logs", 254 DB_TMP_DIR => "/home/tmp" 255 } 256 ... 257 258=item -ErrFile 259 260Expects a filename or filenhandle. Any errors generated internally by 261Berkeley DB will be logged to this file. A useful debug setting is to 262open environments with either 263 264 -ErrFile => *STDOUT 265 266or 267 268 -ErrFile => *STDERR 269 270=item -ErrPrefix 271 272Allows a prefix to be added to the error messages before they are sent 273to B<-ErrFile>. 274 275=item -Flags 276 277The B<Flags> parameter specifies both which sub-systems to initialise, 278as well as a number of environment-wide options. 279See the Berkeley DB documentation for more details of these options. 280 281Any of the following can be specified by OR'ing them: 282 283B<DB_CREATE> 284 285If any of the files specified do not already exist, create them. 286 287B<DB_INIT_CDB> 288 289Initialise the Concurrent Access Methods 290 291B<DB_INIT_LOCK> 292 293Initialise the Locking sub-system. 294 295B<DB_INIT_LOG> 296 297Initialise the Logging sub-system. 298 299B<DB_INIT_MPOOL> 300 301Initialise the ... 302 303B<DB_INIT_TXN> 304 305Initialise the ... 306 307B<DB_MPOOL_PRIVATE> 308 309Initialise the ... 310 311B<DB_INIT_MPOOL> is also specified. 312 313Initialise the ... 314 315B<DB_NOMMAP> 316 317Initialise the ... 318 319B<DB_RECOVER> 320 321 322 323B<DB_RECOVER_FATAL> 324 325B<DB_THREAD> 326 327B<DB_TXN_NOSYNC> 328 329B<DB_USE_ENVIRON> 330 331B<DB_USE_ENVIRON_ROOT> 332 333=item -SetFlags 334 335Calls ENV->set_flags with the supplied bitmask. Use this when you need to make 336use of DB_ENV->set_flags before DB_ENV->open is called. 337 338Only valid when Berkeley DB 3.x or better is used. 339 340=item -LockDetect 341 342Specifies what to do when a lock conflict occurs. The value should be one of 343 344B<DB_LOCK_DEFAULT> 345 346B<DB_LOCK_OLDEST> 347 348B<DB_LOCK_RANDOM> 349 350B<DB_LOCK_YOUNGEST> 351 352=item -Verbose 353 354Add extra debugging information to the messages sent to B<-ErrFile>. 355 356=back 357 358=head2 Methods 359 360The environment class has the following methods: 361 362=over 5 363 364=item $env->errPrefix("string") ; 365 366This method is identical to the B<-ErrPrefix> flag. It allows the 367error prefix string to be changed dynamically. 368 369=item $env->set_flags(bitmask, 1|0); 370 371=item $txn = $env->TxnMgr() 372 373Constructor for creating a B<TxnMgr> object. 374See L<"TRANSACTIONS"> for more details of using transactions. 375 376This method is deprecated. Access the transaction methods using the B<txn_> 377methods below from the environment object directly. 378 379=item $env->txn_begin() 380 381TODO 382 383=item $env->txn_stat() 384 385TODO 386 387=item $env->txn_checkpoint() 388 389TODO 390 391=item $env->status() 392 393Returns the status of the last BerkeleyDB::Env method. 394 395 396=item $env->DB_ENV() 397 398Returns a pointer to the underlying DB_ENV data structure that Berkeley 399DB uses. 400 401=item $env->get_shm_key($id) 402 403Writes the base segment ID for the shared memory region used by the 404Berkeley DB environment into C<$id>. Returns 0 on success. 405 406This option requires Berkeley DB 4.2 or better. 407 408Use the C<-SharedMemKey> option when opening the environemt to set the 409base segment ID. 410 411=item $env->set_isalive() 412 413Set the callback that determines if the thread of control, identified by 414the pid and tid arguments, is still running. This method should only be 415used in combination with $env->failchk. 416 417This option requires Berkeley DB 4.4 or better. 418 419=item $env->failchk($flags) 420 421The $env->failchk method checks for threads of control (either a true 422thread or a process) that have exited while manipulating Berkeley DB 423library data structures, while holding a logical database lock, or with an 424unresolved transaction (that is, a transaction that was never aborted or 425committed). 426 427If $env->failchk determines a thread of control exited while holding 428database read locks, it will release those locks. If $env->failchk 429determines a thread of control exited with an unresolved transaction, the 430transaction will be aborted. 431 432Applications calling the $env->failchk method must have already called the 433$env->set_isalive method, on the same DB environement, and must have 434configured their database environment using the -ThreadCount flag. The 435ThreadCount flag cannot be used on an environment that wasn't previously 436initialized with it. 437 438This option requires Berkeley DB 4.4 or better. 439 440=item $env->stat_print 441 442Prints statistical information. 443 444If the C<MsgFile> option is specified the output will be sent to the 445file. Otherwise output is sent to standard output. 446 447This option requires Berkeley DB 4.3 or better. 448 449=item $env->lock_stat_print 450 451Prints locking subsystem statistics. 452 453If the C<MsgFile> option is specified the output will be sent to the 454file. Otherwise output is sent to standard output. 455 456This option requires Berkeley DB 4.3 or better. 457 458=item $env->mutex_stat_print 459 460Prints mutex subsystem statistics. 461 462If the C<MsgFile> option is specified the output will be sent to the 463file. Otherwise output is sent to standard output. 464 465This option requires Berkeley DB 4.4 or better. 466 467 468=item $env->set_timeout($timeout, $flags) 469 470=item $env->status() 471 472Returns the status of the last BerkeleyDB::Env method. 473 474=back 475 476=head2 Examples 477 478TODO. 479 480=head1 Global Classes 481 482 $status = BerkeleyDB::db_remove [OPTIONS] 483 $status = BerkeleyDB::db_rename [OPTIONS] 484 $status = BerkeleyDB::db_verify [OPTIONS] 485 486=head1 THE DATABASE CLASSES 487 488B<BerkeleyDB> supports the following database formats: 489 490=over 5 491 492=item B<BerkeleyDB::Hash> 493 494This database type allows arbitrary key/value pairs to be stored in data 495files. This is equivalent to the functionality provided by other 496hashing packages like DBM, NDBM, ODBM, GDBM, and SDBM. Remember though, 497the files created using B<BerkeleyDB::Hash> are not compatible with any 498of the other packages mentioned. 499 500A default hashing algorithm, which will be adequate for most applications, 501is built into BerkeleyDB. If you do need to use your own hashing algorithm 502it is possible to write your own in Perl and have B<BerkeleyDB> use 503it instead. 504 505=item B<BerkeleyDB::Btree> 506 507The Btree format allows arbitrary key/value pairs to be stored in a 508B+tree. 509 510As with the B<BerkeleyDB::Hash> format, it is possible to provide a 511user defined Perl routine to perform the comparison of keys. By default, 512though, the keys are stored in lexical order. 513 514=item B<BerkeleyDB::Recno> 515 516TODO. 517 518 519=item B<BerkeleyDB::Queue> 520 521TODO. 522 523=item B<BerkeleyDB::Unknown> 524 525This isn't a database format at all. It is used when you want to open an 526existing Berkeley DB database without having to know what type is it. 527 528=back 529 530 531Each of the database formats described above is accessed via a 532corresponding B<BerkeleyDB> class. These will be described in turn in 533the next sections. 534 535=head1 BerkeleyDB::Hash 536 537Equivalent to calling B<db_open> with type B<DB_HASH> in Berkeley DB 2.x and 538calling B<db_create> followed by B<DB-E<gt>open> with type B<DB_HASH> in 539Berkeley DB 3.x or greater. 540 541Two forms of constructor are supported: 542 543 $db = new BerkeleyDB::Hash 544 [ -Filename => "filename", ] 545 [ -Subname => "sub-database name", ] 546 [ -Flags => flags,] 547 [ -Property => flags,] 548 [ -Mode => number,] 549 [ -Cachesize => number,] 550 [ -Lorder => number,] 551 [ -Pagesize => number,] 552 [ -Env => $env,] 553 [ -Txn => $txn,] 554 [ -Encrypt => { Password => "string", 555 Flags => number }, ], 556 # BerkeleyDB::Hash specific 557 [ -Ffactor => number,] 558 [ -Nelem => number,] 559 [ -Hash => code reference,] 560 [ -DupCompare => code reference,] 561 562and this 563 564 [$db =] tie %hash, 'BerkeleyDB::Hash', 565 [ -Filename => "filename", ] 566 [ -Subname => "sub-database name", ] 567 [ -Flags => flags,] 568 [ -Property => flags,] 569 [ -Mode => number,] 570 [ -Cachesize => number,] 571 [ -Lorder => number,] 572 [ -Pagesize => number,] 573 [ -Env => $env,] 574 [ -Txn => $txn,] 575 [ -Encrypt => { Password => "string", 576 Flags => number }, ], 577 # BerkeleyDB::Hash specific 578 [ -Ffactor => number,] 579 [ -Nelem => number,] 580 [ -Hash => code reference,] 581 [ -DupCompare => code reference,] 582 583 584When the "tie" interface is used, reading from and writing to the database 585is achieved via the tied hash. In this case the database operates like 586a Perl associative array that happens to be stored on disk. 587 588In addition to the high-level tied hash interface, it is possible to 589make use of the underlying methods provided by Berkeley DB 590 591=head2 Options 592 593In addition to the standard set of options (see L<COMMON OPTIONS>) 594B<BerkeleyDB::Hash> supports these options: 595 596=over 5 597 598=item -Property 599 600Used to specify extra flags when opening a database. The following 601flags may be specified by bitwise OR'ing together one or more of the 602following values: 603 604B<DB_DUP> 605 606When creating a new database, this flag enables the storing of duplicate 607keys in the database. If B<DB_DUPSORT> is not specified as well, the 608duplicates are stored in the order they are created in the database. 609 610B<DB_DUPSORT> 611 612Enables the sorting of duplicate keys in the database. Ignored if 613B<DB_DUP> isn't also specified. 614 615=item -Ffactor 616 617=item -Nelem 618 619See the Berkeley DB documentation for details of these options. 620 621=item -Hash 622 623Allows you to provide a user defined hash function. If not specified, 624a default hash function is used. Here is a template for a user-defined 625hash function 626 627 sub hash 628 { 629 my ($data) = shift ; 630 ... 631 # return the hash value for $data 632 return $hash ; 633 } 634 635 tie %h, "BerkeleyDB::Hash", 636 -Filename => $filename, 637 -Hash => \&hash, 638 ... 639 640See L<""> for an example. 641 642=item -DupCompare 643 644Used in conjunction with the B<DB_DUPOSRT> flag. 645 646 sub compare 647 { 648 my ($key, $key2) = @_ ; 649 ... 650 # return 0 if $key1 eq $key2 651 # -1 if $key1 lt $key2 652 # 1 if $key1 gt $key2 653 return (-1 , 0 or 1) ; 654 } 655 656 tie %h, "BerkeleyDB::Hash", 657 -Filename => $filename, 658 -Property => DB_DUP|DB_DUPSORT, 659 -DupCompare => \&compare, 660 ... 661 662=back 663 664 665=head2 Methods 666 667B<BerkeleyDB::Hash> only supports the standard database methods. 668See L<COMMON DATABASE METHODS>. 669 670=head2 A Simple Tied Hash Example 671 672## simpleHash 673 674here is the output: 675 676 Banana Exists 677 678 orange -> orange 679 tomato -> red 680 banana -> yellow 681 682Note that the like ordinary associative arrays, the order of the keys 683retrieved from a Hash database are in an apparently random order. 684 685=head2 Another Simple Hash Example 686 687Do the same as the previous example but not using tie. 688 689## simpleHash2 690 691=head2 Duplicate keys 692 693The code below is a variation on the examples above. This time the hash has 694been inverted. The key this time is colour and the value is the fruit name. 695The B<DB_DUP> flag has been specified to allow duplicates. 696 697##dupHash 698 699here is the output: 700 701 orange -> orange 702 yellow -> banana 703 red -> apple 704 red -> tomato 705 green -> banana 706 green -> apple 707 708=head2 Sorting Duplicate Keys 709 710In the previous example, when there were duplicate keys, the values are 711sorted in the order they are stored in. The code below is 712identical to the previous example except the B<DB_DUPSORT> flag is 713specified. 714 715##dupSortHash 716 717Notice that in the output below the duplicate values are sorted. 718 719 orange -> orange 720 yellow -> banana 721 red -> apple 722 red -> tomato 723 green -> apple 724 green -> banana 725 726=head2 Custom Sorting Duplicate Keys 727 728Another variation 729 730TODO 731 732=head2 Changing the hash 733 734TODO 735 736=head2 Using db_stat 737 738TODO 739 740=head1 BerkeleyDB::Btree 741 742Equivalent to calling B<db_open> with type B<DB_BTREE> in Berkeley DB 2.x and 743calling B<db_create> followed by B<DB-E<gt>open> with type B<DB_BTREE> in 744Berkeley DB 3.x or greater. 745 746Two forms of constructor are supported: 747 748 749 $db = new BerkeleyDB::Btree 750 [ -Filename => "filename", ] 751 [ -Subname => "sub-database name", ] 752 [ -Flags => flags,] 753 [ -Property => flags,] 754 [ -Mode => number,] 755 [ -Cachesize => number,] 756 [ -Lorder => number,] 757 [ -Pagesize => number,] 758 [ -Env => $env,] 759 [ -Txn => $txn,] 760 [ -Encrypt => { Password => "string", 761 Flags => number }, ], 762 # BerkeleyDB::Btree specific 763 [ -Minkey => number,] 764 [ -Compare => code reference,] 765 [ -DupCompare => code reference,] 766 [ -Prefix => code reference,] 767 768and this 769 770 [$db =] tie %hash, 'BerkeleyDB::Btree', 771 [ -Filename => "filename", ] 772 [ -Subname => "sub-database name", ] 773 [ -Flags => flags,] 774 [ -Property => flags,] 775 [ -Mode => number,] 776 [ -Cachesize => number,] 777 [ -Lorder => number,] 778 [ -Pagesize => number,] 779 [ -Env => $env,] 780 [ -Txn => $txn,] 781 [ -Encrypt => { Password => "string", 782 Flags => number }, ], 783 # BerkeleyDB::Btree specific 784 [ -Minkey => number,] 785 [ -Compare => code reference,] 786 [ -DupCompare => code reference,] 787 [ -Prefix => code reference,] 788 789=head2 Options 790 791In addition to the standard set of options (see L<COMMON OPTIONS>) 792B<BerkeleyDB::Btree> supports these options: 793 794=over 5 795 796=item -Property 797 798Used to specify extra flags when opening a database. The following 799flags may be specified by bitwise OR'ing together one or more of the 800following values: 801 802B<DB_DUP> 803 804When creating a new database, this flag enables the storing of duplicate 805keys in the database. If B<DB_DUPSORT> is not specified as well, the 806duplicates are stored in the order they are created in the database. 807 808B<DB_DUPSORT> 809 810Enables the sorting of duplicate keys in the database. Ignored if 811B<DB_DUP> isn't also specified. 812 813=item Minkey 814 815TODO 816 817=item Compare 818 819Allow you to override the default sort order used in the database. See 820L<"Changing the sort order"> for an example. 821 822 sub compare 823 { 824 my ($key, $key2) = @_ ; 825 ... 826 # return 0 if $key1 eq $key2 827 # -1 if $key1 lt $key2 828 # 1 if $key1 gt $key2 829 return (-1 , 0 or 1) ; 830 } 831 832 tie %h, "BerkeleyDB::Hash", 833 -Filename => $filename, 834 -Compare => \&compare, 835 ... 836 837=item Prefix 838 839 sub prefix 840 { 841 my ($key, $key2) = @_ ; 842 ... 843 # return number of bytes of $key2 which are 844 # necessary to determine that it is greater than $key1 845 return $bytes ; 846 } 847 848 tie %h, "BerkeleyDB::Hash", 849 -Filename => $filename, 850 -Prefix => \&prefix, 851 ... 852=item DupCompare 853 854 sub compare 855 { 856 my ($key, $key2) = @_ ; 857 ... 858 # return 0 if $key1 eq $key2 859 # -1 if $key1 lt $key2 860 # 1 if $key1 gt $key2 861 return (-1 , 0 or 1) ; 862 } 863 864 tie %h, "BerkeleyDB::Hash", 865 -Filename => $filename, 866 -DupCompare => \&compare, 867 ... 868 869=item set_bt_compress 870 871Enabled compression of the btree data. The callback interface is not 872supported at present. Need Berkeley DB 4.8 or better. 873 874=back 875 876=head2 Methods 877 878B<BerkeleyDB::Btree> supports the following database methods. 879See also L<COMMON DATABASE METHODS>. 880 881All the methods below return 0 to indicate success. 882 883=over 5 884 885=item $status = $db->db_key_range($key, $less, $equal, $greater [, $flags]) 886 887Given a key, C<$key>, this method returns the proportion of keys less than 888C<$key> in C<$less>, the proportion equal to C<$key> in C<$equal> and the 889proportion greater than C<$key> in C<$greater>. 890 891The proportion is returned as a double in the range 0.0 to 1.0. 892 893=back 894 895=head2 A Simple Btree Example 896 897The code below is a simple example of using a btree database. 898 899## btreeSimple 900 901Here is the output from the code above. The keys have been sorted using 902Berkeley DB's default sorting algorithm. 903 904 Smith 905 Wall 906 mouse 907 908 909=head2 Changing the sort order 910 911It is possible to supply your own sorting algorithm if the one that Berkeley 912DB used isn't suitable. The code below is identical to the previous example 913except for the case insensitive compare function. 914 915## btreeSortOrder 916 917Here is the output from the code above. 918 919 mouse 920 Smith 921 Wall 922 923There are a few point to bear in mind if you want to change the 924ordering in a BTREE database: 925 926=over 5 927 928=item 1. 929 930The new compare function must be specified when you create the database. 931 932=item 2. 933 934You cannot change the ordering once the database has been created. Thus 935you must use the same compare function every time you access the 936database. 937 938=back 939 940=head2 Using db_stat 941 942TODO 943 944=head1 BerkeleyDB::Recno 945 946Equivalent to calling B<db_open> with type B<DB_RECNO> in Berkeley DB 2.x and 947calling B<db_create> followed by B<DB-E<gt>open> with type B<DB_RECNO> in 948Berkeley DB 3.x or greater. 949 950Two forms of constructor are supported: 951 952 $db = new BerkeleyDB::Recno 953 [ -Filename => "filename", ] 954 [ -Subname => "sub-database name", ] 955 [ -Flags => flags,] 956 [ -Property => flags,] 957 [ -Mode => number,] 958 [ -Cachesize => number,] 959 [ -Lorder => number,] 960 [ -Pagesize => number,] 961 [ -Env => $env,] 962 [ -Txn => $txn,] 963 [ -Encrypt => { Password => "string", 964 Flags => number }, ], 965 # BerkeleyDB::Recno specific 966 [ -Delim => byte,] 967 [ -Len => number,] 968 [ -Pad => byte,] 969 [ -Source => filename,] 970 971and this 972 973 [$db =] tie @arry, 'BerkeleyDB::Recno', 974 [ -Filename => "filename", ] 975 [ -Subname => "sub-database name", ] 976 [ -Flags => flags,] 977 [ -Property => flags,] 978 [ -Mode => number,] 979 [ -Cachesize => number,] 980 [ -Lorder => number,] 981 [ -Pagesize => number,] 982 [ -Env => $env,] 983 [ -Txn => $txn,] 984 [ -Encrypt => { Password => "string", 985 Flags => number }, ], 986 # BerkeleyDB::Recno specific 987 [ -Delim => byte,] 988 [ -Len => number,] 989 [ -Pad => byte,] 990 [ -Source => filename,] 991 992=head2 A Recno Example 993 994Here is a simple example that uses RECNO (if you are using a version 995of Perl earlier than 5.004_57 this example won't work -- see 996L<Extra RECNO Methods> for a workaround). 997 998## simpleRecno 999 1000Here is the output from the script: 1001 1002 The array contains 5 entries 1003 popped black 1004 shifted white 1005 Element 1 Exists with value blue 1006 The last element is green 1007 The 2nd last element is yellow 1008 1009=head1 BerkeleyDB::Queue 1010 1011Equivalent to calling B<db_create> followed by B<DB-E<gt>open> with 1012type B<DB_QUEUE> in Berkeley DB 3.x or greater. This database format 1013isn't available if you use Berkeley DB 2.x. 1014 1015Two forms of constructor are supported: 1016 1017 $db = new BerkeleyDB::Queue 1018 [ -Filename => "filename", ] 1019 [ -Subname => "sub-database name", ] 1020 [ -Flags => flags,] 1021 [ -Property => flags,] 1022 [ -Mode => number,] 1023 [ -Cachesize => number,] 1024 [ -Lorder => number,] 1025 [ -Pagesize => number,] 1026 [ -Env => $env,] 1027 [ -Txn => $txn,] 1028 [ -Encrypt => { Password => "string", 1029 Flags => number }, ], 1030 # BerkeleyDB::Queue specific 1031 [ -Len => number,] 1032 [ -Pad => byte,] 1033 [ -ExtentSize => number, ] 1034 1035and this 1036 1037 [$db =] tie @arry, 'BerkeleyDB::Queue', 1038 [ -Filename => "filename", ] 1039 [ -Subname => "sub-database name", ] 1040 [ -Flags => flags,] 1041 [ -Property => flags,] 1042 [ -Mode => number,] 1043 [ -Cachesize => number,] 1044 [ -Lorder => number,] 1045 [ -Pagesize => number,] 1046 [ -Env => $env,] 1047 [ -Txn => $txn,] 1048 [ -Encrypt => { Password => "string", 1049 Flags => number }, ], 1050 # BerkeleyDB::Queue specific 1051 [ -Len => number,] 1052 [ -Pad => byte,] 1053 1054 1055=head1 BerkeleyDB::Unknown 1056 1057This class is used to open an existing database. 1058 1059Equivalent to calling B<db_open> with type B<DB_UNKNOWN> in Berkeley DB 2.x and 1060calling B<db_create> followed by B<DB-E<gt>open> with type B<DB_UNKNOWN> in 1061Berkeley DB 3.x or greater. 1062 1063The constructor looks like this: 1064 1065 $db = new BerkeleyDB::Unknown 1066 [ -Filename => "filename", ] 1067 [ -Subname => "sub-database name", ] 1068 [ -Flags => flags,] 1069 [ -Property => flags,] 1070 [ -Mode => number,] 1071 [ -Cachesize => number,] 1072 [ -Lorder => number,] 1073 [ -Pagesize => number,] 1074 [ -Env => $env,] 1075 [ -Txn => $txn,] 1076 [ -Encrypt => { Password => "string", 1077 Flags => number }, ], 1078 1079 1080=head2 An example 1081 1082=head1 COMMON OPTIONS 1083 1084All database access class constructors support the common set of 1085options defined below. All are optional. 1086 1087=over 5 1088 1089=item -Filename 1090 1091The database filename. If no filename is specified, a temporary file will 1092be created and removed once the program terminates. 1093 1094=item -Subname 1095 1096Specifies the name of the sub-database to open. 1097This option is only valid if you are using Berkeley DB 3.x or greater. 1098 1099=item -Flags 1100 1101Specify how the database will be opened/created. The valid flags are: 1102 1103B<DB_CREATE> 1104 1105Create any underlying files, as necessary. If the files do not already 1106exist and the B<DB_CREATE> flag is not specified, the call will fail. 1107 1108B<DB_NOMMAP> 1109 1110Not supported by BerkeleyDB. 1111 1112B<DB_RDONLY> 1113 1114Opens the database in read-only mode. 1115 1116B<DB_THREAD> 1117 1118Not supported by BerkeleyDB. 1119 1120B<DB_TRUNCATE> 1121 1122If the database file already exists, remove all the data before 1123opening it. 1124 1125=item -Mode 1126 1127Determines the file protection when the database is created. Defaults 1128to 0666. 1129 1130=item -Cachesize 1131 1132=item -Lorder 1133 1134=item -Pagesize 1135 1136=item -Env 1137 1138When working under a Berkeley DB environment, this parameter 1139 1140Defaults to no environment. 1141 1142=item -Encrypt 1143 1144If present, this parameter will enable encryption of all data before 1145it is written to the database. This parameters must be given a hash 1146reference. The format is shown below. 1147 1148 -Encrypt => { -Password => "abc", Flags => DB_ENCRYPT_AES } 1149 1150Valid values for the Flags are 0 or C<DB_ENCRYPT_AES>. 1151 1152This option requires Berkeley DB 4.1 or better. 1153 1154=item -Txn 1155 1156TODO. 1157 1158=back 1159 1160=head1 COMMON DATABASE METHODS 1161 1162All the database interfaces support the common set of methods defined 1163below. 1164 1165All the methods below return 0 to indicate success. 1166 1167=head2 $status = $db->db_get($key, $value [, $flags]) 1168 1169Given a key (C<$key>) this method reads the value associated with it 1170from the database. If it exists, the value read from the database is 1171returned in the C<$value> parameter. 1172 1173The B<$flags> parameter is optional. If present, it must be set to B<one> 1174of the following values: 1175 1176=over 5 1177 1178=item B<DB_GET_BOTH> 1179 1180When the B<DB_GET_BOTH> flag is specified, B<db_get> checks for the 1181existence of B<both> the C<$key> B<and> C<$value> in the database. 1182 1183=item B<DB_SET_RECNO> 1184 1185TODO. 1186 1187=back 1188 1189In addition, the following value may be set by bitwise OR'ing it into 1190the B<$flags> parameter: 1191 1192=over 5 1193 1194=item B<DB_RMW> 1195 1196TODO 1197 1198=back 1199 1200The variant C<db_pget> allows you to query a secondary database: 1201 1202 $status = $sdb->db_pget($skey, $pkey, $value); 1203 1204using the key C<$skey> in the secondary db to lookup C<$pkey> and C<$value> 1205from the primary db. 1206 1207 1208=head2 $status = $db->db_put($key, $value [, $flags]) 1209 1210Stores a key/value pair in the database. 1211 1212The B<$flags> parameter is optional. If present it must be set to B<one> 1213of the following values: 1214 1215=over 5 1216 1217=item B<DB_APPEND> 1218 1219This flag is only applicable when accessing a B<BerkeleyDB::Recno> 1220database. 1221 1222TODO. 1223 1224 1225=item B<DB_NOOVERWRITE> 1226 1227If this flag is specified and C<$key> already exists in the database, 1228the call to B<db_put> will return B<DB_KEYEXIST>. 1229 1230=back 1231 1232=head2 $status = $db->db_del($key [, $flags]) 1233 1234Deletes a key/value pair in the database associated with C<$key>. 1235If duplicate keys are enabled in the database, B<db_del> will delete 1236B<all> key/value pairs with key C<$key>. 1237 1238The B<$flags> parameter is optional and is currently unused. 1239 1240=head2 $status = $env->stat_print([$flags]) 1241 1242Prints statistical information. 1243 1244If the C<MsgFile> option is specified the output will be sent to the 1245file. Otherwise output is sent to standard output. 1246 1247This option requires Berkeley DB 4.3 or better. 1248 1249=head2 $status = $db->db_sync() 1250 1251If any parts of the database are in memory, write them to the database. 1252 1253=head2 $cursor = $db->db_cursor([$flags]) 1254 1255Creates a cursor object. This is used to access the contents of the 1256database sequentially. See L<CURSORS> for details of the methods 1257available when working with cursors. 1258 1259The B<$flags> parameter is optional. If present it must be set to B<one> 1260of the following values: 1261 1262=over 5 1263 1264=item B<DB_RMW> 1265 1266TODO. 1267 1268=back 1269 1270=head2 ($flag, $old_offset, $old_length) = $db->partial_set($offset, $length) ; 1271 1272TODO 1273 1274=head2 ($flag, $old_offset, $old_length) = $db->partial_clear() ; 1275 1276TODO 1277 1278=head2 $db->byteswapped() 1279 1280TODO 1281 1282=head2 $db->type() 1283 1284Returns the type of the database. The possible return code are B<DB_HASH> 1285for a B<BerkeleyDB::Hash> database, B<DB_BTREE> for a B<BerkeleyDB::Btree> 1286database and B<DB_RECNO> for a B<BerkeleyDB::Recno> database. This method 1287is typically used when a database has been opened with 1288B<BerkeleyDB::Unknown>. 1289 1290=head2 $bool = $env->cds_enabled(); 1291 1292Returns true if the Berkeley DB environment C<$env> has been opened on 1293CDS mode. 1294 1295=head2 $bool = $db->cds_enabled(); 1296 1297Returns true if the database C<$db> has been opened on CDS mode. 1298 1299=head2 $lock = $db->cds_lock(); 1300 1301Creates a CDS write lock object C<$lock>. 1302 1303It is a fatal error to attempt to create a cds_lock if the Berkeley DB 1304environment has not been opened in CDS mode. 1305 1306=head2 $lock->cds_unlock(); 1307 1308Removes a CDS lock. The destruction of the CDS lock object automatically 1309calls this method. 1310 1311Note that if multiple CDS lock objects are created, the underlying write 1312lock will not be released until all CDS lock objects are either explictly 1313unlocked with this method, or the CDS lock objects have been destroyed. 1314 1315=head2 $ref = $db->db_stat() 1316 1317Returns a reference to an associative array containing information about 1318the database. The keys of the associative array correspond directly to the 1319names of the fields defined in the Berkeley DB documentation. For example, 1320in the DB documentation, the field B<bt_version> stores the version of the 1321Btree database. Assuming you called B<db_stat> on a Btree database the 1322equivalent field would be accessed as follows: 1323 1324 $version = $ref->{'bt_version'} ; 1325 1326If you are using Berkeley DB 3.x or better, this method will work will 1327all database formats. When DB 2.x is used, it only works with 1328B<BerkeleyDB::Btree>. 1329 1330=head2 $status = $db->status() 1331 1332Returns the status of the last C<$db> method called. 1333 1334=head2 $status = $db->truncate($count) 1335 1336Truncates the datatabase and returns the number or records deleted 1337in C<$count>. 1338 1339=head2 $status = $db->compact($start, $stop, $c_data, $flags, $end); 1340 1341Compacts the database C<$db>. 1342 1343All the parameters are optional - if only want to make use of some of them, 1344use C<undef> for those you don't want. Trailing unusused parameters can be 1345omitted. For example, if you only want to use the C<$c_data> parameter to 1346set the C<compact_fillpercent>, write you code like this 1347 1348 my %hash; 1349 $hash{compact_fillpercent} = 50; 1350 $db->compact(undef, undef, \%hash); 1351 1352The parameters operate identically to the C equivalent of this method. 1353The C<$c_data> needs a bit of explanation - it must be a hash reference. 1354The values of the following keys can be set before calling C<compact> and 1355will affect the operation of the compaction. 1356 1357=over 5 1358 1359=item * compact_fillpercent 1360 1361=item * compact_timeout 1362 1363=back 1364 1365The following keys, along with associated values, will be created in the 1366hash reference if the C<compact> operation was successful. 1367 1368=over 5 1369 1370=item * compact_deadlock 1371 1372=item * compact_levels 1373 1374=item * compact_pages_free 1375 1376=item * compact_pages_examine 1377 1378=item * compact_pages_truncated 1379 1380=back 1381 1382You need to be running Berkeley DB 4.4 or better if you want to make use of 1383C<compact>. 1384 1385=head2 $status = $db->associate($secondary, \&key_callback) 1386 1387Associate C<$db> with the secondary DB C<$secondary> 1388 1389New key/value pairs inserted to the database will be passed to the callback 1390which must set its third argument to the secondary key to allow lookup. If 1391an array reference is set multiple keys secondary keys will be associated 1392with the primary database entry. 1393 1394Data may be retrieved fro the secondary database using C<db_pget> to also 1395obtain the primary key. 1396 1397Secondary databased are maintained automatically. 1398 1399=head2 $status = $db->associate_foreign($secondary, callback, $flags) 1400 1401Associate a foreign key database C<$db> with the secondary DB 1402C<$secondary>. 1403 1404The second parameter must be a reference to a sub or C<undef>. 1405 1406The C<$flags> parameter must be either C<DB_FOREIGN_CASCADE>, 1407C<DB_FOREIGN_ABORT> or C<DB_FOREIGN_NULLIFY>. 1408 1409When the flags parameter is C<DB_FOREIGN_NULLIFY> the second parameter is a 1410reference to a sub of the form 1411 1412 sub foreign_cb 1413 { 1414 my $key = \$_[0]; 1415 my $value = \$_[1]; 1416 my $foreignkey = \$_[2]; 1417 my $changed = \$_[3] ; 1418 1419 # for ... set $$value and set $$changed to 1 1420 1421 return 0; 1422 } 1423 1424 $foreign_db->associate_foreign($secondary, \&foreign_cb, DB_FOREIGN_NULLIFY); 1425 1426=head1 CURSORS 1427 1428A cursor is used whenever you want to access the contents of a database 1429in sequential order. 1430A cursor object is created with the C<db_cursor> 1431 1432A cursor object has the following methods available: 1433 1434=head2 $newcursor = $cursor->c_dup($flags) 1435 1436Creates a duplicate of C<$cursor>. This method needs Berkeley DB 3.0.x or better. 1437 1438The C<$flags> parameter is optional and can take the following value: 1439 1440=over 5 1441 1442=item DB_POSITION 1443 1444When present this flag will position the new cursor at the same place as the 1445existing cursor. 1446 1447=back 1448 1449=head2 $status = $cursor->c_get($key, $value, $flags) 1450 1451Reads a key/value pair from the database, returning the data in C<$key> 1452and C<$value>. The key/value pair actually read is controlled by the 1453C<$flags> parameter, which can take B<one> of the following values: 1454 1455=over 5 1456 1457=item B<DB_FIRST> 1458 1459Set the cursor to point to the first key/value pair in the 1460database. Return the key/value pair in C<$key> and C<$value>. 1461 1462=item B<DB_LAST> 1463 1464Set the cursor to point to the last key/value pair in the database. Return 1465the key/value pair in C<$key> and C<$value>. 1466 1467=item B<DB_NEXT> 1468 1469If the cursor is already pointing to a key/value pair, it will be 1470incremented to point to the next key/value pair and return its contents. 1471 1472If the cursor isn't initialised, B<DB_NEXT> works just like B<DB_FIRST>. 1473 1474If the cursor is already positioned at the last key/value pair, B<c_get> 1475will return B<DB_NOTFOUND>. 1476 1477=item B<DB_NEXT_DUP> 1478 1479This flag is only valid when duplicate keys have been enabled in 1480a database. 1481If the cursor is already pointing to a key/value pair and the key of 1482the next key/value pair is identical, the cursor will be incremented to 1483point to it and their contents returned. 1484 1485=item B<DB_PREV> 1486 1487If the cursor is already pointing to a key/value pair, it will be 1488decremented to point to the previous key/value pair and return its 1489contents. 1490 1491If the cursor isn't initialised, B<DB_PREV> works just like B<DB_LAST>. 1492 1493If the cursor is already positioned at the first key/value pair, B<c_get> 1494will return B<DB_NOTFOUND>. 1495 1496=item B<DB_CURRENT> 1497 1498If the cursor has been set to point to a key/value pair, return their 1499contents. 1500If the key/value pair referenced by the cursor has been deleted, B<c_get> 1501will return B<DB_KEYEMPTY>. 1502 1503=item B<DB_SET> 1504 1505Set the cursor to point to the key/value pair referenced by B<$key> 1506and return the value in B<$value>. 1507 1508=item B<DB_SET_RANGE> 1509 1510This flag is a variation on the B<DB_SET> flag. As well as returning 1511the value, it also returns the key, via B<$key>. 1512When used with a B<BerkeleyDB::Btree> database the key matched by B<c_get> 1513will be the shortest key (in length) which is greater than or equal to 1514the key supplied, via B<$key>. This allows partial key searches. 1515See ??? for an example of how to use this flag. 1516 1517=item B<DB_GET_BOTH> 1518 1519Another variation on B<DB_SET>. This one returns both the key and 1520the value. 1521 1522=item B<DB_SET_RECNO> 1523 1524TODO. 1525 1526=item B<DB_GET_RECNO> 1527 1528TODO. 1529 1530=back 1531 1532In addition, the following value may be set by bitwise OR'ing it into 1533the B<$flags> parameter: 1534 1535=over 5 1536 1537=item B<DB_RMW> 1538 1539TODO. 1540 1541=back 1542 1543=head2 $status = $cursor->c_put($key, $value, $flags) 1544 1545Stores the key/value pair in the database. The position that the data is 1546stored in the database is controlled by the C<$flags> parameter, which 1547must take B<one> of the following values: 1548 1549=over 5 1550 1551=item B<DB_AFTER> 1552 1553When used with a Btree or Hash database, a duplicate of the key referenced 1554by the current cursor position will be created and the contents of 1555B<$value> will be associated with it - B<$key> is ignored. 1556The new key/value pair will be stored immediately after the current 1557cursor position. 1558Obviously the database has to have been opened with B<DB_DUP>. 1559 1560When used with a Recno ... TODO 1561 1562 1563=item B<DB_BEFORE> 1564 1565When used with a Btree or Hash database, a duplicate of the key referenced 1566by the current cursor position will be created and the contents of 1567B<$value> will be associated with it - B<$key> is ignored. 1568The new key/value pair will be stored immediately before the current 1569cursor position. 1570Obviously the database has to have been opened with B<DB_DUP>. 1571 1572When used with a Recno ... TODO 1573 1574=item B<DB_CURRENT> 1575 1576If the cursor has been initialised, replace the value of the key/value 1577pair stored in the database with the contents of B<$value>. 1578 1579=item B<DB_KEYFIRST> 1580 1581Only valid with a Btree or Hash database. This flag is only really 1582used when duplicates are enabled in the database and sorted duplicates 1583haven't been specified. 1584In this case the key/value pair will be inserted as the first entry in 1585the duplicates for the particular key. 1586 1587=item B<DB_KEYLAST> 1588 1589Only valid with a Btree or Hash database. This flag is only really 1590used when duplicates are enabled in the database and sorted duplicates 1591haven't been specified. 1592In this case the key/value pair will be inserted as the last entry in 1593the duplicates for the particular key. 1594 1595=back 1596 1597=head2 $status = $cursor->c_del([$flags]) 1598 1599This method deletes the key/value pair associated with the current cursor 1600position. The cursor position will not be changed by this operation, so 1601any subsequent cursor operation must first initialise the cursor to 1602point to a valid key/value pair. 1603 1604If the key/value pair associated with the cursor have already been 1605deleted, B<c_del> will return B<DB_KEYEMPTY>. 1606 1607The B<$flags> parameter is not used at present. 1608 1609=head2 $status = $cursor->c_count($cnt [, $flags]) 1610 1611Stores the number of duplicates at the current cursor position in B<$cnt>. 1612 1613The B<$flags> parameter is not used at present. This method needs 1614Berkeley DB 3.1 or better. 1615 1616=head2 $status = $cursor->status() 1617 1618Returns the status of the last cursor method as a dual type. 1619 1620=head2 $status = $cursor->c_pget() ; 1621 1622See C<db_pget> 1623 1624=head2 $status = $cursor->c_close() 1625 1626Closes the cursor B<$cursor>. 1627 1628=head2 Cursor Examples 1629 1630TODO 1631 1632Iterating from first to last, then in reverse. 1633 1634examples of each of the flags. 1635 1636=head1 JOIN 1637 1638Join support for BerkeleyDB is in progress. Watch this space. 1639 1640TODO 1641 1642=head1 TRANSACTIONS 1643 1644Transactions are created using the C<txn_begin> method on L<BerkeleyDB::Env>: 1645 1646 my $txn = $env->txn_begin; 1647 1648If this is a nested transaction, supply the parent transaction as an 1649argument: 1650 1651 my $child_txn = $env->txn_begin($parent_txn); 1652 1653Then in order to work with the transaction, you must set it as the current 1654transaction on the database handles you want to work with: 1655 1656 $db->Txn($txn); 1657 1658Or for multiple handles: 1659 1660 $txn->Txn(@handles); 1661 1662The current transaction is given by BerkeleyDB each time to the various BDB 1663operations. In the C api it is required explicitly as an argument to every 1664operation. 1665 1666To commit a transaction call the C<commit> method on it: 1667 1668 $txn->commit; 1669 1670and to roll back call abort: 1671 1672 $txn->abort 1673 1674After committing or aborting a child transaction you need to set the active 1675transaction again using C<Txn>. 1676 1677 1678=head1 Berkeley DB Concurrent Data Store (CDS) 1679 1680The Berkeley DB I<Concurrent Data Store> (CDS) is a lightweight locking 1681mechanism that is useful in scenarios where transactions are overkill. 1682 1683=head2 What is CDS? 1684 1685The Berkeley DB CDS interface is a simple lightweight locking mechanism 1686that allows safe concurrent access to Berkeley DB databases. Your 1687application can have multiple reader and write processes, but Berkeley DB 1688will arrange it so that only one process can have a write lock against the 1689database at a time, i.e. multiple processes can read from a database 1690concurrently, but all write processes will be serialised. 1691 1692=head2 Should I use it? 1693 1694Whilst this simple locking model is perfectly adequate for some 1695applications, it will be too restrictive for others. Before deciding on 1696using CDS mode, you need to be sure that it is suitable for the expected 1697behaviour of your application. 1698 1699The key features of this model are 1700 1701=over 5 1702 1703=item * 1704 1705All writes operations are serialised. 1706 1707=item * 1708 1709A write operation will block until all reads have finished. 1710 1711=back 1712 1713There are a few of the attributes of your application that you need to be 1714aware of before choosing to use CDS. 1715 1716Firstly, if you application needs either recoverability or transaction 1717support, then CDS will not be suitable. 1718 1719Next what is the ratio of read operation to write operations will your 1720application have? 1721 1722If it is carrying out mostly read operations, and very few writes, then CDS 1723may be appropriate. 1724 1725What is the expected throughput of reads/writes in your application? 1726 1727If you application does 90% writes and 10% reads, but on average you only 1728have a transaction every 5 seconds, then the fact that all writes are 1729serialised will not matter, because there will hardly ever be multiple 1730writes processes blocking. 1731 1732In summary CDS mode may be appropriate for your application if it performs 1733mostly reads and very few writes or there is a low throughput. Also, if 1734you do not need to be able to roll back a series of database operations if 1735an error occurs, then CDS is ok. 1736 1737If any of these is not the case you will need to use Berkeley DB 1738transactions. That is outside the scope of this document. 1739 1740=head2 Locking Used 1741 1742Berkeley DB implements CDS mode using two kinds of lock behind the scenes - 1743namely read locks and write locks. A read lock allows multiple processes to 1744access the database for reading at the same time. A write lock will only 1745get access to the database when there are no read or write locks active. 1746The write lock will block until the process holding the lock releases it. 1747 1748Multiple processes with read locks can all access the database at the same 1749time as long as no process has a write lock. A process with a write lock 1750can only access the database if there are no other active read or write 1751locks. 1752 1753The majority of the time the Berkeley DB CDS mode will handle all locking 1754without your application having to do anything. There are a couple of 1755exceptions you need to be aware of though - these will be discussed in 1756L<Safely Updating Records> and L<Implicit Cursors> below. 1757 1758A Berkeley DB Cursor (created with C<< $db->db_cursor >>) will by hold a 1759lock on the database until it is either explicitly closed or destroyed. 1760This means the lock has the potential to be long lived. 1761 1762By default Berkeley DB cursors create a read lock, but it is possible to 1763create a cursor that holds a write lock, thus 1764 1765 $cursor = $db->db_cursor(DB_WRITECURSOR); 1766 1767 1768Whilst either a read or write cursor is active, it will block any other 1769processes that wants to write to the database. 1770 1771To avoid blocking problems, only keep cursors open as long as they are 1772needed. The same is true when you use the C<cursor> method or the 1773C<cds_lock> method. 1774 1775For full information on CDS see the "Berkeley DB Concurrent Data Store 1776applications" section in the Berkeley DB Reference Guide. 1777 1778 1779=head2 Opening a database for CDS 1780 1781Here is the typical signature that is used when opening a database in CDS 1782mode. 1783 1784 use BerkeleyDB ; 1785 1786 my $env = new BerkeleyDB::Env 1787 -Home => "./home" , 1788 -Flags => DB_CREATE| DB_INIT_CDB | DB_INIT_MPOOL 1789 or die "cannot open environment: $BerkeleyDB::Error\n"; 1790 1791 my $db = new BerkeleyDB::Hash 1792 -Filename => 'test1.db', 1793 -Flags => DB_CREATE, 1794 -Env => $env 1795 or die "cannot open database: $BerkeleyDB::Error\n"; 1796 1797or this, if you use the tied interface 1798 1799 tie %hash, "BerkeleyDB::Hash", 1800 -Filename => 'test2.db', 1801 -Flags => DB_CREATE, 1802 -Env => $env 1803 or die "cannot open database: $BerkeleyDB::Error\n"; 1804 1805The first thing to note is that you B<MUST> always use a Berkeley DB 1806environment if you want to use locking with Berkeley DB. 1807 1808Remember, that apart from the actual database files you explicitly create 1809yourself, Berkeley DB will create a few behind the scenes to handle locking 1810- they usually have names like "__db.001". It is therefore a good idea to 1811use the C<-Home> option, unless you are happy for all these files to be 1812written in the current directory. 1813 1814Next, remember to include the C<DB_CREATE> flag when opening the 1815environment for the first time. A common mistake is to forget to add this 1816option and then wonder why the application doesn't work. 1817 1818Finally, it is vital that all processes that are going to access the 1819database files use the same Berkeley DB environment. 1820 1821 1822=head2 Safely Updating a Record 1823 1824One of the main gotchas when using CDS is if you want to update a record in 1825a database, i.e. you want to retrieve a record from a database, modify it 1826in some way and put it back in the database. 1827 1828For example, say you are writing a web application and you want to keep a 1829record of the number of times your site is accessed in a Berkeley DB 1830database. So your code will have a line of code like this (assume, of 1831course, that C<%hash> has been tied to a Berkeley DB database): 1832 1833 $hash{Counter} ++ ; 1834 1835That may look innocent enough, but there is a race condition lurking in 1836there. If I rewrite the line of code using the low-level Berkeley DB API, 1837which is what will actually be executed, the race condition may be more 1838apparent: 1839 1840 $db->db_get("Counter", $value); 1841 ++ $value ; 1842 $db->db_put("Counter", $value); 1843 1844Consider what happens behind the scenes when you execute the commands 1845above. Firstly, the existing value for the key "Counter" is fetched from 1846the database using C<db_get>. A read lock will be used for this part of the 1847update. The value is then incremented, and the new value is written back 1848to the database using C<db_put>. This time a write lock will be used. 1849 1850Here's the problem - there is nothing to stop two (or more) processes 1851executing the read part at the same time. Remember multiple processes can 1852hold a read lock on the database at the same time. So both will fetch the 1853same value, let's say 7, from the database. Both increment the value to 8 1854and attempt to write it to the database. Berkeley DB will ensure that only 1855one of the processes gets a write lock, while the other will be blocked. So 1856the process that happened to get the write lock will store the value 8 to 1857the database and release the write lock. Now the other process will be 1858unblocked, and it too will write the value 8 to the database. The result, 1859in this example, is we have missed a hit in the counter. 1860 1861To deal with this kind of scenario, you need to make the update atomic. A 1862convenience method, called C<cds_lock>, is supplied with the BerkeleyDB 1863module for this purpose. Using C<cds_lock>, the counter update code can now 1864be rewritten thus: 1865 1866 my $lk = $dbh->cds_lock() ; 1867 $hash{Counter} ++ ; 1868 $lk->cds_unlock; 1869 1870or this, where scoping is used to limit the lifetime of the lock object 1871 1872 { 1873 my $lk = $dbh->cds_lock() ; 1874 $hash{Counter} ++ ; 1875 } 1876 1877Similarly, C<cds_lock> can be used with the native Berkeley DB API 1878 1879 my $lk = $dbh->cds_lock() ; 1880 $db->db_get("Counter", $value); 1881 ++ $value ; 1882 $db->db_put("Counter", $value); 1883 $lk->unlock; 1884 1885 1886The C<cds_lock> method will ensure that the current process has exclusive 1887access to the database until the lock is either explicitly released, via 1888the C<< $lk->cds_unlock() >> or by the lock object being destroyed. 1889 1890If you are interested, all that C<cds_lock> does is open a "write" cursor. 1891This has the useful side-effect of holding a write-lock on the database 1892until the cursor is deleted. This is how you create a write-cursor 1893 1894 $cursor = $db->db_cursor(DB_WRITECURSOR); 1895 1896If you have instantiated multiple C<cds_lock> objects for one database 1897within a single process, that process will hold a write-lock on the 1898database until I<ALL> C<cds_lock> objects have been destroyed. 1899 1900As with all write-cursors, you should try to limit the scope of the 1901C<cds_lock> to as short a time as possible. Remember the complete database 1902will be locked to other process whilst the write lock is in place. 1903 1904=head2 Cannot write with a read cursor while a write cursor is active 1905 1906This issue is easier to demonstrate with an example, so consider the code 1907below. The intention of the code is to increment the values of all the 1908elements in a database by one. 1909 1910 # Assume $db is a database opened in a CDS environment. 1911 1912 # Create a write-lock 1913 my $lock = $db->db_cursor(DB_WRITECURSOR); 1914 # or 1915 # my $lock = $db->cds_lock(); 1916 1917 1918 my $cursor = $db->db_cursor(); 1919 1920 # Now loop through the database, and increment 1921 # each value using c_put. 1922 while ($cursor->c_get($key, $value, DB_NEXT) == 0) 1923 { 1924 $cursor->c_put($key, $value+1, DB_CURRENT) == 0 1925 or die "$BerkeleyDB::Error\n"; 1926 } 1927 1928 1929When this code is run, it will fail on the C<c_put> line with this error 1930 1931 Write attempted on read-only cursor 1932 1933The read cursor has automatically disallowed a write operation to prevent a 1934deadlock. 1935 1936 1937So the rule is -- you B<CANNOT> carry out a write operation using a 1938read-only cursor (i.e. you cannot use C<c_put> or C<c_del>) whilst another 1939write-cursor is already active. 1940 1941The workaround for this issue is to just use C<db_put> instead of C<c_put>, 1942like this 1943 1944 # Assume $db is a database opened in a CDS environment. 1945 1946 # Create a write-lock 1947 my $lock = $db->db_cursor(DB_WRITECURSOR); 1948 # or 1949 # my $lock = $db->cds_lock(); 1950 1951 1952 my $cursor = $db->db_cursor(); 1953 1954 # Now loop through the database, and increment 1955 # each value using c_put. 1956 while ($cursor->c_get($key, $value, DB_NEXT) == 0) 1957 { 1958 $db->db_put($key, $value+1) == 0 1959 or die "$BerkeleyDB::Error\n"; 1960 } 1961 1962 1963 1964=head2 Implicit Cursors 1965 1966All Berkeley DB cursors will hold either a read lock or a write lock on the 1967database for the existence of the cursor. In order to prevent blocking of 1968other processes you need to make sure that they are not long lived. 1969 1970There are a number of instances where the Perl interface to Berkeley DB 1971will create a cursor behind the scenes without you being aware of it. Most 1972of these are very short-lived and will not affect the running of your 1973script, but there are a few notable exceptions. 1974 1975Consider this snippet of code 1976 1977 while (my ($k, $v) = each %hash) 1978 { 1979 # do something 1980 } 1981 1982 1983To implement the "each" functionality, a read cursor will be created behind 1984the scenes to allow you to iterate through the tied hash, C<%hash>. While 1985that cursor is still active, a read lock will obviously be held against the 1986database. If your application has any other writing processes, these will 1987be blocked until the read cursor is closed. That won't happen until the 1988loop terminates. 1989 1990To avoid blocking problems, only keep cursors open as long as they are 1991needed. The same is true when you use the C<cursor> method or the 1992C<cds_lock> method. 1993 1994 1995The locking behaviour of the C<values> or C<keys> functions, shown below, 1996is subtly different. 1997 1998 foreach my $k (keys %hash) 1999 { 2000 # do something 2001 } 2002 2003 foreach my $v (values %hash) 2004 { 2005 # do something 2006 } 2007 2008 2009Just as in the C<each> function, a read cursor will be created to iterate 2010over the database in both of these cases. Where C<keys> and C<values> 2011differ is the place where the cursor carries out the iteration through the 2012database. Whilst C<each> carried out a single iteration every time it was 2013invoked, the C<keys> and C<values> functions will iterate through the 2014entire database in one go -- the complete database will be read into memory 2015before the first iteration of the loop. 2016 2017Apart from the fact that a read lock will be held for the amount of time 2018required to iterate through the database, the use of C<keys> and C<values> 2019is B<not> recommended because it will result in the complete database being 2020read into memory. 2021 2022 2023=head2 Avoiding Deadlock with multiple databases 2024 2025If your CDS application uses multiple database files, and you need to write 2026to more than one of them, you need to be careful you don't create a 2027deadlock. 2028 2029For example, say you have two databases, D1 and D2, and two processes, P1 2030and P2. Assume you want to write a record to each database. If P1 writes 2031the records to the databases in the order D1, D2 while process P2 writes 2032the records in the order D2, D1, there is the potential for a deadlock to 2033occur. 2034 2035This scenario can be avoided by either always acquiring the write locks in 2036exactly the same order in your application code, or by using the 2037C<DB_CDB_ALLDB> flag when opening the environment. This flag will make a 2038write-lock apply to all the databases in the environment. 2039 2040Add example here 2041 2042=head1 DBM Filters 2043 2044A DBM Filter is a piece of code that is be used when you I<always> 2045want to make the same transformation to all keys and/or values in a DBM 2046database. All of the database classes (BerkeleyDB::Hash, 2047BerkeleyDB::Btree and BerkeleyDB::Recno) support DBM Filters. 2048 2049There are four methods associated with DBM Filters. All work 2050identically, and each is used to install (or uninstall) a single DBM 2051Filter. Each expects a single parameter, namely a reference to a sub. 2052The only difference between them is the place that the filter is 2053installed. 2054 2055To summarise: 2056 2057=over 5 2058 2059=item B<filter_store_key> 2060 2061If a filter has been installed with this method, it will be invoked 2062every time you write a key to a DBM database. 2063 2064=item B<filter_store_value> 2065 2066If a filter has been installed with this method, it will be invoked 2067every time you write a value to a DBM database. 2068 2069 2070=item B<filter_fetch_key> 2071 2072If a filter has been installed with this method, it will be invoked 2073every time you read a key from a DBM database. 2074 2075=item B<filter_fetch_value> 2076 2077If a filter has been installed with this method, it will be invoked 2078every time you read a value from a DBM database. 2079 2080=back 2081 2082You can use any combination of the methods, from none, to all four. 2083 2084All filter methods return the existing filter, if present, or C<undef> 2085in not. 2086 2087To delete a filter pass C<undef> to it. 2088 2089=head2 The Filter 2090 2091When each filter is called by Perl, a local copy of C<$_> will contain 2092the key or value to be filtered. Filtering is achieved by modifying 2093the contents of C<$_>. The return code from the filter is ignored. 2094 2095=head2 An Example -- the NULL termination problem. 2096 2097Consider the following scenario. You have a DBM database that you need 2098to share with a third-party C application. The C application assumes 2099that I<all> keys and values are NULL terminated. Unfortunately when 2100Perl writes to DBM databases it doesn't use NULL termination, so your 2101Perl application will have to manage NULL termination itself. When you 2102write to the database you will have to use something like this: 2103 2104 $hash{"$key\0"} = "$value\0" ; 2105 2106Similarly the NULL needs to be taken into account when you are considering 2107the length of existing keys/values. 2108 2109It would be much better if you could ignore the NULL terminations issue 2110in the main application code and have a mechanism that automatically 2111added the terminating NULL to all keys and values whenever you write to 2112the database and have them removed when you read from the database. As I'm 2113sure you have already guessed, this is a problem that DBM Filters can 2114fix very easily. 2115 2116## nullFilter 2117 2118Hopefully the contents of each of the filters should be 2119self-explanatory. Both "fetch" filters remove the terminating NULL, 2120and both "store" filters add a terminating NULL. 2121 2122 2123=head2 Another Example -- Key is a C int. 2124 2125Here is another real-life example. By default, whenever Perl writes to 2126a DBM database it always writes the key and value as strings. So when 2127you use this: 2128 2129 $hash{12345} = "something" ; 2130 2131the key 12345 will get stored in the DBM database as the 5 byte string 2132"12345". If you actually want the key to be stored in the DBM database 2133as a C int, you will have to use C<pack> when writing, and C<unpack> 2134when reading. 2135 2136Here is a DBM Filter that does it: 2137 2138## intFilter 2139 2140This time only two filters have been used -- we only need to manipulate 2141the contents of the key, so it wasn't necessary to install any value 2142filters. 2143 2144=head1 Using BerkeleyDB with MLDBM 2145 2146Both BerkeleyDB::Hash and BerkeleyDB::Btree can be used with the MLDBM 2147module. The code fragment below shows how to open associate MLDBM with 2148BerkeleyDB::Btree. To use BerkeleyDB::Hash just replace 2149BerkeleyDB::Btree with BerkeleyDB::Hash. 2150 2151 use strict ; 2152 use BerkeleyDB ; 2153 use MLDBM qw(BerkeleyDB::Btree) ; 2154 use Data::Dumper; 2155 2156 my $filename = 'testmldbm' ; 2157 my %o ; 2158 2159 unlink $filename ; 2160 tie %o, 'MLDBM', -Filename => $filename, 2161 -Flags => DB_CREATE 2162 or die "Cannot open database '$filename: $!\n"; 2163 2164See the MLDBM documentation for information on how to use the module 2165and for details of its limitations. 2166 2167=head1 EXAMPLES 2168 2169TODO. 2170 2171=head1 HINTS & TIPS 2172 2173=head2 Sharing Databases With C Applications 2174 2175There is no technical reason why a Berkeley DB database cannot be 2176shared by both a Perl and a C application. 2177 2178The vast majority of problems that are reported in this area boil down 2179to the fact that C strings are NULL terminated, whilst Perl strings 2180are not. See L<An Example -- the NULL termination problem.> in the DBM 2181FILTERS section for a generic way to work around this problem. 2182 2183 2184=head2 The untie Gotcha 2185 2186TODO 2187 2188=head1 COMMON QUESTIONS 2189 2190This section attempts to answer some of the more common questions that 2191I get asked. 2192 2193 2194=head2 Relationship with DB_File 2195 2196Before Berkeley DB 2.x was written there was only one Perl module that 2197interfaced to Berkeley DB. That module is called B<DB_File>. Although 2198B<DB_File> can be build with Berkeley DB 1.x, 2.x, 3.x or 4.x, it only 2199provides an interface to the functionality available in Berkeley DB 22001.x. That means that it doesn't support transactions, locking or any of 2201the other new features available in DB 2.x or better. 2202 2203=head2 How do I store Perl data structures with BerkeleyDB? 2204 2205See L<Using BerkeleyDB with MLDBM>. 2206 2207=head1 HISTORY 2208 2209See the Changes file. 2210 2211=head1 AVAILABILITY 2212 2213The most recent version of B<BerkeleyDB> can always be found 2214on CPAN (see L<perlmod/CPAN> for details), in the directory 2215F<modules/by-module/BerkeleyDB>. 2216 2217The official web site for Berkeley DB is F<http://www.oracle.com/technology/products/berkeley-db/db/index.html>. 2218 2219=head1 COPYRIGHT 2220 2221Copyright (c) 1997-2004 Paul Marquess. All rights reserved. This program 2222is free software; you can redistribute it and/or modify it under the 2223same terms as Perl itself. 2224 2225Although B<BerkeleyDB> is covered by the Perl license, the library it 2226makes use of, namely Berkeley DB, is not. Berkeley DB has its own 2227copyright and its own license. Please take the time to read it. 2228 2229Here are few words taken from the Berkeley DB FAQ (at 2230F<http://www.oracle.com/technology/products/berkeley-db/db/index.html>) regarding the license: 2231 2232 Do I have to license DB to use it in Perl scripts? 2233 2234 No. The Berkeley DB license requires that software that uses 2235 Berkeley DB be freely redistributable. In the case of Perl, that 2236 software is Perl, and not your scripts. Any Perl scripts that you 2237 write are your property, including scripts that make use of Berkeley 2238 DB. Neither the Perl license nor the Berkeley DB license 2239 place any restriction on what you may do with them. 2240 2241If you are in any doubt about the license situation, contact either the 2242Berkeley DB authors or the author of BerkeleyDB. 2243See L<"AUTHOR"> for details. 2244 2245 2246=head1 AUTHOR 2247 2248Paul Marquess E<lt>pmqs@cpan.orgE<gt>. 2249 2250 2251=head1 SEE ALSO 2252 2253perl(1), DB_File, Berkeley DB. 2254 2255=cut 2256