structure revision 256281
1This file describes the design, layouts, and file formats of a 2libsvn_fs_fs repository. 3 4Design 5------ 6 7In FSFS, each committed revision is represented as an immutable file 8containing the new node-revisions, contents, and changed-path 9information for the revision, plus a second, changeable file 10containing the revision properties. 11 12In contrast to the BDB back end, the contents of recent revision of 13files are stored as deltas against earlier revisions, instead of the 14other way around. This is less efficient for common-case checkouts, 15but brings greater simplicity and robustness, as well as the 16flexibility to make commits work without write access to existing 17revisions. Skip-deltas and delta combination mitigate the checkout 18cost. 19 20In-progress transactions are represented with a prototype rev file 21containing only the new text representations of files (appended to as 22changed file contents come in), along with a separate file for each 23node-revision, directory representation, or property representation 24which has been changed or added in the transaction. During the final 25stage of the commit, these separate files are marshalled onto the end 26of the prototype rev file to form the immutable revision file. 27 28Layout of the FS directory 29-------------------------- 30 31The layout of the FS directory (the "db" subdirectory of the 32repository) is: 33 34 revs/ Subdirectory containing revs 35 <shard>/ Shard directory, if sharding is in use (see below) 36 <revnum> File containing rev <revnum> 37 <shard>.pack/ Pack directory, if the repo has been packed (see below) 38 pack Pack file, if the repository has been packed (see below) 39 manifest Pack manifest file, if a pack file exists (see below) 40 revprops/ Subdirectory containing rev-props 41 <shard>/ Shard directory, if sharding is in use (see below) 42 <revnum> File containing rev-props for <revnum> 43 <shard>.pack/ Pack directory, if the repo has been packed (see below) 44 <rev>.<count> Pack file, if the repository has been packed (see below) 45 manifest Pack manifest file, if a pack file exists (see below) 46 revprops.db SQLite database of the packed revision properties 47 transactions/ Subdirectory containing transactions 48 <txnid>.txn/ Directory containing transaction <txnid> 49 txn-protorevs/ Subdirectory containing transaction proto-revision files 50 <txnid>.rev Proto-revision file for transaction <txnid> 51 <txnid>.rev-lock Write lock for proto-rev file 52 txn-current File containing the next transaction key 53 locks/ Subdirectory containing locks 54 <partial-digest>/ Subdirectory named for first 3 letters of an MD5 digest 55 <digest> File containing locks/children for path with <digest> 56 node-origins/ Lazy cache of origin noderevs for nodes 57 <partial-nodeid> File containing noderev ID of origins of nodes 58 current File specifying current revision and next node/copy id 59 fs-type File identifying this filesystem as an FSFS filesystem 60 write-lock Empty file, locked to serialise writers 61 txn-current-lock Empty file, locked to serialise 'txn-current' 62 uuid File containing the UUID of the repository 63 format File containing the format number of this filesystem 64 fsfs.conf Configuration file 65 min-unpacked-rev File containing the oldest revision not in a pack file 66 min-unpacked-revprop File containing the oldest revision of unpacked revprop 67 rep-cache.db SQLite database mapping rep checksums to locations 68 69Files in the revprops directory are in the hash dump format used by 70svn_hash_write. 71 72The format of the "current" file is: 73 74 * Format 3 and above: a single line of the form 75 "<youngest-revision>\n" giving the youngest revision for the 76 repository. 77 78 * Format 2 and below: a single line of the form "<youngest-revision> 79 <next-node-id> <next-copy-id>\n" giving the youngest revision, the 80 next unique node-ID, and the next unique copy-ID for the 81 repository. 82 83The "write-lock" file is an empty file which is locked before the 84final stage of a commit and unlocked after the new "current" file has 85been moved into place to indicate that a new revision is present. It 86is also locked during a revprop propchange while the revprop file is 87read in, mutated, and written out again. Note that readers are never 88blocked by any operation - writers must ensure that the filesystem is 89always in a consistent state. 90 91The "txn-current" file is a file with a single line of text that 92contains only a base-36 number. The current value will be used in the 93next transaction name, along with the revision number the transaction 94is based on. This sequence number ensures that transaction names are 95not reused, even if the transaction is aborted and a new transaction 96based on the same revision is begun. The only operation that FSFS 97performs on this file is "get and increment"; the "txn-current-lock" 98file is locked during this operation. 99 100"fsfs.conf" is a configuration file in the standard Subversion/Python 101config format. It is automatically generated when you create a new 102repository; read the generated file for details on what it controls. 103 104When representation sharing is enabled, the filesystem tracks 105representation checksum and location mappings using a SQLite database in 106"rep-cache.db". The database has a single table, which stores the sha1 107hash text as the primary key, mapped to the representation revision, offset, 108size and expanded size. This file is only consulted during writes and never 109during reads. Consequently, it is not required, and may be removed at an 110abritrary time, with the subsequent loss of rep-sharing capabilities for 111revisions written thereafter. 112 113Filesystem formats 114------------------ 115 116The "format" file defines what features are permitted within the 117filesystem, and indicates changes that are not backward-compatible. 118It serves the same purpose as the repository file of the same name. 119 120The filesystem format file was introduced in Subversion 1.2, and so 121will not be present if the repository was created with an older 122version of Subversion. An absent format file should be interpreted as 123indicating a format 1 filesystem. 124 125The format file is a single line of the form "<format number>\n", 126followed by any number of lines specifying 'format options' - 127additional information about the filesystem's format. Each format 128option line is of the form "<option>\n" or "<option> <parameters>\n". 129 130Clients should raise an error if they encounter an option not 131permitted by the format number in use. 132 133The formats are: 134 135 Format 1, understood by Subversion 1.1+ 136 Format 2, understood by Subversion 1.4+ 137 Format 3, understood by Subversion 1.5+ 138 Format 4, understood by Subversion 1.6+ 139 Format 5, understood by Subversion 1.7-dev, never released 140 Format 6, understood by Subversion 1.8 141 142The differences between the formats are: 143 144Delta representation in revision files 145 Format 1: svndiff0 only 146 Formats 2+: svndiff0 or svndiff1 147 148Format options 149 Formats 1-2: none permitted 150 Format 3+: "layout" option 151 152Transaction name reuse 153 Formats 1-2: transaction names may be reused 154 Format 3+: transaction names generated using txn-current file 155 156Location of proto-rev file and its lock 157 Formats 1-2: transactions/<txnid>/rev and 158 transactions/<txnid>/rev-lock. 159 Format 3+: txn-protorevs/<txnid>.rev and 160 txn-protorevs/<txnid>.rev-lock. 161 162Node-ID and copy-ID generation 163 Formats 1-2: Node-IDs and copy-IDs are guaranteed to form a 164 monotonically increasing base36 sequence using the "current" 165 file. 166 Format 3+: Node-IDs and copy-IDs use the new revision number to 167 ensure uniqueness and the "current" file just contains the 168 youngest revision. 169 170Mergeinfo metadata: 171 Format 1-2: minfo-here and minfo-count node-revision fields are not 172 stored. svn_fs_get_mergeinfo returns an error. 173 Format 3+: minfo-here and minfo-count node-revision fields are 174 maintained. svn_fs_get_mergeinfo works. 175 176Revision changed paths list: 177 Format 1-3: Does not contain the node's kind. 178 Format 4+: Contains the node's kind. 179 180Shard packing: 181 Format 4: Applied to revision data only. 182 Format 5: Revprops would be packed independently of revision data. 183 Format 6+: Applied equally to revision data and revprop data 184 (i.e. same min packed revision) 185 186# Incomplete list. See SVN_FS_FS__MIN_*_FORMAT 187 188 189Filesystem format options 190------------------------- 191 192Currently, the only recognised format option is "layout", which 193specifies the paths that will be used to store the revision files and 194revision property files. 195 196The "layout" option is followed by the name of the filesystem layout 197and any required parameters. The default layout, if no "layout" 198keyword is specified, is the 'linear' layout. 199 200The known layouts, and the parameters they require, are as follows: 201 202"linear" 203 Revision files and rev-prop files are named after the revision they 204 represent, and are placed directly in the revs/ and revprops/ 205 directories. r1234 will be represented by the revision file 206 revs/1234 and the rev-prop file revprops/1234. 207 208"sharded <max-files-per-directory>" 209 Revision files and rev-prop files are named after the revision they 210 represent, and are placed in a subdirectory of the revs/ and 211 revprops/ directories named according to the 'shard' they belong to. 212 213 Shards are numbered from zero and contain between one and the 214 maximum number of files per directory specified in the layout's 215 parameters. 216 217 For the "sharded 1000" layout, r1234 will be represented by the 218 revision file revs/1/1234 and rev-prop file revprops/1/1234. The 219 revs/0/ directory will contain revisions 0-999, revs/1/ will contain 220 1000-1999, and so on. 221 222Packing revisions 223----------------- 224 225A filesystem can optionally be "packed" to conserve space on disk. The 226packing process concatenates all the revision files in each full shard to 227create pack files. A manifest file is also created for each shard which 228records the indexes of the corresponding revision files in the pack file. 229In addition, the original shard is removed, and reads are redirected to the 230pack file. 231 232The manifest file consists of a list of offsets, one for each revision in the 233pack file. The offsets are stored as ASCII decimal, and separated by a newline 234character. 235 236Packing revision properties (format 5: SQLite) 237--------------------------- 238 239This was supported by 1.7-dev builds but never included in a blessed release. 240 241See r1143829 of this file: 242http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_fs_fs/structure?view=markup&pathrev=1143829 243 244 245Packing revision properties (format 6+) 246--------------------------- 247 248Similarly to the revision data, packing will concatenate multiple 249revprops into a single file. Since they are mutable data, we put an 250upper limit to the size of these files: We will concatenate the data 251up to the limit and then use a new file for the following revisions. 252 253The limit can be set and changed at will in the configuration file. 254It is 64kB by default. Because a pack file must contain at least one 255complete property list, files containing just one revision may exceed 256that limit. 257 258Furthermore, pack files can be compressed which saves about 75% of 259disk space. A configuration file flag enables the compression; it is 260off by default and may be switched on and off at will. The pack size 261limit is always applied to the uncompressed data. For this reason, 262the default is 256kB while compression has been enabled. 263 264Files are named after their start revision as "<rev>.<counter>" where 265counter will be increased whenever we rewrite a pack file due to a 266revprop change. The manifest file contains the list of pack file 267names, one line for each revision. 268 269Many tools track repository global data in revision properties at 270revision 0. To minimize I/O overhead for those applications, we 271will never pack that revision, i.e. its data is always being kept 272in revprops/0/0. 273 274Pack file format 275 276 Top level: <packed container> 277 278 We always apply data compression to the pack file - using the 279 SVN_DELTA_COMPRESSION_LEVEL_NONE level if compression is disabled. 280 (Note that compression at SVN_DELTA_COMPRESSION_LEVEL_NONE is not 281 a no-op stream transformation although most of the data will remain 282 human readable.) 283 284 container := header '\n' (revprops)+ 285 header := start_rev '\n' rev_count '\n' (size '\n')+ 286 287 All numbers in the header are given as ASCII decimals. rev_count 288 is the number of revisions packed into this container. There must 289 be exactly as many "size" and serialized "revprops". The "size" 290 values in the list are the length in bytes of the serialized 291 revprops of the respective revision. 292 293Writing to packed revprops 294 295 The old pack file is being read and the new revprops serialized. 296 If they fit into the same pack file, a temp file with the new 297 content gets written and moved into place just like an non-packed 298 revprop file would. No name change or manifest update required. 299 300 If they don't fit into the same pack file, i.e. exceed the pack 301 size limit, the pack will be split into 2 or 3 new packs just 302 before and / or after the modified revision. 303 304 In the current implementation, they will never be merged again. 305 To minimize fragmentation, the initial packing process will only 306 use about 90% of the limit, i.e. leave some room for growth. 307 308 When a pack file gets split, its counter is being increased 309 creating a new file and leaving the old content in place and 310 available for concurrent readers. Only after the new manifest 311 file got moved into place, will the old pack files be deleted. 312 313 Write access to revprops is being serialized by the global 314 filesystem write lock. We only need to build a few retries into 315 the reader code to gracefully handle manifest changes and pack 316 file deletions. 317 318 319Node-revision IDs 320----------------- 321 322A node-rev ID consists of the following three fields: 323 324 node_revision_id ::= node_id '.' copy_id '.' txn_id 325 326At this level, the form of the ID is the same as for BDB - see the 327section called "ID's" in <../libsvn_fs_base/notes/structure>. 328 329In order to support efficient lookup of node-revisions by their IDs 330and to simplify the allocation of fresh node-IDs during a transaction, 331we treat the fields of a node-rev ID in new and interesting ways. 332 333Within a new transaction: 334 335 New node-revision IDs assigned within a transaction have a txn-id 336 field of the form "t<txnid>". 337 338 When a new node-id or copy-id is assigned in a transaction, the ID 339 used is a "_" followed by a base36 number unique to the transaction. 340 341Within a revision: 342 343 Within a revision file, node-revs have a txn-id field of the form 344 "r<rev>/<offset>", to support easy lookup. The <offset> is the (ASCII 345 decimal) number of bytes from the start of the revision file to the 346 start of the node-rev. 347 348 During the final phase of a commit, node-revision IDs are rewritten 349 to have repository-wide unique node-ID and copy-ID fields, and to have 350 "r<rev>/<offset>" txn-id fields. 351 352 In Format 3 and above, this uniqueness is done by changing a temporary 353 id of "_<base36>" to "<base36>-<rev>". Note that this means that the 354 originating revision of a line of history or a copy can be determined 355 by looking at the node ID. 356 357 In Format 2 and below, the "current" file contains global base36 358 node-ID and copy-ID counters; during the commit, the counter value is 359 added to the transaction-specific base36 ID, and the value in 360 "current" is adjusted. 361 362 (It is legal for Format 3 repositories to contain Format 2-style IDs; 363 this just prevents I/O-less node-origin-rev lookup for those nodes.) 364 365The temporary assignment of node-ID and copy-ID fields has 366implications for svn_fs_compare_ids and svn_fs_check_related. The ID 367_1.0.t1 is not related to the ID _1.0.t2 even though they have the 368same node-ID, because temporary node-IDs are restricted in scope to 369the transactions they belong to. 370 371There is a lazily created cache mapping from node-IDs to the full 372node-revision ID where they are created. This is in the node-origins 373directory; the file name is the node-ID without its last character (or 374"0" for single-character node IDs) and the contents is a serialized 375hash mapping from node-ID to node-revision ID. This cache is only 376used for node-IDs of the pre-Format 3 style. 377 378Copy-IDs and copy roots 379----------------------- 380 381Copy-IDs are assigned in the same manner as they are in the BDB 382implementation: 383 384 * A node-rev resulting from a creation operation (with no copy 385 history) receives the copy-ID of its parent directory. 386 387 * A node-rev resulting from a copy operation receives a fresh 388 copy-ID, as one would expect. 389 390 * A node-rev resulting from a modification operation receives a 391 copy-ID depending on whether its predecessor derives from a 392 copy operation or whether it derives from a creation operation 393 with no intervening copies: 394 395 - If the predecessor does not derive from a copy, the new 396 node-rev receives the copy-ID of its parent directory. If the 397 node-rev is being modified through its created-path, this will 398 be the same copy-ID as the predecessor node-rev has; however, 399 if the node-rev is being modified through a copied ancestor 400 directory (i.e. we are performing a "lazy copy"), this will be 401 a different copy-ID. 402 403 - If the predecessor derives from a copy and the node-rev is 404 being modified through its created-path, the new node-rev 405 receives the copy-ID of the predecessor. 406 407 - If the predecessor derives from a copy and the node-rev is not 408 being modified through its created path, the new node-rev 409 receives a fresh copy-ID. This is called a "soft copy" 410 operation, as distinct from a "true copy" operation which was 411 actually requested through the svn_fs interface. Soft copies 412 exist to ensure that the same <node-ID,copy-ID> pair is not 413 used twice within a transaction. 414 415Unlike the BDB implementation, we do not have a "copies" table. 416Instead, each node-revision record contains a "copyroot" field 417identifying the node-rev resulting from the true copy operation most 418proximal to the node-rev. If the node-rev does not itself derive from 419a copy operation, then the copyroot field identifies the copy of an 420ancestor directory; if no ancestor directories derive from a copy 421operation, then the copyroot field identifies the root directory of 422rev 0. 423 424Revision file format 425-------------------- 426 427A revision file contains a concatenation of various kinds of data: 428 429 * Text and property representations 430 * Node-revisions 431 * The changed-path data 432 * Two offsets at the very end 433 434A representation begins with a line containing either "PLAIN\n" or 435"DELTA\n" or "DELTA <rev> <offset> <length>\n", where <rev>, <offset>, 436and <length> give the location of the delta base of the representation 437and the amount of data it contains (not counting the header or 438trailer). If no base location is given for a delta, the base is the 439empty stream. After the initial line comes raw svndiff data, followed 440by a cosmetic trailer "ENDREP\n". 441 442If the representation is for the text contents of a directory node, 443the expanded contents are in hash dump format mapping entry names to 444"<type> <id>" pairs, where <type> is "file" or "dir" and <id> gives 445the ID of the child node-rev. 446 447If a representation is for a property list, the expanded contents are 448in the form of a dumped hash map mapping property names to property 449values. 450 451The marshalling syntax for node-revs is a series of fields terminated 452by a blank line. Fields have the syntax "<name>: <value>\n", where 453<name> is a symbolic field name (each symbolic name is used only once 454in a given node-rev) and <value> is the value data. Unrecognized 455fields are ignored, for extensibility. The following fields are 456defined: 457 458 id The ID of the node-rev 459 type "file" or "dir" 460 pred The ID of the predecessor node-rev 461 count Count of node-revs since the base of the node 462 text "<rev> <offset> <length> <size> <digest>" for text rep 463 props "<rev> <offset> <length> <size> <digest>" for props rep 464 <rev> and <offset> give location of rep 465 <length> gives length of rep, sans header and trailer 466 <size> gives size of expanded rep; may be 0 if equal 467 to the length 468 <digest> gives hex MD5 digest of expanded rep 469 ### in formats >=4, also present: 470 <sha1-digest> gives hex SHA1 digest of expanded rep 471 <uniquifier> see representation_t->uniquifier in fs.h 472 cpath FS pathname node was created at 473 copyfrom "<rev> <path>" of copyfrom data 474 copyroot "<rev> <created-path>" of the root of this copy 475 minfo-cnt The number of nodes under (and including) this node 476 which have svn:mergeinfo. 477 minfo-here Exists if this node itself has svn:mergeinfo. 478 479The predecessor of a node-rev crosses both soft and true copies; 480together with the count field, it allows efficient determination of 481the base for skip-deltas. The first node-rev of a node contains no 482"pred" field. A node-revision with no properties may omit the "props" 483field. A node-revision with no contents (a zero-length file or an 484empty directory) may omit the "text" field. In a node-revision 485resulting from a true copy operation, the "copyfrom" field gives the 486copyfrom data. The "copyroot" field identifies the root node-revision 487of the copy; it may be omitted if the node-rev is its own copy root 488(as is the case for node-revs with copy history, and for the root node 489of revision 0). Copy roots are identified by revision and 490created-path, not by node-rev ID, because a copy root may be a 491node-rev which exists later on within the same revision file, meaning 492its offset is not yet known. 493 494The changed-path data is represented as a series of changed-path 495items, each consisting of two lines. The first line has the format 496"<id> <action> <text-mod> <prop-mod> <path>\n", where <id> is the 497node-rev ID of the new node-rev, <action> is "add", "delete", 498"replace", or "modify", <text-mod> and <prop-mod> are "true" or 499"false" indicating whether the text and/or properties changed, and 500<path> is the changed pathname. For deletes, <id> is the node-rev ID 501of the deleted node-rev, and <text-mod> and <prop-mod> are always 502"false". The second line has the format "<rev> <path>\n" containing 503the node-rev's copyfrom information if it has any; if it does not, the 504second line is blank. 505 506Starting with FS format 4, <action> may contain the kind ("file" or 507"dir") of the node, after a hyphen; for example, an added directory 508may be represented as "add-dir". 509 510At the very end of a rev file is a pair of lines containing 511"\n<root-offset> <cp-offset>\n", where <root-offset> is the offset of 512the root directory node revision and <cp-offset> is the offset of the 513changed-path data. 514 515All numbers in the rev file format are unsigned and are represented as 516ASCII decimal. 517 518Transaction layout 519------------------ 520 521A transaction directory has the following layout: 522 523 props Transaction props 524 next-ids Next temporary node-ID and copy-ID 525 changes Changed-path information so far 526 node.<nid>.<cid> New node-rev data for node 527 node.<nid>.<cid>.props Props for new node-rev, if changed 528 node.<nid>.<cid>.children Directory contents for node-rev 529 <sha1> Text representation of that sha1 530 531In FS formats 1 and 2, it also contains: 532 533 rev Prototype rev file with new text reps 534 rev-lock Lockfile for writing to the above 535 536In newer formats, these files are in the txn-protorevs/ directory. 537 538The prototype rev file is used to store the text representations as 539they are received from the client. To ensure that only one client is 540writing to the file at a given time, the "rev-lock" file is locked for 541the duration of each write. 542 543The two kinds of props files are all in hash dump format. The "props" 544file will always be present. The "node.<nid>.<cid>.props" file will 545only be present if the node-rev properties have been changed. 546 547The <sha1> files have been introduced in FS format 6. Their content 548is that of text rep references: "<rev> <offset> <length> <size> <digest>" 549They will be written for text reps in the current transaction and be 550used to eliminate duplicate reps within that transaction. 551 552The "next-ids" file contains a single line "<next-temp-node-id> 553<next-temp-copy-id>\n" giving the next temporary node-ID and copy-ID 554assignments (without the leading underscores). The next node-ID is 555also used as a uniquifier for representations which may share the same 556underlying rep. 557 558The "children" file for a node-revision begins with a copy of the hash 559dump representation of the directory entries from the old node-rev (or 560a dump of the empty hash for new directories), and then an incremental 561hash dump entry for each change made to the directory. 562 563The "changes" file contains changed-path entries in the same form as 564the changed-path entries in a rev file, except that <id> and <action> 565may both be "reset" (in which case <text-mod> and <prop-mod> are both 566always "false") to indicate that all changes to a path should be 567considered undone. Reset entries are only used during the final merge 568phase of a transaction. Actions in the "changes" file always contain 569a node kind, even if the FS format is older than format 4. 570 571The node-rev files have the same format as node-revs in a revision 572file, except that the "text" and "props" fields are augmented as 573follows: 574 575 * The "props" field may have the value "-1" if properties have 576 been changed and are contained in a "props" file within the 577 node-rev subdirectory. 578 579 * For directory node-revs, the "text" field may have the value 580 "-1" if entries have been changed and are contained in a 581 "contents" file in the node-rev subdirectory. 582 583 * For the directory node-rev representing the root of the 584 transaction, the "is-fresh-txn-root" field indicates that it has 585 not been made mutable yet (see Issue #2608). 586 587 * For file node-revs, the "text" field may have the value "-1 588 <offset> <length> <size> <digest>" if the text representation is 589 within the prototype rev file. 590 591 * The "copyroot" field may have the value "-1 <created-path>" if the 592 copy root of the node-rev is part of the transaction in process. 593 594Locks layout 595------------ 596 597Locks in FSFS are stored in serialized hash format in files whose 598names are MD5 digests of the FS path which the lock is associated 599with. For the purposes of keeping directory inode usage down, these 600digest files live in subdirectories of the main lock directory whose 601names are the first 3 characters of the digest filename. 602 603Also stored in the digest file for a given FS path are pointers to 604other digest files which contain information associated with other FS 605paths that are beneath our path (an immediate child thereof, or a 606grandchild, or a great-grandchild, ...). 607 608To answer the question, "Does path FOO have a lock associated with 609it?", one need only generate the MD5 digest of FOO's 610absolute-in-the-FS path (say, 3b1b011fed614a263986b5c4869604e8), look 611for a file located like so: 612 613 /path/to/repos/locks/3b1/3b1b011fed614a263986b5c4869604e8 614 615And then see if that file contains lock information. 616 617To inquire about locks on children of the path FOO, you would 618reference the same path as above, but look for a list of children in 619that file (instead of lock information). Children are listed as MD5 620digests, too, so you would simply iterate over those digests and 621consult the files they reference for lock information. 622