1 Oh Most High and Fragrant Emacs, please be in -*- text -*- mode! 2 3############################################################################## 4### The vast majority of this file is completely out-of-date as a result ### 5### of the ongoing work known as WC-NG. Please consult that documentation ### 6### for a more relevant and complete reference. ### 7### (See the files in notes/wc-ng ) ### 8############################################################################## 9 10 11This is the library described in the section "The working copy 12management library" of svn-design.texi. It performs local operations 13in the working copy, tweaking administrative files and versioned data. 14It does not communicate directly with a repository; instead, other 15libraries that do talk to the repository call into this library to 16make queries and changes in the working copy. 17 18Note: This document attempts to describe (insofar as development is still 19a moving target) the current working copy layout. For historic layouts, 20consulting the versioned history of this file (yay version control!) 21 22 23The Problem We're Solving 24------------------------- 25 26The working copy is arranged as a directory tree, which, at checkout, 27mirrors a tree rooted at some node in the repository. Over time, the 28working copy accumulates uncommitted changes, some of which may affect 29its tree layout. By commit time, the working copy's layout could be 30arbitrarily different from the repository tree on which it was based. 31 32Furthermore, updates/commits do not always involve the entire tree, so 33it is possible for the working copy to go a very long time without 34being a perfect mirror of some tree in the repository. 35 36 37One Way We're Not Solving It 38---------------------------- 39 40Updates and commits are about merging two trees that share a common 41ancestor, but have diverged since that ancestor. In real life, one of 42the trees comes from the working copy, the other from the repository. 43But when thinking about how to merge two such trees, we can ignore the 44question of which is the working copy and which is the repository, 45because the principles involved are symmetrical. 46 47Why do we say symmetrical? 48 49It's tempting to think of a change as being either "from" the working 50copy or "in" the repository. But the true source of a change is some 51committer -- each change represents some developer's intention toward 52a file or a tree, and a conflict is what happens when two intentions 53are incompatible (or their compatibility cannot be automatically 54determined). 55 56It doesn't matter in what order the intentions were discovered -- 57which has already made it into the repository versus which exists only 58in someone's working copy. Incompatibility is incompatibility, 59independent of timing. 60 61In fact, a working copy can be viewed as a "branch" off the 62repository, and the changes committed in the repository *since* then 63represent another, divergent branch. Thus, every update or commit is 64a general branch-merge problem: 65 66 - An update is an attempt to merge the repository's branch into the 67 working copy's branch, and the attempt may fail wholly or 68 partially depending on the number of conflicts. 69 70 - A commit is an attempt to merge the working copy's branch into 71 the repository. The exact same algorithm is used as with 72 updates, the only difference being that a commit must succeed 73 completely or not at all. That last condition is merely a 74 usability decision: the repository tree is shared by many 75 people, so folding both sides of a conflict into it to aid 76 resolution would actually make it less usable, not more. On the 77 other hand, representing both sides of a conflict in a working 78 copy is often helpful to the person who owns that copy. 79 80So below we consider the general problem of how to merge two trees 81that have a common ancestor. The concrete tree layout discussed will 82be that of the working copy, because this library needs to know 83exactly how to massage a working copy from one state to another. 84 85 86Structure of the Working Copy 87----------------------------- 88 89Working copy meta-information is stored in a single .svn/ subdirectory, in 90the root of a given working copy. For the purposes of storage, directories 91pull in through the use of svn:externals are considered separate working 92copies. 93 94 .svn/wc.db /* SQLite database containing node metadata. */ 95 pristine/ /* Sharded directory containing base files. */ 96 tmp/ /* Local tmp area. */ 97 experimental/ /* Data for experimental features. */ 98 shelves/ /* Used by 1.10.x shelves implementation */ 99 entries /* Stub file. */ 100 format /* Stub file. */ 101 102`wc.db': 103 A self-contained SQLite database containing all the metadata Subversion 104 needs to track for this working copy. The schema is described by 105 libsvn_wc/wc-metadata.sql. 106 107`pristine': 108 Each file in the working copy has a corresponding unmodified version in 109 the .svn/pristine subdirectory. This files are stored by the SHA-1 110 hash of their contents, sharded into 256 subdirectories based upon the 111 first two characters of the hex expansion of the hash. In this way, 112 multiple identical files can share the same pristine representation. 113 114 Pristines are used for sending diffs back to the server, etc. 115 116`experimental': 117 Experimental (unstable) features store their data here. 118 119`shelves': 120 Subversion 1.10's "svn shelve" command stores shelved changes here. 121 This directory is not used by any other minor release line. 122 123`entries', `format': 124 These stub files exist only to enable a pre-1.7 client to yield a clearer 125 error message. 126 127 128How the client applies an update delta 129-------------------------------------- 130 131Updating is more than just bringing changes down from the repository; 132it's also folding those changes into the working copy. Getting the 133right changes is the easy part -- folding them in is hard. 134 135Before we examine how Subversion handles this, let's look at what CVS 136does: 137 138 1. Unmodified portions of the working copy are simply brought 139 up-to-date. The server sends a forward diff, the client applies 140 it. 141 142 2. Locally modified portions are "merged", where possible. That 143 is, the changes from the repository are incorporated into the 144 local changes in an intelligent way (if the diff application 145 succeeds, then no conflict, else go to 3...) 146 147 3. Where merging is not possible, a conflict is flagged, and *both* 148 sides of the conflict are folded into the local file in such a 149 way that it's easy for the developer to figure out what 150 happened. (And the old locally-modified file is saved under a 151 temp name, just in case.) 152 153It would be nice for Subversion to do things this way too; 154unfortunately, that's not possible in every case. 155 156CVS has a wonderfully simplifying limitation: it doesn't version 157directories, so never has tree-structure conflicts. Given that only 158textual conflicts are possible, there is usually a natural way to 159express both sides of a conflict -- just include the opposing texts 160inside the file, delimited with conflict markers. (Or for binary 161files, make both revisions available under temporary names.) 162 163While Subversion can behave the same way for textual conflicts, the 164situation is more complex for trees. There is sometimes no way for a 165working copy to reflect both sides of a tree conflict without being 166more confusing than helpful. How does one put "conflict markers" into 167a directory, especially when what was a directory might now be a file, 168or vice-versa? 169 170Therefore, while Subversion does everything it can to fold conflicts 171intelligently (doing at least as well as CVS does), in extreme cases 172it is acceptable for the Subversion client to punt, saying in effect 173"Your working copy is too out of whack; please move it aside, check 174out a fresh one, redo your changes in the fresh copy, and commit from 175that." (This response may also apply to subtrees of the working copy, 176of course). 177 178Usually it offers more detail than that, too. In addition to the 179overall out-of-whackness message, it can say "Directory foo was 180renamed to bar, conflicting with your new file bar; file blah was 181deleted, conflicting with your local change to file blah, ..." and so 182on. The important thing is that these are informational only -- they 183tell the user what's wrong, but they don't try to fix it 184automatically. 185 186All this is purely a matter of *client-side* intelligence. Nothing in 187the repository logic or protocol affects the client's ability to fold 188conflicts. So as we get smarter, and/or as there is demand for more 189informative conflicting updates, the client's behavior can improve and 190punting can become a rare event. We should start out with a _simple_ 191conflict-folding algorithm initially, though. 192 193 194Text and Property Components 195---------------------------- 196 197A Subversion working copy keeps track of *two* forks per file, much 198like the way MacOS files have "data" forks and "resource" forks. Each 199file under revision control has its "text" and "properties" tracked 200with different timestamps and different conflict (reject) files. In 201this vein, each file's status-line has two columns which describe the 202file's state. 203 204Examples: 205 206 -- glub.c --> glub.c is completely up-to-date. 207 U- foo.c --> foo.c's textual component was updated. 208 -M bar.c --> bar.c's properties have been locally modified 209 UC baz.c --> baz.c has had both components patched, but a 210 local property change is creating a conflict. 211