Versioning Thoughts (in HTML)

David G. Durand (dgd@cs.bu.edu)
Wed, 5 Jun 1996 14:56:22 -0400


The following is a sort of little "position paper" on versioning in the
WWW. It's in HTML and its longish so you can also look at:

    http://cs-www.bu.edu/students/grads/dgd/HTML_versions.html

{Ed: I have included the HTML source directly into the mail message
for the Hypermail archive. - Jim}

<H1>Versioning and HTTP</H1>

<P>This note includes a number of points that reflect a somewhat different 
perspective about how and why versioning should be integrated into the WWW. 
At a few places, I will argue that less-restrictive assumptions be made to 
accomodate variant styles of versioning, and at a few other places, I will 
argue that more precise recommendations be prmoulgated to enhance 
meaningful interoperability.

<p>My personal agenda is that I'm interested in version control as a way to 
relax concurrency to allow write-anytime collaboration. I'm also 
interested in automatic merge tools that will let users manage such 
collaboration, and finding fundamental models of versioning that capture 
the widest range of possible editing behaviours, as a basis for 
implementing generalized systems. I am personally convinced that this is 
best done by tracking user operations, (typically editing operations) and 
constructing versions as sets of non-interfering edits. This makes merging 
and distribution easy, at the expense of making the notion of version 
trees only one of a variety of styles.

<h2>Versions in URLs</h2>

<p>Accordingly, I believe that version identifiers should be opaque to 
editing systems, and managed by servers.  The paper on "VTML" that Fabio 
Vitali and I wrote (referenced in the page for this group) identify a few 
key notions for version management on the web.

<ul>

<li>A server must be able to 
serve up a "current version" of a document, as well as to serve up a 
particular version on demand.  The default version is server-determined and 
is supplied when several versions of a document exist, and no version 
parameter is specified on the request.

<li>An application should be able to determine what the version parameter 
of an URL is, to enable user decisions about whether to follow a link into 
the "current version" of a document, or into a particular version specified 
in the link, or even a specific version specified by the user.  </ul>

<p>These features allow the option of browsers that present a single 
up-to-date view of web sites, while still being able to reflect changes in 
documents that have been made after a link was created.  It could also 
enable intelligent bookmark management (which is, in principle, just a 
special case of link anchor management).  I have no objection to the 
<tt>;version=</tt> syntax proposed so far.

<h2>HTTP issues</h2>
<p>The use of HTTP headers to specify version information is acceptable, if 
they are not too restricted in their semantics.  As I don't have access to 
a postscript printer right now, I must react to the direction of the 
discussion without being able to read 
the details of Jim's proposal, so pardon any <i>faux pas</i> I may 
inadvertently commit.

<p>Use of versioning operation should not depend on operations such as LOCK and 
UNLOCK. I at least, am taking great pains to avoid the logical or 
practical necessity for such operations by making the free creation of 
variant versions (and their later merging, if desired) as easy as possible. 
I'd like it if we can find a specification for lock and Unlock such that a 
server like the one I am implementing will be able to work with editors 
that expect LOCK and UNLOCK.

<p>The semantics should not assume that there is a single predecessor 
version, or that if there are multiple predecessors, one of them is the 
"main" one.  The semantics should not assume that every derived version 
even has a meaningful predecessor version.  In my model, a user might want 
to designate a new "top-level version" for the result of a complicated 
merge involving many manual decisions about which changes to keep and how 
conflicts are to be resolved.

<p>It should be a server decision as to what version identifier should be 
assigned to a document revision when it is submitted.  This follows from 
the opaqueness of version parameters in URLs.  It should be a server 
decision (not mandated by the protocol) whether to accept a new revision.  
It should also be a server decision whether or not the "current version" is 
changed when a version is submitted.  Setting the current version should 
also be an available operation, subject to server-specific access and 
configuration policy.  I don't object to servers deciding to enforce a 
particular notion of consistency by refusing updates, but I don't want the 
protocol to require that from my server.

<h2>Edit tracking</h2>
<p>I'd like to see something like VTML in place to allow detailed version 
information to be propagated to intelligent clients if they want it. Fabio Vitali 
and I are already modifying VTML to remove some pointless 
HTML-dependencies from it, and make it look more like a byte-stream 
revision system. I think that there is enough practical need to manage 
HTML documents, however, that adressing version control of HTML 
specifically would still be worthwhile.

<p>I'd like to discuss notions such as VTML as part of the overall 
approach to versioning on the web, thus creating a tripartite front for 
proper support: <tt>Content-type:</tt>, HTTP protocol, and URL format. 
These correspond to the fundamental versioning notions of naming, access 
control, and differencing.

----------------------------------------------+----------------------------
  David Durand                 dgd@cs.bu.edu  | david@dynamicDiagrams.com
  Boston University Computer Science          | Dynamic Diagrams
  http://cs-www.bu.edu:80/students/grads/dgd/ | http://dynamicDiagrams.com/