Re: what's doable in Web version control

Larry Masinter (masinter@parc.xerox.com)
Sun, 9 Jun 1996 15:43:38 PDT


>   It seems critical to me that we support what Jim called "browsing
>   within a collection of entities" and that we do so with without
>   requiring version-aware clients.  Why?  Because delivering versioned
>   content will be the most important product of our efforts, and our
>   plans can't rely on changing Netscape.

>    Of the various URL decoration proposals, only one satisfies this
>    dual requirement: having the version indicator embedded in the URL,
>    separated off by /'s.  With this, we can reuse the support already
>    in Web clients for handling relative URLs.

Actually, I think that none of them satisfy the requirements; the only
thing that's close is the one that puts all of the version information
at the beginning.

>   The only other possible solution, which is poor, is to have
>   version-aware servers support non version-aware clients by editing
>   the returned HTML on the fly, fixing up links with version info.
>   If anyone supports this solution, I'd like to hear it.

I think this is the only thing that will actually work.


Re PUT and all of the possibilities around it:

I think all of the requirements laid on "PUT" can be accomplished with
"POST", with some standards for the data that is posted. (E.g., a new
media type. Call it 'multipart/update'.)

>    1)  SCM systems such as CVS rely on state information, stored
>       on the client, to know what version of what documents are
>       being edited (and thus what will be "put" back).  CVS's
>       state information is fairly trivial, and could be embedded
>       into the HTML documents being edited.  But that doesn't work
>       so well if the entities aren't HTML.  When the user PUTs a
>       GIF, how will a version-aware Web server based on CVS recover
>       the state necessary to check the GIF in?

>       SCM systems that store the state in the server, such as
>       ClearCase, don't have this same problem, I think.  But make
>       no mistake about it: these version-aware Web servers are
>       quite stateful.

Yes, multipart/update should contain information about the location,
variant, entitytag (for validation), and prior version that's being
updated.

>    2)  Other SCM systems have fairly heavyweight client
>       implementations themselves, with a fat protocol between the
>       client and server.  For example, it is unlikely that a
>       version-aware Web client would be able to carry out all the
>       machinations necessary to be a ClearCase view (i.e. a client).

The protocol itself (POST) doesn't have to be 'fat', as long as you
get the body of the protocol (multipart/update) right.

>    3)  The picture is even less rosy with SCM systems that require
>       the client to have direct file access to the repository.  A
>       large chunk of the commercial SCM systems -- PVCS, MKS Source
>       Integrity, Continuus, Microsoft's SourceSafe -- are in this
>       boat.  I can't see any way they can be backend to a version-aware
>       Web server without the Web server having to act as proxy to
>       client workspaces maintained on the server.

Well, the CGI that implements POST of a multipart/update will need to
have direct file access to the repository.

>    4)  Aside from architecture, the model varies wildly from one SCM
>       system to the next.  And as has been discussed, the lock-the-head
>       vs merge-into-trunk vs change-set models all need to be
>       accomodated.

These just change the data content of multipart/update in minor ways.

>       Going further, something that we (P3) support is atomic checkin
>       of multiple documents, because it allows you to move the repository
>       forward in whole chunks rather than a file-at-a-time.  Certainly
>       we think this is important for Web documents as well, and would
>       like to see multiple PUTs with a single COMMIT possible.

Well, a single POST of a multipart/update can be performed atomicly.

> Version control.
>
>    All the wrinkles that make a simple PUT difficult are going to make
>    flowing full version control models over HTTP truly daunting.
>    It might be possible to come up with a limited set of operations that
>    make sense across all models, but the examples put forth so far --
>    compute the predecessor revision and show a version tree -- each only
>    make sense in a subset of the systems.

Most 'control' options can be done with POST, and different data
types. In fact, you probably could just use multipart/form-data.

> My flame-retardant personal opinion is that supporting GETs is well
> within the ability of this group, that PUTs will get mired for long
> enough that some defacto industry implementation will set the lead and
> thus simplify the range of models that need supporting, and that the
> rest of version control via HTTP will follow after that.  But I welcome
> contrary opinions, because in this case I'd be glad to be wrong.

It's hard to predict how things will go. There already are defacto
implementations. But it seems like most of the vendors are either
primarily client-only or server-only, so interoperability with others
is pretty important.  Personally, I think the problems aren't really
that hard, so it's mainly a matter of will to agree. I've found that
speculation about this kind of stuff is pretty useless; let's just get
on with the problem/solution discussion. OK?

Larry