Re: version management and relative links

David G. Durand (dgd@cs.bu.edu)
Sun, 9 Jun 1996 13:43:14 -0400


At 4:37 PM 6/8/96, Larry Masinter wrote:
>Let's say that you ask to view "version 3" of a HTML page. Do you want
>the verison 3 of the GIF images that were included, too? Do you want
>all of the hyperlinks to be links to version 3, or continue to be
>links to the original material.

This is the core of the "configuration management problem". I think that it
willnot be possible to define requirements on browsers so that they will
necessarily do the right thing: we can only try to provide help so that
they have the ability to do the right thing. Forn instance the correct
version of a GIF to match version 3 of a document might turn out to be
version 2.3.1 of the GIF. This highlights one of the ways that I think
versioning-aware requests may be different in a configuration-managed
server

>Is it possible to do this without rewriting the URLs contained within
>the material? If the material uses relative links, do you want to
>arrange it so that relative URLs don't have to be rewritten?

If the URLs contain an implicit "current version" reference, when no
version is specified, then versioning-unaware clients can function
correctly with relative links.

   For versioning aware clients, I can imagine several possibilities: one
would be to include the full URL (including version) of the current
document, along with the URL (without version information) of the source
URL. This will work for some simple versioning strategies, but will fail
when the same version of a document is part of several configurations.

  For example with configurations C1 and C2. C1 contains documents D1v1
(document 1, version 1) and D2v1. configuration C2 contains D1v1 and D2v2.
Now if D1 has a relative link to D2, a browser that wants to follow the
link cannot select D2v1 or D2v2 without knowing which configuration
containing D1v1 is the user's "current configuration".

   Another strategy is to address configurations explicitly (but I would do
this as a secondary round of work) and have an opaque Configuration-ID,
attribute of a resource that would be a magic cookie to the server,
allowing the selection of the desired version.

   Also possible should be a strategy where the user, via browser settings
or explicit notification, can be made aware that alternative versions of
the document exist, and a selection made among them. In the case of a
publication scenario, configuration management is better. However, if the
documents are separately controlled (D1v1 is a hostile commentary on D2v1,
let us say), then the version information about D2, and the chronological
relations between the three versions, may be highly significant. Note that
this kind of situation can arise even in a single server if there is
collaborative work being done.

   This is one reason that I think a "current version" strategy is
important -- it provides a simple solution for versioning-unaware browsers,
and a good default strategy for browsing, even with versioning-aware
browsers, when a user is not interested in the historical record.

   Versioning aware editors, however will need to have a choice of
strategies that include configuration management, but also include manual
or automatic overriding of versioning information presented in the URL.

   It's also worth noting that the relative URL problem is really an issue
for servers that are based on file-system technology. Given an integrated
link-database, version information in URLs might be generated by the
HTML-export process, and not recorded directly in the document in the way
we think of when picturing an emacs session on the raw HTML.

   This discussion has highlighted an important requirement:

    **** Versions should be extractable from the URL ****
    While the version string should be opaque, I think that there is some
virtue in being able to tell what part of the URL is identifying
information, and what is version information: for one thing this is
required to implement the "current version" strategy. If we don't do this,
then the version information is totally hidden in the URL, and special HTTP
queries become the only way to find out even the simplest version
relationships among URLs (such as the predicate
are-versions-of-the-same-object(x, y)).

>In the MHTML working group (sending HTML in MIME), there was some
>design work at one time on a 'catalog' which might be included as a
>wrapper to a HTML page as part of multipart/related. The 'catalog'
>would consist of a set of renamings
>
>> Interpret the following (enclosed) HTML page, but within it,
>> whenever you see this URL
>>       "http://blah.blah.com/stuff"
>> substitute the following URL instead
>>        "cid:012312313ab@blah.com"

   I think you mean the MIME-SGML group and not the MIME-HTML group. On
that assumption:

   The CATALOG proposal is based on SGML-Open catalogs for mapping ISO
Public Text Identitifiers to system access strings (like URLs). This is a
great way to do location-independent naming (since you can get the catalog
from anywhere), but it's not implemented as a URL-rewriting strategy.
Unfortunately the ISO public names don't currently include any
revision-control information. I would love to set up a convention for that
here, but I think that this is something that should be done after the
basic foundations are in place.

   The real problem with this is that CATALOGs only make sense in the
context of an SGML entity manager -- they map Public Identifiers (for SGML
"external entities") to storage locations. They do not map attribute values
(like HREFs). If a catalog file is the best approach we might ease
implementors' burdens by stealing some of the syntax, but a correct CATALOG
implementation would not function the way you suggest (i.e. as a URL
mapping).

>I'm wondering if we might presuppose a 'catalog-aware' browser rather
>than a 'version-aware' browser. A versioned resource would be
>delivered as a cataloged entity so that embedded URLs that pointed to
>or included versioned material could themselves be versioned.

   Perhaps URL-renamings should be part of the configuration layer of the
versioning stuff. Does anyone have any opinions on this?

>Larry

----------------------------------------------+----------------------------
  David Durand                 dgd@cs.bu.edu  | david@dynamicDiagrams.com
  Boston University Computer Science          | Dynamic Diagrams
  http://cs-www.bu.edu:80/students/grads/dgd/ | http://dynamicDiagrams.com/