WEBDAV Versioning and Variant Authoring Design Team October 1-2, 1998 FileNet, Costa Mesa, California A meeting of the WebDAV Versioning and Variant Authoring Design Team meeting was held October 1-2, 1998 at the offices of FileNet, in Costa Mesa, California. In attendance were Jim Amsden, Alan Babich, Geoff Clemm, Bruce Cragun, David Durand, Chuck Fay (1st day), Chris Kaler, Brad Sergeant, John Stracke, Jim Whitehead. Jim Whitehead chaired the meeting and recorded minutes. The agenda for the meeting was: October 1: * Discussion of versioning goals document, led by John Stracke (9AM - 5PM) October 2: * Discussion of versioning protocol document, led by Chris Kaler (9AM - 1PM) * Discussion of variant authoring goals and protocol (1:30PM - 4:00PM) Many participants had early plane connections, and were not present for the variant authoring discussion. Alan Babich, David Durand, John Stracke, and Jim Whitehead were the only participants in this discussion. ** Day 1 ** Discussion of RFC 2291. Since the versioning goals document was written as a set of changes to the versioning-specific sections of RFC 2291, "Requirements for a Distributed Authoring and Versioning Protocol for the World Wide Web", the meeting began with a review of RFC 2291 to verify that these goals are still relevant. One area of RFC 2291 which caused disagreement was the sentence in section 5.9.1.3, which states, "It is also possible for a single resource to participate in multiple version graphs." The Design Team argeed there should only be one version graph per versioned resource. However, since there are some scenarios (e.g. a subset of the graph, such as published revisions, or only the last N revisions) where a user might only be given visibility over part of the version graph, the capability to have multiple views of a version graph was viewed as desirable by most, but not all participants. The "client proposes, server disposes" model of a client submitting a version identifier, and the server then stating whether it used the submitted id, as described in section 5.9.2.8 of RFC 2291, met with near universal disagreement from the Design Team. The Design Team agreed to strike the first two sentences of 5.9.2.8. But, since RFC 2291 is already a published RFC, and hence difficult to modify, and since it is difficult to fully grasp the versioning goals since they are located in two different documents at present, the Design Team agreed to merge the versioning section of RFC 2291 into the next revision of the versioning goals document. The versioning goals document will clearly state which sections of RFC 2291 it is superceding. At the end of the discussion of RFC 2291, there was some discussion on variants, and whether the "is-variant-of" relationship should be a member of the version graph, as is the case in draft-whitehead-versioning-00. The Design Team agreed that the version graph should not contain "is-variant-of" relationships, and should concern itself only with versioning relationships. Furthermore, the Design Team agreed to separate variant authoring from versioning, and have a separate goals and protocol document for variant authoring. Discussion of Versioning Goals Document: The first item of discussion in the versioning goals document was the definitions section. The Design Team agreed to change the term "abstract versioned resource" to "versioned resource". Brad, Geoff, and Chris took an action item to develop definitions of configuration set, configuration, workspace, and change set. After discussing terminology, the Design Team focused on the goals. After some discussion, the Design Team agreed to drop goal 1, "It must be possible for the protocol defined to map reasonably well to most existing version repositories." While nobody on the design team disagreed with this goal, it was felt that this was not a direct, functional goal, and that the goals document should be concerned with functional goals only. The design team agreed with goal 2, "Every revision of an abstract versioned resource must itself be a resource, with its own URI." There was agreement on goals 3 ("It should be possible for a client to specify meaningful labels to apply to individual revisions, and to change the revision to which a label refers."), with minor wordsmithing needed. The Design Team agreed with goal 4 ("The labels and revision IDs within a Vgraph are names in a common namespace, in which each name must be unique."), except to note that the labels and revision IDs don't necessarily need to be in the same namespace -- there can be a separate namespace for version ids and labels. There was significant debate over goal 5, "Given a URI to the version graph, and an ID or label for a revision, it must be possible for a client to construct a URI which refers to that revision." While the participants acknowledged that having correct operation of relative URIs for the current, and previous revisions of a versioned resource was a desirable goal, goal 5 sounded too much like a solution, rather than a statement of a goal. Furthermore, there was discussion that since this approach uses URL string operations, it might encounter resistance within the working group. *** Ed note: I have no notes from discussion of goal 6. There was agreement on goal 7, ("The CM protocol must be an optional extension to the base versioning protocol"). During discussion of goal 8, ("Revisions are immutable: once a revision has been checked into the repository, the revision can never be deleted; and its contents and dead properties can never be changed.") there was discussion about separating the concepts of deletion and mutability. Administrators often decide to delete older portions of revision trees to conserve space, and hence though the revisions are immutable once they're checked-in, they may eventually be deleted. Mutability of properties was also discussed. The Design Team determined that there are two types of properties, "mutable", can be changed after check-in, and "immutible", cannot be modified after check-in. These categories are not guaranteed to correspond to "live" and "dead" properties. There was general agreement that an access control property would be mutible. However, some live properties, like "getcontentlength", would be expected to remain immutible, despite being live. This implies there are some properties which are "bound" in some sense to the body of the resource, and others which are not. Branching was discussed. Some Design Team members stated that document management systems typically do not presently support branching, and hence would not be eager to implement branching. As a result, the Design Team agreed to make branching an optional feature, but one that has full support within the protocol, since CM repositories find branching to be a vital, non-optional feature. After discussing the items in the initial goals document, Jim Amsden brought up a list of goals which he felt should be in the goals document. Jim Amsden agreed to send John Stracke a list of these goals for inclusion in the next rev. of the goals. In a separate discussion, the Design Team agreed to not have predecessor links come in from outside the versioned resource. John Stracke agreed to work towards having a new revision of the goals document by October 9. Before adjourning for the day, there was discussion of the when the next Design Team meeting should be held. It was tentatively agreed to hold the next meeting December 2-3, hosted by InterSolv. ** Day 2 ** Chris Kaler led a discussion of his draft which merged together draft-whitehead-webdav-versioning and draft-kaler-webdav-versioning, the September 28, 1998 revision of "Versioning and Variant Extensions to WebDAV." Chris used a small set of slides to lead the discussion. These slides are available on the Web at: http://www.ics.uci.edu/pub/ietf/webdav/versioning/dt_oct98/kaler/ There was discussion of support for downlevel clients (automatic versioning). One view on this capability was that it definitely adds complexity to the protocol, and this might make automatic versioning not worthwhile. There was agreement to continue investigating provision of versioning capabilities for downlevel clients. However, this capability should be re-examined once it is fully specified to ensure that the tradeoff between automatic versioning capability and added complexity is deemed worthwhile. Next, there was discussion on having an entry point in the HTTP URL namespace which is mapped to a particular member of a versioned resource. Some viewpoints expressed were: - We should use existing referential members for this capability. - The semantics of referential members don't meet our needs to have a reference which can track a label, or point to the latest member of a line of descent. - This sounds like a Vportal. - There are scalability problems with the Vportal approach across a configuration, if each Vportal can individually point to a different part of each version graph (i.e., if some point to label X, others point to label Y, others point to the latest on a line of descent, etc.) - A Vportal and a single element configuration sound like identical mechanisms to me. Do we really want both? - But, what about keeping the separation between the versioning layer and the configuration management layer? - It was noted that direct references might work if it was possible to set properties on them. There was no resolution to this discussion. Geoff Clemm agreed to write up a proposal for how the mapped version namespace should be arranged. There was a discussion on retrieving the version history via PROPFIND. No agreement was recorded. The Design Team agreed that the CHECKINOUT method needs to be broken up into several methods, to reflect a better separation of concerns. At the least, methods for CHECKIN, CHECKOUT, UNCHECKOUT all need to exist, and another method may need to be created to request enumerations of all checkouts active on a versioned resource. There was discussion on the semantics of checkout. Some issues which were identified are: - locking/reservation of a line of descent - the ability to specify a line of descent on checkout - auto-creation of a line of descent when branching occurs The Design Team agreed that CHECKOUT needs the following parameters: - do not branch - force branch - creation of a branch, if it occurs, is OK There was agreement that no MKBRANCH method is needed, that creation of a branch is implicit in CHECKOUT. Chris Kaler took an action item to start a thread on the mailing list concerning the semantics of checkout, and also on how the semantics of lock and checkout/checkin interact. Geoff Clemm will collect the ClearCase experience with checkout and mail it to the list. Merging capability was discussed. One approach discussed was having PUT pass the version it is derived from, and hence at checkin time the server will know which resource is the predecessor to the resource being checked-in. Another approach was to have a MERGE method, which could merge two (or more) resources. Another alternative was to allow PROPPATCH on a live history property, similar to the semantics of the GRAPHOP proposal. There was a long discussion on whether it should be possible to, after checkin, delete merge arcs. Some participants noted that, in practise, people make mistakes and want to back-out their merge. Others noted that it is possible to recover from a merge by creating new nodes and moving forward, but still leaving a record of the merge. No agreement was reached on this issue. The Design Team agreed to create a separate, public, archived, mailing list exclusively for the discussion of versioning issues. Jim Whitehead took an action item to create this list. The Design Team agreed to have one more revision of the versioning protocol document before submitting it as the -00 revision of draft-ietf-webdav-versioning. Chris Kaler agreed to create this revision by Nov. 2. Automatic mapping of a configuration to HTTP URL space was discussed. There was general agreement that this type of capability should be included in the protocol. There was agreement in favor of using MKCOL to create this mapping. Geoff agreed to write up a proposal on automatic mapping of a configuration to HTTP URL space. The Design Team agreed that the "Base-Version" header should be removed unless it is shown to be absolutely necessary. The Design Team agreed to move section 3.2.5 capability ("Checkin-Token -- Clients may desire the ability to track a set of changes as a unit.") into the section on configuration management. A participant noted that this capability is needed for change sets. One participant noted that there are i18n problems with the current specification of the Comments header, since there are no provisions for labeling the charset or language, or for handling multi-octet character sets. There was a discussion of the semantics of collections and containment. Jim Amsden will create a document on how to describe a version history using XML. He will consider RDF during this activity. Jim Amsden also volunteered to edit the goals document during John Stracke's sabbatical. At this point in the meeting, several participants left. Alan Babich, David Durand, John Stracke, and Jim Whitehead continued on, discussing variants. Variants. The Design Team began with a discussion of goals and definitions. There was discussion of the terms "rendition", or "internal variant", and "external variant". An internal variant, or rendition, shares the properties of the resource, and is typically mechanically derived. An example is a PDF rendition of a Word file. In both cases, the title will be the same, although the length and content type are not. In contrast, an external variant is one where a human is typically involved in the creation of the resource, such as a translation of a Web page from German to French. It was noted that while some types of variants are typically thought of as being mechanically derived, such as different formats of a word processing document, there may be exceptions where their creation is human-directed. No broad assumptions about which kinds of variant-creating transformations are always mechanically performed can be made, or should be embedded into the protocol. Goals that emerged during the discussion are: * It should be possible to discover what kinds of mechanically derived variants can be provided by the server, what kinds must be provided by the client (or else that type of variant will not be present), and what kinds may be provided by the client (e.g., where it is possible to manually override a server-provided mechanical transformation). * It should be possible to stop automatic server generation of specified classes of variants. * It should be possible to discover which variants can be read, and which ones can be written. * A PUT on a "variant control resource" (i.e., a resource where, by submitting a GET with Accept* headers, content negotiation takes place), should "do the right redirect", and store the body under the correct variant resource. * A PUT of a multipart/alternatives body to a variant control point should cause the server to burst out the multiple parts and store them at the correct variant resource. * It should be possible to set properties on variants independently. As a consequence, each variant has its own URI. * It should be possible to delete individual variants, at a specific point in (media-type, language, encoding, transfer) space. * It should be possible to request the server to package up all variants into a multipart/alternatives for transfer to the client. There was some discussion over whether the HTTP specification implies there is a single "variant control point" where content negotiation takes place, or whether every variant exists in a peer-to-peer relationship with every other variant, and content negotiation can take place at any variant. There was some agreement that there must be at least one control point, but there can be many control points. Control points don't have to be the URL where a specific variant exists. There was some discussion on whether it should be possible to perform variant authoring on a collection. There was agreement that this is desirable, due to the "home page authoring" scenario, where a home page exists in multiple variants (languages). There was a long discussion on how to reduce the combinatoric explosion which occurs when trying to enumerate all of the possible static and mechanically derived variants of a resource during discovery. One technique was to use HEAD with Accept* headers to discover the URL which can be authored for that specific point in variant space. A Not-Authorable header could indicate the variant is not human-authorable. There was some sentiment that variant authoring should be an optional set of features (i.e., support of versioning would not necessarily imply support for variant authoring), but no agreement on this issue was recorded. David Durand volunteered to edit the variant goals document. John Stracke volunteered, time permitting, to edit the variant authoring protocol document. *** Meeting Adjourned ***