Initial Dist. Auth. Requirements

Jim Whitehead (ejw@ics.uci.edu)
Thu, 29 Aug 1996 14:00:23 -0700


Here is my initial draft of requirements that should be present in
HTTP to support distributed authoring.  The intent is to only discuss
distributed authoring requirements in this document, and to not
discuss versioning requirements, as much as this is possible (locking,
edit notification, and relationships are overlapping issues, though.)

I've posted this to both the versioning and distributed authoring lists,
but my intent is to have discussion take place on the distributed authoring
list (<w3c-dist-auth-request@w3.org> to join).

I'm hoping to be able to package this up in IETF standard format and
send it off for comment by the HTTP WG later next week.

Many thanks to Yaron Goland, David Durand, Fabio Vitali, Christopher
Seiwald, and Judith Slein for their ifnromative posts which really
helped clarify and provide rationale for these requirements.

- Jim


Requirements on HTTP for Distributed Editing

Jim Whitehead, U.C. Irvine, <ejw@ics.uci.edu>
Rev. 0.1, August 29, 1996

Abstract

TBD.

1.0 INTRODUCTION

This document describes functionality which, if provided in the HyperText
Transfer Protocol (HTTP) Specification [Ref HTTP/1.1 Spec.], would support
the interoperability of tools which allow remote loading, editing and saving
(publishing) of various media types using HTTP. As much as possible, this
functionality is described without suggesting a proposed implementation,
since in general, there are many ways to perform the functionality within
the HTTP framework.

Much of the functionality described in this document stems from the
assumption that people performing distributed authoring only have access to
the objects they are editing via the HTTP protocol. This is in contrast to
the majority of current authoring practice, where there is access to the
underlying storage media, often via a shell or graphical user interface to a
filesystem. Authors need more than just remote control over their individual
documents: they need remote control over the namespace in which those
documents reside. Currently, authors control their namespace by interacting
directly with the underlying storage system, but when performing distributed
authoring this access is not available.

2.0 REQUIREMENTS

In the requirement descriptions below, the requirement will be stated,
followed by its rationale. If any current distributed authoring tools
currently implement the requirement, this is also mentioned. It is assumed
that "server" means "a program which receives and responds to HTTP
requests," and that "distributed authoring tool" or "intranet enabled tool"
means "a program which can retrieve a source entity via HTTP, allow editing
of this entity, and then save/publish this entity to a server using HTTP." A
"client" is "a program which issues HTTP requests and accepts responses."

  1. Source Retrieval. The source of any given entity should be retrievable
     via HTTP.

     There are many cases where the source entity stored on a server does
     not correspond to the actual entity transmitted in response to an HTTP
     GET. Current known cases are server side include directives, and SGML
     which is converted on the fly to HTML. There are many possible cases,
     such as automatic conversion of bitmap images into several variant
     bitmap media types (e.g. GIF, JPEG), and automatic conversion of an
     application's native media type into HTML. As an example of this last
     case, a word processor could store its native media type on a server
     which automatically converts it to HTML. A GET of this entity would
     retrieve the HTML. Retrieving the source of this entity would retrieve
     the word processor native entity.

  2. Locks. It should be possible, via HTTP, to restrict modification of an
     entity to a specific person, or list of persons. It should be possible
     to query for whether a given URL has any active modification
     restrictions, and if so, who currently has modification permission.
  3. Independence of locks. It should be possible to lock an entity without
     re-reading the entity, and without commiting to editing an entity.

     At present, HTTP does not provide any support for preventing two or
     more people from overwriting each other's modifications when they save
     to a given URL. Furthermore, there is no way for people to discover if
     someone else is currently making modifications to an entity. This is
     known as the "lost update problem," or the "overwrite problem." Since
     there can be significant cost associated with discovering and repairing
     lost modifications, preventing this problem is crucial for supporting
     distributed authoring. Furthermore, locking support is also a key
     component of many versioning schemes, a desirable capability for
     distributed authoring.

     An author may wish to lock an entire web of entities even though they
     are editing just a single entity, just to keep the other entities from
     changing. In this way, an author can ensure that if a local hypertext
     web is consistent in their distributed authoring tool, it will then be
     consistent when they write it to the server. Because of this, it should
     be possible to take out a lock without also causing transmission of the
     contents of an entity. Similarly, it should not be assumed that because
     an entity is locked, that it will necessarily be modified.

  4. Notification of Intention to Edit. It should be possible to notify the
     HTTP server that an entity is about to be edited by a given person. It
     should be possible to query the HTTP server for the list of people who
     have notified the server of their intent to edit an entity.

     Experience from configuration management systems has shown that people
     need to know when they are about to enter a parallel editing situation.
     Once notified, they either decide not to edit in parallel with the
     other authors, or they use out-of-band communication (face-to-face,
     telephone, etc.) to coordinate their editing to minimize the difficulty
     of merging their results. Notification is separate from locking, since
     a lock does not necesssarily imply an entity will be edited, and a
     notification of intention to edit does not carry with it any access
     restrictions. This capability is supportive of versioning, since a
     check-out is typically involves taking out a lock, making a
     notification of intention to edit, and getting the entity to be edited.

  5. Relationships. Via HTTP, it should be possible to create, query, and
     delete arbitrary typed relationships (links) between entities of any
     media type.

     Relationships (or links which are not necessarily navigable) between
     entities can be used for many purposes. Relationships support
     pushbutton printing of a multi-resource document in a prescribed order,
     jumping to the access control page for an entity, and quick browsing of
     related information, such as a table of contents, an index, a glossary,
     help pages, etc. While relationship support is provided by the HTML
     "LINK" element, this is limited only to HTML entities, and does not
     support bitmap image types, and other non-HTML media types.
     AOLpress, America Online, currently "allows pages to add toolbar
     buttons on the fly using the HTML 3.2 <LINK REL....> tag. For example,
     your page can add toolbar buttons that link to a home page, table of
     contents, index, glossary, copyright page, next page, previous page,
     help page, higher level page, or a bookmark in the document." (Source:
     http://www.aolpress.com/press/1.2features.html)

  6. Attributes. Via HTTP, it should be possible to create, query, and
     delete arbitrary attributes on entities of any media type.

     Attributes can be used to define fields (such as author, title,
     subject, organization) on resources of any media type, which can be
     used later in searches. Attributes also support the creation of catalog
     entries as a placeholder for an entity which are not available in
     electronic form, or which will be available later.

  7. List URL Hierarchy Level. A listing of all entities, along with their
     media type, and last modified date, which are located at a specific URL
     [ref RFC 1738] hierarchy level in an http URL scheme should be
     accessible via HTTP.

     In [ref RFC 1738] it states that, "some URL schemes (such as the ftp,
     http, and file schemes) contain names that can be considered
     hierarchical." Especially for HTTP servers which directly map all or
     part of their URL name space into a filesystem, it is very useful to
     get a listing of all resources located at a particular hierarchy level.
     This functionality supports "Save As..." dialog boxes, which provide a
     listing of the entities at a current hierarchy level, and allow
     navigation through the hierarchy. It also supports the creation of
     graphical visualizations (typically as a network) of the hypertext
     structure among the entities at a hierarchy level, or set of levels. It
     also supports a tree visualization of the entities and their hierarchy
     levels.
     AOLpress, America Online, currently supports "Save As..." dialog boxes,
     and graphical network visualization of a portion of a site's hypertext
     structure, which they term a "mini-web."
     FrontPage, Microsoft, also currently supports a graphical network
     visualization and additionally support a tree visualization of a
     portion of a site's structure.

  8. Make URL Hierarchy Level. Via HTTP, it should be possible to create a
     new URL hierarchy level in an http URL scheme.

     The ability to create containers to hold related entities supports
     management of a name space by packaging its members into small, related
     clusters. This utility of this capability is demonstrated by its wide
     implementation in recent operating systems. The ability to create a URL
     hierarchy level also supports the creation of "Save As..." dialog boxes
     with "New Level/Folder/Directory" capability, common in many
     applications.
     AOLpress, America Online, currently supports this capability through
     their "Save As..." dialog box, and their custom MKDIR method.

  9. Copy. Via HTTP, it should be possible to make a byte-for-byte duplicate
     of an entity without a client loading, then resaving the entity. This
     copy should leave an audit trail.

     There are many reasons why an entity might need to be duplicated, such
     as change of ownership, a precursor to major modifications, or to make
     a backup. In combination with delete functionality, copy can be used to
     implement rename and move capabilities, by performing a copy to a new
     name, and a delete of the old name. Due to network costs associated
     with loading and saving an entity, it is far preferable to have a
     server perform an entity copy than a client. If a copied entity records
     which entity it is a copy of, then it would be possible for a cache to
     avoid loading the copied entity if it already locally stores the
     original.

 10. Rename. Via HTTP, it should be possible to change the URL of an entity
     without a client loading, then resaving the entity under a different
     name.

     It is often necessary to change the name of an entity, for example due
     to adoption of a new naming convention, or if a typing error was made
     entering the name originally. Due to network costs, it is undesirable
     to perform this operation by loading, then resaving the entity,
     followed by a delete of the old entity. Ideally an HTTP server should
     record the rename operation, and issue a "301 Moved Permanently" status
     code for requests on the old URL. Note that moving an entity is
     considered the same function as renaming an entity.


--#--