Arcadia Papers: ABSTRACT
"Maintaining Distributed Hypertext Infostructures: Welcome to
MOMspider's Web",
by Roy Fielding in
Proceedings of the First International World-Wide Web Conference
(WWW94), Geneva, May 25-27, 1994.
subsequent version of this paper published in Computer Networks and
ISDN Systems, 27(2), November 1994.)
Abstract
Most documents made available on the World-Wide Web can be considered part
of an infostructure -- an information resource database with a specifically
designed structure. Infostructures often contain a wide variety of
information sources, in the form of interlinked documents at distributed
sites, which are maintained by a number of different document owners
(usually, but not necessarily, the original document authors). Individual
documents may also be shared by multiple infostructures. Since it is
rarely static, the content of an infostructure is likely to change over
time and may vary from the intended structure. Documents may be moved or
deleted, referenced information may change, and hypertext links
may be broken.
As it grows, an infostructure becomes complex and difficult to maintain.
Such maintenance currently relies upon the error logs of each server
(often never relayed to the document owners), the complaints of users
(often not seen by the actual document maintainers), and periodic manual
traversals by each owner of all the webs for which they are responsible.
Since thorough manual traversal of a web can be time-consuming and boring,
maintenance is rarely or inconsistently performed and the infostructure
eventually becomes corrupted. What is needed is an automated means for
traversing a web of documents and checking for changes which may require
the attention of the human maintainers (owners) of that web.
The Multi-Owner Maintenance spider (MOMspider) has been developed to at
least partially solve this maintenance problem. MOMspider can periodically
traverse a list of webs (by owner, site, or document tree), check each web
for any changes which may require its owner's attention, and build a special
index document that lists out the attributes and connections of the web
in a form that can itself be traversed as a hypertext document. This paper
describes the design of MOMspider and how it was influenced by the nature
of distributed hypertext maintenance and requirements for the good behavior
of any web-traversing robot. It also includes discussion of the efficiency
requirements for maintaining world-wide webs and proposed changes to HTML
and HTTP to support distributed maintenance. The paper concludes with a
short description of MOMspider's future and pointers to its freeware
distribution site.
The Arcadia Project
<arcadia-www@ics.uci.edu>
Last modified: Tue Feb 28 11:28:47 1995