Advanced Collections Minutes - May 4, 1999

Attending: Judy Slein, Jason Crawford, Jim Whitehead, Geoff Clemm, Chuck Fay,
Tyson Chihaya

ACTION ITEMS

Jim: Finish spec revisions by the end of this week if possible.

All: Review the revised spec (03.3) for next meeting.

Geoff: Describe DELETE semantics on the WebDAV mailing list.

Jim: Add to Geoff's DELETE semantics a discussion of what the server may do to 
state when processing a DELETE request.

Chuck: Define a method that gets rid of the bits, send to WebDAV mailing list.

Jim: Talk with some implementers about the proposed treatment of redirect references
to collections embedded in request-URIs (section 4.15 of the 03.2 spec).

DEFINITIONS: BINDINGS, MAPPINGS, RESOURCES

Bindings: A binding associates a name with a resource, where the name is a URL 
segment.

Mappings: A mapping associates an absolute URL with a resource.

Geoff: We may want some term for the relationship between a resource and a file
system object.  Proposal: "implements".  Jim: Let's avoid this concept.

Jim: The term "mapping" is very commonly used in a more general way than we are
proposing.  Do we want a different term for our use of "mapping"?  Let's keep it
for now, and use the term "associate" for the generic sense of map.

Jim W: Our notion of "mapping" turns out to be very useful for making the point
that adding a single binding can create mult mappings.
Geoff: It's also handy in versioning to make the point that the server is 
responsible for maintaining mappings, but clients can control bindings.

Jim W: We are improving on our previous usage when we weren't distinguishing 
between mapping and binding.

Chuck: We do need to change the definition in the current specification (03.2).

Geoff: Checked the versioning specification to see that everything would work
properly with our definition of binding.  It looks ok.

Jim W: The definition of "resource" in RFC 2396 is a good one.  We should just
use it.

Judy: Likes the definition, but thinks that the examples that have been used in
recent e-mail are not consistent with it.  They treat the resource as a filename
or something that mediates between a URL and a file.  But the definition, and
especially the examples that accompany it, make it sound more like the file is
the resource, since it is what determines the response entity.  In other cases it
won't be a file but a service, but it's whatever determines what the response entity
will be.

Jason: The interesting cases are the ones where a single resource can be associated
with multiple files.  You can get these cases even without appealing to cgi scripts
or content negotiation, as the e-mail discussions show.

Geoff:  All that we need from a definition of "resource" is the assurance that when 
you bind 2 URL segments to the same resource, the results of a put to one of them 
are visible at the other.

Chuck: Does our use of "mapping" match the usage in RFC 2396? No. RFC 2396 talks
about "conceptual mapping", which is our association.

Let us always use "URL mapping", never just "mapping".

Agreed:  A binding is an association between a URL segment and a resource.  It is
part of the state of a collection.

Agreed:  We will always use the term "URL mapping", not just "mapping".  A URL 
mapping is an association between an absolute URL and a resource.

Agreed:  The spec will include a discussion of the relationship between mappings
and bindings, but it won't be in the terminology section.

SEMANTICS OF BIND

Jim: The open issues about BIND semantics concern what happens if you issue BIND to 
a URL that is already associated with a resource. Does that work or fail? People 
seem to prefer that it fail, so that you have to do UNBIND followed by BIND.  

Should the same thing happen for a collection? Yes.

Jim: Wants to confirm that we are all committed to defending the position that BIND
for collections should work the same as BIND for an ordinary resource.  Agreed.

Judy: Do we want to say anything about the cases where a server might not want to
allow a BIND?  In particular, this might include content negotiation cases and
content generated by a cgi script or other dynamically generated content.

Geoff: A server is always free to fail a BIND if it can't satisfy the semantics
of bindings that we specify, in particular insuring that any PUT through any binding
will be visible through any GET on any binding to the same resource.

Jim: It might be better to wait and see whether people ask about this.  At least be
careful not to set any requirements on servers for these cases.  You can imagine 
with cgi that some servers might be willing to do the extra work to support bindings,
while others might not.

Geoff: We can say that if you do a GET with same headers on two bindings to the same
resource, you should get back same entity.  If we phrase it that way, then
content negotiation and cgi are encompassed.  

We also have to include PROPFIND. For dead properties.  Can we say for all live
properties? Jim W: There might be some hitherto unknown live properties that depend 
on a particular binding that would be exceptions.

UNBIND / DELETE / DESTROY

Jim W: What are the objections to defining DESTROY?

Judy: Was just interested in avoiding having to sort out the complexities of the
meaning of "resource" and what exactly happens if you request DESTROY for content
negotiated and dynamically generated content cases.

Geoff: Is no longer averse to defining DESTROY if it just means getting rid of any 
bindings created with BIND, with no commitments about garbage collection, etc.

Judy: I thought that bindings created with BIND were no different from any other
bindings (created with PUT or COPY or whatever).  DESTROY should get rid of all
bindings, not just ones created with BIND.

Jim: Whether or not we say there's no difference between bindings created with BIND 
and bindings created with PUT, etc., we have difficulties.  There will still be 
other mappings that are less well defined -- a server-created mapping that happened 
to go to the same place, some cgi bin script happens to go to the same place but we 
don't know about it.  (But these are mappings, not bindings -- they point to a 
different resource.)  We would have to define DESTROY so that we are non-committal 
on the other mappings.  Just require the server to get rid of all the bindings it 
knows about.

Judy: DESTROY should also work the same in a RFC 2518 collection as in an advanced
collection. A binding is created in a regular collection by PUT or COPY.  DESTROY
should remove all bindings to an object in a regular collection, just as it does
for advanced collections.  There is no essential difference between a regular
collection and an advanced collection.  It's just that advanced collections accept
BIND requests.  There are already bindings in regular collections.

Geoff: If we say that regular collections have bindings, but DELETE has different
semantics for regular collections than for advanced collections, this is confusing.
It's simpler to say that bindings only exist in advanced collections.

Jim: If these concepts had been available when RFC 2518 was being written, we would 
have defined collections this way. If DELETE is causing problems, lets go back and 
fix RFC 2518 so that WebDAV collections and advanced collections behave the same.

Geoff: Is it ok to say that DELETE in an advanced collection has clear semantics 
but in regular collection it does not?

Jim: DELETE is always guaranteed to unbind, and allow garbage collection.

Geoff: If there are multiple bindings, DELETE will only delete one binding.

Jim: We want DELETE to mean the same in both places.  We can revise RFC 2518 if 
necessary.  We can put binding semantics into WebDAV when it goes to draft.

Geoff: We will get protests from implementers.

Jim: Let's make the proposal on the WebDAV mailing list now.

Judy: Avoiding this controversy is one reason for leaving DELETE alone, and instead
defining a new UNBIND method.

Geoff: Doesn't want a client to have to try UNBIND, then discover it's not in an
advanced collection, then try DELETE.  This is one reason he wants to avoid defining
UNBIND, but instead give DELETE the semantics of UNBIND.

Jason: Can there be both ordinary collections and advanced collections on the same 
server?  Jim: You would probably want any given hierarchy to be advanced or not. 
Otherwise a client would have to do OPTIONS at every level.

Geoff: 2 modules on same server might have different capabilities.

Jim: Wants to roll the concept of bindings and consistent DELETE semantics into the
main spec, so that if both sorts of collections occur on the same server, there will
be more consistent behavior.

Most people won't be affected by rolling it back into the spec.

Jim: Where are we?

Geoff: Let's define DESTROY to mean delete all bindings.  Maybe we should use a 
different name for the method because people might expect to be able
to do resource management with a method called "DESTROY".  We could use a header on 
DELETE to make it delete all bindings.  The default would be to delete one binding.

Jim: Wants the destroy function, and thinks that implementing it as a header is fine.

Geoff: Let's float this.  Geoff will describe DELETE for mailing list.

Jim: Proposed semantics of DELETE: without a header, the server MUST remove the 
binding associated with the URL mapping of the request-URI.  If the last binding is 
removed, the server may do state modification.  It's worth saying that -- otherwise,
people will raise questions.  State modification might not even be restricted to
the case where the last binding is being removed.  We saw some cases in e-mail where
it might be desirable to change file names based on removal of a binding. Jim will 
add something about state modification to Geoff's definition of DELETE.

Chuck: If we are not providing DELETE / destroy for resource management, do we need
to provide some other method for doing that? 
A true destroy would not only make the resource inaccessible through HTTP / WebDAV,
but through any protocol or access method.

Geoff: Not all resources involve disk space.  What if you try to destroy some
cgi-bin derived thing? 
 
Chuck: Exclude the dynamic cases.  For static resources, a client might want to do 
resource management using the protocol.
  
Geoff: If so, not in advanced collections.  

Chuck: An author wants to use WebDAV, and wants to get rid of something he did 
yesterday, wants the bits ato be gone.  The author wants to know for sure that the
bits are gone -- maybe it's a security issue, maybe the thing he wrote yesterday was
a mistake and an embarrassment.  He wants to take it out of circulation.  

Jim: It would be possible to define this, but people won't want it mandatory to 
implement.  

Chuck: Thought we were heading this direction when we started talking about defining
a destroy method.    

Jim: You can imagine WebDAV on a document management system, where everything is a 
binding.  You could remove all bindings but leave the resource lying around.  Then 
there's a difference between delete and destroy.  We could find language for this.  
Would clients use the command? Maybe.  

Chuck: Maybe a client would implement a recycle bin.  A user can move things to the
recycle bin initially, but when he says empty recycle bin, then he really wants the
things to be deleted -- get rid of it, it's gone, free up the space. 

Chuck: Is looking for a request that is defined by the protocol to fail unless the
server actually removed the  bits.  

What level of guarantee do we want server to provide -- what would the details of 
this be like?  What about replication, archives, backups, etc. 

Chuck: The copy the client was dealing with is gone.  This doesn't concern backups, 
archives, etc.  Then you know no one can use some other protocol to get at that copy,
at least.

MAY vs MUST for getting rid of persistent state.

Geoff: We don't want to hold up acceptance for a MUST here.

Chuck: It's not just about storage management.  It's about accessibility. The client 
application can't be sure that the resource is inaccessible on the current 
definition. Geoff: Thinks it will make people unhappy.  Jim: Even defining destroy
for a file system is difficult.  Do you have to zero out the bits so that a disk
maintenance utility can't be used to recover the data? Requiring that level of 
effort from server would be unfortunate.  
Chuck: Disregard extreme cases like Norton utilities. Just satisfy the typical user.

We know what delete all bindings means.  That's clear.

It bothers Chuck that we can't make a promise to the user about the content, only 
about the bindings.  Chuck will try to come up with definition for a method that
would satisfy him and float it to the list.  

Jim: Do we want a separate UNBIND?  (Depends on how we end up defining DELETE)

Jim is willing to explore not having it.  Judy: What would be the difference
between UNBIND and DELETE without a header?  Jim: Maybe something about state
maintenance.  No, they would be the same.

BACKPOINTERS

Jim: Weren't backpointers really for direct references?  Do we need them now that
we only have redirect references?
The only use case is for client-controlled integrity, but people normally accept 
that redirect references could be dangling, so clients aren't likely to try to
maintain their integrity.  If in the future we add strong references, you won't need
clients to do integrity maintenance.

Judy: Assumed that backpointers were for both redirect and direct references.  It's
useful for both sorts of references to be able to find out what references there are
to a given resource.  There are other use cases besides the ones related to
integrity:  navigating up, finding related information by looking in all the
collections from which the target is referenced.

Judy: Actually backpointers are probably more important for bindings than for
redirect references.  Xerox's product won't use redirect references, but will use
bindings and needs some way to find out what bindings there are to a given resource.
If there's no way provided by the protocol, we'll do something non-standard.

Jim: Would we want to expand the DAV:references property to handle bindings as well
as redirect references? Or have two separate properties, one for redirect references
and one for bindings?  One approach might be easier for DASL searches.

Jason: Less interest in redirects than in bindings in the protocol.  

Jim: Maybe we want backpointers only to bindings.

Redirects are more likely to go off server; so a server is less likely to be able to
maintain a useful list of them in DAV:references, but for bindings it could do 
so.

Provisionally, DAV:references includes only bindings. We need to revisit this when
Jim Davis is present.

COLLECTIONS / REDIRECT REFERENCES

Jim: The behavior described in section 4.15 seems odd.  One of our goals for
redirect references was to allow sharing of resources in a way that would make use
of existing server functionality.  The server shouldn't have to do much new to 
support redirect references.  But allowing the use of request-URIs that have
redirect references to collections embedded in them would all be a new mechanism
for the server to implement.  It could also slow down GET processing.

Geoff: Suppose it was an advanced collection with a binding. In a file system, 
this would be a hard link, so there would be no difference to GET processing.

Judy: Are you suggesting that we not allow redirect references to collections?
Jim: No, you can create redirect references to collections, but you can't do a
GET through one of those redirect references.  It's just that
if you do a GET on a request-URI with a redirect reference in the middle, you
get a 404.

Jim is concerned about the performance of GET requests.

Geoff: The server would only go through this processing if when it examined the
whole request-URI it thought it would have to respond with 404. So this doesn't 
affect normal cases, and won't slow down processing of normal cases.  Only error 
cases are slowed down.

Jim: A redirect to a collection creates a whole bunch of extra mappings? bindings? 
what?

Geoff: It's the same as for bindings. A redirect can create a whole set of mappings,
just as bind can create a whole set of mappings.  All the redirect mappings will 
return 302's, so you can create a whole tree of 302's with a single redirect to a
collection.

Jim: A redirect shouldn't affect the namespace below it

Geoff: MOVE subtree is normal operation, lightweight.  

Jim: Will discuss with implementers how difficult this would be to do.

Geoff: left-to-right vs. right-to-left parsing of the request-URI.  Could a 
configuration file on the server cause you to jump over any redirects in the middle 
of a request-URI?  Can you predict where the configuration file wants to jump you?
Do we have to constrain the configuration file in some way in order to satisfy the
semantics?

COMMENTS ON SPEC 03.2

Chuck:

The overview in 4.2 says that HTTP and WebDAV already define "binding", but they
don't use that term anywhere.  This claim needs to be clarified or removed.

4.1.2 implies that referential integrity is guaranteed for bindings -- that it is
impossible to have a "dangling" binding.  But is this really true?  Couldn't a
cross-server binding, in particular, be broken?