Advanced Collections Minutes - May 4, 1999 Attending: Judy Slein, Jason Crawford, Jim Whitehead, Geoff Clemm, Chuck Fay, Tyson Chihaya ACTION ITEMS Jim: Finish spec revisions by the end of this week if possible. All: Review the revised spec (03.3) for next meeting. Geoff: Describe DELETE semantics on the WebDAV mailing list. Jim: Add to Geoff's DELETE semantics a discussion of what the server may do to state when processing a DELETE request. Chuck: Define a method that gets rid of the bits, send to WebDAV mailing list. Jim: Talk with some implementers about the proposed treatment of redirect references to collections embedded in request-URIs (section 4.15 of the 03.2 spec). DEFINITIONS: BINDINGS, MAPPINGS, RESOURCES Bindings: A binding associates a name with a resource, where the name is a URL segment. Mappings: A mapping associates an absolute URL with a resource. Geoff: We may want some term for the relationship between a resource and a file system object. Proposal: "implements". Jim: Let's avoid this concept. Jim: The term "mapping" is very commonly used in a more general way than we are proposing. Do we want a different term for our use of "mapping"? Let's keep it for now, and use the term "associate" for the generic sense of map. Jim W: Our notion of "mapping" turns out to be very useful for making the point that adding a single binding can create mult mappings. Geoff: It's also handy in versioning to make the point that the server is responsible for maintaining mappings, but clients can control bindings. Jim W: We are improving on our previous usage when we weren't distinguishing between mapping and binding. Chuck: We do need to change the definition in the current specification (03.2). Geoff: Checked the versioning specification to see that everything would work properly with our definition of binding. It looks ok. Jim W: The definition of "resource" in RFC 2396 is a good one. We should just use it. Judy: Likes the definition, but thinks that the examples that have been used in recent e-mail are not consistent with it. They treat the resource as a filename or something that mediates between a URL and a file. But the definition, and especially the examples that accompany it, make it sound more like the file is the resource, since it is what determines the response entity. In other cases it won't be a file but a service, but it's whatever determines what the response entity will be. Jason: The interesting cases are the ones where a single resource can be associated with multiple files. You can get these cases even without appealing to cgi scripts or content negotiation, as the e-mail discussions show. Geoff: All that we need from a definition of "resource" is the assurance that when you bind 2 URL segments to the same resource, the results of a put to one of them are visible at the other. Chuck: Does our use of "mapping" match the usage in RFC 2396? No. RFC 2396 talks about "conceptual mapping", which is our association. Let us always use "URL mapping", never just "mapping". Agreed: A binding is an association between a URL segment and a resource. It is part of the state of a collection. Agreed: We will always use the term "URL mapping", not just "mapping". A URL mapping is an association between an absolute URL and a resource. Agreed: The spec will include a discussion of the relationship between mappings and bindings, but it won't be in the terminology section. SEMANTICS OF BIND Jim: The open issues about BIND semantics concern what happens if you issue BIND to a URL that is already associated with a resource. Does that work or fail? People seem to prefer that it fail, so that you have to do UNBIND followed by BIND. Should the same thing happen for a collection? Yes. Jim: Wants to confirm that we are all committed to defending the position that BIND for collections should work the same as BIND for an ordinary resource. Agreed. Judy: Do we want to say anything about the cases where a server might not want to allow a BIND? In particular, this might include content negotiation cases and content generated by a cgi script or other dynamically generated content. Geoff: A server is always free to fail a BIND if it can't satisfy the semantics of bindings that we specify, in particular insuring that any PUT through any binding will be visible through any GET on any binding to the same resource. Jim: It might be better to wait and see whether people ask about this. At least be careful not to set any requirements on servers for these cases. You can imagine with cgi that some servers might be willing to do the extra work to support bindings, while others might not. Geoff: We can say that if you do a GET with same headers on two bindings to the same resource, you should get back same entity. If we phrase it that way, then content negotiation and cgi are encompassed. We also have to include PROPFIND. For dead properties. Can we say for all live properties? Jim W: There might be some hitherto unknown live properties that depend on a particular binding that would be exceptions. UNBIND / DELETE / DESTROY Jim W: What are the objections to defining DESTROY? Judy: Was just interested in avoiding having to sort out the complexities of the meaning of "resource" and what exactly happens if you request DESTROY for content negotiated and dynamically generated content cases. Geoff: Is no longer averse to defining DESTROY if it just means getting rid of any bindings created with BIND, with no commitments about garbage collection, etc. Judy: I thought that bindings created with BIND were no different from any other bindings (created with PUT or COPY or whatever). DESTROY should get rid of all bindings, not just ones created with BIND. Jim: Whether or not we say there's no difference between bindings created with BIND and bindings created with PUT, etc., we have difficulties. There will still be other mappings that are less well defined -- a server-created mapping that happened to go to the same place, some cgi bin script happens to go to the same place but we don't know about it. (But these are mappings, not bindings -- they point to a different resource.) We would have to define DESTROY so that we are non-committal on the other mappings. Just require the server to get rid of all the bindings it knows about. Judy: DESTROY should also work the same in a RFC 2518 collection as in an advanced collection. A binding is created in a regular collection by PUT or COPY. DESTROY should remove all bindings to an object in a regular collection, just as it does for advanced collections. There is no essential difference between a regular collection and an advanced collection. It's just that advanced collections accept BIND requests. There are already bindings in regular collections. Geoff: If we say that regular collections have bindings, but DELETE has different semantics for regular collections than for advanced collections, this is confusing. It's simpler to say that bindings only exist in advanced collections. Jim: If these concepts had been available when RFC 2518 was being written, we would have defined collections this way. If DELETE is causing problems, lets go back and fix RFC 2518 so that WebDAV collections and advanced collections behave the same. Geoff: Is it ok to say that DELETE in an advanced collection has clear semantics but in regular collection it does not? Jim: DELETE is always guaranteed to unbind, and allow garbage collection. Geoff: If there are multiple bindings, DELETE will only delete one binding. Jim: We want DELETE to mean the same in both places. We can revise RFC 2518 if necessary. We can put binding semantics into WebDAV when it goes to draft. Geoff: We will get protests from implementers. Jim: Let's make the proposal on the WebDAV mailing list now. Judy: Avoiding this controversy is one reason for leaving DELETE alone, and instead defining a new UNBIND method. Geoff: Doesn't want a client to have to try UNBIND, then discover it's not in an advanced collection, then try DELETE. This is one reason he wants to avoid defining UNBIND, but instead give DELETE the semantics of UNBIND. Jason: Can there be both ordinary collections and advanced collections on the same server? Jim: You would probably want any given hierarchy to be advanced or not. Otherwise a client would have to do OPTIONS at every level. Geoff: 2 modules on same server might have different capabilities. Jim: Wants to roll the concept of bindings and consistent DELETE semantics into the main spec, so that if both sorts of collections occur on the same server, there will be more consistent behavior. Most people won't be affected by rolling it back into the spec. Jim: Where are we? Geoff: Let's define DESTROY to mean delete all bindings. Maybe we should use a different name for the method because people might expect to be able to do resource management with a method called "DESTROY". We could use a header on DELETE to make it delete all bindings. The default would be to delete one binding. Jim: Wants the destroy function, and thinks that implementing it as a header is fine. Geoff: Let's float this. Geoff will describe DELETE for mailing list. Jim: Proposed semantics of DELETE: without a header, the server MUST remove the binding associated with the URL mapping of the request-URI. If the last binding is removed, the server may do state modification. It's worth saying that -- otherwise, people will raise questions. State modification might not even be restricted to the case where the last binding is being removed. We saw some cases in e-mail where it might be desirable to change file names based on removal of a binding. Jim will add something about state modification to Geoff's definition of DELETE. Chuck: If we are not providing DELETE / destroy for resource management, do we need to provide some other method for doing that? A true destroy would not only make the resource inaccessible through HTTP / WebDAV, but through any protocol or access method. Geoff: Not all resources involve disk space. What if you try to destroy some cgi-bin derived thing? Chuck: Exclude the dynamic cases. For static resources, a client might want to do resource management using the protocol. Geoff: If so, not in advanced collections. Chuck: An author wants to use WebDAV, and wants to get rid of something he did yesterday, wants the bits ato be gone. The author wants to know for sure that the bits are gone -- maybe it's a security issue, maybe the thing he wrote yesterday was a mistake and an embarrassment. He wants to take it out of circulation. Jim: It would be possible to define this, but people won't want it mandatory to implement. Chuck: Thought we were heading this direction when we started talking about defining a destroy method. Jim: You can imagine WebDAV on a document management system, where everything is a binding. You could remove all bindings but leave the resource lying around. Then there's a difference between delete and destroy. We could find language for this. Would clients use the command? Maybe. Chuck: Maybe a client would implement a recycle bin. A user can move things to the recycle bin initially, but when he says empty recycle bin, then he really wants the things to be deleted -- get rid of it, it's gone, free up the space. Chuck: Is looking for a request that is defined by the protocol to fail unless the server actually removed the bits. What level of guarantee do we want server to provide -- what would the details of this be like? What about replication, archives, backups, etc. Chuck: The copy the client was dealing with is gone. This doesn't concern backups, archives, etc. Then you know no one can use some other protocol to get at that copy, at least. MAY vs MUST for getting rid of persistent state. Geoff: We don't want to hold up acceptance for a MUST here. Chuck: It's not just about storage management. It's about accessibility. The client application can't be sure that the resource is inaccessible on the current definition. Geoff: Thinks it will make people unhappy. Jim: Even defining destroy for a file system is difficult. Do you have to zero out the bits so that a disk maintenance utility can't be used to recover the data? Requiring that level of effort from server would be unfortunate. Chuck: Disregard extreme cases like Norton utilities. Just satisfy the typical user. We know what delete all bindings means. That's clear. It bothers Chuck that we can't make a promise to the user about the content, only about the bindings. Chuck will try to come up with definition for a method that would satisfy him and float it to the list. Jim: Do we want a separate UNBIND? (Depends on how we end up defining DELETE) Jim is willing to explore not having it. Judy: What would be the difference between UNBIND and DELETE without a header? Jim: Maybe something about state maintenance. No, they would be the same. BACKPOINTERS Jim: Weren't backpointers really for direct references? Do we need them now that we only have redirect references? The only use case is for client-controlled integrity, but people normally accept that redirect references could be dangling, so clients aren't likely to try to maintain their integrity. If in the future we add strong references, you won't need clients to do integrity maintenance. Judy: Assumed that backpointers were for both redirect and direct references. It's useful for both sorts of references to be able to find out what references there are to a given resource. There are other use cases besides the ones related to integrity: navigating up, finding related information by looking in all the collections from which the target is referenced. Judy: Actually backpointers are probably more important for bindings than for redirect references. Xerox's product won't use redirect references, but will use bindings and needs some way to find out what bindings there are to a given resource. If there's no way provided by the protocol, we'll do something non-standard. Jim: Would we want to expand the DAV:references property to handle bindings as well as redirect references? Or have two separate properties, one for redirect references and one for bindings? One approach might be easier for DASL searches. Jason: Less interest in redirects than in bindings in the protocol. Jim: Maybe we want backpointers only to bindings. Redirects are more likely to go off server; so a server is less likely to be able to maintain a useful list of them in DAV:references, but for bindings it could do so. Provisionally, DAV:references includes only bindings. We need to revisit this when Jim Davis is present. COLLECTIONS / REDIRECT REFERENCES Jim: The behavior described in section 4.15 seems odd. One of our goals for redirect references was to allow sharing of resources in a way that would make use of existing server functionality. The server shouldn't have to do much new to support redirect references. But allowing the use of request-URIs that have redirect references to collections embedded in them would all be a new mechanism for the server to implement. It could also slow down GET processing. Geoff: Suppose it was an advanced collection with a binding. In a file system, this would be a hard link, so there would be no difference to GET processing. Judy: Are you suggesting that we not allow redirect references to collections? Jim: No, you can create redirect references to collections, but you can't do a GET through one of those redirect references. It's just that if you do a GET on a request-URI with a redirect reference in the middle, you get a 404. Jim is concerned about the performance of GET requests. Geoff: The server would only go through this processing if when it examined the whole request-URI it thought it would have to respond with 404. So this doesn't affect normal cases, and won't slow down processing of normal cases. Only error cases are slowed down. Jim: A redirect to a collection creates a whole bunch of extra mappings? bindings? what? Geoff: It's the same as for bindings. A redirect can create a whole set of mappings, just as bind can create a whole set of mappings. All the redirect mappings will return 302's, so you can create a whole tree of 302's with a single redirect to a collection. Jim: A redirect shouldn't affect the namespace below it Geoff: MOVE subtree is normal operation, lightweight. Jim: Will discuss with implementers how difficult this would be to do. Geoff: left-to-right vs. right-to-left parsing of the request-URI. Could a configuration file on the server cause you to jump over any redirects in the middle of a request-URI? Can you predict where the configuration file wants to jump you? Do we have to constrain the configuration file in some way in order to satisfy the semantics? COMMENTS ON SPEC 03.2 Chuck: The overview in 4.2 says that HTTP and WebDAV already define "binding", but they don't use that term anywhere. This claim needs to be clarified or removed. 4.1.2 implies that referential integrity is guaranteed for bindings -- that it is impossible to have a "dangling" binding. But is this really true? Couldn't a cross-server binding, in particular, be broken?