WebDAV Advanced Collections: April 6, 1999 Attending: Judy Slein, Jim Davis, Geoff Clemm, Jim Whitehead ACTION ITEMS Jim Whitehead: Draft semantics of bindings. Geoff: Draft semantics of fixup operation on MOVE. Judy: Mail to Max Rible asking for scenario justifying support for server-maintained orderings. Jim Whitehead: Check on whether we can use a 506 response code for loops. WHAT ARE DIRECT REFERENCES? Geoff’s proposal was that they be URLs stored in a DAV:internalmembers property on collections. Geoff is willing to drop this proposal if no one wants to pursue it. Jim W wants to explore it. Something like a symbolic link. Separate the issues of what they direct references are and whether to store them on the collection. Problems about LOCK if addressed to a direct reference, does that lock the collection also? Geoff: It locks the target, and the server is responsible for making sure the URL stays bound to that resource. Jim W: MOVE and DELETE operations get passed through to the target. Geoff: No, we define an advanced collections delete that is a modification to the collection, not a modification to the member - redefine delete to be unbind. The same for move. Copy does affect the resource itself (same whether it’s a reference or not). Jim D: If you copy a collection that contains a direct reference, then what? Geoff: Just copy the reference. Jim D: That’s unintuitive. Jim D: Geoff’s proposal doesn’t do anything to resolve the hard problems. We still have to appeal to other arguments to resolve them. The user of the protocol can’t tell what references are. Geoff: they can tell because some methods behave differently when applied to references than to resources, some methods can’t be applied to references. Jim D: If Geoff’s proposal offered a radical simplification, it would be worth pursuing. But it doesn’t. It will lead to other complicated discussions about what a binding is. It doesn’t let us say in any principled way why methods work the way they do. Jim W: There is no compelling reason for one way or the other. We can have state as a resource or state on a collection. We can have things that act sort of like symbolic links (direct references as we’ve described them up till now), or things that act sort of like hard links. The Web is different from a file system - open / GET differs from the Unix file system. In Unix, open can’t give a redirect response. Redirect is a new thing. We are rediscovering rationale for both hard and symbolic links. Define semantics of something like hard links. It doesn’t have to be in a collection. But allow our version of hard links point to a directory. A hard link points to an underlying resource, and allows multiple URLs to be bound to a resource in such a way that MOVE doesn’t affect any other bindings. Hard links provide localized reliable binding. MOVE just changes the entry and doesn’t affect any other binding. Jim D: It’s a bad idea to define references based on the Unix file system. Geoff: We won’t do that. What we care about is the underlying notion that multiple collections can have members bound to the same resource. Jim D: Don’t use the terms hard / soft links. Talk independently of Unix. There are also 2 other semantics to consider Document Management Systems, and existing web servers (Apache). Original motivation: in current web servers there are analogs of direct and redirect reference, but you could only create them out of band. Geoff: The motivation for references is to provide the ability to reliably share a resource. When you change it, anyone using any reference to it will see the change. Jim D: Both direct and redirect references have that property. What is the desired semantics when you MOVE the target of a resource? The spec today allows, but does not require, that the reftarget properties of references to it get fixed up. Jim D: Just fix the gap in the protocol - when you MOVE a resource, the DAV:reftarget properties of all references to it MUST get updated to point to new location. Geoff: No, we need both semantics. There are cases where we don’t want fixup to happen (direct references), and cases where we do (bindings). Jim W: What are the 2 scenarios? Geoff: Activities and configurations should be unaffected by any moves or deletes to the same resources, but in other situations we do want resources to appear and disappear when they are deleted or moved from another place. Jim D: If you want the reference to be left dangling, do COPY + DELETE. Jim W: There are many possible policies, and it’s not clear which should be mandated. Existing systems may have one or another policy. So we stay neutral. Some people want to create dangling references from the moment of creation. We can get both semantics using MKREF with new headers. You can discover which semantics a reference has by examining DAV:refintegrity. Yes, on Geoff’s proposal that information would just be stored on a collection rather than on a resource. Jim W: Let direct references be bindings because when a user creates a direct reference, he is trying to create a location in a namespace that is as much as possible like the target. The best you can do is make it just another URL. MOVE has down-level DAV semantics. If you create a binding, you pass in URL r and URL t for resource x. The server has an internal id for x, and binds r to that internal resource id. If someone does a MOVE through t, the server creates a resource with internal id y and binds t to it, and deletes x. It is gray in WebDAV what happens to r. There are two possible answers - fix up r to point to y, or let r to x binding disappear. Jim D: This is the same as the existing MKREF. Jim W: There is no way to distinguish r from any other URL. There’s only one resource. All semantics are down-level semantics. They are indistinguishable to down-level clients. Jim D: The rationale won’t be any more satisfactory than it is currently. From user’s point of view, he says copy r and finds that target got copied. That’s the opposite of the IETF consensus. Jim W: At IETF, people were thinking in terms of the ontology where references are resources. If we present a different ontology, people’s intuitions may change. Geoff: Ideally, we would give up the WebDAV definition of MOVE. MOVE is a rename. Jim W: Once you have the notion of a binding, move is a change in one end of the relationship, copy adds a new binding. But it’s troublesome to redefine move / copy. We could create new move / copy methods. Don’t do that. People will accuse us of feature creep. Geoff - No, don’t have new move / copy. Just explain the fixup stage - adjusting the bindings. Jim W: Here’s a problem with the binding approach, though: If we replace direct references with bindings, we lose relative references. A binding is an absolute URL bound to a resource. Jim D: If you move a collection, things will break. Geoff: If relative URLs point downward in the hierarchy, that’s ok. It’s a problem if they go up and out. Jim D: No, once the binding is complete, it would continue to point to the original target. Jim W: Suppose URL t is created with PUT, and the internal resource id is x. Later someone creates a new binding r, to t. Now someone moves collection c that contains t - then the server has to do fixup semantics. If you do a depth copy of collection c to new URL d, a new resource with id y gets created - what should fixup do? Geoff : You avoid fixup if c/ is a binding to collection rc, and t is bound to x. When you move c to d, you just change the binding from c to rc to be d to rc. Then all members stay the same. Jim D: But what if I wanted my references to become dangling? Geoff: We need 2 kinds of references. We still need references that aren’t bindings, they refer to URLs. Bindings are direct by definition. Jim W: If you do a copy and that just changes the binding, that’s not the semantics that are in rfc 2518. Geoff: The fixup stage is intended to produce this. Geoff: We need both kinds of members, so we define a new operation called BIND different from move or copy. Jim D: We could just put headers on MKREF. Fixup is different depending on whether you have a reference or a binding - we need two well- defined semantics - mkref and bind, or distinguish between the semantics with a header Jim D: There’s a requirement that references be able to carry properties. What do we say to people who care about that requirement? Jim W: We can just say that trying to meet that goal led to a protocol with undesirable characteristics. Geoff: In versioning, the only properties it was tempting to put on references were transitory and interesting only for the server. Someone needs to write down the proposed semantics in detail. Judy, Jim D: Separate the issue of treating references as bindings from the issue of incorporating them in a property of collections. Need implementations to see whether there are problems with one approach or the other. Unstated requirement: Must DAV map to file system semantics? A mapping to file system semantics can be a goal, but don’t constrain the protocol to be the least common denominator of file systems. Jim W: Web servers that map onto file systems (Apache) must have some module that does URL to file mapping. There must be a table or algorithm there, just add a new table entry, and have MOVE and COPY operate on the table - this shouldn’t be hard. Jim D thinks this is not an easy implementation. Let’s see the semantics first. Jim W will write definition of a BIND method. Geoff will write a description of the fixup step operation. Geoff: We need 3 tpyes of references - bindings (associate a url with a resource), referential members (associate a url with a url), redirect members Drop direct references? Bindings don’t give relative references capability. Semantics will include an example where operation on binding differs from operation on references. Jim D: Drop direct references - this is getting too complex. SEPARATE SPEC FOR ORDERING Jim D: Is there any advantage in separating ordered collections and publishing that spec now? It seems stable, whereas referencing is not. Jim W: Keep them together. There are already 4 RFCs planned for WebDAV capabilities, plus DASL. Options are getting too complicated. Is there any benefit to getting ordered collections out sooner? There is more interest in references than in ordering. If we keep them together, we might get more people to implement ordering because they want referencing. But the spec has separate compliance classes for ordering and referencing because they are orthogonal. We could query the group. Web folders alphabetize the contents of collections, so they are imposing their own ordering on the client side. SERVER-MAINTAINED ORDERINGS Server-maintained orderings: Jim D: Why would you want this? Jim W: A server-maintained alphabetical ordering makes sense. Jim D: From the standpoint of the protocol, server-maintained is unordered. We have made it possible. Current spec says a request fails if it has a Position header on an unordered collection. Jim W believes that Max only wants us to say that server-maintained orderings may exist, and say how operations act against server-maintained orderings. We don't want to get into issues about registering server- maintained orderings. Jim D does not want to let client select from a list of available server-maintained orderings - he sees no use case for this, and it's expensive to implement. It’s one thing for the server to declare how it will order, another for client to be able to request a particular server-maintained ordering. Jim D doesn’t see use case even for that. Max is implementing a server. Ask him for use case - advertise an ordering that it will use vs. let client choose. Judy: will send note to Max asking for a compelling use case and whether Max intends to implement something like this if we make it possible. DO WE NEED BOTH DIRECT AND REDIRECT REFERENCES? If clients are intelligent enough to use redirect, then perform all future operations on the target isn’t that more efficient than direct references? Yes, if the client does that. There's a one-time loss, then use the target’s URL. But that's incorrect use of 302. Suppose we use 301 instead. There's a one-time cost, then the client operates on the target. Gets rid of complexity. (Geoff: We do not want the client to assume the location of the target won’t change. We want 302 semantics, where we force the client to continue to use the reflector. If we also want a permanent redirect, it's ok to have that as well.) From a document management perspective, we need direct references. We need to hide the target, need redirection to be done server-side. Flat namespace and faking the hierarchy. We’ll assess this case against bindings semantics. Geoff is willing to deal with relative references later. LOOP DETECTION Should we require servers to detect loops? If the server is providing a chunked response, it might never discover a loop. It might never abort. Let's say SHOULD. Jim W: if we think there's enough justification for SHOULD, then let's say MUST. We should always start at MUST, and only back down to SHOULD if we know of some case where a stronger requirement would be a problem. We know it's easy to implement loop detection. Say MUST, and see if the community objects. Decision: MUST. Jim W: Will ask Roy about using a 506 response for this case.