Re: Entity Tags

David G. Durand (dgd@cs.bu.edu)
Thu, 1 Aug 1996 18:25:49 -0500


>David Durand writes:
>>So if the Entity tag is the same, the document is byte-for-byte identitical.
>>If the the Entity tag is "Weak", then it may not be byte-for-byte identical,
>>if the tags are the same, but the server is saying that it doesn't care,
>>and caching would still be acceptable. I didn't have a chance to check out
>>how this affects content-type negotiation, but I got the impression that each
>>type would have its own entity tag.
>>
>>   It seems to me that a versioning system provides the right information to
>>calculate this stuff correctly (and is under obligation to do so), but that
>>ETags don't solve many of our problems for us (though they do provide a nice
>>way to check if the "current version" has changed or not).

>My question (from San Mateo) was whether the notion of "current version"
>for cacheability is so different from the notion of "current version" for
>versioning that we need an additional layer beyond the ETags.  (and in
>the rather narrow scope of preventing conflicting overwrites)

    The problem with the notion of current version (as Fabio and I
conceptualized it in our draft) is that it's the version you get with _no_
Version-ID specified. So we would still need to use HTTP's mechanism to
handle the caching aspect. The requirements for cacheability are very
strong. This is a special case of some versioning systems approximate
version references

   I think that while versioning solves the identity problem well enought
to create ETags (more details later on), even the simplest versioning has
other requirements (like tracking relationships) that ETags identity test
does not suffice to provide. So I think there are versioning-specific
requirements that need to be met separately from ETags.

    They're pretty orthogonal anyway: you can have ETags (via MD5, say)
without versioning, even if good ETags come cheap when you have a
versioning system in place.

   [[[ Note that for some systems that allow "active versions," only
"frozen" or "committed" versions satisfy the immutability requirement for
cacheability (immutability being the versioning approach to byte-for-byte
identity). Our functional requirements draft tries to forestall this kind
of problem, by asking that servers never send different states of the same
resource under the same Version-ID. Of course a different resource might
later share the same URL, but there's nothing we can do about that. ]]]

>I imagine the most common system would be a "major/minor" sort of thing,
>where minor revisions were bumped on every change (and hence equivalent
>to strong tags), and major revisions were bumped when the content had
>changed to the point where the provider no longer wished it to be cached
>(and hence equivalent to weak tags).


    I think this is correct. This seems to me like it could be a candidate
for a "Weak" ETag. Of course a versioning system always has a strong ETag
available in the form of the complete version identifier. Given that we
always have a stronger ID available, I'd avoid ever creating a weak ETag.

 I'm not sure if we want to really accomodate this at the version protocol
level. This kind of facility seems to be most important in managing
configurations -- so that we might make the HTTP level simple, and leave
the subtleties to more sophisticated clients.

   Partly this reflects a prejudice that I think I share with the HTTP 1.1
authors -- dislike of Weak identity verification.

>
>I admit this is a very naive view of versioning, but I'm not sure that
>it helps to add complexity.  Can anyone post a scenario where it is
>important to decouple cacheability from the version id?

If version-Ids are held to denote immutable objects, there is no essential
reason to decouple it. The one problem is that the HTTP 1.1 document seemed
to be saying that the ETags should be unique across all documents on a
server -- that might imply that we would need to use (Resource-ID,
Version-ID) pairs as ETags, rather than just Version-IDs. This should not
be a problem.

   To sum up, I think your simplistic view is basically right, and we
should make sure that it stays as right as possible.

   I am assuming that Content-type negotiation can lead to more than on
ETag for the "same" resource, if it is available in several forms with
different Content-types. If this is not true and anyone knows, could they
post to the list?

    -- David

----------------------------------------------+----------------------------
  David Durand                 dgd@cs.bu.edu  | david@dynamicDiagrams.com
  Boston University Computer Science          | Dynamic Diagrams
  http://cs-www.bu.edu:80/students/grads/dgd/ | http://dynamicDiagrams.com/