[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: pulish WCIP version 01
Hi, Fred, this is excellent. Attached is the updated doc, also linked at
http://content-signaling.org/
so people can always see the up-to-date one. (yet an invalidation problem ;-)
Comments (revision log) inline.
At 11:14 AM 3/2/01 -0500, Fred Douglis wrote:
I have a problem with the term "dynamic data". To me, dynamic data is
something that changes all the time, such as a stock quote, and is
inherently uncachable. "Frequently-changing" data is more
appropriate, as long as the rate of access dominates the rate of
change -- the observation from the DOCP work among others.
So, I would search and destroy references to "caching dynamic data"
and make it clear that you mean "frequently-changing" or maybe
"semi-dynamic" data. Alternatively, one might redefine "dynamic data"
to be absolutely clear about this distinction, something I've tried to
do below, both when the term is introduced and in the Defs section.
I know. "dynamic content/data" is a fuzzy term. Outside of the scope of WCIP, I view this term as referring to:
1. frequently changing content (and yet popular)
2. personalized content
3. dynamically computed content (computed based on frequently changing data and/or personal info).
WCIP is solving #1 with the trajectory that if the proxy computes dynamic content, WCIP can be used to keep the underlying data up-to-date, blah blah ...
Anyways. Back to the scope of the draft, we should probably keep things simple and non-controversial. So, I have replaced this term with frequently changing content.
You changed reliable multicast to IP multicast and claimed that
reliable delivery wasn't necessary because of the volume IDs, but in
3.3 it still says message delivery MUST be reliable. Should that be
changed?
changed the bullet to "Delivery: delivery SHOULD be real-time in that the average latency should be comparable to the network round-trip time from the sender to the receiver. It's RECOMMENDED that the delivery be reliable, full duplex, and in sequence (wrt. the sender) to achieve good performance, although it's not required."
Related work seems incomplete. I know not everything need be included, but for
example, it's incestuous to include [4] and [5] but not earlier work on using
volumes for invalidation -- my own incestuous suggestion there is:
done.
@InProceedings{cohen98,
author = "Edith Cohen and Balachander Krishnamurthy and Jennifer
Rexford",
title = "Improving End-to-End Performance of the {W}eb Using
Server Volumes and Proxy Filters",
booktitle = "Proceedings of the ACM SIGCOMM conference",
year = "1998",
month = sep,
pages = "241--253",
note = "\url{http://www.research.att.com/~bala/
papers/sigcomm98.ps.gz}",
}
I also think the draft uses the first person too much -- lots of
"we's" in there, or "let's lay out", or ...
fixed.
Abstract
done.
Table of Content
1. Introduction
got rid of dynamic content. Only use "frequently changing content" and "dynamically computed content".
Dynamic content is quickly becoming a significant percentage of the
strike "the" at end
sorry, what do you mean? delete "the"?
TTL
fixed
strong cache consistency, and yet "poll every time" is costly. So
cite Gwertzmann & Seltzer here?
done.
the content provider usually sets a very short expiration time or
a content provider
done.
s/the/a/g
tried to nail out some out as much as I can.
The two modes are merely the two extremes of a continuum,
characterized by how soon the server proactively sends
updates/heartbeats and how soon the proxy revalidates the volume. The
sooner the revalidation, the quicker the objects are invalidated; this
results in better consistency but also more load on the server and
proxy. Regardless of the mode, the same messages are exchanged
between the invalidation server and the caching proxies, whose format
is defined by an "ObjectVolume" XML DTD [forward ref]. Each round of
message exchange, whether initiated by the server or the client, is a
process of "volume synchronization" and results in an up-to-date view
of the object volume. Based on the up-to-date view, the proxy can
provide freshness guarantees to all the objects in the volume.
done. thanks!
WCIP-related
fixed.
Dynamic Content
Web resources that change "frequently," where the definition of
"frequent" depends on the access rate and desired consistency
guarantees. [Or something to this effect...]
deleted refrerences to dynamic content. Only use "frequently changing content" and "dynamically computed content".
Strike "besides"
deleted "besides".
capitalize "invalidation"
fixed.
Revalidation Interval
A property of the client-driven mode. The invalidation client
initiates volume synchronization with the invalidation server, when
the "last synchronization time" was "revalidation interval" ago. The
interval SHOULD be smaller than the freshness guarantees of all the
objects in the object volume, to avoid unnecessary cache misses.
smaller, or no greater than?
If network propagation delay is 0, it's "no greater than". Otherwise, the proxy can use some spare room.
could -> can
in a timely fashion
been able
fixed them.
the object right away as HTTP revalidation could result in an
indication that the object is "Not Modified".
done.
The invalidation server picks the heartbeat interval while the
invalidation client picks the revalidation interval. Both of them
SHOULD be smaller than any of the freshness guarantees of the
no larger?
better smaller to leave room for server and network delay.
"one can"
for the DTD
An invalidation
start to send
fixed.
(4) Reliability: message delivery MUST be reliable, full duplex,
and in sequence (wrt. the sender). Moreover, delivery SHOULD be
real-time in that the average latency should be comparable to
the network round-trip time from the sender to the
receiver.
Still true given the multicast change?
Nop, changed to "Delivery: delivery SHOULD be real-time in that the average latency should be comparable to the network round-trip time from the sender to the receiver. It's RECOMMENDED that the delivery be reliable, full duplex, and in sequence (wrt. the sender) to achieve good performance, although it's not required".
later in
fixed.
The channel relay point may have multiple clients subscribed to the
same invalidation channel. It in turn only subscribes once to the
original invalidation server. By multiplicatively relaying channel
multiplicatively?
Why not "hierarchically"?
It refers to the multicast-ish nature of the relay, which takes in one stream and sends out multiple. I'll change it. It's the second time someone complained about it. ;-(
helps to scale
vice-versa
constructs
cite delta-encoding
an addition
of the event.
polling frequency
There is software [cite]
fixed.
some cases, an event described above may invalidate multiple URLs.
If the participating caching proxies are able to interpret such
events, the invalidation message may carry the description of the
event, instead of the list of invalidated URLs. This may be future
work.
This paragraph made no sense to me. I think what you mean is that
WCIP can be used to tell systems like AIDE (my own), URL-minder,
etc. about changes, and I fully agree -- and probably suggested this
in the first place. But the second sentence about invalidation is a
non-sequitur, and I don't understand the next sentence. How about:
If a database event triggers the invalidation of hundreds of objects, instead of listing all those objects and sending over to proxies, the server may just describe the event itself to the proxies, provided that the proxies know how to interpret the event and figure out the hundred objects on their own.
Does this clear things? maybe the above text should be put in as an example.
There is software providing user-level notification of changes to web
content [cite]. WCIP could potentially be used to permit agents to
subscribe to change notification, not for the purpose of cache
invalidation, but to notify users. Integrating such functionality may
be future work.
added. also, "E.g., a web crawler could subscribe to WCIP channels instead of crawling web sites periodically for object updates."
4.3 Discover Channels
...
Example:
Invalidated-By: wcip://www.cdn.com:777/allpolitics?proto=http
This used to be "cnn.com" and got changed to "cdn.com" yet later
references say cnn, and "allpolitics" seems specific to CNN. Are you
sure about this change?
Nop, don't know how it got changed. changed it back.
meantime
the latest
and compares
fixed.
However, if the volume indeed has changed, the invalidation server
MUST send back an ObjectVolume description with a base equal to or
smaller than 7. Here is an example:
I'm not that into IETF lingo, but I thought that a specific example
such as this wouldn't justify MUST rather than simply "must".
Thoughts?
does this help: "However, if the volume indeed has changed, the invalidation server sends back the journal of changes since version 7. The reply MUST have a base version equal to or smaller than the version in the synchronization request."
too colloquial -- there are
skew
a URI
a filename
Why not just say SSL; isn't HTTPS redundant?
fixed.
9 Mogul, J.C.; Douglis, F.; Feldmann, A.; Krishnamurthy, B.,
"Potential benefits of delta encoding and data compression for
HTTP", ACM SIGCOMM 97 Conference.
10 Mogul, J.C.; Douglis, F.; Feldmann, A.; Krishnamurthy, B.,
"Potential benefits of delta encoding and data compression for
HTTP", ACM SIGCOMM 97 Conference.
Notice anything odd here?
nice catch. this is the result of last-minute changes.
Full Copyright Statement
Truncated?
woops, fixed.
Thanks!
Dan