[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fwd: Minutes / Notes



[ post by non-subscriber.  with the massive amount of spam, it is easy to miss
  and therefore delete posts by non-subscribers.  if you wish to regularly
  post from an address that is not subscribed to this mailing list, send a
  message to <listname>-owner@ops.ietf.org and ask to have the alternate
  address added to the list of addresses from which submissions are
  automatically accepted. ]

On Sun, 20 Jul 2003 18:04:41 -0400
"J. Noel Chiappa" <jnc@ginger.lcs.mit.edu> wrote:

>     > From: Keith Moore <moore@cs.utk.edu>
> 
>     > It is essential that TCP connections be able to survive link failures.
>     > If you don't have this, you basically have to re-implement much of TCP
>     > at a higher layer. (Typically, explicit acks for application data, the
>     > ability to detect a lack of acknowledgment and to retransmit
>     > unacknowledged application data, and to gracefully handle duplicate
>     > transmissions of application data.)
> 
> Some of this you have to do anyway (e.g. "explicit acks for application
> data,[and] the ability to detect a lack of acknowledgment") because the
> application at the far end may consume your data (so you get an ACK from the
> TCP layer), and then hang. TCP's happy, and if you're depending on TCP to
> give you an error you'll be waiting forever.

Mumble.

For some apps, you're exactly right, and for those it's the granularity of the
recovery that we're concerned about.  SMTP can tolerate broken connections,
but it generally does so by resending the entire message.  Similarly for FTP,
though some versions of FTP have a partial transfer restart capability.

Other apps depend on "hanging" being a sufficiently rare condition that the
app doesn't need to recover.  (or that "manual recovery" is sufficient)
However, we should be careful about assuming that address changes are
sufficiently rare that the resulting failures of TCP connections are also
inherently tolerable by these apps.

(For instance, we don't expect telnet or ssh to recover from broken TCP
connctions, even though failure of such a connection can be costly.)

Really the question should not be whether TCP connections should be able to
survive link failures - because clearly there are some kinds of link failures
that cannot possibly be survived.  A better question is:  How long should an
application be able to expect a TCP connection to last without needing to 
implement its own ack/retransmit/duplicate-suppression logic?  My view is that
a TCP connection should not become a significant additional source of failure.
So it be able to last as long as its endpoints are likely to stay "up", which
is on the order of months for UNIX boxes (and minutes for Windows :).  

Keith