[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: survivability, rewriting



There is middle ground between letting the application to deal with the
preservation of the established communications (handling locator discovery,
security) and letting all this to be performed by the multi-homing layer.
Some parts seem to be pretty common to all apps, locator discovery, security
Some parts seem to be very different such as failure detection.
So the common parts should be provided by the common multi-homing layer and
the different parts should be ULP specific.
IMHO, the issue is how to make the interaction of those mechanisms
Regards, marcelo

> -----Mensaje original-----
> De: owner-multi6@ops.ietf.org [mailto:owner-multi6@ops.ietf.org]En
> nombre de Brian E Carpenter
> Enviado el: viernes, 31 de octubre de 2003 13:52
> Para: Multi6 Mailing List
> Asunto: Re: survivability, rewriting
>
>
> Pekka Savola wrote:
> >
> > On Fri, 31 Oct 2003, Brian E Carpenter wrote:
> > > I agree with this, and I'd add that many applications can survive much
> > > longer glitches than 5 seconds, and even TCP resets, by
> putting some fairly
> > > trivial retry logic in the right place. [...]
> >
> > (I'm pretty sure you agree here, but playing the devil's
> advocate to bring
> > up an important point here..)
> >
> > Is it the business of the applications to put in this retry logic?
> >
> > No.
> >
> > If *every* application has to do this, we've failed.  If such
> adding such
> > logic is deemed the best approach, it needs to be put somewhere else.
>
> That is what I would have said a few years ago. But the fact of life is
> that TCP resets do occur, and if you are building a business class
> application you will *not* allow that to cause an applications level
> failure. So all the business class applications that I know already
> have retry logic, and it was put there by programmers who wouldn't know
> a multihoming event if it hit them in the face.
>
> Actually it's just an extension of the fate sharing argument. If the host
> hasn't actually crashed and burned, it should try again at successively
> higher levels of the stack until things work again.
>
> That's why I've always rated transport survivability as only
> "nice to have"
> in multihoming.
>
>    Brian
>