[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: multihomed host



On 10 Oct 2002, William Waites wrote:

>     Iljitsch> There is one difference that may be important sometimes,
>     Iljitsch> but inconsequential  at other  times: a host  can detect
>     Iljitsch> that an  interface is down. It  can't necessarily detect
>     Iljitsch> the link to an ISP is down.

> But the "physical  interface is down" failure mode  is relatively rare
> -- as soon  as an ethernet switch,  or ethernet to atm  or mpls bridge
> is thrown into  the mix, the "physical interface is  up but link layer
> is down"

That's the whole problem with switched networks: you can't detect failures
at layer 2, so you have to do it at layer 3. (The problem could have been
in layer 3 in the first place, so you have to detect layer 3 problems
anyway.) I wouldn't discount the usefulness of being able to detect
failures at layer 1: this is much, much faster than anything else.

> Is the correct place to implement detection of these types of failures
> in the link layer? But then  the providers have to play nicely, and it
> says  nothing  about problems  more  than one  hop  out  in the  ISP's
> network. Or beyond the ISP's AS even...

> I  guess I  am  not sure  where  in the  stack  the failure  detection
> belongs, or which types of failures are to be addressed here...

Failure detection is one of the major challenges in multihoming solutions
that aren't router-based. My thinking is that TCP should signal the IP
layer that there _may_ be a problem and the IP layer should either act as
if there is, or make sure first.

In any event, it has to be end-to-end. Unfortunately, this doesn't mean
you can leave it out at the lower layers, since failure detection at layer
4 is very slow (TCP timeout...) and you don't want to have to do this time
and time again when going back to the same destination.