[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Preserving established communications (was RE: about draft-nordmark-multi6-noid-00)



> In terms of the time to failover I think all we know is that the shorter
> it can be the better, but we don't know how much we are willing to "pay"
to
> reach any particular time limit (for different types of failures which
> occur with different probabilities).

Yes, agree, this is the question that we have to answer

I guess that the approach could be to see how much could be achieved with
the different solutions and then see if it is worth it.

> I don't think it is that black and white. The routing system
> could propagate bad news faster than good news

I am not sure if this is true...
I am no routing expert, but i think that withdrawing a route is inherently
slower than advertising a new route, at least in path vector algorithms.
The problem is that to detect that there is no route to a certain, all the
alternative routes with increasing AS path are tried, which is inherently
slow. This would be similar to the count to infinity problem, but with
different AS path. This implies that the time to converge is proportional to
the number of ASes in the network. At least that is what i understood from
the Labovitz paper, but i may be misunderstanding some issues here.

However, i guess that when the system is looking for a new route, (i.e.
reconverging) multiple routes with increasing AS path length are advertised,
which can be used as a hint to infer that something may wrong (which i guess
was you proposal about the churn rate detection, right?)

> > This doesn't mean that the routing system is broken, it just means that
the
> > transistorizes of the routing system to reach a stable view are filtered
and
> > that during that transistorizes the view of the topology by the routing
> > system will not be accurate during that period.
>
> What do you mean by "transistorizes"? Dictionaries say:
> Date: circa 1952
> : to equip (a device) with transistors

oops... sometimes the spell checker do some funny things. I meant
transitory, but never mind, i guess you already see what i meant.

ULP hints:


> Conceptually it would make sense to view it as the host/ULP observing the
need
> to try a different locator, but the routing system (through feedback
mechanisms
> like locator rewriting and perhaps others) influence the order in which
> locators are tried.

[...]

>
> Host would try a different destination locator and routing system would
> rewrite the source locator hence they would be complimentary I think.
>

Well, i still don't see this very clear.
Let's try with an example.

Recycling an old figure, we have:


            +----+
        ----|ISPA|_             +----+
       /    +----+ \_+------+  _|ISPC|_
 +----+              |      |_/ +----+ \__+----+
 | mh1|             _|      |            _| mh2|
 +----+     +----+_/ |      |_          / +----+
       \____|ISPB|   +------+ \_+----+_/
            +----+              |ISPD|
                                +----+

We are using packet rewriting at site border routers. This means that the
site exit router connection a multihomed site with ISPX will rewrite source
address of packets and it will replace the contained prefix with  PX::. Is
this correct?

Suppose that mh1 has an established tcp connection with mh2. They are
currently using PA:mh1 and PC:mh2 as locators and the communication flows
through ISPA and ISPC.
The link between ISPC and the internet breaks
It takes 180 seconds to bgp to detect the outage, so routing remains
unchanged during that period i.e. packets flowing from the mhsite2 to an
address containing PA are forwarded through ISPC.
There are two cases to consider here, one is when the internal routing of
mh2 routes packets sent to PB:: through ISPC and when it routes them through
ISPD. I guess that the problematic case is the first one, so let's assume
that packets flowing from mh2 to PB: are carried also through ISPC.

Now when the outage occurs, one or both of the communicating nodes can start
to retransmit.
Let's consider that mh1 is the one whose TCP timeouts and that starts to do
some retransmissions. Then TCP at mh1 communicates the shim layer at mh1
that something wrong is happening.
Then the shim layer changes the destination address that is being used in
the communication from PC::mh2 to PD::mh2. In this way the communication is
rerouted to the alternative ISP. Good!

Now, suppose that mh2 also obtains a hint that something is wrong (this
could be because TCP at mh2 timeout or because packets start arriving with a
different source address (this last case cannot be guaranteed because it
depends on the internal routing in mh1)). So, mh2 shim layer start using an
alternative destination address, so it switches from PA:mh1 to PB:mh1.
However, this does not solve the problem, since packets addressed to PB::
are also routed through ISPC.
The problem here is that TCP detects the problem but it has no means to
communicate the problem to the mh2's routing system, who still believes that
a route through ISPC is still available.

I guess that if we decide to place a failure detection mechanism in the
hosts themselves, we have to provide means to let the host force the routing
system to select the path, or at least the exit router.

This could be achieved with source address based routing.. the problem here
is that this is not compatible with source address rewriting by exit
routers, i mean or whether the source address is fixed and the exit isp is
determined by it or the exit isp is fixed and the source address is
rewritten to make it coherent with the isp selected, but not both.

One option would be that for packets with the rewrite ok bit not set, source
address based routing is applied and when the rewrite ok bit is not set then
destination address based routing is used.

(Another hint can be the reception of ICMP error messages.)

This is simple example, since no interaction with routing based mechanism
(churn level detector and so) are supposed. maybe we can try with a more
complex the next round :-)

regards, marcelo