[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: about draft-nordmark-multi6-noid-00
On 26 okt 2003, at 12:12, marcelo bagnulo wrote:
Suppose that mh1 initiates a communication with mh2, so mh1 selects
PA:mh1
as initial source address i.e. initial locator and AID1, and also
selects
PC:mh2 as destination address and AID2.
The initial packet must be sent with the "rewrite ok" bit not set.
Suppose now that the internal routing of the site mh1 is such that
packets
sent to PC:: are forwarded through ISPB. This means that the packet
generated by mh1 described above will be discarded by ingress
filtering in
ISPB because its source address is incompatible and cannot not be
rewritten.
There are solutions possible that allow rewriting all packets. However,
IPv6 as we know it today does not. So we need to solve this problem
anyway. I think source address dependent routing makes the most sense
here.
the internal routing system
knows that it has to forward the packet through ISPA, since there is no
route available through ISPB. However, the actual packet has PB:mh1 as
source address, making the packet incompatible with ingress filtering.
The
packet then will be dropped.
When an ISP fails, it is of course extremely important to have traffic
rerouted over the other one. Today, we do this using BGP. We can still
do that, but it has the disadvantage that packets for every destination
only flow over one ISP. If that ISP is incapable of delivering the
packets, there is trouble (although this should be rare if we do BGP
rather than just a default). Less important, but still annoying: we
have to depend on BGP's limited best path selection capabilities.
If we do source address dependent routing on the other hand, a host can
utilize both ISP links for the same destination at the same time. Since
the destination address in the packet is then no longer relevant, we
don't need to run BGP either.
Of course now the multihoming system in the host must detect
unreachability and initiate a rehoming event when there is a problem in
the path over one ISP.
The internal routing system reconverges and packets are no
longer routed through ISPA but they are routed through ISPB. Since the
"rewrite ok" would be set at this moment of the communication, then
packets
that are originated by mh1 with PA:mh1 as source address are rewritten
by
the exit router so that PB:mh1 is included as source address. Now mh2
learns that it has to use PB:mh1 as destination address because this
is the address contained in the packets that it is receiving. Is that
ok?
The source address used by the other side is a good hint when there is
no real preference as to which destination address for the other side
is preferred. However, each side must be able to jump destination
addresses as it is possible that failures occur somewhere along the way
where neither side knows that something is wrong so there is no change
of source address.
Also, there is the case of asymmetric reachability, where there are two
links that only work in one direction so packets from A to B must use
different addresses than packets from B to A. This can happen with
radio links, or a weaker case where there is still symmetric
connectivity but the asymmetric connectivity is of a higher bandwidth,
is common with one way satellite links.
Now my question is how long does this mechanism takes to react? i.e.
the
response time. In order to preserve established communications, the
interruption shouldn't be too long.
We probably want to monitor transport layer events to see whether
rehoming is needed. This could be pretty fast in most cases.
The problem here is that intradomain routing is pretty slow detecting
outages.
Yes, the worst case is pretty pathetic. Fortunately, the worst case
isn't the most common one.
Suppose that ISPA is running BGP
with its upstream provider, then if i am correct, the value
recommended in
the BGP spec to detect an outage between peers is 90 seconds.
Correct, but Cisco in their infinite wisdom doubled this to 180
seconds. But you can set this much lower. I used to recommend 15
seconds but some BGP implementations are too braindead to work well
with that. Not many problems with 30 seconds, though.
Note that in many cases, the sessions will go down much faster because
BGP monitors the interface the session runs over.
This means that the interruption in the communication will be longer
than 90 seconds. IMHO this is not good enough to preserve established
communications. Am i missing something?
Isn't the TCP timeout 240 seconds? Sessions should survive, not sure if
the user is this patient, though.
My point is that the usage of the routing system to determine which
locator
to use and to preserve established communication is not a good option
because:
I agree with your points here.
It is then my conclusion that path outage detection and locator
selection
has to be performed by the end-host themselves and not by the routing
system.
Indeed.
Iljitsch