[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: about reachability detection draft
Ok, the long awaited reply:
On 16-jul-2005, at 17:49, marcelo bagnulo braun wrote:
In section 2 it is stated that:
- In the second model, a host can only detect problems in the
receiving
direction so it must depend on the correspondent to detect problems
in the other direction
[this is what I called forced bidirectional or FBD in the slides]
I think it is important to consider one step further then. I mean,
what can a host possibly do when he detects an outage in the
incoming path? If a host detects an outage in the outgoing path, he
can change the address pair that he is using to send packets and
see if it solves the problem, but what can he do if he detects an
outage in the incoming path? I guess that the only option would be
to notify the correspondent node about the failure so that the
correspondent uses an alternative path (Note that we are asuming
the the case of unidirectional connectivity is possible)
So i guess that this mode needs an additional message informing the
failure, which needs to be taken into account when comparing the
options.
Well, the idea is that when there is actually a failure in the active
path, it's likely that there will be failures in one or more of the
backup paths too. So in my opinion, it doesn't make sense to switch
to a new path blindly. So we must first test any path we may want to
switch to.
Now one of two things can happen:
1. the candidate path doesn't work either -> timeout, try another
2. the candidate path works -> we can communicate with the correspondent
If there is any connectivity left, eventually we'll end up in
situation 2, and the fact that we're sending the correspondent these
test packets (which are different from the regular ones that we send
when we thing there is still connectivity, see below) tells the
correspondent that something is wrong, so the correspondent starts
sending test packets in our direction too.
The reason I think we need two types of test packets is because in
the situation where there is a large number of address pairs with
unknown reachability and/or RTT, we need to do extra work to make
sure that when A sends (for instance) a probe A2 -> B3, which makes
it to B, but there is only connectivity from B to A over B1 -> A4,
it's unlikely that B1 -> A4 is the first pair B tries, so B needs to
include reports about what it got from A in all of its packets. And
we need a reasonable level of authentication too, because we haven't
previously established that these addresses indeed belong to the
correspondent.
On the other hand, when A1 <-> B1 is working happily, there is no
need to use such a complex protocol: we are only testing one pair in
each direction, and the correspondent has been authenticated earlier.
If we wanted we could even use pings to determine whether this still
works. (Well, sort of...)
I guess it would also make sense to state how the those mechanisms
behave when there are no outgoing packets? i mean, i guess that in
any of the modes signaling is suppressed right?, however, i guess
that the hosts assume that the address pair is reachable, right?
With correspondent unreachability detection (first mechanism in the
draft) the transport hints would accompany the packets, so packets =
no, hints = no -> no action. Only packets = yes and hints = no
requires reachability probes.
In the forced bidirectional communication we only send probes when we
sent data recently but didn't receive data recently, so no probes in
this case either.
Note that some transports and some applications explicitly keep the
session alive with periodic traffic.
Additionally, i guess that there are other information that needs
to be taken into account when detecting reachability, such as ICMP
error messages, address deprecation, lower layers information (i
know that you state that you assume that addresses are available,
but what happens if an address of the currently used address pair
is deprecated? or if the associated interface goes down?)
It makes sense to send a probe when there is an ICMP error (have to
rate limit this, though). Deprecation is irrelevant, as we can
continue to use deprecated addresses. Lower layer events are also a
reason to do a reachability test, I think. A more interesting case is
when an address is removed from the system. I don't remember which
session it was, but in Paris someone was talking about how systems
keep using addresses they no longer have because upper layers still
use those addresses. IMO this is a feature: if I unplug my ethernet
from my powerbook and turn on my wifi, I get the same address on a
different interface and my sessions are still alive. Under windows,
things like this kill your sessions immediately.
Another issue that may be of interest is what happens with an
address (pair) after it becomes unreachable? i mean, is it used in
followings address pair explorations? is it putted in quarantine?
The way I see it there is an ordered list of address pairs. The more
probes fail to make it to the other side the lower the address pair
ends up on the list, I imagine. :-)
We need to think about the situation where a fast primary link fails
and we switch to a slow backup, though. Presumably, we'll want to
switch back to the fast primary address pair when possible.
In its essence, address pair exploration is very simple: just send
probes using every possible address pair, wait for something to come
back and possibly consider the round trip time.
I guess we agree that you are oversimplifying the issue here :-)
Well, you know me. (-:
I mean, the complexity is not only due to the amount of probe
packets that are needed, but also becuase of unidirectional
connectivity. I think it is very important to express such
difficulty. I mean, even with two address pairs, the problem can be
quite complex, because replies need to carry information, not only
about the particular incoming packet, but also from other
previously received probes, in order to allow the transmitter to
determine if previous probes have succesfully arrived. I think it
is important to describe this problem and possible approaches to
deal with it.
You're absolutely right. I think I said something about these
subjects but it has to be fleshed out in detail.
Finally, in the security considerations section, i think that there
is closely related problem that perhaps needs to be presented here
that is flooding protection. I mean, the path exploration exchange
can be used for identifying working address pairs but also for
preventing that the shim can be used for flooding attacks. In order
to enable the path exploration exchange to be used for this, you
need to include some additional information in the exchange, some
information that identifies the shim context, so that the receiver
of a packet of the address pair exploration process can determine
if this is one of its own established sessions that are being
genuinely rehomed or if this is a flooding attack.
Yes.
Iljitsch