[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: host-centric draft



On 20-feb-04, at 18:37, marcelo bagnulo wrote:

for instance, the options that you were mentioning that is missing, could
you expand on this?

The two layer 3 network that you mention, would this be like a multihomed
host?

The idea is that you build two separate layer 3 networks so that the host gets to make a choice in which network they inject the packet and then there is no complexity moving packets from the part that talks to ISP A to the part that talks to ISP B. The simple way to do this would be to have two network interfaces on each host, where one NIC has an address from ISP A and talks to routers that eventually connect to ISP A, while the other NIC has an address from ISP B and only talks to routers connected to ISP B. A slightly more complex way to do this would be to have the host use different virtual interfaces over a single NIC (using VLANs or some such).


The problem that rerouting could invalidate an earlier source address
choice isn't mentioned.

Could expand this too? perhaps an example?

Host X wants to talk to remote host Y. X sends packets with source addresses compatible with ISP A to a router that delivers the packet to ISP A and life is good, despite the fact that the router doesn't do source address based routing. Now the link to ISP A fails, so the router reroutes the packets to Y over ISP B. But now the source addresses used by this session are incompatbile with the way the packet is routed. Life is not so good, especially if this was a legacy session that doesn't support changing addresses during the session.


Now in this example the problem isn't too bad because the session would have failed also if the site was single homed to ISP A. But if the site uses BGP to determine the best path, it may also reroute traffic over a different ISP if the first ISP is still available. So then we're breaking sessions that would have worked without multihoming. That can't be what we want. So any kind of dynamic best path selection can't be used unless ALL sessions can change addresses.

I'm unsure why we would want to inject BGP information into hosts.

I don't think that injecting BGP routing table has been considered in the
draft (but i may be forgetting something)

What's all the talk about BGP in the draft about then?


Reachability yes/no isn't very useful as the actual reachability status
of the other side is in almost all cases hidden by aggregation. So
basically this only indicates whether the ISP in question is available.
Determining preference based on BGP information is also a relatively
futile endeavor

Well,
as i see it, there are two systems that can be aware of reachability: the
routing system or the end host itself

The end host can discover reachability by itself, simply trying to send
packets to the other end. This is more reliable way to discover
reachability, since the host is actually reaching the target.

Right.


The other option is to use the routing system. As you mention, the
information provided by the routing system is aggregated, so it hides some
information. However, the routing system has the information in advance, while the host has to discover it when it needs it.

Our current interdomain routing system doesn't deliver actual reachability information. It is _extremely_ rare for an entire ISP / address block from an ISP to be down. And it's well-established that BGP is very slow to propagate the fact that a prefix is no longer reachable. So I would strongly advice against taking "reachable yes/no" information from BGP. And as explained above, doing path selection based on BGP properties is also problematic.


So the host obtained by the host is more accurate, but the routing system
provides it faster, i guess.

So the question is: can we detect the actual reachability using probes fast enough that we don't feel we also need to look in the routing tables? If end-to-end is 30 seconds while routing table is 30 microseconds, then sure, I agree that the latter is useful. If end-to-end is 3 seconds, maybe having to wait this long some of the time is better than importing all this BGP complexity. If it's 300 milliseconds I'm sure it is.


I don't really agree. Applications need to try all addresses for a
correspondent. Some do this today, most don't. And this needs to be
done in a smart way, a four minute timout between trying successive
addresses isn't acceptable.

Well, rfc 3484 already states that
"Well-behaved applications SHOULD iterate through the list of
addresses returned from getaddrinfo() until they find a working
address."

But i agree with you, something about this can be included here.

The other point about this is when considering retrying with different
source addresses. If you do this sequentially, i think that it would be up to the app to retry.

However if you do it in parallel, i think that the ip layer could simply
generate multiple packets and send them in parallel, so no need to change
the apps.

Agree. But then, we probably want to try multiple destination addresses at the same time too, which can't be done without changes. And we probably want to have a way to signal to the other side "these are all really the same session, don't be worried about the session setup being aborted for the alternative ones".