[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multihoming by IP Layer Address Rewriting (MILAR)



On Tue, 4 Sep 2001, Iljitsch van Beijnum wrote:

> On Tue, 4 Sep 2001, Peter Tattam wrote:
> 
> > > Doing this in the SYN/ACK handshake has the disadvantage that you have to
> > > do it over and over again for each TCP session. One popular application
> > > comes to mind that uses many short lived TCP sessions...
> 
> > There is the chance that it can be made independent of the TCP session but
> > still tied to the sessions so that it may only need to be done once, and could
> > also work for other connection protocols. BTW, my current tcp proposal does
> > piggy back on the syn/ack so it's only just a little extra baggage.  It's more
> > expensive to send extra packets than to piggy back on an existing packet IMHO.
> 
> I'm not worried about something you might have to do once every hour or
> so: that may take a few packets. But these days many Web pages have more
> than 20 pictures. That could mean 20 TCP sessions, each negotiating all
> addresses all over again. That's not the best way to do it. But you could
> cache this information, of course.

As the address exchange is controlled by the sender, I see no reason why a TCP
stack should not use a cached value while any sessions exist in the established
state.  Any other states would need careful thought about whether the cached
information s reliable enough.  Of course the API should be able to control the
desired behaviour if necessary.  

e.g. 

1) normal behaviour (exchange addresses using cache if available)
2) IPv4 behaviour (single address) 
3) normal behviour with no caching.

BTW, web browsers these days are getting a little smarter by using keep alive
HTTP connections - typically no more than 5-10 smultaneous connections.  I
recently added support for them into my web server.

> 
> > > I think we should definately not rule out the DNS for this, unless we're
> > > absolutely sure this will not work.
> 
> > In my experience, it's much easier to misconfigure DNS.  There are a lot of
> > clueless admins out there who would not understand the implications of getting
> > it wrong.  Even I'm guilty of stuffing up DNS delegations and SOA's in recent
> > times :)
> 
> Sure it's easy to misconfigure the DNS. But it's easy to do a lot of
> things wrong. We're not trying to build a Word clone, where pressing
> random keys on the keybord has to result in the creation of a perfect
> business letter.
> 
> > > And even if the DNS can't be trusted, it could still be a valuable source
> > > of "hints" that can be validated through some more secure means later.
> 
> > I start to get nervous whenever I hear about needing a secure channel to
> > exchange information.  If the system can operate without requiring rigorous
> > security, then it is one step more able to withstand a security attack than a
> > system that requires it.
> 
> Security is a hot topic, with just about every IP weakness known and
> unknown to man being exploited today. We have to be sure the protocols we
> come up with aren't easily fooled by someone falsifying header
> information. But I agree with you: we shouldn't require all kinds of
> strong crypto in simple protocols.
> 
> > > The major drawback of ICMP or TCP solutions is that they can only work
> > > when the first address is reachable at the beginning of the session.
> 
> > The DNS should have the initial set for connecting end for starters anyway.
> 
> In my proposal: not really. The alternative addresses can't be listed as
> valid A or AAAA records, since if these addresses can also receive regular
> traffic, the destination host can't easily determine if the destination
> address must be rewritten or not. On the other hand, if this is deemed
> very desirable, it could be done at the cost of keeping extra state and/or
> checking the TCP checksum with and without rewriting the destination
> address.
> 
> The IP layer could discover the extra IP addresses through cooperation
> with the resolver library, bypassing the regular DNS -> application -> TCP
> -> IP path the destination address follows when creating application layer
> sessions so the socket API doesn't have to change.

so what are you suggesting? new RR's to list alternative addresses?  Would this
not be similar to reinventing AAAA records with priorities?

Also, how do you plan to deal with stale DNS cache issues?  If you flush DNS
caches too quickly you would degenerate into getting information directly from
the site.   Why not just get the information directly from the site?  A single
packet exchange could carry quite a lot of information.

With DNS only a single poorly configured server in the cache path to the
destination would play havoc with any address determination.

One of the reasons DNS works farily well now is that in general, name to
address mapping & vice versa is relatively long lived.  I typically see TLLs
varying from 5 minutes up to 3 or more days.  I don't have numbers on what the
current distribution of TLLs is like but my guess is that 1 day would be about
normal.  And generally, the TLLs don't reflect how static a lot of the
information in the DNS is.

When we are talking about routability, that is likely to change on a per minute
basis and you would want the change to be propagated globally as soon as
possible.  The more I think about DNS to do the job, less I am convinced that
it can.  I am of course assuming that you want to convey the best destination
avilable via the DNS - forgive me if I misunderstand what you're saying.

> 
> > It doesn't tell the other end what addresses you might be coming from though.
> 
> Good point. But this information could easily be carried in an IP option.

surprise surprise - that's what I'm proposing :)

> 
> > > > choosing alternative addresses when a failure occurs and knowing when
> > > > to start choosing an alternative address.  We don't have much empirical
> > > > experience with this and we are only guessing at the right way to do this.
> 
> > > One system could be to periodically cycle through all available addresses
> > > and gather statistics for each. After a while, the system would know which
> > > addresses are best.
> 
> > Depends if it's a passive or active approach to gather stats.  If it involves
> > extra traffic just for the sake of gathering statistics I don't think it will
> > be popular.  The passive approach would rely on traffic already taking place
> > which I had already hinted at requiring doing in my proposal.
> 
> On hosts the passive approach would be appropriate. Changing to the next
> address periodically and keeping a copy of the RTT values for each address
> would probably do it. Maybe for routers that rewrite many sessions, the
> overhead of sending some extra packets is worth it.
> 
> If we want to be really ambitious, we could come up with a TCP
> implementation that could actively take advantage of having multiple
> destination addresses. This TCP would load balance the traffic over the
> different destination addresses, exploiting different bandwidth and delay
> properties of the different paths.

Umm.  Nope.  I don't think this will fly with the TCP people if done on an
arbitrary basis. It's sure to screw around with RTT averages and muck up the
retry mechanisms.   You only want to send a packet by an alternative path as a
last resort unless you are very sure that it will be.  


In my recent draft update, I suggested the idea of sending concurrent packets
to all possible destinations at the start of the syn exchange.  Basically you'd
have an election to see which path wins & pick that.  The downside is the
duplicate traffic could send some people ballistic.  In a TCP retry scenario
which would suggest that the path may not be the optimal, you could pull the
same trick to see if another path turns out to be better. 

The issue of load balancing is a thorny one & I'm not sure if any host based
multihoming is going to be able to do that easily. I have to confess I can only
guess at how load balancing is done with BGP. Perhaps those in the know might
enlighten me.

Another thought comes to mind.

If we consider the possibility of hosts within a site "colluding" in the case
of a provider path close by being down it might not take very many packets at
all for most of the hosts in a site to shift their traffic pattern even before
the individual connections start timing out.  The connectiions could even send
gratuitous acks ahead of time.  It probably only takes a few to start the
avalanche, but you would want to be damned sure that it couldn't be triggered
by an attacker.  Controlled multicast might be a good notification mechanism
for this.  Hmm.. we have something already in the way of router advertisements
- why not use them properly.

At the other end, if an ack for sent data in a TCP connection comes via a
different path, it is a strong hint that the current path might not be the best
and to start hunting for better paths.  This is kind of advance warning that
something has happened at the other end.

This of course assumes that we are talking about symmetric network paths.
Where traffic comes in via one provider and goes out by another it might be a
waste of time.


> 
> Iljitsch
> 
> 

Peter

--
Peter R. Tattam                            peter@trumpet.com
Managing Director,    Trumpet Software International Pty Ltd
Hobart, Australia,  Ph. +61-3-6245-0220,  Fax +61-3-62450210