[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Transport multihoming



On Fri, 25 Oct 2002, Iljitsch van Beijnum wrote:

> On Fri, 25 Oct 2002, Peter Tattam wrote:
> 
> > > As I see it, the reason to have the multihoming functionality inside one
> > > or more transport protocols is that the transport layer has end-to-end
> > > knowledge that makes it possible to make better multihoming decisions.
> 
> > My recent thoughts are that it need not be tied directly to the TCP protocol,
> > but can instead be done at the IP layer of the host stack.  However the TCP and
> > IP layers should be aware of each other in much the same way that PMTU is
> > facilitated.  If done that way, there would be benefits from caching the
> > multihoming information and sharing it over several connections.
> 
> This may or may not be desireable; the best way to handle this is
> probably some capabilities negotiation where each end tells the other: I
> can do this and this and that and it applies to:
> 
> 1. All hosts living in this prefix: ...
> 2. This host
> 3. This protocol (TCP, UDP, ...)
> 4. This session
> 
> > Also because
> > of the strong aggregation, the cached information would be able to build a
> > multihoming hint tree more efficiently than would a flat list of IP addresses.
> > For example, if a major link from one aggregator showed a problem, it would
> > take provide a hint that all aggregations from that provider would be
> > inaccessible and the stack could use this intelligence in advance.
> 
> Hm, not sure if this would work very well as many different networks are
> behind a single PA block, some of which may be down, some of which may
> be operational. Also, I don't think many hosts will be donning a full
> routing table. However, it would be nice to have. I've been thinking
> about a protocol to share such information between hosts a while back,
> maybe this would be a nice addition later on.
> 
> > > Would it be possible to have a modified TCP talk to a non-modified TCP
> > > through some kind of "mudem" (multihomer/demultihomer), without loss of
> > > the core multihoming functionality, and without the "mudem" having to
> > > keep long-term state?
> 
> > Depends what you mean by long term.  Please clarify.
> 
> If a host sets up a connection that passes through a "mudem" this box
> may intercept the setup packet and do some capability negotiation with
> the other end. During this negotiation, there must be state in the mudem
> box. That would be acceptable. But when the session is established, it
> should be possible for the mudem to go down and come back up, or for
> another to take over, without breaking the session. So for running
> sessions, there must not be any state that can't be recovered by looking
> at the session = state that would last as long as the session = long
> time.
> 
> > This is perhaps close to what I alluded to by a decoupling process.  What is
> > fundamental to a reliable working solution using my concepts is that if the
> > prefix replacement is decoupled, it must still be done in a secure way so that
> > protocols like TCP which might depend on address immutability can have an iron
> > clad guarantee that the address selection is valid.
> 
> I'm not sure what kind of bad things you want to protect against. If
> someone has full access to the TCP packets, there is nothing you can do
> anyway. (Other than SSL/IPSec, that is, but then they can still break
> the session.)
> 
> > It is clear that moving the process out of the host kernel enviromnent into an
> > ancilliary processor environment would imply using some kind of secure control
> > protocol to ensure that addresses are dealt with correctly, and this adds a
> > degree of complication which I believe to be excessive.
> 
> Hm, there is one thing that would be bad: a host thinks it talks with
> host X, while in fact it talks to host Y. But that could happen with NAT
> or some other man-in-the-middle thingy anyway, so is it worth the
> trouble protecting against this? We could build something into the
> protocol but obviously a corrupted multihoming processing box wouldn't
> necessarily implement this and the fact that you _don't_ see such a box
> doesn't mean anything.

The real danger in all this is connection hijacking.  The set of addresses
negotiated must be kept immutable and also free from contamination during the
negotiation process.  Having a network in between the host and the mudem would
increase the risk of this contamination.

> 
> > I strong suggest that such multihoming be restricted to prefix replacement
> > only, and not arbitrary address replacement, as there will be significant
> > advantage in exploiting the implied tree structure imposed by the strong
> > aggregation.
> 
> I think it would be good to be able to replace the full address, that
> way everyone can use it, no matter how limited their connectivity.
> 
> What I envision is an IP option that makes it possible to negotiate
> capabilities at the start of a / the first session. This option can
> either be added:
> 
> 1. by the transport protocol without involvement from the IP layer
> 2. by the IP layer without involvement from the transport protocol
> 3. by the IP layer at the request of the transport protocol
>    (= we need to extend the interface)
> 4. by some external entity
> 
> Which of 1 - 3 would you say is best? Don't forget this will probably
> have to be implemented for several transport protocols.

I think 2.  Then it is available to all transport protocols.  Maybe 4 if the
issues I outlined before can be adequately dealt with.

As I said in another thread selection of source address is also something that
has been largely overlooked.  The host based MH possibly has the advantage of
getting this right better than an external entity.

> 
> It would of course also be possible to do this in a TCP option, but that
> complicates things like the TCP checksum in the case of 4 and doesn't
> address non-TCP protocols.

As discussed at the Tokyo meeting, there are limitations in the TCP header size
which may preclude this being done there.  Having it at the IP layer kind of
makes sense, and having the gleaned information cached beyond the life of
connections not only benefits multiple connections to the same host but should
also aid connectionless protocols as well.

At Tokyo, I think Steve Deering made a suggestion that we make this an IP layer
feature available to all protocols but it was so fresh in my mind that I had
not thought through all the possibilities.

So my latest thought would be that we could do it at the IP layer in a similar
way to that described in my preliminary draft.  

For TCP connections, the prefix/address set negotiation be done at syn/ack
time, and for any other protocols that have the same syn/ack semantics. 

For connectionless protocols, the IP layer would need to rely on cached state
to do its thing.  In such a scenario, it may be important to set an upper time
limit on the cached information being valid, the information being updated by
incoming traffic.  A nonce for the negotiation would have to be supplied by
both ends in this case.

If a nonce were to be added to the IP MH options, this could be made larger
than the TCP sequence number thereby increasing the security from seq number
attacks.

Another possibility is that we define a MH connection protocol which
establishes the MH characteristics of the connection between the two hosts. It
would be analogous to a TCP session and would have the same degree of security
as a TCP session.  It could be done directly at the IP layer or be an
ancilliary protocol on top of IP.  It would imply that there would be host-host
state information kept somewhere in the IP stack, but it is possible that this
could distributed amongst the control blocks associated with each connection.

My logic is that if this type of multihoming is done, state information has to
be kept, and that this state information is likely to be rather fine grained
and would very likely be a bad idea to place in a single box being used for a
whole organization whether that be a router or mudem or whatever.  The
connection states are on the hosts - these would closely map those states and
could best be managed on the hosts.

What we don't want is this thing looking like one humungous NAT box - in the
long term, that just won't scale, and it introduces another point of failure to
consider in the traffic flow.


> 
> Iljitsch
> 
> 

Peter

--
Peter R. Tattam                            peter@trumpet.com
Managing Director,    Trumpet Software International Pty Ltd
Hobart, Australia,  Ph. +61-3-6245-0220,  Fax +61-3-62450210