[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: shim6 @ NANOG (forwarded note from John Payne)



On 27-feb-2006, at 16:22, Erik Nordmark wrote:

2) The first hit is of critical importance to content providers (many of whom couldn't legitimately justify a /32). Hunting through DNS to
find a locator that works won't fly.

Right. So how do we solve this? Second shim? Shim-before-syn?

It is very hard *under* the current socket API, since the connect() and sendto() calls do not know that there is alternative addresses to try.

But if the application is using some middleware that has the equivalent of a connect_to_name(), then it isn't hard to implement that API on top of the socket API by using non-blocking connect() and trying different addresses relatively quickly (instead of waiting for a minute or so until TCP times out, it could try a second connect on a different socket after a few seconds).

True enough, and I certainly hope that frameworks, libraries and the like that implement a connect-by-name type call do this. However, this isn't the only way to solve this. One very interesting feature that I discovered this summer when finally upgrading to a more recent version of FreeBSD was the address selection policy mechanism (see the ip6addrctl command, Windows also has it as netsh interface ipv6 show/set prefixpolicy). With this, it's possible to control address preferences system-wide. I'm not sure how it's implemented exactly, but apparently this policy is applied somewhere between the DNS resolver and the application.

One way to put this mechanism to good use is when a DNS lookup for an AAAA record returns more than one address, is to go out and check which address is alive and/or apply local, remote and in-the-middle traffic engineering policies. Since the application hasn't committed to any particular address choice yet, this should work very well. The only downside is that all of this will take an extra roundtrip.

I imagine that we can send out some kind of quick reachability check / information request to the other side, maybe even towards all the addresses returned by the DNS. I have to read up on the R1/R2 stuff, but I imagine this could be an R1, without necessarily completing the full handshake at this time.

Since we're waiting for information to come back from the correspondent anyway, we can use this time to query a policy server that's local to the site or is located at one or more of the site's ISPs. If the servers don't waste time, the answer for this should be back by the time the correspondent answers anyway.

Another advantage of checking reachability at this stage is that we can easily avoid ingress filtering: simply try the reachability check towards the correspondent with the cartesian product of all local and remote addresses at the same time. The probes with an ingress filtered source address shouldn't even take up any non-local bandwidth if the egress routers also perform this filtering.