[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: shim6 @ NANOG (forwarded note from John Payne)



On 27-feb-2006, at 16:22, Erik Nordmark wrote:

2) The first hit is of critical importance to content providers (many of whom couldn't legitimately justify a /32). Hunting through DNS to
find a locator that works won't fly.
Right. So how do we solve this? Second shim? Shim-before-syn?
It is very hard *under* the current socket API, since the connect() and sendto() calls do not know that there is alternative addresses to try.
But if the application is using some middleware that has the equivalent of a connect_to_name(), then it isn't hard to implement that API on top of the socket API by using non-blocking connect() and trying different addresses relatively quickly (instead of waiting for a minute or so until TCP times out, it could try a second connect on a different socket after a few seconds).
True enough, and I certainly hope that frameworks, libraries and the  
like that implement a connect-by-name type call do this. However,  
this isn't the only way to solve this. One very interesting feature  
that I discovered this summer when finally upgrading to a more recent  
version of FreeBSD was the address selection policy mechanism (see  
the ip6addrctl command, Windows also has it as netsh interface ipv6  
show/set prefixpolicy). With this, it's possible to control address  
preferences system-wide. I'm not sure how it's implemented exactly,  
but apparently this policy is applied somewhere between the DNS  
resolver and the application.
One way to put this mechanism to good use is when a DNS lookup for an  
AAAA record returns more than one address, is to go out and check  
which address is alive and/or apply local, remote and in-the-middle  
traffic engineering policies. Since the application hasn't committed  
to any particular address choice yet, this should work very well. The  
only downside is that all of this will take an extra roundtrip.
I imagine that we can send out some kind of quick reachability  
check / information request to the other side, maybe even towards all  
the addresses returned by the DNS. I have to read up on the R1/R2  
stuff, but I imagine this could be an R1, without necessarily  
completing the full handshake at this time.
Since we're waiting for information to come back from the  
correspondent anyway, we can use this time to query a policy server  
that's local to the site or is located at one or more of the site's  
ISPs. If the servers don't waste time, the answer for this should be  
back by the time the correspondent answers anyway.
Another advantage of checking reachability at this stage is that we  
can easily avoid ingress filtering: simply try the reachability check  
towards the correspondent with the cartesian product of all local and  
remote addresses at the same time. The probes with an ingress  
filtered source address shouldn't even take up any non-local  
bandwidth if the egress routers also perform this filtering.