[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: shim6 @ NANOG (forwarded note from John Payne)



Jason,

On 27-feb-2006, at 8:02, Jason Schiller (schiller@uu.net) wrote:

As far as TE relates to IPv6, I have concerns that the TE requirements are not fully defined. Based on past discussions in the working group, and
after reading some IPv6 publications that discuss how BGP based TE is
used, it appears people think the only valid TE concerns are 1. How to
load up all links, and 2. How to make failover work. In reality there are
more requirements!

"Requirement" is a very big word. We should be careful not to use it unless the issue under discussion is absolutely essential for the solution under consideration. In my opinion, many of the examples you provide don't qualify. In essence, you're saying a new IPv6 mechanism should provide any and all capabilities that exist today. And obviously the application people say the same thing, as do the transport people and, not in the least, the security people. Since all these sets of requirements can't be fulfilled at the same time, that means either we throw up our hands and go home, or we downgrade some requirements to "good to have, but not essential" and come to terms with having to give up some stuff that we'd really like to have. Obviously what's left has to be worthwhile.

All in all it doesn't matter why people want a backup link or how it is
useful.  The only thing that matters is that people desire this
functionality.

Again, I disagree. The trouble today is that pretty much everything that CAN be done IS done. Upgrading everything that's possible today to the level of requirement makes the problem impossible to solve. And in many cases, there are different ways to meet the underlying need other than just deliver the same mechanisms that exist today. Basically, we're trying to invent screws in a world that only knows nails. Show people a screw driver and they're going to say that it's way too light and the wrong shape to be useful. Show them a screw, and they'll object that you can't possibly hammer that in.

In addition, the IPv6 multihoming solution should not remove the current tools transit ASes currently have in IPv4 style multihoming. This doesn't
mean transit ASes simply over-ride the down stream TE preferences.  A
transit AS attempts to move the packet closer to the destination. If the
destination is more than one AS away, then it may be reached through
multiple neighbor ASes.  In this case the transit AS may have some TE
choices.

Could you provide some examples of mechanisms that you consider important in day-to-day traffic management?

Now obviously most ISPs are multihomed themselves so they want to do
their own TE. That's completely legitimate. However, that doesn't mean
that they get to decide how their customers can do TE.

It is not only important for transit ASes to be able to traffic engineer their traffic to and from other ASes as I have described previously. This
can also be used to accommodate both the source outbound TE and the
destination inbound TE when they conflict.

For example say cust1 and cust2 are both multihomed to Sprint and
UUNET.  Say the cust1 wants to use UUNET as the primary inbound and
outbound link because it is higher performance. Say cust2 wants to use
the Sprint link as the primary inbound and outbound link due to cost
savings. In this case cust1 can have a default route to UUNET and a high cost default route to Sprint. Cust1 can advertise their network to UUNET with the default local preference. Cust1 can advertise their network to Sprint with a community to lower the local preference. Cust2 will do the
opposite.  Cust2 will have a default route to Sprint and a high cost
default route to UUNET. Cust2 will advertise their network to Sprint with the default local preference. Cust2 will advertise their network to UUNET
with a community to lower the local preference.

Ok.

In this case traffic from cust1 will get forwarded to UUNET due to the
more preferred default route. UUNET will honor the community from cust2
to lower the local preference on the route it learns directly from
cust2.  Traffic will be forwarded to Sprint who will pass it along to
Cust2.

Right.

The opposite will happen in the reverse direction. Cust2 will forward the traffic to Sprint. Sprint will honor the community from cust1 to lower the local preference on the route it learns directly from cust1. Traffic
will be forwarded to UUNET who will pass it along to Cust1.

Sure.

Both cust1 and cust2 inbound and outbound TE policies can be honored even
if they conflict with each other.

But what's the problem here? The same thing can easily be accomplished by cust1 advertising adresses they got from UUNET and also set a higher priority default route to UUNET while cust2 does the same with Sprint. The only difference is that in this case, when cust2 sets up a session towards cust1, it can ignore the higher priority UUNET address and connect to the lower priority Sprint address instead. But I don't see any reason for them to do so.

I think what we have here is a disconnect between what's going on in
the wg (and the multi6 design teams) and what's visible from the
outside.

Maybe there is a disconnect.  Maybe my concerns have been addressed by
private off-list communications. If that is the case, can someone address the concern that the current "IPv4 style" TE functionality is needed for
IPv6 multihoming?  Can someone define what they think are the TE
requirements (as far as I can tell they are lacking)? Can someone explain how each of the current "IPv4 style" functionalities will be supported in
shim6?

It's not that concrete. But we're certainly not ignoring TE.

Maybe someone can come to the next NANOG and set things straight
on a pannel?

Maybe the next time the NANOG meeting is held in a place where asbestos is still legal? :-)

It is my understanding that DNS may be an incomplete list of locators, or may have multiple sets of locators for multiple round robin hosts. Shim6 will exchange the complete locator set for a single host. I assume if you want TE to work, then in addition to exchanging locators, you will need to
exchange inbound TE preferences.  This also means you may want to
immediately switch over to use the shim so that you TE works prior to a
failure.

We'll have to see how that plays out in practice. Personally, I wouldn't be too happy about that because it makes everything much less transparent and if we get a shim header in all shimmed packets this uses up bandwidth unnecessarily.

Right. So there is precedent for storing state for 30000 instances of
"something". Servers are getting a lot faster and memory is getting
cheaper so adding a modest amount of extra state for longer lived
associations shouldn't be problematic.

The impression I get from content providers is that it is non- trivial to
support the added state from shim6.

I would be interested to learn how they know this.

Yes, computers are getting faster,
and memory cheaper, but these systems may negatively impact their business
models as power consumption, cooling requirements, and rack space
consumption increase.  But I won.t try to speak authoritative on this
issue, and will instead defer to some of the larger content providers.

I don't think this will be an issue at all. Let's see how things look when the protocols start to take their final shape.

Let me speak for myself and speculate a bit: what we should do is
have multihomed sites publish SRV (or versy similar) records with two
values: a "strong" value that allows primary/backup mechanisms, and a
"weak" value that allows things like 60% of all sessions should go to
this address and 40% to that one.

Then, before a host sets up a session it consults a local policy
server that adds local preferences to the remote ones and also
supplies the appropriate source address that goes with each
destination address. New mechanisms to distribute this information
have been proposed in the past, but there is already a service that
is consulted before the start of most sessions, so it makes sense to
reuse that service. (No prizes for guessing what service I'm getting
at.)

I think using SRV records for TE is overloading the DNS function.

I know some people object against that, but I find it hard to take such objections seriously. As long as the information can be cached like other DNS info there should be no impact at all on DNS operation.

However I do have some concerns about TE preferences in DNS. First, DNS
records tend to be cached for scalability, and to reduce traffic.  TE
policy may need to change quickly to reflect a topology change.  This
seems problematic.

The way I see it, the information in the DNS should remain mostly stable regardless of outages and the like.

Secondly, I am concerned that DNS may only allow for the end point to
indicate their inbound preferences, but what about the cumulative TE
preferences of all of the ASes in each of the different paths between the
source and destination,  how is this reflected in DNS based TE?

If we implement this as I outlined above, an ISP could run a TE aware DNS server that has a BGP feed to import traditional BGP TE info.

    A ----- C
  / |       | \
X   |       |  Y
  \ |       | /
    B ----- D

So suppose X wants to set up a session towards Y. Suppose that Y's preference is 40% of its traffic through C and 60% through D. If X then asks the DNS server at A for Y's addresses and SRV info, A will return Y(C) and Y(D) along with the 40/60 info, as published by Y. But A will also add the information that Y(C) is reachable over 2 hops while Y(D) is 3 hops. I'm not sure yet how exactly we can integrate this information, but X could divide 40 by 2 = 20 and 60 by 3 = 20 and then toss a virtual coin.

If X also queries B the info here would be 40 / 3 = 13 and 60 / 2 = 30.

Since the TE info added by the DNS servers would probably be at /32 resolution, it could change often even though the regular published DNS info could be fairly static. The only way this wouldn't work so well is when a host starts the mother of all file transfers and then something happens that requires a TE intervention. The file transfer would continue without receiving any new TE input.

All the ASes that are advertising 55,000 more specifics to the global
Internet routing table will care when you take this TE tool away from them
and the traffic loading on their links change.

We're not taking anything away... IPv4 can continue without changes until either it blows up or it has outlived its usefulness.

Unfortunately this is incompatible with hop-by-hop forwarding for
outgoing traffic from the customer. Obviously this can be solved both
today and with shim6 using MPLS or similar.

Inter-domain MPLS with customers or Peers is not a tool most large
networks are comfortable with. This needs to be solved inside of shim6.

I think asking shim6 to support non hop-by-hop policies is too much to ask. You can't do this today either.