[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[RRG] Re: TRRP Waypoint Routers
On Mon, Feb 25, 2008 at 2:56 AM, Robin Whittle <rw@firstpr.com.au> wrote:
> > Lets say we have a source IP at 126.0.0.1 and he want's to
> > talk to me in the swamp at 199.33.224.1. So, he sends the packet
> > out. Call it packet A. There is no BGP route that covers
> > 199.33.224.1,
>
> Except perhaps as noted later regarding WRs advertising their
> prefixes in BGP.
Hi Robin,
This is an open issue in TRRP which I talk about in
http://bill.herrin.us/network/trrp-holey.html . BGP within the AS
where TRRP is operating can't accept a short BGP prefix which covers a
long TRRP prefix, that it can't accept a route in which TRRP punches a
hole. The TRRP prefixes aren't present in BGP so they can't override
the shorter prefix. If those short prefixes aren't filtered from the
routing table where TRRP is present, the packets will never find their
way to a TRRP ITR.
One consequence of this problem is that if you have a /28 carved out
of someone's PA space, you can't advertise it via TRRP. The shorter
BGP prefix for the PA space will override it. This is debateably a
good thing since it allows ISPs to keep control of their address
space.
There are at least two solutions to the holey route problem.
Solution #1: an option in the BGP announcement indicates that ASes
which support TRRP should filter this prefix.
Solution #2: a list of affected prefixes is compiled and periodically
downloaded by each ITR. The ITR then announces any of these prefixes
which are shorter than the filtering cutoff into the local AS's BGP so
that for the local AS they override the remote announcements.
As near as I can figure, any map-encap system which plans to co-exist
with BGP (that being necessary for incremental deployability) will
have to address a version of the holey route problem.
> > so packet A follows 0.0.0.0/0 to the nearest TRRP ITR.
>
> I assume this is an igp default route, so the ITR is located inside
> some end-user or ISP network. If you have such ITRs advertising to
> the routers of other ASes in the DFZ, then you will be supporting
> traffic from non-upgraded networks, just as with Ivip's "anycast
> ITRs in the core/DFZ" or LISP's "Proxy Tunnel Routers".
The terminology gets a little fuzzy here because TRRP uses the routing
system in a way that's different. It'll be an iBGP default route,
never an OSPF route. The IGP default should lead to the nearest BGP
router but the default route in the BGP router should lead to the
nearest ITR. I want the packet to pass through one BGP router before
it gets to the ITR because if a prefix for the destination is in BGP
then the packet should never go to an ITR.
Generally you'll want to announce that default to your customers (more
traffic = more cash), you won't want to announce it to your upstreams
(more traffic = you pay more cash) and you might want to announce it
to specific peers if you need to better balance your traffic or have
some other use for it.
> > 148.129.75.8 doesn't have a MAP for 199.33.224.1 either. However,
> > I have a private waypoint set up for all of 199.33.224.0/23 in
> > "generous" mode at 71.246.241.146 (which is also within globally
> > routeable space). It accepts GRE as one if its formats. I have
> > made arrangements with 148.129.75.8 to keep this knowledge in his
> > cache. Essentially, I push this knowledge to him.
>
> My first critique is about security. How can 148.129.75.8 know that
> your WR 71.246.241.146 is authorised by you, the person to whom
> these 512 IP addresses 199.33.224.0/23 of TRRP-mapped address have
> been in some way assigned?
He looks it up at 23.waypoint.224.33.199.v4.trrp.arpa. Except I have
an error in the (unfinished) document and that needs to be
23.224.waypoint.33.199.v4.trrp.arpa instead because
224.33.199.v4.trrp.arpa has already been delegated to me and I'm not
allowed to expand the netmask beyond 23.
I ran in to this same problem with the NM entries and solved it there.
The format is messier than I'd like. I've been thinking about just
creating additional hierarchies: nm.v4.trrp.arpa and
waypoint.v4.trrp.arpa instead of asking the code to figure out the
right place to put the "nm" and "waypoint" designators. It may also be
more practical to just require the nm and waypoint entries to be on
8-bit boundaries.
Anyway, the short version is: he knows it's authorized because he
looks it up from a part of the hierarchy that either pulls from my DNS
server or pulls from the DNS server for the RIR which assigned me the
addresses.
> But what if your /23 is actually split into multiple micronets, and
> some of their ETRs are nowhere near your WR? The system would still
> work, but could involve longer paths.
In the very worst case it could involve paths longer than than the the
lookup-response-send cycle at the ITR. If I know that my /23 is
organized as far-away micronets then I shouldn't create a waypoint for
it... I should rely on the /8 waypoint to query and cache my EID ETRs
directly.
> All you have to do is tell me that these WRs must be, or should
> typically be anycast, so multiple such WRs doing the same job are
> scattered around the Net, advertising the same prefix - and give me
> a name for this technique (maybe "Anycast Waypoint Routers in the
> DFZ") and I would say that TRRP is potentially incrementally
> deployable, at least in this important respect, with ways of
> ensuring relatively short paths and good load sharing between these WRs.
They -can- be deployed in this manner. It's up to a cost-benefit
analysis at the operator and/or RIR level. If the gain is worth the
cash and the process isn't too unwieldy, it'll be done this way. If
not, it'll be done one of the many other ways that TRRP makes
possible.
This kind of "let the operators build it the way that suits them best"
approach is, I think, one of the unique and most valuable aspects of
TRRP compared to its competitors. A successful TRRP deployment doesn't
depend on foreordaining the operations architecture.
> What about traffic volume levels? Right now, RIRs probably charge
> you and Google the same for X amount of address space. But if
> Google really gives their WRs a hammering, the RIRs should be
> charging Google according to their higher traffic volume.
The gigantic portals like Google gain a fairly obvious benefit from a
push-like feed from the RIRs covering some or all of that middle level
of the hierarchy so that their lookup depth is reduced from 2 to 1.
After a day of caching, the odds are they gain no benefit from even
trying to use the waypoint system let alone overloading it.
This conveniently puts the cost of the system on the folks using the
system instead of on "everybody else" like with BGP.
> >> Do you have estimates for the delay times?
> >
> > My SWAG is that the initial round trip will be 1.5 to 2 times the
> > normal round trip with some single-digit percentage taking long
> > enough to recognize no gain versus bare TRRP.
>
> I don't clearly understand this.
>
> If you have a single WR for each /8, then some ITRs are going to be
> on the other side of the Earth with respect to it, and so are the
> ETRs they are trying to send packets to. Worst case delay times
> could be long unless you have an elaborate anycast network.
Most US traffic stays in the US. Most Chinese traffic stays in China.
Most /8's are allocated by region.
Bare TRRP will delay the round trip of the first packet in a series of
connections by a factor of 3 or more. Waypoints should usually boost
the speed to a delay factor of 2 or less and the vast majority of the
time it'll get the packet through before bare TRRP's hold-for-query
algorithm could. Some very small percentage of the time the waypoint
path will be asymptotically long as you describe. Those cases will
gain no benefit from waypoints. Barring an incredibly bad
implementation of waypoints, those cases simply won't rely on the
waypoint system.
Regards,
Bill Herrin
--
William D. Herrin herrin@dirtside.com bill@herrin.us
3005 Crane Dr. Web: <http://bill.herrin.us/>
Falls Church, VA 22042-3004
--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg