[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Attractiveness of LISP-NERD, was Re: [RRG] Is 12 bytes really so scary?

To: Robin Whittle <rw@firstpr.com.au>
Subject: Attractiveness of LISP-NERD, was Re: [RRG] Is 12 bytes really so scary?
From: Iljitsch van Beijnum <iljitsch@muada.com>
Date: Sat, 15 Dec 2007 17:51:42 +0100
Cc: Routing Research Group list <rrg@psg.com>
In-reply-to: <475A567F.7070407@firstpr.com.au>
References: <475A567F.7070407@firstpr.com.au>

Robin & others,

I'm going to hijack this thread to unleash my musings based on what Itook away from the IETF meeting and RRG sessions last week and theRiNG meeting the past two days. Please note that I'm departing fromestablished LISP terminology and protocol bits here and there, it'sthe big picture that I'm interested in.

I'll be covering several subjects, please at least skim the wholemessage.

On 8 dec 2007, at 9:31, Robin Whittle wrote:

An ITR-ETR scheme with real-time (a few seconds ideally) push
distribution of mapping data only needs 12 bytes for each mapping
change, for IPv4:

4 bytes  Micronet start address
4 bytes  Length
4 bytes  ETR address

-- [NERD (-like) prefix distribution format] --

Hm, in database class they taught us to never have any information ina record that isn't 1-to-1 mapped to the primary key. I.e., assumingone ETR is going to serve multiple prefixes (and possibly a goodnumber of them), it's better to introduce a layer of indirection here.Also, we routing types tend to work with prefix lengths. So for theprefixes:

address prefix
prefix length (7 bits)
ETR index (n * m bits)
ETR preference (n * o bits)

Since we are required to have 64-bit interface identifiers in allcurrently usable IPv6 address space, we can limit the prefixes forIPv6 to 64 bits. Or maybe 48 or 56. Today IPv4 is de facto 24 bits butlet's assume we may want to use the full 32 bits.

Now if we order the prefixes, obviously, each one will tend to havemany bits in common with the previous one. Also, we'll probably wantto split the data into smaller pieces, for reason of transmission,update and signature calculation efficiency and to allowparallelization. So then we get a packet like this:

1 data format version
2 address familiy identifier
3 common prefix bits for all prefixes in this packet
repeat as necessary:
4 remaining prefix bits
5 prefix length
repeat as necessary:
6 ETR index
7 ETR preference
end repeats
8 signature

If we ignore 1, 2, 3 and 8, assume an average of 16 bits for 4, 8 for5, 24 for 6 and 8 for 7, with an average of 2.5 ETRs per prefix,that's 104 bits per prefix, let's say 112 = 14 bytes with a few morebits to find the end of the ETR list.

But that doesn't give you the infrastructure you need to quicklysearch through all of this. So we need more bits for that. DRAM speedprobably isn't going to be a huge concern immediately because youbasically only have to look up this stuff when there is no activemapping state, but if you need a large number of memory cycles to getat it you could hit a bottleneck as the database grows. But efficientlookup at the cost of size won't work too well either. I'mguestimating that we can reach a sweet spot somehwere between 16 and20 bytes per prefix.

The ETRs can probably go into a simple array, can compress the initialbits there too if we care but for IPv6 the address identifier bitswon't compress unless we mandate a bit pattern for exactly thatreason. This list could be extremely large if people run private ETRsbut very small if they use their ISP's ETRs. Let's assume 0.5 IPv4ETRs per prefix and 0.25 IPv6 ETRs per prefix with no compression = 6more bytes per prefix.

At 26 bytes and 250k prefixes the prefix table would be 6.25 MB. So wecan probably afford low order 10^8 prefixes with current technology (~8 GB RAM) for the mapping database. Whether a mapping state cache fora network with that many PI prefixes could work is a different question.

Bandwidth, RAM and CPU power are cheap.

Not quite. You can get a decent amount of all three for cheap. But alot is still expensive.

-- [Classification of LISP-NERD] --

LISP-NERD is a "push" system, but with each ITR asking for updates
from one of multiple relatively centralised servers.  At least there
is no packet delays or dropping.

Not entirely. I would say LISP-NERD is an optimistic hybrid push/pullsystem: you push out the mapping database, but discover/pull in thereachability status of ETRs where you assume reachability rather thanwait for the reachability status information to become available. Thisis a powerful combination. With multihoming, each individual linktends to be up 98% or more of the time, so with LISP-NERD rather thana true pull system you get to deliver 98% percent of the initialpackets efficiently.

But there isn't a concept of a
multicast-like replication system to fan the data out efficiently.

You can easily build something like this with off-the-shelf technology(HTTP...) as long as you use locator space for all of this so there isno circular dependency.

LISP-ALT sounds nifty, but there are all sorts of delays in the
paths taken across the world in this global query server system too.
One level of aggregation may be a router in the the USA, the next
in Japan, the next in the Netherlands.  The query packets traverse
the long tunnels between these routers, and have to make it all the
way back along the same tunnels.  There's no clear way of
authenticating the ETRs which are the authoritative query servers
for particular micronets, so the whole ALT overlay network needs to
be tied together carefully, manually, according to business
relationships.  That makes it expensive to administer and costly to
change.

And it uses existing BGP to boot, so there are no guarantees we won'tsee all the unpleasantness we see in the current global routing tablein the ALT overlay network. However...

-- [LISP-NERD can be viewed as conceptually similar to LISP-ALT] --

If at some point it becomes undesirable to give each LISP-NERD ITR acopy of the full ID/loc database, you can split this database up intoparts where different boxes handle different parts of the addressspace. Each box creates aggregate routes to attract the traffic forthe part of the address space that it handles. If it's a small numberof boxes, you can just put 2, 4 or 8 in each datacenter, but at somepoint that starts becoming unmanagable, too. So at this point, youstop distributing all the information to whereever the packets are,and start transporting the packets to where the information is. Atthis point, the LISP-NERD ITRs become a lot like the LISP-ALT overlaynetwork. Adding caching-only ITRs closer to the source of the trafficis left as an exercise for the reader.

Yet when the customer decides on a new ISP, they need to
change the IP address and the administrator of their ETRs.

Just go to your address provider, tell them the new addresses for yourETRs (either yours that have new addresses or your new ISP's), pay asmall processing fee, wait for the update to propagate throughout thenetwork in due course. (Having this happen every hour seems reasonable.)


--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg

Follow-Ups:
- Re: Attractiveness of LISP-NERD, was Re: [RRG] Is 12 bytes really so scary?
  - From: Robin Whittle <rw@firstpr.com.au>

References:
- [RRG] Is 12 bytes really so scary?
  - From: Robin Whittle <rw@firstpr.com.au>

Prev by Date: Re: [RRG] The use of UDP in LISP
Next by Date: [RRG] LISP-NERD reachability and MTU detection
Previous by thread: Re: [RRG] Is 12 bytes really so scary?
Next by thread: Re: Attractiveness of LISP-NERD, was Re: [RRG] Is 12 bytes really so scary?
Index(es):
- Date
- Thread