[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Attractiveness of LISP-NERD, was Re: [RRG] Is 12 bytes really so scary?



Robin & others,

I'm going to hijack this thread to unleash my musings based on what I took away from the IETF meeting and RRG sessions last week and the RiNG meeting the past two days. Please note that I'm departing from established LISP terminology and protocol bits here and there, it's the big picture that I'm interested in.

I'll be covering several subjects, please at least skim the whole message.

On 8 dec 2007, at 9:31, Robin Whittle wrote:

An ITR-ETR scheme with real-time (a few seconds ideally) push
distribution of mapping data only needs 12 bytes for each mapping
change, for IPv4:

4 bytes  Micronet start address
4 bytes  Length
4 bytes  ETR address

-- [NERD (-like) prefix distribution format] --

Hm, in database class they taught us to never have any information in a record that isn't 1-to-1 mapped to the primary key. I.e., assuming one ETR is going to serve multiple prefixes (and possibly a good number of them), it's better to introduce a layer of indirection here. Also, we routing types tend to work with prefix lengths. So for the prefixes:

address prefix
prefix length (7 bits)
ETR index (n * m bits)
ETR preference (n * o bits)

Since we are required to have 64-bit interface identifiers in all currently usable IPv6 address space, we can limit the prefixes for IPv6 to 64 bits. Or maybe 48 or 56. Today IPv4 is de facto 24 bits but let's assume we may want to use the full 32 bits.

Now if we order the prefixes, obviously, each one will tend to have many bits in common with the previous one. Also, we'll probably want to split the data into smaller pieces, for reason of transmission, update and signature calculation efficiency and to allow parallelization. So then we get a packet like this:

1 data format version
2 address familiy identifier
3 common prefix bits for all prefixes in this packet
repeat as necessary:
4 remaining prefix bits
5 prefix length
repeat as necessary:
6 ETR index
7 ETR preference
end repeats
8 signature

If we ignore 1, 2, 3 and 8, assume an average of 16 bits for 4, 8 for 5, 24 for 6 and 8 for 7, with an average of 2.5 ETRs per prefix, that's 104 bits per prefix, let's say 112 = 14 bytes with a few more bits to find the end of the ETR list.

But that doesn't give you the infrastructure you need to quickly search through all of this. So we need more bits for that. DRAM speed probably isn't going to be a huge concern immediately because you basically only have to look up this stuff when there is no active mapping state, but if you need a large number of memory cycles to get at it you could hit a bottleneck as the database grows. But efficient lookup at the cost of size won't work too well either. I'm guestimating that we can reach a sweet spot somehwere between 16 and 20 bytes per prefix.

The ETRs can probably go into a simple array, can compress the initial bits there too if we care but for IPv6 the address identifier bits won't compress unless we mandate a bit pattern for exactly that reason. This list could be extremely large if people run private ETRs but very small if they use their ISP's ETRs. Let's assume 0.5 IPv4 ETRs per prefix and 0.25 IPv6 ETRs per prefix with no compression = 6 more bytes per prefix.

At 26 bytes and 250k prefixes the prefix table would be 6.25 MB. So we can probably afford low order 10^8 prefixes with current technology (~ 8 GB RAM) for the mapping database. Whether a mapping state cache for a network with that many PI prefixes could work is a different question.

Bandwidth, RAM and CPU power are cheap.

Not quite. You can get a decent amount of all three for cheap. But a lot is still expensive.

-- [Classification of LISP-NERD] --

LISP-NERD is a "push" system, but with each ITR asking for updates
from one of multiple relatively centralised servers.  At least there
is no packet delays or dropping.

Not entirely. I would say LISP-NERD is an optimistic hybrid push/pull system: you push out the mapping database, but discover/pull in the reachability status of ETRs where you assume reachability rather than wait for the reachability status information to become available. This is a powerful combination. With multihoming, each individual link tends to be up 98% or more of the time, so with LISP-NERD rather than a true pull system you get to deliver 98% percent of the initial packets efficiently.

But there isn't a concept of a
multicast-like replication system to fan the data out efficiently.

You can easily build something like this with off-the-shelf technology (HTTP...) as long as you use locator space for all of this so there is no circular dependency.

LISP-ALT sounds nifty, but there are all sorts of delays in the
paths taken across the world in this global query server system too.
One level of aggregation may be a router in the the USA, the next
in Japan, the next in the Netherlands.  The query packets traverse
the long tunnels between these routers, and have to make it all the
way back along the same tunnels.  There's no clear way of
authenticating the ETRs which are the authoritative query servers
for particular micronets, so the whole ALT overlay network needs to
be tied together carefully, manually, according to business
relationships.  That makes it expensive to administer and costly to
change.

And it uses existing BGP to boot, so there are no guarantees we won't see all the unpleasantness we see in the current global routing table in the ALT overlay network. However...

-- [LISP-NERD can be viewed as conceptually similar to LISP-ALT] --

If at some point it becomes undesirable to give each LISP-NERD ITR a copy of the full ID/loc database, you can split this database up into parts where different boxes handle different parts of the address space. Each box creates aggregate routes to attract the traffic for the part of the address space that it handles. If it's a small number of boxes, you can just put 2, 4 or 8 in each datacenter, but at some point that starts becoming unmanagable, too. So at this point, you stop distributing all the information to whereever the packets are, and start transporting the packets to where the information is. At this point, the LISP-NERD ITRs become a lot like the LISP-ALT overlay network. Adding caching-only ITRs closer to the source of the traffic is left as an exercise for the reader.

Yet when the customer decides on a new ISP, they need to
change the IP address and the administrator of their ETRs.

Just go to your address provider, tell them the new addresses for your ETRs (either yours that have new addresses or your new ISP's), pay a small processing fee, wait for the update to propagate throughout the network in due course. (Having this happen every hour seems reasonable.)


--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg