[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Attractiveness of LISP-NERD, was Re: [RRG] Is 12 bytes really so scary?
Robin & others,
I'm going to hijack this thread to unleash my musings based on what I
took away from the IETF meeting and RRG sessions last week and the
RiNG meeting the past two days. Please note that I'm departing from
established LISP terminology and protocol bits here and there, it's
the big picture that I'm interested in.
I'll be covering several subjects, please at least skim the whole
message.
On 8 dec 2007, at 9:31, Robin Whittle wrote:
An ITR-ETR scheme with real-time (a few seconds ideally) push
distribution of mapping data only needs 12 bytes for each mapping
change, for IPv4:
4 bytes Micronet start address
4 bytes Length
4 bytes ETR address
-- [NERD (-like) prefix distribution format] --
Hm, in database class they taught us to never have any information in
a record that isn't 1-to-1 mapped to the primary key. I.e., assuming
one ETR is going to serve multiple prefixes (and possibly a good
number of them), it's better to introduce a layer of indirection here.
Also, we routing types tend to work with prefix lengths. So for the
prefixes:
address prefix
prefix length (7 bits)
ETR index (n * m bits)
ETR preference (n * o bits)
Since we are required to have 64-bit interface identifiers in all
currently usable IPv6 address space, we can limit the prefixes for
IPv6 to 64 bits. Or maybe 48 or 56. Today IPv4 is de facto 24 bits but
let's assume we may want to use the full 32 bits.
Now if we order the prefixes, obviously, each one will tend to have
many bits in common with the previous one. Also, we'll probably want
to split the data into smaller pieces, for reason of transmission,
update and signature calculation efficiency and to allow
parallelization. So then we get a packet like this:
1 data format version
2 address familiy identifier
3 common prefix bits for all prefixes in this packet
repeat as necessary:
4 remaining prefix bits
5 prefix length
repeat as necessary:
6 ETR index
7 ETR preference
end repeats
8 signature
If we ignore 1, 2, 3 and 8, assume an average of 16 bits for 4, 8 for
5, 24 for 6 and 8 for 7, with an average of 2.5 ETRs per prefix,
that's 104 bits per prefix, let's say 112 = 14 bytes with a few more
bits to find the end of the ETR list.
But that doesn't give you the infrastructure you need to quickly
search through all of this. So we need more bits for that. DRAM speed
probably isn't going to be a huge concern immediately because you
basically only have to look up this stuff when there is no active
mapping state, but if you need a large number of memory cycles to get
at it you could hit a bottleneck as the database grows. But efficient
lookup at the cost of size won't work too well either. I'm
guestimating that we can reach a sweet spot somehwere between 16 and
20 bytes per prefix.
The ETRs can probably go into a simple array, can compress the initial
bits there too if we care but for IPv6 the address identifier bits
won't compress unless we mandate a bit pattern for exactly that
reason. This list could be extremely large if people run private ETRs
but very small if they use their ISP's ETRs. Let's assume 0.5 IPv4
ETRs per prefix and 0.25 IPv6 ETRs per prefix with no compression = 6
more bytes per prefix.
At 26 bytes and 250k prefixes the prefix table would be 6.25 MB. So we
can probably afford low order 10^8 prefixes with current technology (~
8 GB RAM) for the mapping database. Whether a mapping state cache for
a network with that many PI prefixes could work is a different question.
Bandwidth, RAM and CPU power are cheap.
Not quite. You can get a decent amount of all three for cheap. But a
lot is still expensive.
-- [Classification of LISP-NERD] --
LISP-NERD is a "push" system, but with each ITR asking for updates
from one of multiple relatively centralised servers. At least there
is no packet delays or dropping.
Not entirely. I would say LISP-NERD is an optimistic hybrid push/pull
system: you push out the mapping database, but discover/pull in the
reachability status of ETRs where you assume reachability rather than
wait for the reachability status information to become available. This
is a powerful combination. With multihoming, each individual link
tends to be up 98% or more of the time, so with LISP-NERD rather than
a true pull system you get to deliver 98% percent of the initial
packets efficiently.
But there isn't a concept of a
multicast-like replication system to fan the data out efficiently.
You can easily build something like this with off-the-shelf technology
(HTTP...) as long as you use locator space for all of this so there is
no circular dependency.
LISP-ALT sounds nifty, but there are all sorts of delays in the
paths taken across the world in this global query server system too.
One level of aggregation may be a router in the the USA, the next
in Japan, the next in the Netherlands. The query packets traverse
the long tunnels between these routers, and have to make it all the
way back along the same tunnels. There's no clear way of
authenticating the ETRs which are the authoritative query servers
for particular micronets, so the whole ALT overlay network needs to
be tied together carefully, manually, according to business
relationships. That makes it expensive to administer and costly to
change.
And it uses existing BGP to boot, so there are no guarantees we won't
see all the unpleasantness we see in the current global routing table
in the ALT overlay network. However...
-- [LISP-NERD can be viewed as conceptually similar to LISP-ALT] --
If at some point it becomes undesirable to give each LISP-NERD ITR a
copy of the full ID/loc database, you can split this database up into
parts where different boxes handle different parts of the address
space. Each box creates aggregate routes to attract the traffic for
the part of the address space that it handles. If it's a small number
of boxes, you can just put 2, 4 or 8 in each datacenter, but at some
point that starts becoming unmanagable, too. So at this point, you
stop distributing all the information to whereever the packets are,
and start transporting the packets to where the information is. At
this point, the LISP-NERD ITRs become a lot like the LISP-ALT overlay
network. Adding caching-only ITRs closer to the source of the traffic
is left as an exercise for the reader.
Yet when the customer decides on a new ISP, they need to
change the IP address and the administrator of their ETRs.
Just go to your address provider, tell them the new addresses for your
ETRs (either yours that have new addresses or your new ISP's), pay a
small processing fee, wait for the update to propagate throughout the
network in due course. (Having this happen every hour seems reasonable.)
--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg