[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip

To: Routing Research Group <rrg@psg.com>
Subject: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
From: Robin Whittle <rw@firstpr.com.au>
Date: Tue, 22 Jan 2008 15:05:05 +1100
Organization: First Principles
User-agent: Thunderbird 2.0.0.9 (Windows/20071031)
Is Tony Li the only person other than me who is critical of the
prospect of LISP-ALT and LISP-NERD running in parallel?

Everyone seems to agree that neither is sufficient on its own.  Yet
most folks seem to be assuming that the future map-encap scheme will
be based on, or resemble LISP CONS or ALT - with its global query
network and some system to handle traffic packets while the ITR is
waiting for the response from the slow, unreliable, global query
network.

In responding to concerns about all ITRs caching (ALT) or about all
ITRs having the full feed of mapping information (NERD), Dino
proposed that LISP should be implemented with both systems at once.
Here are some quotes from Dino and Tony around 10 January:

http://psg.com/lists/rrg/2008/msg00113.html

DF >>  As you know, all the LISP mapping database
DF >>  mechanisms touches on all tradeoffs. We know how
DF >>  to do it each way, what's left is experimentation
DF >>  and a decision to pick one, or blend two.

TL >   ?  You missed my point: you pick all, simultaneously.
TL >   You let your particular working set characteristics in
TL >   your particular location select which particular approach
TL >   you use at that particular location.  All of the choices
TL >   need to be integrated so that they result in one clean
TL >   mechanism.


http://psg.com/lists/rrg/2008/msg00115.html

DF >>  We have already made that conclusion. You should have
DF >>  seen that at RRG.

TL >   No, what I saw was a total mess of mechanisms, with no
TL >   coherency.


http://psg.com/lists/rrg/2008/msg00118.html

DF >>  Is coherency in the eye of the beholder? Systems with
DF >>  radically different operating points are going to look
DF >>  radically different.
DF >>
DF >>  As long as each operates well, they don't interfere with
DF >>  one another, and there is a good algorithm/whatever for
DF >>  deciding which one to use in a given situation, is it a
DF >>  problem if they are very dissimilar?

TL >   Something that looks like three different systems still
TL >   looks like three different systems. What we're talking
TL >   here is a smooth continuum of solutions with a common
TL >   infrastructure. There was no dial that you could set for
TL >   "light", "normal", and "heavy duty".


Here is my understanding of deploying both ALT and NERD, with and
without the "Default Mapper" approach to handling packets for which
a caching ITR currently has no mapping information.

Some ITRs are full database NERD ITRs.  They receive ("push", though
in NERD the ITR initiates the downloads) the full mapping database
and updates to this database.  These full database ITRs (ITRDs in
Ivip parlance) correctly and immediately encapsulate every traffic
packet (one addressed to an EID) they receive.  The cost is an
expensive ITR with a substantial flow of mapping data to it at all
times, irrespective of the traffic it handles.

The remaining ITRs are caching ITRs (ITRCs in Ivip).  They can only
get mapping data some time (fractions of a second to several
seconds) after they request it from the global ALT network.  These
ITRs can in principle be smaller and cheaper and their communication
needs for mapping data are smaller than for a NERD ITR.  The
communication needs are in direct proportion to the range of EIDs of
the traffic they handle.

With the original LISP approach of not using a "Default Mapper", the
initial one or more traffic packets cannot be tunnelled directly to
the ETR - because the mapping reply has not yet been received.  With
ALT as originally defined, they are sent into the global ALT system
which functions both as a global (often or typically slow) query
server network and as a global (slower and more costly than direct
tunnelling) network for delivering these initial packets to the correct
ETR.

The trouble with this is that by the time these initial traffic
packets emerge from their circuitous travels in the ALT network,
they are likely to be delayed so much that they have zero or
negative value for the end-users who are trying to communicate.

If the "Default Mapper" approach is used, then the ALT network is
only for handling mapping queries, and the packets which the caching
ITR has no mapping information for will be sent to a "Default
Mapper" - and ITR which does know how to tunnel them.  This may or
may not involve longer paths (stretch) compared to direct tunnelling.

By the way, the "Default Mapper" approach fits nicely with LISP's
"Proxy Tunnel Router" approach (which is identical to Ivip's
"anycast ITRs in the core").  The "Default Mapper" ITR can be one
and the same as the "Proxy Tunnel Router".

A "Default Mapper" / "Proxy Tunnel Router" ITR only makes sense if
it has the full database - so I assume these are NERD ITRs.  The ITR
functions as a "Proxy Tunnel Router" for packets originating in ISP
networks with no ITRs, and as a "Default Mapper" for packets sent
from an ISP with caching only ITRs.  An ITR could also be a "Default
Mapper" just for the ALT (caching) ITRs in a particular ISP network.

Without "Default Mapper" / "Proxy Tunnel Router" approach, ALT and
NERD are entirely independent systems which do not interfere, depend
upon or assist each other.

With the "Default Mapper" approach, ALT is used only for queries,
and NERD is used for some or many ordinary ITRs and for the "Default
Mappers" which caching (ALT) ITRs send their initial traffic packets
to.  So the ALT system depends on and derives benefits from the NERD
system.

However this ALT + NERD combination is inelegant and inefficient -
since with a little thought, a number of improvements can easily be
made.


The first obvious inefficiency is that ALT ITRs are going to rely on
the global (slow) ALT query system - sending their request around
the world (sometimes on a path longer than around the world) when
there is a NERD ITR at a distance which is far less than the path
which would be taken in the ALT network.

The NERD ITR has the full database.  So the first improvement is to
make that ITR, or a server at the same site, a query server.

Now the system resembles APT.  APT's Default Mappers are both query
servers and full database ITRs - but they only tunnel packets which
the other (caching) ITRs don't yet have mapping information for.
The traffic packet also functions as the mapping request (elegant
and efficient!), and assuming the Default Mapper (there can be
several of them) is close, the caching ITR will have the mapping
data within tens of milliseconds.  So only one or a few initial
traffic packets are handled by the Default Mapper.  This means the
Default Mapper can be a server, rather than a Big Iron router with
hardware FIB etc.

APT is an improvement on LISP ALT+NERD in several respects:

1 - There is no need for a global (big, expensive, unreliable and
    often very slow) query network - all queries are handled by
    the local full database query server: the Default Mapper.

2 - No traffic packets are delayed excessively - in contrast to
    ALT + NERD without "Default Mappers" which relies on the ALT
    network to deliver the first packets.

However, the APT design can be improved upon to deal with these
problems:

1 - There is no support for traffic packets from non-upgraded
    networks as there is for Ivip or LISP ("Proxy Tunnel
    Routers").  So APT as it stands is not incrementally
    deployable.

2 - The caching ITRs can only use Default Mappers in their own
    ISP network.  Therefore, every upgraded network needs
    one or more Default Mappers and so a full feed of mapping
    information.

Firstly, we can split the Default Mapper functions of full database
query server (QSD) and full database ITR (ITRD) - to allow them to
be implemented on separate devices if this is desired.

Secondly we can remove the requirement that caching ITRs can only
access these things within their own ISP's network.

QSDs and ITRDs might be in the same device, or be in separate
devices at the same location - but it is easy to imagine an
architecture which does not tie the two things together.  Both need
a full feed of mapping data, so they may well be in the same
location or the same device.

Thirdly we can allow for the possibility that there be a second kind
of query server - one which caches the answers it receives from a
full database query server.  This is a caching query server (QSC).

Fourthly, we allow for the possibility of the sending host
performing ITR functions (ITFH).  The host cannot be behind NAT, but
this saves on specific ITR hardware and makes use of CPU power and
RAM where it is often freely available, without any extra hops.  It
is unlikely that an ITFH would be a full database ITR (but the
architecture should not rule this out), so we assume the ITFH needs
to send queries to a local full database query server (QSD), perhaps
via one or more QSCs.

With no restrictions on the locations of these devices, we allow
their deployment to proceed according to the locally made decisions
about capital cost, mapping traffic (full feed or query and
response).  Combined with "anycast ITRs (ITRDs) in the core)", we
have a highly flexible, fully integrated, elegant architecture which:

1 - Does not drop or delay packets.

2 - Does not require a global query server network, since there
    will be one or more full database query servers in reasonably
    close proximity.

3 - Does not need any "gleaning" or "piggybacking" system for
    conveying mapping information which presumably will soon be
    required, since every ITR either has the full database or is
    close to a query server which has the full database.

3 - Requires a global mapping database feed for a large number
    of ITRDs and QSDs - but which will be used only as far
    as operators choose, not to every ITR.

This is the overall architecture of Ivip, as defined in July last year.

To this, Ivip adds:

1 - Make the mapping distribution *fast* - a few seconds at most
    from users to all ITRDs and QSDs.

2 - Add a mechanism by which QSDs can push mapping changes to ITRCs,
    ITFHs and QSCs which may be handling packets affected by the
    changes.

These two changes mean that the ITRs no longer need to test for
connectivity to ETRs, because the end-user can use their own
connectivity test system (multihoming monitoring system) to change
the mapping accordingly in real time.

So ITRs and ETRs are simpler and the whole system is modular, rather
than integrating reachability testing and multihoming service
restoration decisions into the map-encap scheme, as does LISP and APT.

This also enables the mapping data to be very compact - just a
description of the range of addresses and the address of the ETR.
See "Is 12 bytes really so scary?"

   http://psg.com/lists/rrg/2007/msg00806.html

The simpler mapping information is easier to push *fast*.

These two changes also enable the system to be used for mobile
networks and hosts, for both IPv4 and IPv6, with no need for special
software in the correspondent hosts, and minimal extra software in
the mobile host.


To this, Ivip also adds:

3 - The source address of the encapsulated packets (outer source
    address) is the address of the sending host (inner source
    address) - rather than the ITR's address.

This enables the ETR to easily enforce any source address filtering
imposed at the border routers of the network it operates in - by
dropping any inner packet whose source address is not the same as
the outer source address.

This should also enable Traceroute to operate over the tunnelled
portion of the path, if the sending host runs a modified Traceroute
program which can detect the replies which arrive with different
destination addresses (the ETR's rather than the destination host's).

Ivip doesn't directly support the ITR load splitting traffic to
multiple ETRs.  Load splitting needs to be done by having one or
more IP addresses mapped to one ETR, one or more IP addresses to the
next ETR etc. - with fine (real time) control by the user.

Ivip will also require some charging system to cover the cost of
sending out mapping changes for end-users who change their mapping
frequently, such as for mobility or fancy, dynamic, load-splitting TE.


I intend to revise the Ivip material into multiple, less monolithic,
IDs in the next month or so.  This will include a more detailed
description of the fast push system for mapping data, and of how I
propose to handle PMTUD and fragmentation problems.  The latter will
draw heavily on the on-list discussions with Fred Templin and
Iljitsch van Beijnum.

The result is more complex than ALT + NERD, or APT.  However, it is
a fully integrated system which I think is elegant and highly
flexible.

Ivip is intended to work fine with individual IP addresses (IPv4) or
/64s (IPv6) being mapped to different ETRs.

Ivip also fulfils Tony's desire for flexibility in deploying
caching and non-caching ITRs:

http://psg.com/lists/rrg/2008/msg00115.html

DF >>  You guys that are not favoring caches can other favor
DF >>  full tables and that can't scale to 10^10.
DF >>
DF >>  So do you have a proposal?

TL >   I don't fall into "you guys" because a) I'm in favor
TL >   of caches in some places and b) I'm against them in
TL >   others. I haven't noticed that opinion expressed by
TL >   anyone else yet. Yes, that's something of a proposal.


The Ivip description is longer than for ALT, NERD or APT because it
is more detailed and because Ivip tackles some problems the LISP and
APT developers have not yet attempted to handled, including
Traceroute, ETRs enforcing source address filtering, and especially
the PMTUD/fragmentation problem.

 - Robin   http://www.firstpr.com.au/ip/ivip/




--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg
Follow-Ups:
- Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
  - From: Dino Farinacci <dino@cisco.com>
Prev by Date: Re: [RRG] Properties of mapping solutions
Next by Date: Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
Previous by thread: Re: Dependency on mapping [Re: [RRG] Tunnel fragmentation/reassembly for RRG ...
Next by thread: Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
Index(es):
- Date
- Thread