[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RRG] ALT's strong aggregation often leads to *very* long paths



Even assuming a global query server (and initial packet delivery)
network was desirable, which I think it is not, ALT suffers from the
problem that many paths from ITR to ETR will be extremely long -
longer than half-way round the Earth.

The paths will often be far longer than via the most direct path via
BGP routers.  So a global ALT network would be very costly, slow and
unreliable for delivering initial traffic packets and map queries.
Likewise if it is used to return the mapping response. (It seems
much smarter to send the response directly to the ITR over the
Internet - which is referred to in the ID as the "underlying
topology".  This is an option mentioned in draft-fuller-lisp-alt-01.)

ALT mandates that the hierarchy of ALT routers be highly aggregated:

   LISP-ALT routers are deployed in a hierarchy which matches the
   EID prefix allocation hierarchy.  LISP-ALT routers at each level
   in the this hierarchy are responsible for aggregating all EID
   prefixes learned from LISP-ALT routers logically "below" them and
   advertising summary prefixes to the LISP-ALT routers logically
   "above" them.

   ... LISP-ALT uses existing BGP mechanisms to aggressively
   aggregate this information.

This is presumably impossible at the edge of the network, since the
location of ETRs for a given EIDs are scattered around the place
with little or no correlation with geography.  Assuming that it is
desired to have short paths between the ETR and the lowest level of
ALT router, that lowest level of the ALT hierarchy will not be doing
much aggregation.

A very high degree of address aggregation is a core principle of ALT.

So the higher levels of the hierarchy will be highly aggregated.
Since there is little correlation between geography and IP address,
and since these ALT routers will be physically located according to
the operators who are (however determined) responsible for some
subset of the address space, the problem arises:

By mandating the structure of the ALT network be strongly driven by
address aggregation, this means the connections between one router
and the next in the hierarchy will have little or no relation to
geography.  Therefore, the average length of inter-router distance
will be far longer than for ordinary BGP routers, where the network
structure is based primarily on linking to geographic neighbours,
and not at all on address aggregation.

I imagine the following scenario would be common:

An ITR in Melbourne Australia needs to send an encapsulated traffic
packet or map request to an ETR in the Netherlands.  The IP
addresses of the ITR (150.101.162.123) and ETR (83.149.65.1) have no
correlation whatsoever.  This is typically the case for Internet
communications, except between some geographically neighbouring
hosts which happen to be on the same ISP or on ISPs which got
similar address ranges from the one RIR.

The sequence of LISP ITRs through which the packet travels is
something like this.  I am assuming each level of hierarchy
aggregates 3 bits, meaning each router has about 8 below it.  For
simplicity I am assuming the ETR advertises a /24 EID.  The problem
would be worse, due to more layers in the hierarchy, for longer EIDs
(which I think are desirable and often necessary) and of course for
IPv6.

Hierarchy  24 MSB Address                    Location
Level      bits

ITR      1001 0110  0110 0101  1010 0010     Melbourne

1        1001 0110  0110 0101  1010 0***     Melbourne

2        1001 0110  0110 0101  10** ****     Melbourne

3        1001 0110  0110 010*  **** ****     Sydney

4        1001 0110  0110 ****  **** ****     Sydney

5        1001 0110  0*** ****  **** ****     Somewhere-1

6        1001 01**  **** ****  **** ****     Somewhere-2

7        100* ****  **** ****  **** ****     Somewhere-3

7        010* ****  **** ****  **** ****     Somewhere-4

6        0101 00**  **** ****  **** ****     Somewhere-5

5        0101 0011  1*** ****  **** ****     Somewhere-6

4        0101 0011  1001 ****  **** ****     Somewhere-7

3        0101 0011  1001 010*  **** ****     Amsterdam

2        0101 0011  1001 0101  01** ****     Amsterdam

1        0101 0011  1001 0101  0100 0***     The Hague

ETR      0101 0011  1001 0101  0100 0001     The Hague

I am assuming some meshiness at the top levels of the hierarchy -
such as between the "100*" ALT router(s) and the "010*" router(s).
Without that, all queries would have to go up a level to router or
set of ALT routers for "1***", which would be very busy indeed,
including meshing to another set of ALT routers for "0***".

Meshiness at lower levels would generally reduce path lengths -
since the packet could travel to a lower level ALT router which is
topologically close to the destination, rather than having to
traverse more ALT routers up and down the hierarchy.

However, this would involve a large number of GRE links between ALT
routers.  That is OK, since GRE is not like paying for a fibre link.
 Taken to a logical extreme, all level 1 ALT routers could be fully
meshed and have a GRE tunnel to all other ALT routers - then the
higher levels wouldn't be needed.  But that would be fragile and
hard to administer due to the hundreds of thousands of GRE tunnels
for each ALT router.

I have made assumptions about the lower levels being geographically
close, due to a presumption that the ALT system would not be highly
aggregated at the lower levels.  That means the ALT router at level
1 in The Hague would be handling many unrelated aggregates, each
sparsely populated, according to whatever nearby ETRs advertised
their EIDs to it.

Alternatively, there could be strong aggregation at all levels of
the ALT hierarchy.  In that case, the problem I am discussing is
worse, since the level 1 ALT router could be anywhere in the world
with respect to the ETR in The Hague, or the Seychelles, or wherever
the the network (or single host) with this EID is physically located
at the current time.

The big questions are:

1 - Where are these ALT routers "Somewhere-x"?

2 - What is the total geographical path length (and number
    of routers in each GRE tunnel) from Somewhere-1 to
    Somewhere-7?

Because the IP addresses are assigned to organisations with little
regard for geography, and since those organisations are going to be
running these ALT routers at their own sites, it is most likely that
the locations of these 7 ALT routers will not be highly correlated
in a geographical sense.  Maybe some are in Europe and some in the
USA, China or Singapore.

So the path length for the whole trip could be be very long indeed -
longer than the longest path one could create on the Internet - due
to the need to tunnel back and forth between geographically
dispersed ALT routers at various levels of the hierarchy.


I think a global query network is a really bad idea.  ALT is a poor
way of building a global query network - since it tries to impose
the mathematical purity of strong aggregation, without considering
how this will generally greatly extend the total geographic path length.

(I never really understood how CONS would work, but perhaps CONS
would be subject to this same critique, since its structure was also
to be based on strong address aggregation.)


My guess is that the attraction of ALT (CONS too?) is primarily one
of feel - the attraction of concepts which have not been fully
considered.  After two or so decades of frustration, battling to
help the BGP network run efficiently, by trying to impose address
aggregation on it - and largely or utterly failing - ALT is a
satisfying fresh start.

By administrative fiat, the ALT hierarchy is structured to strongly
aggregate addresses.

The trouble is, this makes things far slower, when implemented on a
global scale.  This will not show up in experiments, except on a
very large, genuinely geographically dispersed, prototype network.


The only workaround I can imagine would be to locate all the higher
level ALT routers in the same place - "ALT-Central".  Then the
system becomes a star-structured thing, with links out to ETRs and
ITRs all over the world.  There will be little or no correlation
between the EID addresses of those ETRs and geography.

So almost all queries and encapsulated traffic packets would leave
the ITR, travel to wherever ALT-Central is located, and then to the
ETR.  If both were 1/4 Earth circumference from ALT-Central, this is
a half-way round the Earth one-way trip, even if the ITR and ETR are
physically very close to each other.

This would make the responsiveness of the system dependent on how
far the ITR and ETR are from ALT-central.  This would not be
acceptable to folks in Australia, New Zealand, Japan or Moscow - who
are far from ALT-Central's most likely location: the USA.

I suppose there could be multiple ALT-Centrals, in Europe, the USA,
Japan etc. - all meshed with each other - but this would be really
expensive and ugly.


The only way I can imagine the new architecture operating without
dropping or unreasonably delaying traffic packets is to allow for a
flexible deployment of:

1 - Full database ITRs

2 - Caching ITRs close to:

3 - Full database query servers


  - Robin        http://www.firstpr.com.au/ip/ivip/




--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg