[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RRG] Re: LISP-ALT's long path problem again



Detailed engineering discussions about LISP can take place on alias lisp-interest@lists.civil-tongue.net .

Thanks,
Dino

On Jun 29, 2008, at 4:44 AM, Robin Whittle wrote:

Hi Dino,

I changed the topic from the "EXPLISP BOF at the Dublin IETF" thread:

 http://psg.com/lists/rrg/2008/msg01667.html
 http://psg.com/lists/rrg/2008/msg01674.html
 http://psg.com/lists/rrg/2008/msg01675.html

The latter part of your reply is a good example of how I think the
LISP-ALT team has generally failed to properly engage in debate
about the merits of the various proposals.  In this case, the
proposal being debated is yours.

Another problem is that I think you have generally not engaged in
proper debate about proposals which differ from yours.  I don't
think any of you have shown why it is necessary to use pure pull,
which is the assumption on which LISP-ALT is based.

I think you should at least argue the case for pure pull by pointing
out how impossible, undesirable or whatever you presumably think it
must be to use any alternative - such as APT's or Ivip's hybrid
push-pull approach.

To do this, you would need to propose some number of EID prefixes
and some overall rate of change of mapping of these EIDs.  Then you
would show why this is a realistic target and why this is impossible
to achieve with any hybrid push-pull proposal, or at least with APT
and Ivip.  For instance, you could argue the storage requirements
are impossible or undesirable to achieve.  Alternatively you could
argue that push would be impossibly or undesirably costly, fragile
or whatever.  The APT and Ivip proposals are detailed enough for you
to show in detail why both are unworkable for whatever number of
EIDs and whatever rate of change you think a map-encap proposal must
be able to scale to.


Back to your message:

The map request packet, which is perhaps an encapsulated initial
data packet, travels along the GRE tunnel to the nearest ALT
router.

There are Data Probes or Map-Request packets. A Data Probe is a
data packet that gets LISP encapsulated where the inner
destination address is copied to the outer destination address.

A Map-Request is a UDP packet where the destination address is the
destination EID and the source address is the locator of the
encapsulating ITR.

OK.  This is how I understood it.


There, over the GRE links to other ALT routers, each ALT router
forwards the packet according to its destination address
240.1.1.1.

Right.

The structure of this LAT network results in the packet arriving
at the correct ETR for this destination host.  The map reply
message goes back via the ordinary Internet (no GRE tunnels or
ALT routers) to the ITR.  Subsequent encapsulated data packets go
from the ITR to the ETR via the ordinary Internet too.

Right.

Its not clear to me how the 12.0.0.1 ITR has a tunnel to an ALT
router, unless the dotted line "Low Opex" indicates a tunnel,
without the need for a physical link to this ALT router.

The 12.0.0.1 ITR is a low-opex router and it has a configured GRE
tunnel to the ALT router.

OK.


The diagram makes it look easy, with a single GRE hop between the
first and the final ALT router.  However, in reality, it would be

And in reality it is that easy.  ;-)

It is easy in your prototype network since you only have a handful
of ALT routers.

In the real world with x00,000,000 EID prefixes, how many would you
need?  How many ALT routers would the Data Probe or Map Request
packet need to go through in order to get from the ITR to the ETR?

How aggregated is the ALT network to be?  Does any one router
aggregate just two routers below it (a 1 bit span), on average, or
16 (4 bits), or 1024 (10 bits)?  This has never been specified by
the LISP-ALT team.

If you stick to /24 as a maximum length of EID prefix (which I don't
recommend, since many end-user networks will be happy with less than
256 addresses and so will be happy with, for instance, 4 addresses
in a /28) then you have 24 bits to aggregate in the ALT network.
Let's say the top 6 bits are handled by a fully meshed set of
top-level ALT routers.  (These are bottlenecks, but that is another
discussion.)

Now you have potentially 2^24 ETRs to connect to some number of ALT
routers at the lowest level - level 0.  Say you have 4 bits of
aggregation there, so you have potentially 2^20 level 0 ALT routers,
each serving up to 16 ETRs.

Now you have decisions to make about connecting these 2^20 ALT
routers to each other via multiple higher levels of routers.  Let's
say we have the next level 1 handling 16 such level 0 routers.  This
is a 4 bit span in the aggregation hierarchy, so there are up to
2^16 such level 1 ALT routers.  Now lets say you span the rest of
the bits (16 to 6) in 3, 3 and 4 bit steps.

Level  Bit   Bit span       Potential
            from previous  number of
            level          ALT routers
                           at this level
ETRs   24
  0   20    4              2^20
  1   16    4              2^16
  2   13    3              2^13
  2   10    3              2^10
  3    6    4              2^6  all meshed together

Let's say a packet needs to get from an ALT router which handles
11.0.0.0/20 to an ETR which handles the 130.0.0.0/24 EID prefix.
That ETR is only reachable via one (more for redundancy? and if so,
how??) ALT routers which handle 130.0.0.0/20

The path through the ALT network would be something like this:

ITR -->          ALT router for:  11.0.0.0/20
         up to  ALT router for:  11.0.0.0/16
         up to  ALT router for:  11.0.0.0/13
         up to  ALT router for:  11.0.0.0/10
         up to  ALT router for:   8.0.0.0/6
     across to  ALT router for: 128.0.0.0/10
       down to  ALT router for: 130.0.0.0/13
       down to  ALT router for: 130.0.0.0/16
       down to  ALT router for: 130.0.0.0/20  --> ETR


This is 8 GRE tunnels to traverse, each involving multiple physical
routers and quite possibly very large distances, such as across
oceans, continents etc.

This is just my imagined example of actual bit spans per aggregation
level.  I have no idea what you propose for LISP-ALT.

I think you could help everyone make progress towards a scalable
routing solution by providing detailed examples of how you expect
your system to work for the very large (how large?) numbers of EIDs
you expect it to scale to.  You must have some very large number,
like hundreds of millions (IPv4) or billions (IPv6) - otherwise you
wouldn't have a strong objection to the storage requirements of ITRs
and Query Servers in hybrid push-pull systems such as APT or Ivip.

The full argument about this long-path problem for LISP-ALT can be
found in the messages linked to from:

 http://www.firstpr.com.au/ip/ivip/lisp-links/#long_paths

The main defence was Scott's message:

 http://psg.com/lists/rrg/2008/msg00584.html

which contained some generalised arguments about why the paths
between ALT routers would generally not be as long as the critics
expect.  I think there needs to be a much more detailed exposition
of how the ALT network would really be structured before LISP-ALT
critics could be convinced this "long-paths" problem has an
agreeable solution.


common for the packet to have to ascend the highly aggregated ALT
hierarchy before it could descend the hierarchy towards the ALT

The ITR has a default 0.0.0.0/0 EID-prefix route so it knows to
forward all Data Probes or Map-Requests over the tunnel.

Yes - I understand the ITR has, in its ALT network section, a
default route to the GRE tunnel to whatever ALT router is nearest,
most convenient or whatever.  ITRs don't have an address in the ALT
network - only ETRs.  ITRs are required to send (map request / data
probe) packets into the ALT network, but not to receive any packets
from the ALT network.  ETRs only receive these packets from the ALT
network.  They do not send to the ALT network - unless of course
they are also an ITR.)  ETRs send their mapping replies to the ITRs
by the ordinary Internet.

What you wrote doesn't address my point - that to get from an ALT
router which aggregates one part of the ALT address space to an ALT
router which handles another very different address, such as one
where the most significant bits are different, requires the packet
to ascend the hierarchy to some level where you have organised the
ALT routers as a mesh - and then to descend it again.

That can be a *lot* of GRE tunnels, which wouldn't be so bad if we
thought they were all relatively short.  However, since address
space is scattered all over the world, these ALT routers will be
scattered all over the place too, and it would not be uncommon for
several of these GRE tunnels to traverse the Pacific, or a large
country etc.  K. Sriram's diagram says it all:

 http://www.antd.nist.gov/~ksriram/strong_aggregation.png

router with a direct link to the correct ETR.  Also, each GRE
tunnel may involve traversing multiple physical routers.  This
long path problem was discussed:

Yes, and so what?

Other people understand my critique well and consider it to be a
serious problem with LISP-ALT.

The logical topology can be relatively congruent to the physical
topology

At the interdomain level, the physical topology of the Internet is
very poorly correlated with who has been assigned what address space.

I assume the company which runs an ALT router for some part of the
ALT address space is the company which has been assigned this
address range in the Internet.  If not, who is going to run all
these ALT routers?

Those companies are all over the place and there is very little
correlation between the addresses they use and their location in the
physical topology of the Internet.

So you have LISP ALT routers all over the place, and you form them
into a theoretically tight network with strong aggregation.  But
since they are all over the place, the GRE tunnels between ALT
routers at different levels (there are no GRE tunnels between ALT
routers at the same level, apart from the highest - fully meshed -
level) are frequently going to be *long*.

Since the ALT network is highly aggregated, there is no cut-through
path from the 11.0.0.0/20 ALT router and the 130.0.0.0/20 ALT router
- even if they were physically close to each other.  (If you had
lots of these cut-throughs, then each ALT router would have an
objectionable number of GRE tunnels to maintain, with each one
supporting a full BGP conversation with each of its GRE tunnel
neighbours to keep the ALT system working.)

but because it is logical we want to reap the benefits of it being
logical so we can aggregate EID-prefixes as best we can.

The ALT topology could be a topology where we use addressing based
on Geo location, but more importantly it is based on address
allocation rather the topological allocation.

I don't understand this last part of your message in any coherent,
constructive, manner which addresses my critique.  It seems to be
highly theoretical statements which I can't reliably construct an
alternative understanding of the ALT network from.

You and I both object to people avoiding low-level details and just
talking in abstract concepts too far removed from reality to put to
a proper test - too airy-fairy for anyone to develop a reliable
understanding from.

Please provide a detailed example of how you expect the ALT network
to work with some specified number of EIDs - hundreds of millions or
billions, I assume.  Then, please point out why this long-path
critique has less merit than I and quite a few others think it has.

I suggest you not shy about doing it at the RRG.  I have asked for
an alternative list, and for whether this sort of discussion should
move to the RAM list.  There has been no decisive response yet, so
where else can we discuss this?

If you think LISP is developed enough to warrant the IETF creating a
WG, I think you should be able to communicate a detailed response to
constructive critiques such as this.

I also think you should make it easy for people who want to
understand your proposal by establishing a single web page where
everything of importance to LISP-ALT can be found.

 - Robin




--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg