[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] cache issues in LISP and CONS



The issue regarding packet drops with LISP and LISP-CONS has been
brought up a few times on the list.
Ah know, everyone tends to focus on corner cases. But please remember  
this is only for the first source in the source-site sending to the  
first destination in the destination-site.
We have lived with this problem with ARP and ND for years. I know  
it's a bit different and localized to a LAN but server-based switch  
networks are really large with 10s of thousands of end-hosts  
attached, so this problem occurs more often than people would think  
and we have lived with it (because it isn't much of an issue IMO).
If we become too pedantic about solving this, it could lead to a very  
complicated design which will prohibit deployment as well.
So we have to carefully choose our poison.

Basically, packets are dropped for every ITR cache miss.  Since CONS
mapping requests may take a long time to be satisfied, this may result
in unacceptable service.
If people think this is a show-stopper, we could recommend  
implementations queue small amounts of packets. But that causes  
unnecessary resource utilization in the implementation. Most IPv4/ARP  
(if not all) drop packets but the IPv6-ND test suites required to  
queue exactly one packet (and different Tahi tests indicated to queue  
either the first one or the last one). So how does queuing exactly  
one packet actually solve the problem.
Therefore, we have to watch what we ask for from a design.

Suggestions have been made to route packets on the old topology in the
event of ITR cache misses. However, this leads to a major incremental
deployment issue -- since LISP adopters will still need to maintain
their routes in the old topology, there would be no reduction in the
size of the global routing table.
The routes in the old topology are aggregatable RLOCs that map to  
topology. The routes (EID-prefixes) used in the new topology are  
highly aggregatable because it is based on allocation hierarchy. So  
the PI prefixes that were in the old topology can be a smaller set  
that are in the new topology.
We are very close to testing this alternate topology idea. We are  
close to testing it because it requires really no new code  
development to make it happen.
I have not seen any other suggestions on how to handle this issue.
Could this be a fundamental problem with the design, or are there other
solutions?
The problem is not in the design. The problem is just a hard problem  
to solve.
Dino

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg