[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [RRG] cache issues in LISP and CONS



    > From: <louise.burness@bt.com>

    > I thought the intention was somehow to have the mapping system partly
    > responsible for multi-homing (ID could map to multiple edge routers).
    > Surely if we need to introduce aggressive caching it would make
    > multi-home failover slow?

Ow. Good question! This margin is definitely too small to fully explore this!
But I think the answer is no, generally.

First, you have to distinguish between a couple of cases; ongoing
communications which were already in progress when the failure happens, and
new communications.


Caching will not have much impact on the ongoing communications cases - not
if the cached entries include mappings from EID's to RLOC sets, and not just
individual RLOC's.

I mean, whether there is caching or not, in either case there is an entity X
has is reachable via a number of RLOC's, and when one stops working, other
entities Y1...YN which are communicating with X need to find that out, so
they can switch to other RLOC's. Whether the Ys got those RLOC sets by
communicating directly with X's aCAR, or those RLOC sets were cached, there's
no difference.

Of course, then on has to decide whether each Ym is supposed to figure it out
on their own, or whether some other agent Q does it on their behalf, and then
notifies all the Ym, and if you do that how does Q know where all the Ym are,
yadda, yadda.

(See:

  http://ana-3.lcs.mit.edu/~jnc/tech/nsrg/multihoming_points.txt
  http://ana-3.lcs.mit.edu/~jnc/tech/nsrg/multihoming_more.txt

for a bit more thinking about what has to happen, and how to minimize
the overhead, in all that 'recovering from a failure' stuff.)


Caching might have an impact on the new communication cases - but only if one
assumes that on a temporary failure, the aCAR entries will be updated to
remove the 'broken' RLOC. If not, caching clearly has no effect.

I'm unsure that we will update the mapping database, any more than we update
the DNS in such cases now. Although I suppose we are talking about doing
mobility with this mapping system, so I suppose it's possible.

Also, if we do support multihoming failover for ongoing communication (above)
those mechanisms will also work in this case, to find out that there's a
non-working RLOC, and stop using it, so if we solve that one, we've
automatically solved this one (although we'll still have a performance hit,
as we'll have to go through the 'discover RLOC Rx1 isn't working' cycle).


This is just scratching the surface of a complex topic; I don't know if
anyone has analyzed in detail how we will support the various kinds of
multi-homing, and yes, that's something that should be done 'soon'. Again,
there are complex questions of performance versus complexity versus
overhead...


    > Also it would make it hard for the mapping system to provide traffic
    > engineering?

It depends on whether (and if so, how often) doing traffic engineering would
require changing the EID->RLOC binding(s). If so, I can imagine e.g. putting
smaller lifetimes on binding entries as a way to do this. Yes, that will
increase the overhead of operating the system, but TANSTAAFL.

Note, however, that in this case it's probably not the end of the world if
someone keeps using an old EID->RLOC binding, unlike the 'busted link' case
above. There, keep using the old binding -> no communication. Here, keep
using the old binding -> traffic using that old binding for a while (until it
times out) takes a path other than the desired one.

<Repeat previous closing comment...>

	Noel

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg