[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: CPE router acting as host on its WAN interface (RE: draft-ietf-v6ops-ipv6-cpe-router-03.txt WGLC)
Wes,
We are getting closer, but there are still a couple of
things to clarify:
> -----Original Message-----
> From: Wes Beebee (wbeebee) [mailto:wbeebee@cisco.com]
> Sent: Wednesday, January 06, 2010 11:11 AM
> To: Templin, Fred L; Hemant Singh (shemant); Fred Baker (fred); v6ops@ops.ietf.org
> Cc: kurtis@kurtis.pp.se; rbonica@juniper.net
> Subject: RE: CPE router acting as host on its WAN interface (RE: draft-ietf-v6ops-ipv6-cpe-router-
> 03.txt WGLC)
>
> > Not quite; the OS seems to clear *FIB entries* based on the setting of
> the IsRouter flag in the
> > neighbor cache entry corresponding to the nexthop. The OS does not
> clear entries in the nbr cache.
>
> From RFC 4861:
>
> "Router Solicitations in which the Source Address is the unspecified
> address MUST NOT update the router's Neighbor Cache; solicitations
> with a proper source address update the Neighbor Cache as follows.
> ...
> Whether or not a Source Link-Layer
> Address option is provided, if a Neighbor Cache entry for the
> solicitation's sender exists (or is created) the entry's IsRouter
> flag MUST be set to FALSE."
>
> > But, if the CE router subsequently sends an NA message with the R bit
> (i.e., the Router bit) set to
> > TRUE, the SP router will set IsRouter in the nbr cache entry to TRUE
> and the danger of FIB entry
> > deletion is averted.
>
> Well, the CE Router may need to receive an RA in order to know how to do
> address acquisition on its WAN interface (doing SLAAC/DHCP, etc.).
> Waiting for a periodic RA may not be feasible in some deployments, so a
> CE Router MAY send an RS in order to increase the chances of receiving
> an RA in a timely manner. We don't want to block CE Routers from ever
> sending RS's on their WAN interface.
I agree it is very likely that the CE router may need to
send a solicitation of some kind in order to receive a
more timely RA from the SP router.
> Garbage collecting the FIB entries based on IsRouter value in the
> Neighbor Cache is not specifically prohibited by RFC 4861 - so we're not
> talking about a non-compliance issue.
Right.
> From RFC 4861:
>
> "To limit the storage needed for the Destination and Neighbor
> Caches,
> a node may need to garbage-collect old entries. However, care must
> be taken to ensure that sufficient space is always present to hold
> the working set of active entries. A small cache may result in an
> excessive number of Neighbor Discovery messages if entries are
> discarded and rebuilt in quick succession. Any Least Recently Used
> (LRU)-based policy that only reclaims entries that have not been
> used
> in some time (e.g., ten minutes or more) should be adequate for
> garbage-collecting unused entries.
>
> A node should retain entries in the Default Router List and the
> Prefix List until their lifetimes expire. However, a node may
> garbage-collect entries prematurely if it is low on memory. If not
> all routers are kept on the Default Router list, a node should
> retain
> at least two entries in the Default Router List (and preferably
> more)
> in order to maintain robust connectivity for off-link
> destinations."
While this text is fine as quoted from the RFC, it does not
really apply to the issue we are discussing. We are concerned
with the case of an implementation garbage collecting FIB
entries based on the IsRouter setting in the neighbor cache
entry for the nexthop. This has nothing to do with memory
limitation, LRU, etc. - it is rather based on a policy
decision that routes not be allowed to use a non-router as
the next hop.
> And, sending a gratuitous NA after an RA solely for the purpose of
> preventing Linux running on the SP from GC'ing the CE Router entry has
> the problems that you've already identified, and seems like a hack:
>
> > Two problems with this however. First, it requires the CE router to
> send a gratuitous NA message.
> > Secondly, the CE router has no way of knowing if the SP router has
> received the NA message.
>
> I think the only other options are to say "don't GC if IsRouter is
> FALSE" to Linux, which may not be an option if you run out of space, or
> make sure that there's enough space that you don't GC more often than
> you'd expect traffic from the CE Router to keep the entries alive, which
> is already recommended by RFC 4861:
>
> "However, care must be taken to ensure that sufficient space is
> always
> present to hold the working set of active entries."
Again, this is not a memory limitation consideration; it is
a policy consideration. Linux is just one example of an OS
that seems to have adopted the policy of not allowing FIB
entries that use a non-router as the next hop. We don't
know what other implementations there are that might adopt
such a policy.
> I think we've analyzed the problem fully now. From a specification
> standpoint, I don't know what you want us to do.
It seems that there needs to be some way to either
prevent the SP router from setting the IsRouter flag
to FALSE when the CE temporarily acts as a host or
to reset the flag to TRUE when the CE begins acting
as a router. I'm not sure there is a way to do either
of these without involving the SP router.
> From a practical
> implementation standpoint, I think you know what you're options are.
I'm not sure I understand this part. This is an
operational consideration for which we see one
implementation that may be affected but we have
no way of knowing what other implementations
could be affected.
Fred
fred.l.templin@boeing.com
> - Wes
>
> -----Original Message-----
> From: owner-v6ops@ops.ietf.org [mailto:owner-v6ops@ops.ietf.org] On
> Behalf Of Templin, Fred L
> Sent: Wednesday, January 06, 2010 12:08 PM
> To: Hemant Singh (shemant); Fred Baker (fred); v6ops@ops.ietf.org
> Cc: kurtis@kurtis.pp.se; rbonica@juniper.net
> Subject: RE: CPE router acting as host on its WAN interface (RE:
> draft-ietf-v6ops-ipv6-cpe-router-03.txt WGLC)
>
> Hemant,
>
> > -----Original Message-----
> > From: Hemant Singh (shemant) [mailto:shemant@cisco.com]
> > Sent: Wednesday, January 06, 2010 8:24 AM
> > To: Templin, Fred L; Fred Baker (fred); v6ops@ops.ietf.org
> > Cc: kurtis@kurtis.pp.se; rbonica@juniper.net
> > Subject: RE: CPE router acting as host on its WAN interface (RE:
> > draft-ietf-v6ops-ipv6-cpe-router- 03.txt WGLC)
> >
> > Fred,
> >
> > It's a well-known problem in Linux that the OS incorrectly combined
> > the Neighbor Cache and the Destination cache causing data forwarding
> > failures and incorrect on-link assumptions. This problem you are
> > alluding to about the IsRouter is another bug in the Linux code as to
> > why the OS has FIB clearing entries in the Neighbor Cache?
>
> Not quite; the OS seems to clear *FIB entries* based on the setting of
> the IsRouter flag in the neighbor cache entry corresponding to the
> nexthop. The OS does not clear entries in the nbr cache.
>
> > The FIB is
> > the Prefix List, the Destination Cache, and the Default Router List;
> > the FIB should not touch the Neighbor Cache. I do grant you an OS can
>
> > independently garbage collect entries in the Neighbor Cache and the OS
>
> > is also not non-compliant for ND if the OS deletes entries in the
> > Neighbor Cache with IsRouter flag set to FALSE. Note ND RFC 4861 does
>
> > not say anything about garbage collecting entries in the Neighbor
> > Cache with IsRouter flag set to FALSE.
>
> No, I am not talking about garbage collecting *nbr cache* entries based
> on IsRouter; I am talking about garbage collecting *FIB entries* which
> can lead to loss of connectivity. I have said this a number of times
> now.
> Wes said it in his message, too.
>
> > Now, when anyone reports a bug to me, I try to ascertain the severity
> > of the bug. The issue you raise does not look severe to me, It's a
> > temporary problem that can fix itself.
>
> Fix itself how? Once the FIB entry is gone there would need to be some
> protocol for bringing it back and I don't see that specified anywhere.
> And, unless the nbr cache entry IsRouter flag gets set to TRUE, the FIB
> entry could just be garbage collected all over again resulting in the
> same loss of connectivity.
>
> > If an OS has this garbage
> > collection nuance and the Neighbor Cache entry is deleted, when the
> > next packet needs to be sent to the node whose entry was deleted in
> > the SP rtr, ND address resolution will take place and resolve the
> > address causing the Neighbor Cache to be populated again. ND also
> > specifies the packet be held in a queue till the packet's destination
> > is resolved - so the SP rtr is not likely to drop any packets.
>
> See above - it is FIB entry deletion and not nbr cache entry deletion
> that concerns me.
>
> > Wes already asked, what if the CE Rtr always sets the IsRouter flag in
>
> > ND messages where this flag is possible to be set and that should take
>
> > care of your Linux problem. If the CE Rtr sends an NA, the CE Rtr
> > will set the IsRouter flag to TRUE.
>
> I already said this both in an off-list message and more recently
> on-list. If the CE router sends an RS, then the SP router will set
> IsRouter in its nbr cache entry for the CE router to FALSE. But, if the
> CE router subsequently sends an NA message with the R bit (i.e., the
> Router bit) set to TRUE, the SP router will set IsRouter in the nbr
> cache entry to TRUE and the danger of FIB entry deletion is averted.
>
> Two problems with this however. First, it requires the CE router to send
> a gratuitous NA message. Secondly, the CE router has no way of knowing
> if the SP router has received the NA message.
>
> > Did we miss anything?
>
> Yes, but I think I clarified it above?
>
> Fred
> fred.l.templin@boeing.com
>
> > Thanks,
> >
> > Hemant