[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: draft-aruns-ccamp-rsvp-restart-ext-00

To: Adrian Farrel <adrian@olddog.co.uk>
Subject: Re: draft-aruns-ccamp-rsvp-restart-ext-00
From: Reshad Rahman <rrahman@cisco.com>
Date: Sat, 06 Mar 2004 09:12:52 -0500
Cc: Nic Neate <Nic.Neate@dataconnection.com>, aruns@movaz.com, Movaz Networks - Louis Berger <lberger@movaz.com>, dimitri.papadimitriou@alcatel.be, ccamp@ops.ietf.org
In-reply-to: <024001c40379$2fb0d260$ece325da@Puppy>
References: <53F74F5A7B94D511841C00B0D0AB16F8028708CF@baker.datcon.co.uk> <024001c40379$2fb0d260$ece325da@Puppy>
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)

Hi all,

Comments inline.

Adrian Farrel wrote:

Hi Nic,

I've just read your draft-aruns-ccamp-rsvp-restart-ext-00 and it looks good.
In particular, we've been looking at using Restart for Fast Reroute LSPs for
some time and this draft provides everything that is needed (like recovering
the FAST_REROUTE, DETOUR, SENDER_TEMPLATE and ERO
objects from the downstream node when they are not available from upstream).


Good. This concern was also raised in Seoul, and I am pleased to hear that the draft
addresses these requirements.

However, I have a couple of concerns (not related to Fast Reroute).

 - Your draft doesn't tackle, and won't work for, simultaneous restart of
adjacent nodes.  This is a problem that is tackled by
draft-rahman-ccamp-rsvp-restart-extensions, so merging the two drafts in
some way may be the best way to resolve that.  I realize that the Aruns
draft aims to make Restart possible for nodes which cannot retrieve state
from the data plane, and in that case recovering from simultaneous restart
of adjacent nodes isn't easy.  I think including some further extensions for
nodes which can retrieve some state from the data plane would be
appropriate.


Retrieving state from the data plane only answers half of the problem. However, it is
certainly important to audit the recovered control plane information against the known
data plane state.

With regard to adjacent node failures and restarts, I believe there are actually
sufficient capabilities here. Perhaps the authors would like to include text to clarify
the procedures.

 - The back compatibility with RFC 3473 restart looks risky.  Draft Aruns
mandates that restarted nodes don't send Path Refreshes until either the
recovery period expires or a RecoveryPath is received from downstream.  In
the case that the downstream node only supports RFC 3473 restart (and so
doesn't send RecoveryPaths), it may well timeout Path state at the same time
as or very soon after the recovery period expires.  Hence a dangerous timing
window is created.


You have something here.
However, section 9.5.3 of RFC3473 does not say that the neighbor MUST discard state that
is not restored in the recovery time interval. Presumably it would simply recommence
waiting for state refresh and so would time out after a 3.5 refresh intervals from the end
of the recovery interval.

Some compromise may be introduced here by noting that 3473 says that Path state SHOULD be
restored within 1/2 of the recovery time. So we could follow this logic and use the first
half of the time interval for the RecoveryPath message and the second half for backwards
compatible recovery.

On the other hand, I would prefer that this new capability (support for RecoveryPath
message) was signaled in the Restart_Capabilities object so that the restarting node can
know whether to expect to receive a RecoveryPath or not.

It's a good idea for the restarting node to know whether it should expect RecoveryPath messages. There doesn't seem to be any room in the Restart_Cap object for this info, so looks like we'd need an extension.

I would also like to see a mechanism where each node could indicate on a per-LSP basis (e.g. at setup time) whether it would want the RecoveryPath message for that LSP after it restarts.

Regards,
Reshad.

As a potential solution to both problems I'd suggest that a restarting node
receiving a Path message with a recovery label should always forward it
immediately as well as it can, and include both a recovery label and (for
back compatibility) a suggested label.  Similarly, it should forward
RecoveryPath messages immediately as well as it can.  I'd be happy to
discuss any of this further.


This sounds very dangerous.
"As well as it can" may include path computation which may pick a path other than the one
previously in use. Hence the new Path message will be sent to a new neighbor. This
disaster is no better than the problem we are trying to solve.

Cheers,
Adrian

References:
- draft-aruns-ccamp-rsvp-restart-ext-00
  - From: Nic Neate <Nic.Neate@dataconnection.com>
- Re: draft-aruns-ccamp-rsvp-restart-ext-00
  - From: "Adrian Farrel" <adrian@olddog.co.uk>

Prev by Date: Re: draft-aruns-ccamp-rsvp-restart-ext-00
Next by Date: Re: Opinion sought on drafts being adopted by CCAMP
Previous by thread: Re: draft-aruns-ccamp-rsvp-restart-ext-00
Next by thread: RE: draft-aruns-ccamp-rsvp-restart-ext-00
Index(es):
- Date
- Thread