[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Starting a discussion on graceful shutdown



Hi Adrian,

On Dec 7, 2004, at 10:38 AM, Adrian Farrel wrote:

Hi,

http://www.ietf.org/internet-drafts/draft-ali-ccamp-mpls-graceful- shutdown-00.txt was
discussed at the meeting in Washington DC, and one of the questions raised was whether or
not we already had (multiple!) mechanisms capable of performing the function described in
the draft.


Kireeti quite rightly suggested that we step back and make sure we understand the
requirements. This is my take on those requirements and I would appreciate it if the
authors of the draft joined in and other people on the list commented.


We wish to manage a link that needs to be taken out of service in some way (data plane
and/or control plane will be disrupted). The link concerned has active LSPs and we wish to
offer upstream LSRs (in particular the ingress) the opportunity to use make-before-break
to re-route the LSPs.



This is indeed the goal. Note that this is just one out of many other cases for which a graceful shutdown solution is desirable, the action (reroute of existing LSPs, avoidance of the network element for *new* LSPs, reoptimization, ...) may of course vary depending on the event.


In order to achieve this, we need to communicate to the upstream nodes.

Right. Note that this might either be the impacted head-end LSRs *or* directly upstream neighbors *or* both depending on the requesting actions.


Should we choose
signaling or routing? Are there benefits that mean we should use both, or should we limit
to just one?



Both have pros and cons and are needed ... Routing is, in many cases, more efficient in term of sig overhead *but* limited to single area ... thus signaling is also required. The use of one of the other should IMO be specific of the graceful shutdown triggers and expected actions.


There is another aspect to be considered. Should we also attempt to protect new path
computations from selecting the link that will be taken out of service?



Again this highly depends on the root cause ... Consider:
1) Link to be shutdown: all LSP (old and new) should be rerouted
2) Memory shortage on some equipment: only *new* LSP should avoid the equipment in question
3) ....


How should we consider the case of a node (data plane and/or control plane) being taken
out of service? Is a node simply a collection of links?



Good question ... depends again on the data/control plane issue.

If a component link of a bundle is being taken out of service (and assuming other
component links are available) is this just an issue for the adjacent nodes or does it
need to be communicated more widely? If the downstream node decides to take the component
link out of service, how does it inform the upstream node?



This should I think be configurable with the possible to offer a notification mechanism.


Does it matter whether it is the control plane or the data plane that will be taken out of
service?


Opinions please.

I would like to see these issues and their answers captured clearly in a requirements
section of any draft. Would the authors of draft-ali-ccamp-mpls-graceful-shutdown-00.txt
be willing to take that on even though the end result might be that procedures other than
those they suggest will be selected?

Sure, we will be happy to write such a section.

Cheers.

JP.


Adrian