[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Two Drafts for Resilience of Control Plane





Igor

[snip]

1. Yes, I do have an independent dp (not cp as you say) failure detection
mechanism

not sure what's so specific here from a base SONET/SDH GMPLS control

2. Yes, I do negotiate with my neighbors not to drop state in case of cp
failure

ok

3. Yes, I want to control my LSP in such state from a *single* (LSP ingress)
point, because when an LSP is created by a user (via say UNI or management
plane) it is totally up to user to decide what to do with the LSP when CP
fails. The only way the user can control this is by sending another
request(s) to the LSP ingress controller (not to egress or any transit
controller because the user may not even know their IDs). In particular, the
user may decide to continue using the service (after all it is still
carrying its traffic), but in this case the user would want to maintain
control of the LSP (for example, he may want to get notifications about data
plane alarms detected on the nodes beyond point of CP failure, likewise, he
may want to modify the LSP's admin status, setup/holding priorities, etc.)
Alternatively, he may decide to delete or reroute it. Still in any case the
user wants to control/convey its decision only to the LSP ingress
controller.

you have many specific requirements but i really don't see here what's so specific here in terms of resiliency of existing cp mechanisms; you should probably try to find another name for this kind of feature here as using 'resiliency' is rather confusing

btw, if your data plane is up i am not sure which kind of alarms you may want to receive and/or if there is something preventing you as of today to receive these alarms at the sender side ?

4. No. I can not do this today because of the deficiency of RSVP - messages
on a contiguous LSP are sent on a hop-by-hop basis, meaning that any hop can
block the message

just a simple comment are you sure that in such a degraded state the user is willing to make so many/various operations on its LSPs - i mean beside service availability and base maintenance operations ? would be nice to have a bit more feedback on this specific aspect

5. Yes, I do have a simple and totally backward compatible solution how to
overcome these deficiencies.

ok ... i think right time for you to write a draft about it

note: don't forget to provide a complete motivation section, as well as why RSVP mechanisms/others would not be sufficient as this may seriously help assessing the value of your proposal

Igor





Igor

----- Original Message ----- From: "dimitri papadimitriou" <dpapadimitriou@psg.com>
To: "Igor Bryskin" <ibryskin@movaz.com>
Cc: <dimitri.papadimitriou@alcatel.be>; "Drake, John E"
<John.E.Drake2@boeing.com>; "Zafar Ali (zali)" <zali@cisco.com>; "Igor
Bryskin" <i_bryskin@yahoo.com>; <drake@movaz.com>; "Kim Young Hwa"
<yhwkim@etri.re.kr>; <ccamp@ops.ietf.org>
Sent: Monday, October 31, 2005 11:52 AM
Subject: Re: Two Drafts for Resilience of Control Plane




igor -

Igor Bryskin wrote:


Dimitri,



igor - my two cents

RSVP over time has progressively borrowed mechanisms from "hard-state"
protocols, explicit deletion using PathTear is most noticeable and
initial example of this evolution !

but in any case, RSVP still relies is on idem-potent soft-states that
are flushed when not refreshed after certain time interval (or self-
maintained if previously negotiated) this prevents orphans in the
network (so unused resources) and provides for resilience - hence

there

is by no means a need to introduce an additional protocol mechanism to
trigger or not such event via the "control plane" -


Refreshes are useful mechanism but only between neighbors that maintain
Hello communication. In this case the absence of Path refreshes is as

good


indication that data plane must be destroyed as received PathTear

message.


the base function of state refresh and usefulness is independent of
hello adjacency maintenance (or any other control channel maintenance)



However, when a controller does not receive Path refreshes from a

neighbor


it does not have any control plane communication with, it can assume

neither


a problem in the data plane nor intention to destroy it.

as the node did not negotiate any channel/node fault recovery (due in
part. to the absence of Hello adjacency with its neighbor) and if no
other independent control channel failure is provided (this is an add-on
of RFC3471/3), the simple absence of refresh is simply intepreted as
"implicit deletion"

you are mis-interpreting the following sentence of RFC3471

"   Note that these cases only apply when there are mechanisms to detect
  data channel failures independent of control channel failures."

there is no retro-fit on the use of Refreshes in absence of control
channel failure detection mechanism



Hence, as it was
specified in RFC3471, it *must* maintain both control and data plane

states


throughout the failure.

- d.


Igor




btw, the paragraph you mention in RFC3471 does not say "soft state
protocols do not work well for non-packet environments" this is your
interpretation;

ps: you are still free to make use of RFC3472 in case (as you were
apparently looking for something else ;-)

Igor Bryskin wrote:



John,






States are supposed to be destroyed on explicit signalling
message (e.g. PathTear or PathErr with the state removal
flag), but not because of the absence of refreshes.


[JD]

Igor,

Just to be clear, we are talking about RSVP here, and RSVP *is* a

soft

state protocol.  Can you point to any RFC that supports your

statements

above?

IB>> Oh, come on, John. You sound like you've been yourself in a

dormant


state for a while :=). We've gone a long way since RFC2205. In

RFC3471,

for



example, there is a discussion why GMPLS is needed and how is it

different



from MPLS. One of the differences is the fact that soft state

protocols

do



not work well for non-packet environments. Here is from the RDC3471:



9.2. Fault Handling    There are two new faults that must be handled

when



the control   channel is independent of the data channel.  In the

first,


there is a   link or other type of failure that limits the ability of
neighboring   nodes to pass control messages.  In this situation,
neighboring nodes   are unable to exchange control messages for a

period


of



time.  Once   communication is restored the underlying signaling

protocol



must   indicate that the nodes have maintained their state through

the

failure..

What is more important is the reality of life: The customers simply

say

that



you cannot destroy a user service (or even force any traffic hits)

just

because you have a problem in the control plane. If this does not fit

your



soft-state paradigm, than "harden" your protocols or flash them down

the


toilet and come with something else if you want our business. After

all,


if



we provision the services via NMS, we do not have to destroy the

services if



we have problems in the management network. It is that simple.



Igor







.





.





.





.