[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Two Drafts for Resilience of Control Plane

To: "Drake, John E" <John.E.Drake2@boeing.com>
Subject: RE: Two Drafts for Resilience of Control Plane
From: ibryskin@movaz.com
Date: Sat, 29 Oct 2005 10:19:21 -0400 (EDT)
Cc: ibryskin@movaz.com, dpapadimitriou@psg.com, dimitri.papadimitriou@alcatel.be, "Igor Bryskin" <i_bryskin@yahoo.com>, "Zafar Ali" <zali@cisco.com>, "Kim Young Hwa" <yhwkim@etri.re.kr>, ccamp@ops.ietf.org
In-reply-to: <626FC7C6A97381468FB872072AB5DDC8369711@XCH-SW-42.sw.nos.boeing.com>
References: <626FC7C6A97381468FB872072AB5DDC8369711@XCH-SW-42.sw.nos.boeing.com>
User-agent: SquirrelMail/1.4.1
John,

> Igor,
>
> You haven't convinced me that there is a real problem here that is not
> addressed by the combination of the existing GMPLS resilience mechanisms
> and a robust implementation of those mechanisms.

What exactly I failed to convince you:
a) the example is not real problem  -- I am telling you that this is the
problem raised by the customers, how more real it could be?

b) the problem cannot ne solved by existing machinery --- so I described
in details the problem, please, tell me how I can solve it.

Have a great weekend.

Igor

>
> As I said yesterday, an I-D that enumerated all of the resilience
> mechanisms might be a useful thing, specifically as a primer.
>
> Thanks,
>
> John
>
>> -----Original Message-----
>> From: ibryskin@movaz.com [mailto:ibryskin@movaz.com]
>> Sent: Saturday, October 29, 2005 6:38 AM
>> To: Drake, John E
>> Cc: ibryskin@movaz.com; dpapadimitriou@psg.com;
>> dimitri.papadimitriou@alcatel.be; Igor Bryskin; Zafar Ali; Kim Young
> Hwa;
>> ccamp@ops.ietf.org
>> Subject: RE: Two Drafts for Resilience of Control Plane
>>
>> John,
>>
>> See in line.
>>
>> Igor
>>
>> > Igor,
>> >
>> > What you wrote was:
>> >
>> > "Suppose one or more signaling controllers managing some LSP went
> out of
>> > service leaving the LSP's data plane intact. As far as the user is
>> > concerned such LSP is perfectly healthy and operational.  Such
> situation
>> > could last for a considerable period of time."
>> >
>> > What part of this is *not* handled by RSVP graceful restart?
>> >
>> > In your subsequent e-mail, you then changed the problem statement
> to:
>> >
>> > ""Dead" controllers in my example *do not* come back for a
> considerable
>> > period of time. So there are no restarts here (graceful or not
>> > graceful)"
>>
>> Sorry, I don't see how I have changed the problem statement. I was and
> am
>> saying that while controllers are out of service for a considerable
> time
>> (day? two days?  week?) the question is what to do with active LSPs
>> associated with them? Let's consider an example:
>>
>>
>> A----B------C-----D
>> }                 |
>> E-----F-----H-----K
>>
>> Suppose we have an LSP A-B-C-D carrying user traffic and a controller
>> managing node B went out of service. The question is what to do with
> this
>> LSP until the controller comes back? The operator may decide:
>> a)	simply not wait and delete the LSP. Normal LSP teardown -
> PathTear
>> originated on the ingress controller- won't work because PathTear
> won't
>> make it to controllers managing nodes C and D, leaving (very expensive
> in
>> the optical layer) resources associated with the LSP allocated and not
>> available for other LSPs;
>> b)	reroute via mb4b the LSP onto alternative path A-E-H-K-D -won't
> work
>> for the same reason as in a)
>> c)	leave LSP as it is and wait for the dead controller to be
> replaced
>> or
>> repaired. This would mean the need to perform normal operations like,
> for
>> example, monitoring of data plane alarms, changing LSP admin status
> (for
>> example, disabling alarms on all nodes), perform power monitoring and
>> equalization, perform recovery operation in case of a fatal data plane
>> failure. All what depends on hop-by-hop signaling won't work today.
>> Don't tell me that these problems are fabricated; they are real
> because
>> they are raised by the customers. Dimitri seems to understand the
> problem
>> but he is saying that the CP in this case is hardly of any use. This
> IMO
>> is a dangerous statement for the future of CP in non-packet
> environments.
>> The Management plane aficionados will jump on it and say that
> management
>> plane does not have such a problem - NMS has a direct access to any NE
> on
>> the network, so it can do all necessary cleanup no matter what
> happened.
>> Customers will say: "Well, if there are situations when CP suddenly
>> becomes useless and we have to use management plane anyway, why would
> we
>> use the CP in the first place?'
>>
>> Fortunately, I believe that the problems could be solved entirely via
> CP
>> by making it more resilient. Hence, CP resilience is a good direction
> to
>> work on within CCAMP WG
>>
>> Igor
>>
>> > If "Considerable period of time" is not equal to infinity, then
> there
>> > will be an RSVP graceful restart.  If a controller is really and
> truly
>> > dead, then presumably the operator will either replace it or
> re-assign
>> > its data-plane resources to another signaling controller.  In either
>> > case, there will then be an RSVP graceful restart.
>> >
>> > Thanks,
>> >
>> > John
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: ibryskin@movaz.com [mailto:ibryskin@movaz.com]
>> >> Sent: Friday, October 28, 2005 1:00 PM
>> >> To: Drake, John E
>> >> Cc: ibryskin@movaz.com; dpapadimitriou@psg.com;
>> >> dimitri.papadimitriou@alcatel.be; Igor Bryskin; Zafar Ali; Kim
> Young
>> > Hwa;
>> >> ccamp@ops.ietf.org
>> >> Subject: RE: Two Drafts for Resilience of Control Plane
>> >>
>> >> John,
>> >>
>> >> I think you missed my point here. "Dead" controllers in my example
> *do
>> >> not* come back for a considerable period of time. So there are no
>> > restarts
>> >> here (graceful or not graceful) :=)
>> >>
>> >> Igor
>> >>
>> >> > What part of your problem, as stated below, is not handled by
> RSVP
>> >> > graceful restart?
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: ibryskin@movaz.com [mailto:ibryskin@movaz.com]
>> >> >> Sent: Friday, October 28, 2005 11:41 AM
>> >> >> To: Drake, John E
>> >> >> Cc: dpapadimitriou@psg.com; dimitri.papadimitriou@alcatel.be;
> Igor
>> >> >> Bryskin; Zafar Ali; Kim Young Hwa; ccamp@ops.ietf.org
>> >> >> Subject: RE: Two Drafts for Resilience of Control Plane
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> Here is one of the problems that I've been thinking for a while
> -
>> >> > control
>> >> >> plane partitioned LSPs. Suppose one or more signaling
> controllers
>> >> > managing
>> >> >> some LSP went out of service leaving the LSP's data plane
> intact.
>> > As
>> >> > far
>> >> >> as the user is concerned such LSP is perfectly healthy and
>> >> > operational.
>> >> >> Such situation could last for a considerable period of time. Do
> we
>> >> > need to
>> >> >> manage such LSP via control plane? Sure, we must be capable to
> tear
>> >> > down
>> >> >> such LSP, perform mb4b rerouting, distribute alarms between
>> >> > operational
>> >> >> controllers, signal data plane faults and perform recovery
>> > switchover,
>> >> >> modify LSP status, etc. Can we do this today? No, but with some
>> >> >> (signaling) extensions the problem I believe is solvable. Is
> this
>> > some
>> >> >> artificial, "fabricated" problem? No, I think it is real. Does
> it
>> > fall
>> >> >> under the control plane resilience problem space? I believe it
>> > does.
>> >> >>
>> >> >> Igor
>> >> >>
>> >> >> > I agree with Zafar and Dimitri.  If someone wanted to document
>> > the
>> >> > GMPLS
>> >> >> > control plane resiliency features, as was done for GMPLS
>> > addressing,
>> >> >> > that might be a useful activity.
>> >> >> >
>> >> >> >> -----Original Message-----
>> >> >> >> From: dimitri papadimitriou [mailto:dpapadimitriou@psg.com]
>> >> >> >> Sent: Friday, October 28, 2005 9:56 AM
>> >> >> >> To: Igor Bryskin
>> >> >> >> Cc: Zafar Ali (zali); Kim Young Hwa; ccamp@ops.ietf.org
>> >> >> >> Subject: Re: Two Drafts for Resilience of Control Plane
>> >> >> >>
>> >> >> >> igor -
>> >> >> >>
>> >> >> >> over time CCAMP came with a set of mechanims to improve
> control
>> >> > plane
>> >> >> >> resilience (RSVP and LMP GR upon channel/node failure) other
> WG
>> >> >> > protocol
>> >> >> >> work are also usable used here OSPF GR, etc. ... on the other
>> > side,
>> >> >> >> mechanism such as link bundling have built-in resilience
>> >> > capabilities
>> >> >> >> and most GMPLS control plane capabilities have been designed
>> > such
>> >> > as
>> >> >> > to
>> >> >> >> be independent of the control plane realisation (in-band,
>> >> > out-of-band,
>> >> >> >> etc.)
>> >> >> >>
>> >> >> >> so indeed i share the concern of Zafar what could we do more
>> > here
>> >> > than
>> >> >> >> document these tools and provide our experience in using
> them;
>> >> >> >>
>> >> >> >> now, before stating there are (potential) problems(s) arising
> -
>> >> > would
>> >> >> >> you please be more specific on what are these potential
> issue(s)
>> >> >> > and/or
>> >> >> >> problems ? (not related to policy/config. - note: all the
> issues
>> >> > you
>> >> >> >> have pointed here below are simply policy/config specific but
>> > none
>> >> > of
>> >> >> >> them highlights a missing IP control plane resiliency
> feature)
>> >> >> >>
>> >> >> >> thanks,
>> >> >> >> - dimitri.
>> >> >> >>
>> >> >> >>
>> >> >> >> Igor Bryskin wrote:
>> >> >> >>
>> >> >> >> > Zafar,
>> >> >> >> >
>> >> >> >> > The problem arises when the control plane is decoupled
>> >> >> >> > from the data plane. The question is do we need such
>> >> >> >> > decoupling in IP networks? Consider, for example, the
>> >> >> >> > situation when several parallel PSC data links bundled
>> >> >> >> > together and controlled by a single control channel.
>> >> >> >> > Does it mean in this case that when the control
>> >> >> >> > channel fails all associated data links also fail? Do
>> >> >> >> > we need to reroute in this case LSPs that use the data
>> >> >> >> > links? Can we rely in this case on control plane
>> >> >> >> > indications to decide whether an associated data link
>> >> >> >> > is healthy or not (in other words, can we rely on RSVP
>> >> >> >> > Hellos or should we use, for example, BTD)? Should we
>> >> >> >> > be capable to recover control channels without
>> >> >> >> > disturbing data plane? I think control plane
>> >> >> >> > resilience is important for all layers. You are right,
>> >> >> >> > Internet does work, however, we do need for some
>> >> >> >> > reason TE and (fast) recovery in IP as much as in
>> >> >> >> > other layers,don't we?
>> >> >> >> >
>> >> >> >> > Cheers,
>> >> >> >> > Igor
>> >> >> >> >
>> >> >> >> > --- "Zafar Ali (zali)" <zali@cisco.com> wrote:
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >>Hi All,
>> >> >> >> >>
>> >> >> >> >>I am unable to understand the problem we are trying
>> >> >> >> >>to solve or
>> >> >> >> >>fabricate. My control network is IP based and IP has
>> >> >> >> >>proven resiliency
>> >> >> >> >>(Internet *does* work), why would I like to take
>> >> >> >> >>control plan resiliency
>> >> >> >> >>problem at a layer *above-IP* and complicate my
>> >> >> >> >>life. Did I miss
>> >> >> >> >>something?
>> >> >> >> >>
>> >> >> >> >>Thanks
>> >> >> >> >>
>> >> >> >> >>Regards... Zafar
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>________________________________
>> >> >> >> >>
>> >> >> >> >>	From: owner-ccamp@ops.ietf.org
>> >> >> >> >>[mailto:owner-ccamp@ops.ietf.org]
>> >> >> >> >>On Behalf Of Kim Young Hwa
>> >> >> >> >>	Sent: Friday, October 28, 2005 6:04 AM
>> >> >> >> >>	To: ccamp@ops.ietf.org
>> >> >> >> >>	Subject: Two Drafts for Resilience of Control Plane
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>	Dear all,
>> >> >> >> >>
>> >> >> >> >>	I posted two drafts for the resilience of control
>> >> >> >> >>plane.
>> >> >> >> >>	One is for requirements of the resilience of
>> >> >> >> >>control plane, the
>> >> >> >> >>other is for a protocol specification as a solution
>> >> >> >> >>of that .
>> >> >> >> >>	These are now available at:
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> >
>> >> >
> http://www.ietf.org/internet-drafts/draft-kim-ccamp-cpr-reqts-01.txt
>> >> >> >> >
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> >
>> >> >> >
>> >> >
>> >
> http://www.ietf.org/internet-drafts/draft-kim-ccamp-accp-protocol-00.txt
>> >> >> >> >
>> >> >> >> >>
>> >> >> >> >>	I want your comments.
>> >> >> >> >>
>> >> >> >> >>	Regards
>> >> >> >> >>
>> >> >> >> >>	Young.
>> >> >> >> >>
>> >> >> >> >>	===================================> >>	Young-Hwa Kim
>> >> >> >> >>	Principal Member / Ph.D
>> >> >> >> >>	BcN Research Division, ETRI
>> >> >> >> >>	Tel:     +82-42-860-5819
>> >> >> >> >>	Fax:    +82-42-860-5440
>> >> >> >> >>	e-mail: yhwkim@etri.re.kr
>> >> >> >> >>	===================================> >>
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> >
>> >> >> >
>> >> >
>> >
> <http://umail.etri.re.kr/External_ReadCheck.aspx?email=ccamp@ops.ietf.or
>> >> >> >> >
>> >> >> >> >
>> >> >> >
>> >> >
>> >
> g&name=ccamp%40ops.ietf.org&fromemail=yhwkim@etri.re.kr&messageid=%3C863
>> >> >> >> >
>> >> >> >> >>0a6db-0c31-49ab-a798-13b0dda04553@etri.re.kr%3E>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > __________________________________
>> >> >> >> > Yahoo! Mail - PC Magazine Editors' Choice 2005
>> >> >> >> > http://mail.yahoo.com
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > .
>> >> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >
>> >> >
>> >
>> >
>
>
References:
- RE: Two Drafts for Resilience of Control Plane
  - From: "Drake, John E" <John.E.Drake2@boeing.com>
Prev by Date: RE: Two Drafts for Resilience of Control Plane
Next by Date: RE: Two Drafts for Resilience of Control Plane
Previous by thread: RE: Two Drafts for Resilience of Control Plane
Next by thread: RE: Two Drafts for Resilience of Control Plane
Index(es):
- Date
- Thread