[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Last call - RSVP problems



Jonathan,

The problem with being in meetings is it took me a long
time to respond -:)

So, I 'll try to provide a summary of the messages exchanged
as well as my understanding of our off-line discussion 
on the applicability of RSVP and LMP Hello mechanisms.

An attempt on terminology clarification first:
Control channel: This is a logical channel over which IP 
control packets are transported between two network elements 
(for example a UNI-C and UNI-N in the case of the OIF UNI). 
A control channel can be realized in multiple ways, for example,
in-fiber using SONET/SDH overhead bytes, out-of-band over an 
Ethernet, etc. 
A pair of network elements may be connected by multiple control 
channels, used either concurrently or in a primary-backup 
configuration. Control channel failures can be detected using 
link management procedures, such as the LMP Hello mechanism. 

Signaling channel: This is a logical channel over which signaling 
messages are exchanged between two peers. The signaling channel 
is realized using (potentially multiple) control channels and 
it operation relies on processing of signaling messages by the 
respective peer signaling entities. 
A signaling channel failure is the result of failure of all control 
channels, or failure of a signaling entity. Signaling channel 
failures can be detected using the RSVP hello mechanism (assuming
of course RSVP is the signaling protocol used!).

Hope this helps,
Dimitrios

> -----Original Message-----
> From: Jonathan Lang [mailto:jplang@calient.net]
> Sent: Wednesday, May 30, 2001 2:08 PM
> To: 'Dimitrios Pendarakis'; Jonathan Lang; 'Fong Liaw'; 
> 'suresh Katukam'
> Cc: 'Jennifer Yates'; mpls@UU.NET; ccamp@ops.ietf.org
> Subject: RE: Last call - RSVP problems
> 
> 
> Dimitrios,
>   The luxury of being in meetings all morning is that I had 
> the chance to
> read your email as well as the responses from David, Lyndon, 
> and Eric before
> I respond.  Since I agree with their response, I will try to be brief.
> Using RSVP Hellos to detect RSVP module failures is in my 
> mind the correct
> usage for them.  Control channel failures should be detected 
> by mechanisms
> designed to detect them (e.g., LMP Hellos).   One scenario 
> that carriers are
> interested in supporting is the notion of having parallel 
> redundant control
> channels for signaling purposes.  We will need to support 
> other mechanisms
> for detecting control channel failures in this case.
> 
> Thanks,
> Jonathan
> 
> > -----Original Message-----
> > From: Dimitrios Pendarakis [mailto:DPendarakis@tellium.com]
> > Sent: Wednesday, May 30, 2001 7:24 AM
> > To: 'Jonathan Lang'; 'Fong Liaw'; 'suresh Katukam'
> > Cc: 'Jennifer Yates'; mpls@UU.NET; ccamp@ops.ietf.org
> > Subject: RE: Last call - RSVP problems
> > 
> > 
> > Jonathan,
> > 
> > Could you please clarify why you think RSVP Hello doesn't work for 
> > parallel (redundant) control channels? Control channel redundancy 
> > is obviously important; however I don't think it relates to the 
> > usefulness of the RSVP hello mechanism, at least in the context of 
> > this discussion.
> > 
> > The whole issue is one of layering: LMP hello allows you to detect 
> > individual link (or control channel) failures; RSVP Hello allows
> > you to detect that you 've lost contact with your signaling peer
> > for whatever reason (failure of all redundant control channels, 
> > failure of the RSVP daemon, etc.). Since they operate at different
> > levels they are to some extend complementary; once RSVP Hello tells
> > you that your peer is not reachable, you could use LMP Hello's to 
> > try to pinpoint where the problem occurred. 
> > 
> > For clarification, the "control channel failure" mentioned in
> > Fong's mail that started this discussion refers to a failure of the
> > signaling channel, i.e., inability to communicate with your 
> signaling
> > peer, not failure of specific (individual) control 
> channels. As such, 
> > I still believe RSVP Hello is a more accurate detection mechanism.
> > 
> > Please see some additional comments inline.
> > 
> > Thanks,
> > Dimitris
> > 
> > 
> > >   If you're using the RSVP Hello to detect the failure of the 
> > > RSVP module
> > > only, then this is fine.  If you're using the RSVP Hello to 
> > detect (or
> > > rather, infer) control channel failures, I have a problem.  
> > > For starters, I
> > > don't think it works for parallel (redundant) control 
> > > channels; if control
> > > channel redundancy is *not* important for SPs, then it's not 
> > > as much of an
> > > issue, however, I think you're overloading the semantics of 
> > > the RSVP Hello.
> > > 
> > >   See additional comments inline.
> > > 
> > > Thanks,
> > > Jonathan
> > > 
> > > > 
> > > > Hi Jonathan,
> > > > 
> > > > What needs to be detected here is the inability to send and 
> > > > receive signaling messages to and from your RSVP peer. In this 
> > > > respect, using RSVP Hello is a more accurate mechanism for 
> > > > detecting failures since it is part of the RSVP protocol and 
> > > > independent of the link layer used to realize the control 
> > > > channel. As you point out, if you prefer redundant control 
> > > > channels for resiliency, and want to rely on LMP only, you have
> > > > to correlate all LMP detected failures in order to declare a 
> > > > control channel failure.
> > > This is not correct.  LMP allows you to detect individual 
> > > control channel
> > > failures, even when there are multiple (redundant) control 
> > > channels.  The
> > > RSVP Hello cannot do this.
> > > 
> > 
> > Sure, as I pointed above, RSVP Hello is not intended to detect 
> > individual link failures. At the risk of sounding repetitive, the
> > point here is that these are different mechanisms that serve 
> > different purpose. For example, if you have 3 (redundant) 
> IP control 
> > channels and rely on LMP Hello only, in order to infer 
> failure of the 
> > signaling channel with your peer, you would have to correlate LMP 
> > failures in all three channels. 
> > 
> > > > With RSVP Hello you would rely on a 
> > > > single set of event(s) to detect the failure.
> > > To be clear, the failure of the RSVP module.
> > 
> > Or simultaneous failure of all IP control channels...
> > 
> > > 
> > > > To go even further, what if all your control channels 
> are working
> > > > fine, but your RSVP daemon has crashed, for whatever reason.
> > > > LMP hellos will keep going across but RSVP messages will not...
> > > This is where the RSVP Hello has its place.
> > > 
> > > > 
> > > > All this points to the (certainly non-trivial) problem 
> of failure 
> > > > detection and correlation at different layers. I am not 
> trying to
> > > > say that RSVP Hello is the perfect or only mechanism we 
> > should rely 
> > > > on, but it certainly has a place in addressing the problem.
> > > agreed.  see above comment.
> > > 
> > > > 
> > > > Thanks,
> > > > Dimitris
> > > > 
> > > > PS. Hopefully we are not too far apart so we might not have
> > > > to keep the discussion going for too long :-)
> > > > In any case I believe it has some relevance here as well.
> > > > 
> > > > 
> > > > > -----Original Message-----
> > > > > From: Jonathan Lang [mailto:jplang@calient.net]
> > > > > Sent: Friday, May 25, 2001 6:56 PM
> > > > > To: 'Dimitrios Pendarakis'; Jonathan Lang; 'Fong 
> Liaw'; 'suresh
> > > > > Katukam'; v.sharma@ieee.org
> > > > > Cc: 'Jennifer Yates'; mpls@UU.NET; ccamp@ops.ietf.org
> > > > > Subject: RE: Last call - RSVP problems
> > > > > 
> > > > > 
> > > > > Dimitris,
> > > > >   The RSVP Hello Extension is intended to detect node 
> > > > failures and not
> > > > > necessarily link (control channel) failures.  In particular, 
> > > > > the tunnel
> > > > > draft says, "It should be noted that node failure detection 
> > > > > is not the same
> > > > > as a link failure detection mechanism, particularly in the 
> > > > > case of multiple
> > > > > parallel unnumbered links."  What happens if you want 
> > > > > redundant control
> > > > > channels for control-channel resiliency?  I believe you would 
> > > > > still want a
> > > > > mechanism to detect control channel (link-layer) failures.
> > > > > 
> > > > >   In LMP, you can have multiple active control channels 
> > > > > between a pair of
> > > > > nodes.  You can configure control channels over shared 
> > > > > Ethernet and over
> > > > > SONET/SDH overhead, etc.  You send keep-alive Hellos over 
> > > > each control
> > > > > channel.  If you so desire, you can send all signaling 
> > > > > messages over the
> > > > > Ethernet control channels.  You can also choose to send all 
> > > > > remaining LMP
> > > > > messages over the SONET/SDH-based control channels.  Keeping 
> > > > > RSVP Hellos as
> > > > > "optional" would be fine.  Making them mandatory doesn't seem 
> > > > > like the right
> > > > > solution.
> > > > > 
> > > > > Thanks,
> > > > > Jonathan
> > > > > 
> > > > > P.S.  Should we move this discussion on the OIF list??
> > > > > 
> > > > > > -----Original Message-----
> > > > > > From: Dimitrios Pendarakis [mailto:DPendarakis@tellium.com]
> > > > > > Sent: Friday, May 25, 2001 3:16 PM
> > > > > > To: 'Jonathan Lang'; 'Fong Liaw'; 'suresh Katukam'; 
> > > > > v.sharma@ieee.org
> > > > > > Cc: 'Jennifer Yates'; mpls@UU.NET; ccamp@ops.ietf.org
> > > > > > Subject: RE: Last call - RSVP problems
> > > > > > 
> > > > > > 
> > > > > > Hi Jonathan,
> > > > > > 
> > > > > > One point to consider is that the signaling control 
> > > > > > channel might be realized over a different link layer 
> > > > > > than LMP. For example, LMP might be running in-fiber
> > > > > > over SONET/SDH overhead bytes, while RSVP is running 
> > > > > > over an out-of-band network such a shared Ethernet.
> > > > > > RSVP Hello allows independent detection of signaling 
> > > > > > channel failure, so it's a useful option to keep.
> > > > > > 
> > > > > > Thanks,
> > > > > > Dimitris
> > > > > > 
> > > > > > Dimitris Pendarakis
> > > > > > Tellium, Inc.
> > > > > > 
> > > > > > 
> > > > > > > -----Original Message-----
> > > > > > > From: Jonathan Lang [mailto:jplang@calient.net]
> > > > > > > Sent: Friday, May 25, 2001 5:30 PM
> > > > > > > To: 'Fong Liaw'; 'suresh Katukam'; v.sharma@ieee.org
> > > > > > > Cc: 'Jennifer Yates'; mpls@UU.NET; ccamp@ops.ietf.org
> > > > > > > Subject: RE: Last call - RSVP problems
> > > > > > > 
> > > > > > > 
> > > > > > > Fong,
> > > > > > > 
> > > > > > > 
> > > > > > > <snip>
> > > > > > > > Same as (2), any proposals that remove the refresh 
> > > mechanism 
> > > > > > > > is going to be difficult to prove that all cases 
> > > are covered.
> > > > > > > > 
> > > > > > > > Instead, we (will) recommend the following in OIF UNI 
> > > > document:
> > > > > > > >    
> > > > > > > >    Use RSVP Hello to detect control channel failure.
> > > > > > > Why wouldn't you use the LMP Hello to detect 
> > control channel 
> > > > > > > failure?  This
> > > > > > > is exactly what it is designed for.  From 
> > > > > > > draft-ietf-mpls-lsp-tunnel-08.txt,
> > > > > > > 
> > > > > > >   "This (RSVP Hello) mechanism is intended to be 
> used when 
> > > > > > > notification of
> > > > > > > link layer
> > > > > > >    failures is not available and unnumbered links are not 
> > > > > > > used, or when
> > > > > > >    the failure detection mechanisms provided by the link 
> > > > > > layer are not
> > > > > > >    sufficient for timely node failure detection."
> > > > > > >  
> > > > > > > >    If a control channel failure is detected, 
> LSPs states 
> > > > > > > >    are maintained as if a node continues to receive 
> > > > > > > >    RSVP refresh message from the failed control channel.
> > > > > > > >    The recommended Hello timer will be in second range,
> > > > > > > >    instead of ms range specified in RSVP-TE draft.  
> > > > > > > > 
> > > > > > > >    If a control channel failed permanently, manual 
> > > > intervention 
> > > > > > > >    may be required. This is to be studied.
> > > > > > > > 
> > > > > > > > p.s The text is currently being drafted as we type.
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
>