[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Last call - RSVP problems



Dimitrios,
  The luxury of being in meetings all morning is that I had the chance to
read your email as well as the responses from David, Lyndon, and Eric before
I respond.  Since I agree with their response, I will try to be brief.
Using RSVP Hellos to detect RSVP module failures is in my mind the correct
usage for them.  Control channel failures should be detected by mechanisms
designed to detect them (e.g., LMP Hellos).   One scenario that carriers are
interested in supporting is the notion of having parallel redundant control
channels for signaling purposes.  We will need to support other mechanisms
for detecting control channel failures in this case.

Thanks,
Jonathan

> -----Original Message-----
> From: Dimitrios Pendarakis [mailto:DPendarakis@tellium.com]
> Sent: Wednesday, May 30, 2001 7:24 AM
> To: 'Jonathan Lang'; 'Fong Liaw'; 'suresh Katukam'
> Cc: 'Jennifer Yates'; mpls@UU.NET; ccamp@ops.ietf.org
> Subject: RE: Last call - RSVP problems
> 
> 
> Jonathan,
> 
> Could you please clarify why you think RSVP Hello doesn't work for 
> parallel (redundant) control channels? Control channel redundancy 
> is obviously important; however I don't think it relates to the 
> usefulness of the RSVP hello mechanism, at least in the context of 
> this discussion.
> 
> The whole issue is one of layering: LMP hello allows you to detect 
> individual link (or control channel) failures; RSVP Hello allows
> you to detect that you 've lost contact with your signaling peer
> for whatever reason (failure of all redundant control channels, 
> failure of the RSVP daemon, etc.). Since they operate at different
> levels they are to some extend complementary; once RSVP Hello tells
> you that your peer is not reachable, you could use LMP Hello's to 
> try to pinpoint where the problem occurred. 
> 
> For clarification, the "control channel failure" mentioned in
> Fong's mail that started this discussion refers to a failure of the
> signaling channel, i.e., inability to communicate with your signaling
> peer, not failure of specific (individual) control channels. As such, 
> I still believe RSVP Hello is a more accurate detection mechanism.
> 
> Please see some additional comments inline.
> 
> Thanks,
> Dimitris
> 
> 
> >   If you're using the RSVP Hello to detect the failure of the 
> > RSVP module
> > only, then this is fine.  If you're using the RSVP Hello to 
> detect (or
> > rather, infer) control channel failures, I have a problem.  
> > For starters, I
> > don't think it works for parallel (redundant) control 
> > channels; if control
> > channel redundancy is *not* important for SPs, then it's not 
> > as much of an
> > issue, however, I think you're overloading the semantics of 
> > the RSVP Hello.
> > 
> >   See additional comments inline.
> > 
> > Thanks,
> > Jonathan
> > 
> > > 
> > > Hi Jonathan,
> > > 
> > > What needs to be detected here is the inability to send and 
> > > receive signaling messages to and from your RSVP peer. In this 
> > > respect, using RSVP Hello is a more accurate mechanism for 
> > > detecting failures since it is part of the RSVP protocol and 
> > > independent of the link layer used to realize the control 
> > > channel. As you point out, if you prefer redundant control 
> > > channels for resiliency, and want to rely on LMP only, you have
> > > to correlate all LMP detected failures in order to declare a 
> > > control channel failure.
> > This is not correct.  LMP allows you to detect individual 
> > control channel
> > failures, even when there are multiple (redundant) control 
> > channels.  The
> > RSVP Hello cannot do this.
> > 
> 
> Sure, as I pointed above, RSVP Hello is not intended to detect 
> individual link failures. At the risk of sounding repetitive, the
> point here is that these are different mechanisms that serve 
> different purpose. For example, if you have 3 (redundant) IP control 
> channels and rely on LMP Hello only, in order to infer failure of the 
> signaling channel with your peer, you would have to correlate LMP 
> failures in all three channels. 
> 
> > > With RSVP Hello you would rely on a 
> > > single set of event(s) to detect the failure.
> > To be clear, the failure of the RSVP module.
> 
> Or simultaneous failure of all IP control channels...
> 
> > 
> > > To go even further, what if all your control channels are working
> > > fine, but your RSVP daemon has crashed, for whatever reason.
> > > LMP hellos will keep going across but RSVP messages will not...
> > This is where the RSVP Hello has its place.
> > 
> > > 
> > > All this points to the (certainly non-trivial) problem of failure 
> > > detection and correlation at different layers. I am not trying to
> > > say that RSVP Hello is the perfect or only mechanism we 
> should rely 
> > > on, but it certainly has a place in addressing the problem.
> > agreed.  see above comment.
> > 
> > > 
> > > Thanks,
> > > Dimitris
> > > 
> > > PS. Hopefully we are not too far apart so we might not have
> > > to keep the discussion going for too long :-)
> > > In any case I believe it has some relevance here as well.
> > > 
> > > 
> > > > -----Original Message-----
> > > > From: Jonathan Lang [mailto:jplang@calient.net]
> > > > Sent: Friday, May 25, 2001 6:56 PM
> > > > To: 'Dimitrios Pendarakis'; Jonathan Lang; 'Fong Liaw'; 'suresh
> > > > Katukam'; v.sharma@ieee.org
> > > > Cc: 'Jennifer Yates'; mpls@UU.NET; ccamp@ops.ietf.org
> > > > Subject: RE: Last call - RSVP problems
> > > > 
> > > > 
> > > > Dimitris,
> > > >   The RSVP Hello Extension is intended to detect node 
> > > failures and not
> > > > necessarily link (control channel) failures.  In particular, 
> > > > the tunnel
> > > > draft says, "It should be noted that node failure detection 
> > > > is not the same
> > > > as a link failure detection mechanism, particularly in the 
> > > > case of multiple
> > > > parallel unnumbered links."  What happens if you want 
> > > > redundant control
> > > > channels for control-channel resiliency?  I believe you would 
> > > > still want a
> > > > mechanism to detect control channel (link-layer) failures.
> > > > 
> > > >   In LMP, you can have multiple active control channels 
> > > > between a pair of
> > > > nodes.  You can configure control channels over shared 
> > > > Ethernet and over
> > > > SONET/SDH overhead, etc.  You send keep-alive Hellos over 
> > > each control
> > > > channel.  If you so desire, you can send all signaling 
> > > > messages over the
> > > > Ethernet control channels.  You can also choose to send all 
> > > > remaining LMP
> > > > messages over the SONET/SDH-based control channels.  Keeping 
> > > > RSVP Hellos as
> > > > "optional" would be fine.  Making them mandatory doesn't seem 
> > > > like the right
> > > > solution.
> > > > 
> > > > Thanks,
> > > > Jonathan
> > > > 
> > > > P.S.  Should we move this discussion on the OIF list??
> > > > 
> > > > > -----Original Message-----
> > > > > From: Dimitrios Pendarakis [mailto:DPendarakis@tellium.com]
> > > > > Sent: Friday, May 25, 2001 3:16 PM
> > > > > To: 'Jonathan Lang'; 'Fong Liaw'; 'suresh Katukam'; 
> > > > v.sharma@ieee.org
> > > > > Cc: 'Jennifer Yates'; mpls@UU.NET; ccamp@ops.ietf.org
> > > > > Subject: RE: Last call - RSVP problems
> > > > > 
> > > > > 
> > > > > Hi Jonathan,
> > > > > 
> > > > > One point to consider is that the signaling control 
> > > > > channel might be realized over a different link layer 
> > > > > than LMP. For example, LMP might be running in-fiber
> > > > > over SONET/SDH overhead bytes, while RSVP is running 
> > > > > over an out-of-band network such a shared Ethernet.
> > > > > RSVP Hello allows independent detection of signaling 
> > > > > channel failure, so it's a useful option to keep.
> > > > > 
> > > > > Thanks,
> > > > > Dimitris
> > > > > 
> > > > > Dimitris Pendarakis
> > > > > Tellium, Inc.
> > > > > 
> > > > > 
> > > > > > -----Original Message-----
> > > > > > From: Jonathan Lang [mailto:jplang@calient.net]
> > > > > > Sent: Friday, May 25, 2001 5:30 PM
> > > > > > To: 'Fong Liaw'; 'suresh Katukam'; v.sharma@ieee.org
> > > > > > Cc: 'Jennifer Yates'; mpls@UU.NET; ccamp@ops.ietf.org
> > > > > > Subject: RE: Last call - RSVP problems
> > > > > > 
> > > > > > 
> > > > > > Fong,
> > > > > > 
> > > > > > 
> > > > > > <snip>
> > > > > > > Same as (2), any proposals that remove the refresh 
> > mechanism 
> > > > > > > is going to be difficult to prove that all cases 
> > are covered.
> > > > > > > 
> > > > > > > Instead, we (will) recommend the following in OIF UNI 
> > > document:
> > > > > > >    
> > > > > > >    Use RSVP Hello to detect control channel failure.
> > > > > > Why wouldn't you use the LMP Hello to detect 
> control channel 
> > > > > > failure?  This
> > > > > > is exactly what it is designed for.  From 
> > > > > > draft-ietf-mpls-lsp-tunnel-08.txt,
> > > > > > 
> > > > > >   "This (RSVP Hello) mechanism is intended to be used when 
> > > > > > notification of
> > > > > > link layer
> > > > > >    failures is not available and unnumbered links are not 
> > > > > > used, or when
> > > > > >    the failure detection mechanisms provided by the link 
> > > > > layer are not
> > > > > >    sufficient for timely node failure detection."
> > > > > >  
> > > > > > >    If a control channel failure is detected, LSPs states 
> > > > > > >    are maintained as if a node continues to receive 
> > > > > > >    RSVP refresh message from the failed control channel.
> > > > > > >    The recommended Hello timer will be in second range,
> > > > > > >    instead of ms range specified in RSVP-TE draft.  
> > > > > > > 
> > > > > > >    If a control channel failed permanently, manual 
> > > intervention 
> > > > > > >    may be required. This is to be studied.
> > > > > > > 
> > > > > > > p.s The text is currently being drafted as we type.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
>