[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RSVP Restart (was Re: update GMPLS signaling documents)



Gopal,

> Thanks for the clarification.  

You are welcomed !!!

> My comments are inline:

My comments are inline:

> Yakov Rekhter wrote:
> 
> >Gopal,
> >
> >>>>I have some questions/concerns on the restart mechanism as
> >>>>specified in draft-ietf-mpls-generalized-rsvp-te-04.txt.  I
> >>>>have listed them below.
> >>>>
> >>>>If a node that is terminating hierarchical LSPs restarts,
> >>>>there is an ordering issue during resynchronization, since
> >>>>the LSPs would depend on other FA-LSPs the interface IDs for
> >>>>which would have been generated dynamically. So the
> >>>>mechanism as described will not work, unless this
> >>>>information is preserved across restarts - in which case we
> >>>>might as well preserve other information as well, and avoid
> >>>>resynchronization.
> >>>>
> >>>Since transit node of the FA-LSP does not see the RSVP
> >>>messages that use the FA, we only need to consider
> >>>the ingress and egress node. If the egress node restarts,
> >>>the worst case is that its upstream node sends an ERO
> >>>consisting a unrecognized interface address. Since the
> >>>Path message will carry RECOVERY_LABEL (replacing
> >>>SUGGESTED_LABEL), the egress node knows that it might
> >>>not have complete information yet, and can hold on to
> >>>the Path message until the FA-LSP is established.
> >>>All the information is there, so implementation
> >>>can be completely within the egress node. It does work,
> >>>just not specified enough.
> >>>
> >>>Ingress node will have to know the dependency, so not
> >>>a problem there.
> >>>
> >>Are you suggesting that in case of an ERO with an
> >>unrecognized interface id, if the Path message also
> >>carries a RECOVERY LABEL, the egress node can hold on
> >>to it until later?  Note that there may not be a
> >>RECOVERY LABEL - since this Path message is for a
> >>hierarchical LSP, and the previous node may not be
> >>directly connected in the control plane.  Here is an
> >>example - There is an LSP L1 from C to F, and on top of
> >>it there is an LSP L2 from A to G.
> >>
> >> L1:              C --- D --- E --- F
> >> L2:  A --- B --- C --------------- F --- G
> >>
> >>When F restarts, C may not even know, unless HELLO
> >>protocol is run between C and F.  Note that the
> >>HELLO mechanism is intended for immediate neighbors.
> >>
> >>And even if the HELLO mechanism is extended, there
> >>is the issue of interface id assigned by F.  F needs
> >>to assign the same interface id upon restart so as
> >>not to bring down the hierarchical LSPs set up on top
> >>of L1 in 
> >>the reverse direction (say an LSP from G to
> >>B set up on top of L1.)  Where would F get this
> >>information ?
> >>
> >
> >Couple of points:
> >
> >(1) First of all, you do run RSVP Hello between C and F.
> 
> O.K.
> 
> >(2) Since L1 is advertised as an FA into OSPF/ISIS, F should
> >be able to recover the Interface ID it assigns to L1 from
> >a combination of (a) the OSPF/ISIS link state database that
> >F would recover, and (b) the Forward Interface ID (the one
> >assigned by C).
> >
> True. One could either use IGP restart mechanism to relearn
> this (I have other concerns on RSVP restart being dependent on
> IGP restart, but that for later) or preserve this across restarts.
> In either case,  you agree that RSVP restart depends on some
> mechanism outside RSVP to help it along?  So RSVP needs a new
> interface to get this mapping upon restarts, right? This is not
> clear in the draft.

Couple of points:

1. As Dimitri Papadimitriou mentioned in his other e-mail to this
list, an Internet Draft "is not a textbook on how to use GMPLS-SIG".

2. What I mentioned in the above should be abundantly obvious to the
informed reader of RSVP and LSP Hierarchy specs.

> >>>>Also, let us consider a node that can preserve state across
> >>>>restarts, and hence does not need its state to be synced by
> >>>>its peer.  How will it advertise this in the RESTART_CAP
> >>>>object?
> >>>>
> >>>We still need to resynchronize the state since the other
> >>>side might have connections that are in progress. The
> >>>
> >>Let us say the node can resynchronize its neighbor if
> >>the neighbor restarts and requests state recovery. But
> >>the issue is how a node can advertise that it does not
> >>need recovery since all its state was preserved?
> >>
> >
> >By treating is the same way as the way the spec handles
> >control channel fault.
> >
> 
> True, but this does not help.  

Help with what ?

> See below.
> 
> >>>reason for resynchronization is to make sure both side
> >>>have done the same to the in-progress connections, not
> >>>just to synchronize the established states.
> >>>
> >>>The value of the RESTART_CAP will set to non-zero values
> >>>
> >>RECOVERY LABEL does not come into picture unless the node
> >>that is upstream to the restarting node has already received
> >>a Resv.
> >>
> >
> >Wrong. Quoting 9.5.3:
> >
> >   Upon detecting a restart with a neighbor that supports state
> >   recovery, a node SHOULD refresh all Path state shared with that
> >   neighbor.
> >
> >So, as you can hopefully see from the above, the upstream node doesn't 
> >wait until it receives a Resv.
> >
> True - if you had read further, you would have noticed that
> I have said the same thing that you have quoted :) -
> but there will be no RECOVERY_LABEL unless a Resv has
> been received, right?
>
> It is possible that the downstream node, after programming
> its forwarding path, restarted before sending the Resv.
> This would result in the upstream node refreshing the Path
> message without the Recovery Label, and the downstream
> node eventually allocating a different label. 

Correct.

> So essentially it is a new LSP setup.

It is a setup for a connection that is *in-progress*.

> >>So it seems that the procedure is to resynchronize
> >>established state? If Resv has not been received yet, there
> >>will be a refresh of Path message, and the restarting node
> >>will consider it as a new request, etc.  So can you elaborate
> >>on "to make sure both side have done the same to the in
> >>progress connections" ?
> >>
> >>>>If a node supports PSC as well as TDM or LSC interfaces, it
> >>>>might want to advertise different set of parameters in the
> >>>>RESTART_CAP object for data LSPs as opposed to SONET/WDM
> >>>>LSPs which form bearer channels in transport networks.
> >>>>Currently this is not possible.
> >>>>
> >>>Can you give us explicit examples as to why and what do
> >>>you gain by giving different values for PSC, TDM ?
> >>>
> >>In case of PSC devices, it may be OK to remove state that
> >>is not resynchronized at the end of the recovery period,
> >>and the recovery period advertised might reflect that.
> >>But for LSPs in transport networks, one might want to
> >>have a different recovery period to avoid any LSP from
> >>going down because of recovery timer expiry.
> >>
> >
> >There is no requirement for a node to advertise exactly the
> >same Restart_Cap on all the interfaces. So, on PSC interfaces
> >the node could advertise that it will remove the state that
> >isn't syncronized at the end of the recovery period, while
> >on the TDM interface precisely the same node could advertise
> >that the LSPs would be kept even after the recovery time expires.
>
> But to set up LSPs over TDM/LSC interfaces, the PSC interface
> is going to be used for signaling - since control and data
> planes are decoupled!  So, how will this help?

Help with what ? You asserted that it "is not possible" for
"a node that supports PSC as well as TDM and LSC interfaces..
to advertise different set of parameters in the RESTART_CAP
object for data LSPs as opposed to SONET/WDM LSPs."

I pointed out to you that your assertion is incorrect, as
there is no requirement for a node to advertise exactly the
same Restart_Cap on all of its interfaces.

> >>>>According to 9.5.1. (procedures for restarting LSR): "When
> >>>>sending the corresponding outgoing Path message the node
> >>>>SHOULD include a SUGGESTED_LABEL object with a label value
> >>>>^^^^^^
> >>>>matching the outgoing label from the now restored forwarding
> >>>>entry."
> >>>>
> >>>>This has a conflict with 9.5.2.  Consider the case where
> >>>>adjacent nodes B and C restart, and B has another adjacent
> >>>>node A, and C has another adjacent node D.  B and C will get
> >>>>resynced by A and D, and during this process, they will
> >>>>resync. each other. While resyncing each other, they act as
> >>>>neighbors of a restarting LSR, and hence according to 9.5.2,
> >>>>MUST include the SUGGESTED_LABEL.
> >>>>
> >>>>Also according to 9.5.2: "During the recovery period, new
> >>>>Path state being advertised to the restarted neighbor SHOULD
> >>>>not include the SUGGESTED_LABEL object in the corresponding
> >>>>outgoing Path message.  This will prevent the restarting
> >>>>node
> >>>> from erroneously reusing a saved forwarding entry."
> >>>>
> >>>>I guess this would mean that if suggested labels are used
> >>>>during new LSP setup (as they are likely to be while
> >>>>provisioning lightpaths - to reduce latency), then new LSP
> >>>>setup will not be allowed during resyncing?
> >>>>
> >>>The use of RECOVERY_LABEL address all the above questions.
> >>>
> >>The first problems seems to be there still - consider two
> >>adjacent nodes restarting.  They both act both as the restarting
> >>node as well as the neighbor to the restarting node. So, once
> >>they learn the state from the upstream neighbor, do they use
> >>suggested label or the recovery label when they send the path
> >>message to the just restarted downstream neighbor?
> >>
> >
> >The recovery label.
> >
> >The following should be added to the existing text from the document:
> >
> >   In the special case where a restarting node also has a restating
> >   downstream neighbor, a Recovery_Label object should be used instead
> >   of a Suggested_Label object.
> >
> Since a restarting node may not be able to detect that a
> downstream neighbor is restarting, I suggest it always use
> a Recovery_Label object instead of a Suggested_Label object.

Please see my reply to Yangguang on this topic.

Yakov.