[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RSVP Restart (was Re: update GMPLS signaling documents)



Gopal,

> >>I have some questions/concerns on the restart mechanism as
> >>specified in draft-ietf-mpls-generalized-rsvp-te-04.txt.  I
> >>have listed them below.
> >>
> >>If a node that is terminating hierarchical LSPs restarts,
> >>there is an ordering issue during resynchronization, since
> >>the LSPs would depend on other FA-LSPs the interface IDs for
> >>which would have been generated dynamically. So the
> >>mechanism as described will not work, unless this
> >>information is preserved across restarts - in which case we
> >>might as well preserve other information as well, and avoid
> >>resynchronization.
> >>
> >
> >Since transit node of the FA-LSP does not see the RSVP
> >messages that use the FA, we only need to consider
> >the ingress and egress node. If the egress node restarts,
> >the worst case is that its upstream node sends an ERO
> >consisting a unrecognized interface address. Since the
> >Path message will carry RECOVERY_LABEL (replacing
> >SUGGESTED_LABEL), the egress node knows that it might
> >not have complete information yet, and can hold on to
> >the Path message until the FA-LSP is established.
> >All the information is there, so implementation
> >can be completely within the egress node. It does work,
> >just not specified enough.
> >
> >Ingress node will have to know the dependency, so not
> >a problem there.
> >
> Are you suggesting that in case of an ERO with an
> unrecognized interface id, if the Path message also
> carries a RECOVERY LABEL, the egress node can hold on
> to it until later?  Note that there may not be a
> RECOVERY LABEL - since this Path message is for a
> hierarchical LSP, and the previous node may not be
> directly connected in the control plane.  Here is an
> example - There is an LSP L1 from C to F, and on top of
> it there is an LSP L2 from A to G.
>
>  L1:              C --- D --- E --- F
>  L2:  A --- B --- C --------------- F --- G
>
> When F restarts, C may not even know, unless HELLO
> protocol is run between C and F.  Note that the
> HELLO mechanism is intended for immediate neighbors.
>
> And even if the HELLO mechanism is extended, there
> is the issue of interface id assigned by F.  F needs
> to assign the same interface id upon restart so as
> not to bring down the hierarchical LSPs set up on top
> of L1 in the reverse direction (say an LSP from G to
> B set up on top of L1.)  Where would F get this
> information ?

Couple of points:

(1) First of all, you do run RSVP Hello between C and F.

(2) Since L1 is advertised as an FA into OSPF/ISIS, F should
be able to recover the Interface ID it assigns to L1 from
a combination of (a) the OSPF/ISIS link state database that
F would recover, and (b) the Forward Interface ID (the one
assigned by C).

> >>Also, let us consider a node that can preserve state across
> >>restarts, and hence does not need its state to be synced by
> >>its peer.  How will it advertise this in the RESTART_CAP
> >>object?
> >>
> >
> >We still need to resynchronize the state since the other
> >side might have connections that are in progress. The
> >
> Let us say the node can resynchronize its neighbor if
> the neighbor restarts and requests state recovery. But
> the issue is how a node can advertise that it does not
> need recovery since all its state was preserved?

By treating is the same way as the way the spec handles
control channel fault.

> >reason for resynchronization is to make sure both side
> >have done the same to the in-progress connections, not
> >just to synchronize the established states.
> >
> >The value of the RESTART_CAP will set to non-zero values
> >
>
> RECOVERY LABEL does not come into picture unless the node
> that is upstream to the restarting node has already received
> a Resv.

Wrong. Quoting 9.5.3:

   Upon detecting a restart with a neighbor that supports state
   recovery, a node SHOULD refresh all Path state shared with that
   neighbor.

So, as you can hopefully see from the above, the upstream node doesn't 
wait until it receives a Resv.

> So it seems that the procedure is to resynchronize
> established state? If Resv has not been received yet, there
> will be a refresh of Path message, and the restarting node
> will consider it as a new request, etc.  So can you elaborate
> on "to make sure both side have done the same to the in
> progress connections" ?
>
> >>If a node supports PSC as well as TDM or LSC interfaces, it
> >>might want to advertise different set of parameters in the
> >>RESTART_CAP object for data LSPs as opposed to SONET/WDM
> >>LSPs which form bearer channels in transport networks.
> >>Currently this is not possible.
> >>
> >
> >Can you give us explicit examples as to why and what do
> >you gain by giving different values for PSC, TDM ?
> >
> In case of PSC devices, it may be OK to remove state that
> is not resynchronized at the end of the recovery period,
> and the recovery period advertised might reflect that.
> But for LSPs in transport networks, one might want to
> have a different recovery period to avoid any LSP from
> going down because of recovery timer expiry.

There is no requirement for a node to advertise exactly the
same Restart_Cap on all the interfaces. So, on PSC interfaces
the node could advertise that it will remove the state that
isn't syncronized at the end of the recovery period, while
on the TDM interface precisely the same node could advertise
that the LSPs would be kept even after the recovery time expires.

> >>According to 9.5.1. (procedures for restarting LSR): "When
> >>sending the corresponding outgoing Path message the node
> >>SHOULD include a SUGGESTED_LABEL object with a label value
> >>^^^^^^
> >>matching the outgoing label from the now restored forwarding
> >>entry."
> >>
> >>This has a conflict with 9.5.2.  Consider the case where
> >>adjacent nodes B and C restart, and B has another adjacent
> >>node A, and C has another adjacent node D.  B and C will get
> >>resynced by A and D, and during this process, they will
> >>resync. each other. While resyncing each other, they act as
> >>neighbors of a restarting LSR, and hence according to 9.5.2,
> >>MUST include the SUGGESTED_LABEL.
> >>
> >>Also according to 9.5.2: "During the recovery period, new
> >>Path state being advertised to the restarted neighbor SHOULD
> >>not include the SUGGESTED_LABEL object in the corresponding
> >>outgoing Path message.  This will prevent the restarting
> >>node from erroneously reusing a saved forwarding entry."
> >>
> >>I guess this would mean that if suggested labels are used
> >>during new LSP setup (as they are likely to be while
> >>provisioning lightpaths - to reduce latency), then new LSP
> >>setup will not be allowed during resyncing?
> >>
> >
> >The use of RECOVERY_LABEL address all the above questions.
> >
> The first problems seems to be there still - consider two
> adjacent nodes restarting.  They both act both as the restarting
> node as well as the neighbor to the restarting node. So, once
> they learn the state from the upstream neighbor, do they use
> suggested label or the recovery label when they send the path
> message to the just restarted downstream neighbor?

The recovery label.

The following should be added to the existing text from the document:

   In the special case where a restarting node also has a restating
   downstream neighbor, a Recovery_Label object should be used instead
   of a Suggested_Label object.

Yakov.