Hi Zafar, CCAMP,
During discussions with several
colleagues within the CCAMP WG, it has become clear that it would be useful to
clarify some of the fundamental differences between restoration in packet
networks and that in transport networks.
This is because this difference,
together with the time criticality of restoration at the transport layer,
requires the development of techniques for time-bounded notification. It would
then be useful to discuss the solutions proposed in
draft-rabbat-fault-notification-protocol-03 for such
notification.
We are in the process of preparing
a contribution on this subject, but thought it would be useful to highlight a
few key points on the mailing list, so that we can elicit feedback and
comments from the WG.
In normal packet networks (MPLS
networks) one can pre-signal *and* pre-configure a backup LSP for a working
LSP. This is because selecting a label at a node for a backup LSP is
sufficient to be able to switch traffic for that LSP when that traffic
arrives. If resources are required for the backup LSP (buffers and bandwidth),
they too can be reserved in advance (during the LSP signaling phase), but can
still be used by low-priority or extra-traffic LSPs as long as there is no
failure on the working LSP.
This is true even for shared mesh
restoration in MPLS networks. In that case, multiple labels would be assigned,
one for each of the backup LSPs (corresponding to link and/or node disjoint
working LSPs) transiting a node on the shared backup path, but only one set of
resources (buffers, bandwidth) would be reserved (if such resource reservation
was needed).
In transport networks, however,
one can pre-signal but not pre-configure a backup LSP (unless one was doing
just 1+1 protection). This is because, in transport networks, if an LSP is
established (that is, it is cross-connected) then the full bandwidth of the
LSP is automatically *consumed*, irrespective of whether traffic actually
flows on this LSP.
For this reason, to implement
shared restoration schemes in transport networks (and allow extra-traffic) a
backup LSP cannot be cross-connected until *after* the specific failure for
which this backup LSP was pre-signaled has occured.
Now, if signaling-based
notification is used in transport networks, an *additional phase of signaling*
is required along the backup path to enable nodes along that path to
reconfigure themselves (this is well-described in the functional specification
document
of the P&R Design Team). This
lengthens the time to recover from the failure. Depending on the layer at
which recovery is being performed this may or may not be
acceptable.
In the specific case of transport
networks, restoration is typically a time-critical activity, so this
round-trip signaling delay could be unacceptable when time-bounded
notification and recovery is desired.
In addition, signaling individual
LSPs or individual LSP bundles may create buffering problems that makes
signaling time unbounded.
If instead, the information about
a failure is flooded to all the network nodes, and the backup paths are
selected intelligently (as described in
draft-rabbat-fault-notification-protocol-03.txt), this additional signaling
hand-shake delay can be eliminated. This is because by flooding the
information about a fault on a working LSP, one can inform, in parallel, all
the nodes lying along the path of the backup LSP. Thus, the repair point(s)
upon learning of the fault holds off activating the backup LSP(s) for an
appropriate time in which all nodes along the corresponding backup path(s)
will have reconfigured themselves.
We would also like to get feedback
on a suitable protocol that could implement time-critical flooding
notification.
Comments, thoughts and questions
are welcome!
--
Richard Rabbat,
Ph.D.
Member of Research Staff, Fujitsu
Labs of America
1240 E Arques Ave, MS 345,
Sunnyvale, CA 94085
Phone: 408-530-4537. Fax:
408-530-4581. Cell: 650-714-7618