[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Expedited Flooding and Bundled Notification Was [Re: draft-rabbat-fault-notification-protocol-04.txt]



Hi CCAMPers,
 
We have had some very interesting discussions offline since Seoul about the
issue of fault notification, and several valuable points have been made in
these exchanges that I'd like to move to the ML, since it may be a source of
confusion for the larger community as well.
 
Some people mentioned that bundling of notification messages solves the
problems of notification from a scalability perspective, but there are 2
facets to this issue that I explain below. I hope this discussion will clear
up the issues.

1. The notification process used in the P&R documents uses a Notify IP
packet. This is expected to be fast since it goes over the forwarding plane
to the destination IP address.  The advantage is the forwarding speed, but
the disadvantage is that there is only one destination IP address. This
means that if a node were to do localization of the fault and were to send a
"bundled notify", it uses one notify message for each ingress node that
sources a certain number of LSPs.  Therefore, one would need as many Notify
messages as the number of ingress nodes of the failed LSPs. In a certain
network, this number can be bound by the number of nodes that can act as
ingress; in the case of mesh networks, this number's theoretical limit is
the number of nodes in the network.

Note that the FNP method discussed in our drafts is very fast as well since
the forwarding of the packet from line cards to other line cards would occur
before any intervention of the control plane. The control plane processing
occurs -in parallel- with the propagation of the notification information
through the network.

2. The Notify message only delivers the failure news to the ingress nodes.
Each ingress node has to start a signaling process to activate the
protection LSP. In this case, when protection LSPs are not routed on the
same path, each needs its own set of signaling messages. Of course, this can
be alleviated somewhat through the use of hierarchies, but only to a certain
extent. Again, the advantage of mesh networks would be lost if one were to
force the route calculation for the protection path to use the same segments
or paths in order for the activation not to cause signaling storms.

Signaling storms are not supposed to be a problem when setting up paths
because of time lags between different path setups (before activation); but
they may occur if a large number of LSPs need to be activated at the same
time within a delta as in the case of a fault.

The flooding approach outlined in draft-rabbat-expedited-flooding-01.txt and
the detailed notification method specified in
draft-rabbat-fault-notification-protocol makes this problem moot because the
number of messages is bound in the case of a fault (one message transmitted
per link). So, the notification using FNP would guarantee that a fault
notification would not lead to any signaling storms.

I hope this clears up any misunderstanding. Feedback/observations are
welcome as always.

Richard.