[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: draft-rabbat-fault-notification-protocol-04.txt

To: ccamp@ops.ietf.org
Subject: Re: draft-rabbat-fault-notification-protocol-04.txt
From: George Newsome <gnewsome@ieee.org>
Date: Tue, 24 Feb 2004 20:40:50 -0500
User-agent: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)

All,

My attention was drawn to
draft-rabbat-fault-notification-protocol-04.txt, which provokes the
following comments.

1) There seems to be some notion that the time taken to restore is a
crucial element of high availability, yet overall availability is
controlled by unprotected elements failure rate and by mean time to
repair, rather than by switching time. (A 1 second switch is less
1/10000 of the generally accepted MTTR of 4 hrs)

2) This draft seems to address the relatively simple problem of setting
up the restoration path. It seems to completely ignore the much harder
problem of allocating resources to the shared restoration path, and of
actually locating the fault in an optical network to a single span in a
time that is useful to restoration. It makes no mention of the
inaccuracies in network planning databases, which make one wonder
whether precomputation of restoration paths will actually lead to faster
restoration times. Finally, it seems to presuppose that a network
operator would make such a facilities database available to route
computation at all. The suggestion in sect 6.2 that the physical length
of the fibers be available for route computation is very unlikely in any
network I have ever worked on.

3) One must wonder whether a flooding approach is actually best anyway. The assumption seems to be that a flooding protocol PDU can be forced onto the front of the send queue, thereby incurring minimum delay. An additional assumption seems to be that there is only one fault in the network, and all bets are off if that is not true. There seem to be problems with both these assumptions. It seems to me that there are no mechanisms for truncating the PDU that is being sent, so there is a finite chance that a significant extra delay is incurred. Perhaps more serious is the assumption that all bets are off if there are multiple faults in the network. In general, multiple faults are those that lead to service outage. Two faults that do not interact, in that they do not contend for the same network resources, will be coupled by the flooding. In addition, unsupressed restoration requests, which occur when the fault cannot be rapidly located to a single span, will also generate restoration messages. It also seems to me that routing changes may well start to be flooded at the same time scale as restoration activity is taking place. There is no mention of possible interactions with this.

4) Assuming that this problem is worth solving, and that a flooding protocol is the best solution, is it a good idea to generate yet another protocol that floods, and is LMP the vehicle of choice to embed yet another protocol? It seems to me that restoration has a strong interaction with routing change announcment, so it seems to me to make more sense to use those mechanisms rather than invent new ones.

5) Until the effect of network database inaccuracies on the effectiveness of precomputed restoration is better understood, the problem of allocating resources in shared mesh networks is solved, and it is certain that all faults will be located to the correct span in a time useful to restoration, it seems to be premature to be proposing a solution to the final piece of the problem.

Regards

George

Follow-Ups:
- RE: draft-rabbat-fault-notification-protocol-04.txt
  - From: "Richard Rabbat" <rabbat@fla.fujitsu.com>
- RE: draft-rabbat-fault-notification-protocol-04.txt
  - From: "Vishal Sharma" <v.sharma@ieee.org>

Prev by Date: Re: RSVP Graceful Restart Extensions
Next by Date: Re: draft-kim-ccamp-interaction-grsvpte-lcas-01.txt
Previous by thread: RSVP Graceful Restart Extensions
Next by thread: RE: draft-rabbat-fault-notification-protocol-04.txt
Index(es):
- Date
- Thread