|
Hi all,
we are working to a MPLS testbed supporting end-to-end Protection-Restoration mechanisms and we faced the problem of link-failures notification. We share the scalability concerns of RSVP-like solutions reported in the Rabbat's mail. We are in favour of OSPF flooding-based mechanisms for link-failures using Opaque because: - applicable to both MPLS-TE -
GMPLS.
- Proved OSPF protocol stability and
robustness with respect to the LMP solution.
OSPF flooding in general is mature, instead LMP has to be extended and it has to support some OSPF capabilities we already have. - Reduction of routing failure
probability respect to the use of RSVP (see
below).
In draft-katz-yeung-ospf-traffic-09 it is written that in a TE scenario we can have a module in the edge nodes that searches constrained routes based on Opaque TE info. We call a "routing failure" the computation by edge node X of a path which includes a failed link F. Clearly, routing failures are a consequence of lack of notification, to X, of the failure of F. With RSVP failure notification, this can occur: - in case of single fault, when there are no LSPs originated from X crossing F; - in case of dual faults (i.e., two links L1 and L2 fails almost simultaneously), if from X there is a LSP crossing both L1 and L2:
X---link-----node------link(L1)------node-------link(L2)-----egress_node
the failure
of L1 can hide the notification of L2 failures.
It can be
seen that with OSPF flooding, there is virtually no
potential
routing failure at all, as ALL the edge nodes are notified any failure. So if we
have a flooding-based notification, all the edge nodes in a
network
will be aware about the failure. Instead, with RSVP, we'll have only some nodes aware of the failure! So we have an increasing of routing failure probability after a link failure. We agree that a major drawback of the
OSPF-flooding solution is the need of revisiting
the timers, as pointed out in the Rabbat's mail:"Flooding using LMP extensions". In fact we can't wait for a max period of MinLSInterval seconds to notify a failure... So we have to modify something. Among the possible solutions:
1) Introduction of a new timer in
OSPF for a new sub-TLV
used to carry the info of broken link. The current timer of MinLSInterval should not consider this new field. 2) Force the flooding when a link
failure signal arrives
and reset the timer. We think that the 2) solution has more
advantages.
The current behavior of the OSPF protocol is (considering Opaque extensions):
<----------------MinLSInterval------------->
B1
| B2
| B3 | FAIL.
|
| | | | ---+----------------+------+----------+-------+------------> time | | | | FL1 FL2 WHERE:
- FL1 is a flooding of B1
information.
- FL2 is a flooding of B3 + FAIL. information. - FAIL. could be a signal coming from a failure detection mechanism (i.e. from lower layer). - B1, B2, B3, FAIL. are external OSPF Opaque inputs. Note that B1, B2, B3 could be Bandwidth updates (link TLVs updates). We thought to solve in this
way:
<-------------MinLSInterval--------->
<-------------MinLSInterval------> B1
| B2 | B3 |
FAIL.|
| | | | ---+-----------+------+--------+------------------------------------------> | | time | | FL1 FL2 In this case we force the flooding (FL2)
when arrives a signal of
link-failure (FAIL.) and we reset the MinLSInterval timer so that it restarts from the failure event. To enforce the robustness of this solution,
and to avoid continous flooding of
failure notifications in case of interface flapping, we have to consider a timer in the module (external to OSPF) that detects the link-failures and triggers the flooding of FL2 Opaque LSA. Consider that some external module is
needed to trigger the OSPF flooding
of failure notification, as we can not rely on the HELLO process for its long detection delay. A possibility is to let LMP to detect the failure and trigger the OSPF flooding. In this approach, the timer to avoid
interface-flapping should be included in the LMP trigger.
This solution is conservative in that it only requires LMP extensions (timer in the trigger) and just a minor modification to the OSPF process (i.e., accept a force-to-send and MinLSInterval-reset trigger). Thanks in advance for your kind observations. Regards
Roberto Albanese, Nicola Caione
University of Rome - La Sapienza, Italy |