[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Flooding using LMP extensions
My two cents:
With ITU-T glossary, I think LRM(Link Resource Manager) will detect the link fault ,
and then notifies the RC(Routing Control) to update the RDB.
rick
-----Original Message-----
From: owner-ccamp@ops.ietf.org [mailto:owner-ccamp@ops.ietf.org]On Behalf Of Vishal Sharma
Sent: Thursday, June 19, 2003 1:36 PM
To: Roberto Albanese; ccamp@ops.ietf.org
Subject: RE: Flooding using LMP extensions
Hi Roberto and Nicola,
Thanks very much for your observations and for your detailed
analysis of the issues.
I certainly share your view that a flooding-based solution to fault
notification can present several advantages, and is one that we should
consider as a candidate (together with the signaling-based solutions
proposed
so far).
1) In that regard, it would be nice to get your inputs on draft-rabbat:
http://www.ietf.org/internet-drafts/draft-rabbat-fault-notification-protocol
-02.txt
since the mechanisms proposed there are technology, protocol, and topology
agnostic.
Do you, for instance, agree with the basic proposal in there? Do you
have any feedback, additions, or suggestions, based on your experience of
building your MPLS testbed?
2) On that note, I was wondering if you'd already made the modifications
to OSPF that you discussed in your email, and if you have any
performance results you might be able to share with the WG.
(Richard in one of his earlier emails, I believe, mentioned that he
has some prototyping results on LMP performance.)
3) Finally, here are some points that I'd like to discuss related to
the issues highlighted in your email.
i) I wanted to add that not only could flooding minimize routing failure
probability (as you've observed), it could also speed up recovery
by notifying (in parallel)
the nodes on multiple recovery paths affected by a given fault. This can
prove very advantageous when a link carries a large number of
transport circuits (e.g. lambdas), and when the LSPs do not share the
same s-d pairs.
ii) In regard to the argument for using OSPF to obtain a solution that
works
both for MPLS and GMPLS, my thought is that this is not critical, since
the recovery methods (e.g. fast reroute) at the MPLS layer are already
well-defined and are not common with the recovery methods used at the
transport layer. Thus, there is less of a need to have a solution that
works across layers of the network using MPLS and GMPLS, respectively.
On the contrary, the "cost" of trying to incorporate the fault
notification
functionality in a routing protocol, like OSPF, an be heavy (as I
explain below).
iii) This is because one has to develop solutions for multiple routing
protocols
(as not everyone will use OSPF). More importantly, a service provider
not
normally running a routing protocol at the transport layer (but using
LMP
and GMPLS signaling), would be forced to run a routing protocol just
for
efficient fault notification! That seems like an undue burden.
Furthermore, unlike LMP, OSPF requires more involved processing,
making it
difficult to ensure that the failure-related messages are always
processed
and dispatched _before_ other OSPF messages.
Also, irrespective of whether a provider runs routing, it is likely
that
a GMPLS-control plane at the transport layer will at least run LMP,
thus
making it a natural choice to extend for flooding-based fault
notification.
Of course, I welcome service provider inputs on this issue, and they
can correct me if I'm wrong. :-)
iv) And finally, it appears that you are inserting the timer functionality
and
initial detection functionality in LMP anyway, so in the light of (iii)
above,
wouldn't it be easier just to extend LMP. (With regard to the argument
of OSPF
flooding being mature, perhaps we can learn from the lessons there, and
not
repeat the same mistakes when operating LMP. :-)).
Best regards,
-Vishal
-----Original Message-----
From: owner-ccamp@ops.ietf.org [mailto:owner-ccamp@ops.ietf.org]On Behalf Of
Roberto Albanese
Sent: Tuesday, June 17, 2003 10:50 AM
To: ccamp@ops.ietf.org
Subject: Re: Flooding using LMP extensions
Hi all,
we are working to a MPLS testbed supporting end-to-end
Protection-Restoration mechanisms and we faced the problem of
link-failures notification.
We share the scalability concerns of RSVP-like
solutions reported in the Rabbat's mail.
We are in favour of OSPF flooding-based mechanisms for link-failures
using Opaque because:
- applicable to both MPLS-TE - GMPLS.
- Proved OSPF protocol stability and robustness with respect to the LMP
solution.
OSPF flooding in general is mature, instead LMP has to be extended
and it has to support some OSPF capabilities we already have.
- Reduction of routing failure probability respect to the use of RSVP (see
below).
In draft-katz-yeung-ospf-traffic-09 it is written that in a TE
scenario
we can have a module in the edge nodes that searches constrained
routes
based on Opaque TE info.
We call a "routing failure" the computation by edge node X of a path
which includes a failed link F.
Clearly, routing failures are a consequence of lack of notification,
to X, of the failure of F.
With RSVP failure notification, this can occur:
- in case of single fault, when there are no LSPs originated from X
crossing F;
- in case of dual faults (i.e., two links L1 and L2 fails almost
simultaneously),
if from X there is a LSP crossing both L1 and L2:
X---link-----node------link(L1)------node-------link(L2)-----egress_node
the failure of L1 can hide the notification of L2 failures.
It can be seen that with OSPF flooding, there is virtually no
potential
routing failure at all, as ALL the edge nodes are notified any
failure.
So if we have a flooding-based notification, all the edge nodes in a
network
will be aware about the failure. Instead, with RSVP, we'll have
only some nodes aware of the failure! So we have an increasing of
routing failure probability after a link failure.
We agree that a major drawback of the OSPF-flooding solution is the need of
revisiting
the timers, as pointed out in the Rabbat's mail:"Flooding using LMP
extensions".
In fact we can't wait for a max period of MinLSInterval seconds to notify a
failure...
So we have to modify something.
Among the possible solutions:
1) Introduction of a new timer in OSPF for a new sub-TLV
used to carry the info of broken link.
The current timer of MinLSInterval should not consider
this new field.
2) Force the flooding when a link failure signal arrives
and reset the timer.
We think that the 2) solution has more advantages.
The current behavior of the OSPF protocol is (considering Opaque
extensions):
<----------------MinLSInterval------------->
B1 | B2 | B3 | FAIL. |
| | | |
---+----------------+------+----------+-------+------------> time
| |
| |
FL1 FL2
WHERE:
- FL1 is a flooding of B1 information.
- FL2 is a flooding of B3 + FAIL. information.
- FAIL. could be a signal coming from a failure detection mechanism
(i.e. from lower layer).
- B1, B2, B3, FAIL. are external OSPF Opaque inputs.
Note that B1, B2, B3 could be Bandwidth updates (link TLVs updates).
We thought to solve in this way:
<-------------MinLSInterval--------->
<-------------MinLSInterval------>
B1 | B2 | B3 | FAIL.|
| | | |
---+-----------+------+--------+-----------------------------------------
->
| | time
| |
FL1 FL2
In this case we force the flooding (FL2) when arrives a signal of
link-failure (FAIL.) and we reset the MinLSInterval timer so that
it restarts from the failure event.
To enforce the robustness of this solution, and to avoid continous flooding
of
failure notifications in case of interface flapping, we have to consider a
timer in the module (external to OSPF) that detects the link-failures and
triggers the flooding of FL2 Opaque LSA.
Consider that some external module is needed to trigger the OSPF flooding
of failure notification, as we can not rely on the HELLO process for its
long detection delay.
A possibility is to let LMP to detect the failure and trigger the OSPF
flooding.
In this approach, the timer to avoid interface-flapping should be included
in the LMP trigger.
This solution is conservative in that it only requires LMP extensions
(timer in the trigger) and just a minor modification to the OSPF process
(i.e., accept a force-to-send and MinLSInterval-reset trigger).
Thanks in advance for your kind observations.
Regards
Roberto Albanese, Nicola Caione
University of Rome - La Sapienza, Italy