[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: draft-bonica-tunneltrace-02
neil.2.harrison@bt.com wrote:
>>>- what is the impact on customers?
>>>
>>When there is a problem, without a good application, it may
>>take hours
>>for the operators to figure out where the problem is.... I am pretty
>>sure BT had experienced this in the past. :-)
>>
> NH=> OK thanks....I was wondering if there was some new problem that had
> arisen, and that's why yourself/Loa were saying this is urgent (hence my
> question). But yes connectivity problems happen...and not just in the past
> either. Big push coming down from above here for increased robustness and
> reduced opex...one aspect of which is (i) auto detection of defect (ii) auto
> handling of defect.....then we need tools to diagnose/clear.
>
Neil,
Here is what I think on auto-detection and atuo-handling:
1. The operators should have the option to turn on some "ping-like"
function on the suspicious LSPs at ingress LSRs. These LSPs will be
queries periodically. It's important to make sure that this function
won't introduce too much overhead.
2. If a LSP is in trouble, it would be nice to have the "traceroute"
function kick in automatically. Eventually, the trouble spot can be
located and reported.
3. Now, when a problem is identified, there are two ways to handle this:
(1) Page the operators asap, and have them to fix the problem.
(2) Fix the problem automatically. For example, the ingress LSR can
repair the LSP by setting up a new one that skips the trouble nodes.
From the development point of view, we can do both, but the operators
can tell us what they prefer.
Regards,
- Ping