[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Looking for comments on draft for RSVP Graceful Restart extensions (draft-rahman-ccamp-rsvp-restart-extensions-00.txt)



Reshad Rahman wrote:

Tthe main comment/question was if we could reuse the RRO object instead of defining a new object to recover the contents of an ERO expansion done prior to control plane restart. Here are two potential issues with reusing RRO:
- The RRO would contain the full list of nodes whereas the ERO expansion may have been partial. In that situation the downstream node would detect a change in the incoming ERO and may reject the message (the expected behaviour on incoming ERO change seems to be unspecified).

I hope it doesn't do this.


RSVP, by nature is a soft-state protocol. Implementations should expect stuff to change all the time. When an ERO changes, a node shouldn't reject the message. It should use the new ERO to determine the new next-hop.

If the next hop doesn't change, then the node should leave the LSP in place and immediately generate a Path refres, so that downstream neighbors get the new ERO as soon as possible.

If the next hop changes, then it should be treated identically to what would happen in response to a route change with a loose ERO-hop or no ERO.

BTW, I noted that your draft allows an ingress node to recover more quickly. This has been a hole in the GR procedure. An ingress node that is computing an ERO can't re-compute that ERO until routing reconverges, and when it does so, there is no guarantee that it will compute the same ERO as before the failure. It could store the ERO in non-volatile storage, but that can be problematic if there are thousands of LSPs originating.

Using the recovery-ERO object solves this. The ingress node can then send out a Path (using the preserved forwarding state to know what the next-hop is) using an empty (or near-empty) recovery ERO. The next-hop can then send back an immediate Resv containing an appropriate recovery-ERO, which the ingress node can use while waiting for routing to reconverge. (Once routing reconverges and recovery completes, of course, it will want to compute its own ERO and possibly do a make-before-break to the new path if it ends up being better than the recovered path.)

- RRO uses Class-Number of form 0bbbbbbb, so if the downstream node doesn't support RRO, the whole message is rejected.

If RRO isn't supported, then the ingress node will know about, since the LSP won't come up in the first place.


If this happens, then the upstream node will know that it can't use this method of ERO recovery. Functionally, this is really no different from a node not supporting the recovery-ERO class.

Note that other RSVP extensions (like Fast Reroute) also require RRO support as a prerequisite.

-- David