[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Last call review of draft-ietf-ccamp-gmpls-recovery-e2e-signaling-02.txt
Hi,
I've got a couple of comments in what seems like
quite a long draft.
Adrian
Abstract:
This document describes protocol
specific procedures and extensions
for Generalized
Multi-Protocol Label Switching (GMPLS) Resource
ReserVation
Protocol - Traffic Engineering (RSVP-TE) signaling to
support
end-to-end Label Switched Path (LSP) recovery that denotes
protection and restoration.
Not sure what is meant by "denotes" in this
context.
Perhaps "is used to provide" ?
===
Section 2
OLD
In addition, the reader is assumed
to be familiar with the
terminology used in [RFC3945],
[RFC3471], [RFC3473] and referenced
as well as [TERM] and
[FUNCT].
NEW
In addition, the reader is assumed
to be familiar with the
terminology used in [RFC3945],
[RFC3471], [RFC3473],
as well as in [TERM] and [FUNCT].
===
Section 2
Checklog List from revision
v01.txt:
Please either remove this text or mark it so that
the RFC editor will remove it.
===
Section 3
Although the second paragraph defines "end-to-end
protection" I would like to see this pulled out into its own paragraph for
emphasis, and also a little more clarity added to the definition. For example,
it would appear that your four types of e2e protection are all have the
protecting and protected LSPs disjoint in some way - this makes it appear that
this is a property of e2e protection in general and you should state this if it
is true.
===
Section 3 para 5
In 1:N (N =< 1) protection with
extra-traffic,
Hopefully you mean N >= 1
===
Section 3 para 5
I note that you do not distinguish between 1:N
and 1:N-with-extra-traffic.
If there is a reason for this perhaps you could
add a note to the text.
===
Section 3
OLD
working one. Here, the recovery
resources for the protecting LSP are
pre-reserved and explicit
action is required to activate (i.e.
NEW
working one. Here, the recovery
resources for the protecting LSP are
pre-reserved but explicit
action is required to activate
(i.e.
^^^
===
Section 3
OLD
requirements by allowing multiple protecting LSPs to share
common
link and node resources. The recovery resources are
pre-reserved and
NEW
requirements by allowing multiple protecting LSPs to share
common
link and node resources. The recovery resources are
pre-reserved
but
^^^
===
Section 3
Note that in both
cases, any lower priority LSP that would use the pre-reserved
resources for the protecting LSP(s) MUST be preempted during
the
activation of the protecting LSP.
This sentence comes
out of the blue. The whole of the paragraph up to then has not even mentioned
extra traffic. I suggest you insert a paragraph break and a sentence explaining
how the pre-reserved resources may be used to support an extra-traffic
LSP...
Also, delete "would".
===
Section 3
Full LSP re-routing (or restoration)
switches normal traffic to an
alternate LSP that is fully
established only after working LSP
failure occurs.
This text does not read well in English. In fact,
the same is true of pre-planned. What you mean is, I think, as
follows...
Full LSP re-routing (or restoration)
switches normal traffic to an
alternate LSP that is not even
partially established until after
the working LSP failure occurs.
===
Section 3
Note that crankback signaling (see
[CRANK]) and LSP segment recovery
are further detailed in
dedicated companion documents. Also, there
Need to add a citation for LSP
segment recovery.
===
Section
4.2
OLD
The
recovery attributes includes all the parameters that
determine
NEW
The recovery attributes
include all the parameters that determine
===
Section 4.2.1
- S (Secondary) bit: enables
distinction between primary and
secondary
LSPs. A primary LSP is a fully established LSP for
which the resource allocation has been committed at
the data plane
(i.e. full cross-connection
has been performed). Both working and
protecting LSPs can be primary LSPs. A secondary LSP is an LSP
I am
uneasy about this definition of "primary". In [TERM] the only mention of
"primary" is in section 2...
Recovery typically involves the
activation of a recovery (or
alternate) LSP when a failure is
encountered in the working (or
primary) LSP.
This implies that "primary" is a synonym for
"working".
Further, RFC3471 has a subtly different meaning
for "secondary" in section 7.
Protection Information also
indicates if the LSP is a primary or
secondary LSP. A
secondary LSP is a backup to a primary LSP. The
resources
of a secondary LSP are not used until the primary LSP
fails. The resources allocated
for a secondary LSP MAY be used by
other LSPs until the primary
LSP fails over to the secondary LSP. At
that point, any
LSP that is using the resources for the secondary LSP
MUST be
preempted.
Can we please not modify the interpretation of the
S-bit.
If you need to flag a new piece of information
(to distinguish between resource allocated and not) then please introduce a new
flag.
Note that the P-bit appears to
be slightly orthogonal because the text seems to describe the
*current* role of the LSP. (The S-bit in RFC3471 describes the role at the time
the LSP is set up, I think).
===
Section 4.3
OLD
When used for the working LSP
signaling, the Association ID of the
NEW
When used for signaling the working
LSP, the Association ID of the
===
Section 4.3
OLD
When used for the protecting LSP
signaling, this field identifies
NEW
When used for signaling the
protecting LSP, this field identifies
===
Section 5
When a failure occurs (say at node
B) and is detected at end-node D,
the receiver at D selects the
normal traffic from the other LSP.
From this perspective, 1+1
unidirectional protection can be seen as
an uncoordinated
protection switching mechanism acting independently
at both
end-points. Also, for the protected LSP under failure
condition, the Path_State_Removed Flag of the ERROR_SPEC object (see
[RFC3473]) SHOULD NOT be set upon PathErr message generation.
So, what you are saying is that in 1+1 protection the network may *never*
know that the error is so bad that the LSP is dead, but MUST leave that choice
to the ingress. While this is the operational practice in many transport
networks, I don't see why you make this as strong as a SHOULD
NOT.
===
Section 5
Note: one should assume that both
paths are SRLG disjoint otherwise,
a failure would impact both
working and protecting LSPs.
What is this supposed to tell the reader? That
he should make the assumption or that he should ensure SRLG diversity?
;-)
Actually, I think you want to say that the
quality of 1+1 protection may vary. Allowing link diverse, node diverse or SRLG
diverse 1+1 protection.
(ditto section 6 and 7)
===
Section 5.1
Since both LSPs belong to the same
session, the SESSION object MUST
be the same for both LSPs.
An undisputable conclusion drawn from an unproven
premise.
Why must both LSPs belong to the same session? A
one line explanation would start the section off nicely.
===
Section 5.1
A new PROTECTION object is included
in the Path message. This object
What is the implication of "new"? I guess
you mean the new type defined in this draft.
===
Section 5.1
A new PROTECTION object is included
in the Path message. This object
carries the desired end-to-end
LSP Protection Type (in this case,
"1+1 Unidirectional"). This
LSP Protection Type value is applicable
to both uni- and
bi-directional LSPs.
This is unclear. In section 14.1 you have
0x08 1+1 Unidirectional Protection
0x10 1+1 Bi-directional Protection
===
Section 5.1
Your description of the use of the P-bit for 1+1
protection isn't clear. You mean to say that the P-bit indicates which LSP the
ingress would *prefer* to be the protecting LSP if all other things are equal,
but your text (and the description of the P-bit in sections 4.2.1 and 14) don't
make this clear.
===
Section 6.2
directions. This is done using the
Notify message with a new Error
Code indicating "Working LSP Failure (Switchover Request)".
The
I don't see this in the IANA section, and I wonder if you also mean
Error Value?
===
Section 6.2
directions. This is done using the
Notify message with a new Error
Code indicating "Working LSP
Failure (Switchover Request)". The
Notify Ack message MUST be
sent to confirm the reception of the
Notify message (see
[RFC3473], Section 4.3).
I see no definition of a "Notify Ack message" in
RFC3473 (in any section).
I am worried that you are confusing the Ack
message with a new procedure requiring a handshake of Notify
messages.
===
Section 6.2
1. If
an end-node (A or D) detects the failure of the working
LSP (or a
degradation of signal quality over the working
LSP) or
receives a Notify message including its SESSION
object within
the <upstream/downstream session list> (see
[RFC3473]), it
MUST begin receiving on the protecting LSP
Note that the sender descriptor or flow
descriptor is also present in the Notify and this will considerably help resolve
ambiguities and race conditions since it identifies the LSP.
===
Section 6.2
1. If
an end-node (A or D) detects the failure of the working
LSP (or a
degradation of signal quality over the working
LSP) or
receives a Notify message including its SESSION
object within
the <upstream/downstream session list> (see
[RFC3473]), it
MUST begin receiving on the protecting LSP
I don't think the receipt
of a Notify message is sufficient, per se. I think the error code and value need
to indicate a problem with the LSP.
===
Section 6.2
1. If
an end-node (A or D) detects the failure of the working
LSP (or a
degradation of signal quality over the working
LSP) or
receives a Notify message including its SESSION
object within
the <upstream/downstream session list> (see
[RFC3473]), it
MUST begin receiving on the protecting LSP
and send a
Notify message reliably to the other end-node (D
or A,
respectively).
"...send a Notify message reliably" will
certainly be misunderstood.
You presumably mean "...send a Notify message
including the Message_ID object".
===
Section 6.2
2.
Upon receipt of the switchover message, the end-node
(D or A,
respectively) MUST begin receiving from the
protection LSP
and send a (Notify) Ack message to the other
end-node (A or
D, respectively) using reliable message
delivery (see
[RFC2961]).
While this clarifies the use of Ack rather than Notify Ack (not
sure why you need to include "(Notify)") it is now confused about the delivery
of the Ack message. How do we achieve reliable delivery of an Ack
message?!
===
Section 7
Although the resources for the
protecting LSP are pre-allocated,
preemptable traffic may be
carried end-to-end using this LSP (i.e.
the protecting LSP is
capable of carrying extra-traffic) with the
caveat that this
traffic will be preempted if the working LSP fails.
Do you mean that the extra traffic is carried
"using this LSP" or "using some or all of the resources assigned to this
LSP"?
===
Section 7
Also, if extra-traffic is carried
over the protecting LSP, the
corresponding end-nodes may be
notified of the failure in order to
complete the
switchover.
I think this is "end-nodes may need to be
notified"
===
Section 7.2
To co-ordinate the switchover
between end-points, an end-to-end
switchover request is needed
such that the affected LSP(s) are moved
to the protecting
LSP.
In what way may there be more than one affected
LSP moved to a single protecting LSP?
===
Section 7.2
This operation may be done using a
Notify message exchange with a
new Error Code indicating
"(Working) LSP Failure (Switchover
Request)". The Notify Ack
message MUST be sent to confirm the
reception of the Notify
message.
All of the same comments as for section 6.2.
Also:
- Why do you say "may be done"?
- Is this the same error code as in 6.2? (the
text is slightly different)
===
Section 7.3
OLD
provisioned protecting LSP is
resource-disjoint LSP from the N
NEW
provisioned protecting LSP
is resource-disjoint from the N
===
Section 7.3
Can you highlight that the N working LSPs are all
between the same pair of end points.
===
Section 8
OLD
this does not mean that the
corresponding resources can not used by
NEW
this does not mean that the
corresponding resources can not be used by
===
Section 8
To make bandwidth pre-reserved for a
protecting (but not activated)
LSP, available for extra traffic
this bandwidth could be included in
the advertised Unreserved
Bandwidth at priority lower (means
numerically higher) than the
Setup Priority of the protecting LSP.
This feels like it should be the
Holding Priority. That is, the Setup Priority was only important for how it
could displace pre-existing LSPs.
===
Section 8.3
OLD
From [GMPLS-ARCH], the secondary LSP
is setup with resource pre-
NEW
From [RFC3945], the secondary LSP is
setup with resource pre-
===
Section 9
OLD
plane) a specific protecting LSP
instantiated during the (pre-
)provisioning phase. This requires
restoration signaling along the
NEW
plane) a specific protecting
LSP instantiated during the (pre-)
provisioning phase. This
requires restoration signaling along the
===
Section 9
resource sharing), the LSPs must
have the same Session Ids, but the
Session Id includes the
target (egress) IP address. These addresses
2xs/Id/ID/
Suggest a search for "id"
===
Section 9.3
OLD
From [GMPLS-ARCH], the secondary LSP
is setup with resource pre-
NEW
From [RFC3945], the secondary LSP is
setup with resource pre-
===
Section 10
OLD
activated. Additional condition
raises from mis-connection avoidance
NEW
activated. An additional condition
arises from mis-connection avoidance
===
Section 10
OLD
Note that step 1 may cause alarms to
be raised for the pre-empted
LSP. If alarm suppression is
desired the pre-empting node MAY expand
before applying step 1
act as follows.
NEW
Note that step 1 may cause alarms to
be raised for the pre-empted
LSP. If alarm suppression is
desired the pre-empting node MAY insert
the following steps before step
1.
===
Section 10
At the downstream node (with respect
to the pre-empting LSP) the
processing is RECOMMENDED to be as
follows:
1. Receive PathTear (and/or
PathErr) message for the pre-empted
LSP(s).
2a.Release the resources
associated with the LSP on the interface
to
the pre-empting LSP, remove any cross-connection and release
all other resources associated with the
pre-empted LSP.
2b.Forward the PathTear (and/or PathErr)
message per [RFC 3473].
C. Receive the
Path message for the pre-empting LSP and process as
normal, forwarding it to the downstream node.
D. Receive the Resv for the pre-empting
LSP and process as normal,
forwarding it to
the upstream node.
Cool numbering scheme :-)
Any chance of settling on something more
conventional?
===
Section 11.2
Note: when the end-to-end LSP
Protection Type is set to
"Unprotected", both S and P bit MUST
be set to 0 and the LSP SHOULD
NOT be re-routed at the head-end
node after failure occurrence. The
Association_ID value MUST be
set to the LSP_ID value of the signaled
LSP.
Please explain
the difference between an attempt to "re-route" and an attempt to
"re-establish". presumably it could involve:
- a time difference
- the use of make-before-break for failed
LSPs.
- the use of the ASSOCIATION object.
I would like to make sure that you are not
applying "SHOULD NOT" to LSP re-establishment.
===
Section 12
OLD
allocated to the LSP that was
originally routed over it even after a
NEW
allocated to the LSP that was
originally routed over them even after a
===
Section 12
- then, apply the reverse 1-phase
APS switchover request/response
(or 2-phase
APS) described in Section 6.2 (or Section 7.2,
This is the first
mention of APS
===
Section 13
I think this section is going to give us grief
during IESG review :-(
Why do we need to tie this so closely with NMS
etc. And why describe it as external?
Can't we simply describe the function
by:
- dropping the first para
- in C, D and E drop
"externally"
- in D and E replace "manual" with
"requested"
===
Section 13 TWICE
OLD
Recovery signaling operation is
initiated externally that switches
NEW
Recovery signaling is initiated
externally that switches
===
Section 13 (A and B)
is set to either 0x04, or 0x08 or
0x10.
I would prefer you to use the meanings rather than the
values.
===
Section 13 (D and E)
This, unless a fault condition
exists on
? "This is allowed"? "This is possible"? "This is
successful"?
===
Section 14
OLD
use so that the object can be
included in the Notify message to act
a switchover request for
1+1 bi-directional and 1:1 protection.
NEW
use so that the object can be
included in the Notify message to act
as a switchover request
for 1+1 bi-directional and 1:1 protection.
===
Section 14.1
I believe we have had this discussion
before.
We don't introduce reserved fields for future
extensibility. We only do it for padding.
If you are certain that we need to extend in the
future then please use sub-objects or TLVs.
This means that you can:
a. Remove the last four bytes of the Protection
object.
b. Retain the C-Type from RFC3473
===
Section 15
This object MUST be
present in the Path message (for the pre-provisioning of the
secondary protecting LSP) if and only if the LSP Protection Type
value is set to "0x02".
"MUST if and only if" is not really in
RFC2119.
Can we two statements. One with "MUST" and one
with "MUST NOT".
===
Section 15
In the case where my protecting LSP protects only
one working LSP and where the full path of the protecting LSP is known by the
ingress (strict and explicit) and there is no resource sharing between the
protected and protecting LSP, I can't see why I must include a
PPRO.
In other words, PPRO is an enabler of function
(as stated in section 15.4 "The PPRO enables of sharing recovery resources
between a given secondary protecting LSP and one or more secondary
protecting LSPs if their corresponding primary working LSPs have
mutually (link/node/SRLG)disjoint paths."), but that does not make its
presence mandatory.
===
Section 15.1
The contents of a PRIMARY_PATH_ROUTE
object are a series of
variable-length data items called
subobjects. The subobjects are
identical to those that can
constitute an EXPLICIT/RECORD ROUTE
object as defined in
[RFC3209], [RFC3473] and [RFC3477].
This seems in contradiction with section
15.3
===
Section 15.4
OLD
The PPRO enables of sharing recovery
resources between a given
NEW
The PPRO enables sharing of
recovery resources between a given
===
Section 16
The ASSOCIATION object is used to
associate LSPs with each other. In
the context of end-to-end
LSP recovery, the association MUST only
identify LSPs that
support the same Tunnel ID.
Hmmm. presumably same source and destination is
relatively important too.
===
Section 16
The ASSOCIATION object is used to
associate LSPs with each other.
You already said
this.
===
Section 16.1
Association ID: 16
bits
A
value that when combined with Association Type and
Association Source uniquely
identifies an association.
It would be helpful to state who assigns
this value.
===
Section 16.1
Association
Source: 4 or 16 bytes
The IP address of the node that
originated the association.
"The IP address"?
Question. Are two associations with the same
Association ID equivalent if the Association Source addresses are different but
identify the same node?
Answer (it transpires) is "no".
You need to make this much clearer
here.
===
Section 17
Isn't Notify modified as well?
And I thought Resv was, but I may have been
sleeping.
===
Section 18
This is a bit poor.
If you don't modify the "external commands"
section, you'll certainly have to discuss security for them. After all, a forced
failover can be pretty disruptive.
But I think you need to discuss misconnection
here. In particular when there is mesh protection going on.
===
Section 19
The IANA section needs some gardening to make it
really easy for IANA to implement.
- Break it up into clearer
subsections.
- Make sure you have included all of the
information needed in the registry
- Point back at the defining sections of the
draft
- Only have suggested values in one place in the
document
- Be consistend in using TBD or TBA in the
document
===
Section 19
Should the IANA section also cover the bits in
the ADMIN STATUS object?
===
Section 21
Missing references [CRANK], [RFC2205]. Suspect
you need to check them all.
Will need to add a reference for LSP segment
recovery.
===
Section 21.1
This seems a very long list of normative
references. I hope you can split this so that most of the references are
informational.
===
Section 22
You might change this to "Editors'
Addresses"