[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
New: draft-harrison-mpls-oam-00.txt
Hi,
Please post the attached Internet draft on the IETF Internet draft directory.
<<draft-harrison-mpls-oam-00.txt.txt>>
Thanks,
Shahram Davari
Systems Engineer
Product Research Group
PMC-Sierra, Inc. (Ottawa)
Phone: (613) 271-4018
Fax: (613) 271-7007
<<Shahram Davari.vcf>>
Neil Harrison
Internet Draft Peter Willis
Document: draft-harrison-mpls-oam-00.txt British Telecom
Expires: August 2001
Shahram Davari
PMC-Sierra
Ben Mack-Crane
Tellabs
Hiroshi Ohta
NTT
February 2001
OAM Functionality for MPLS Networks
Status of this Memo
This document is an Internet-Draft and is in full conformance
with all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Copyright Notice
Copyright(C) The Internet Society (2001). All Rights Reserved.
Abstract
This Internet draft provides requirements and mechanisms for OAM
(Operation and Maintenance) for the user-plane in MPLS networks. A
connectivity verification "CV" OAM packet is defined, which is
transmitted periodically from LSP source to LSP sink. The CV flow
could be used to detect defects related to misrouting of LSPs as
well as link and nodal failure, and if required to trigger
protection switching to the protection path.
Harrison et.al Expires August 2001 Page 1
OAM Functionality for MPLS Networks February 2001
A forward defect identifier "FDI" and a backward defect identifier
"BDI" are defined, which carry the defect type and location to the
near end and far end respectively. At every LSP terminating node,
the FDI is mapped from server layer to client layer. By doing so FDI
could suppress the alarm storm, and let the appropriate layer take
control of protection switching. BDI is used by LSP source to start
or stop the QoS aggregation, depending on whether the LSP is in
available or unavailable state. The criteria for entry and exit to
the available and unavailable states are also defined in this
document.
Table of Contents
1. Introduction..................................................3
2. Definitions...................................................4
3. Symbols and Abbreviations.....................................5
4. Requirements for MPLS OAM.....................................5
5. Principles of OAM Function....................................6
5.1 Client/Server Recursion-Layering..............................6
5.2 OAM Functionality and Layer Independence......................7
5.3 Defects.......................................................7
5.4 Availability..................................................7
5.5 Decoupling of User behavior from Connectivity Assessment......8
5.6 Forward and Backward Defect Indicators........................8
5.7 Connectivity Verification.....................................9
5.8 Customers Should not be Used as Defect Detectors.............10
5.9 The Reliability of OAM Functionality Under Fault Conditions..10
6. Mechanisms of MPLS OAM.......................................10
6.1 Special MPLS Label Values....................................10
6.2 Handling of Errored OAM Packets..............................10
6.3 Label Stack Overhead Encoding Rules for OAM Packets..........11
6.3.1 For CV OAM Packets...........................................11
6.3.2 For P OAM Packets............................................12
6.3.3 For FDI and BDI OAM Packets..................................12
6.3.4 MPLS OAM Function Types for the OAM Alert Label..............13
6.4 MPLS OAM Packets.............................................14
6.4.1 Connectivity Verification (CV) Packets.......................15
6.4.2 Performance “P” Packets......................................16
6.4.3 Forward defect Indicator “FDI” packets.......................16
6.4.4 Backward Defect Indicator “BDI”..............................17
6.5 Defect Types and their Entry/Exit Criteria...................18
6.5.1 Defect Type Codepoints.......................................18
6.5.2 dLOCV Entry Criteria.........................................20
6.5.3 DTTSI Entry Criteria.........................................21
6.5.4 dLoop Entry Criteria.........................................21
6.5.5 dLOCV, dTTSI and dLoop exit criteria.........................22
6.6 Available and unavailable state processing...................23
Harrison et. al. Expires August 2001 Page 2
OAM Functionality for MPLS Networks February 2001
6.6.1 Short Break definition.......................................23
6.6.2 Available/Unavailable State Definition.......................24
6.6.3 Near-end and Far-end Measurements of Availability............24
6.6.4 Near-End State Processing Flow-chart.........................25
6.6.5 Far-End State Processing Flow-chart..........................27
6.6.6 A pictorial view of near-end and far-end state processing....28
7. Security Considerations......................................29
8. References...................................................29
9. Author's Addresses...........................................29
Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
this document are to be interpreted as described in RFC-2119 [1].
1. Introduction
This Internet draft provides requirements and mechanisms for OAM
(Operation and Maintenance) for the user-plane in MPLS networks. It
is recognized that OAM functionality is important in public networks
for ease of network operation, for verifying network performance and
to reduce operational costs. OAM functionality is especially
important for networks, which are required to deliver (and hence be
measurable against) QoS (Quality of Service) and availability
performance parameters/objectives.
A connectivity verification "CV" OAM packet is defined in this
document, which is transmitted periodically from LSP source to LSP
sink. The CV flow could be used to detect defects related misrouting
of LSPs as well as link and nodal failure, and if required to
trigger protection switching to the protection path. A forward
defect identifier "FDI" and a backward defect identifier "BDI" are
defined, which carry the defect type and location to the near end
and far end respectively. At every LSP terminating node, the FDI is
mapped from server layer to client layer. By doing so FDI could
suppress the alarm storm, and let the appropriate layer take control
of protection switching. BDI is used by LSP source to start or stop
the QoS aggregation, depending on whether the LSP is in available or
unavailable state. The criteria for entry and exit to the available
and unavailable states are also defined in this document.
The OAM functionality defined herein is limited to point-point LSP
tunnels. OAM functionality for multipoint-point and point-multipoint
LSP tunnels is FFS.
Harrison et. al. Expires August 2001 Page 3
OAM Functionality for MPLS Networks February 2001
2. Definitions
This document introduces some new terminology, which is required to
discuss the functional network components associated with OAM.
Functional Architecture Meaning
Term
------------------ ------------------
Client/server A term referring to the transparent
(relationship between transport of a client (ie higher)
layer networks) layer link connection by a server
(ie lower) layer network trail.
Link connection A partition of a layer N trail that
exists between two logically
adjacent switching points within the
layer N network.
LSP Tunnel An LSP Tunnel is an LSP with well-
defined source (ingress point) and
sink (egress point)
Subnetwork A subnetwork is a contiguous
topological region of a network
delimited by its set of peripheral
access points, and is characterized
by the possible routing across the
subnetwork between those access
points. A network is the largest
subnetwork and a node is the
smallest subnetwork (at least in
practical physical terms, though
there are smaller sub-networks
within nodes).
Trail A generic transport entity at layer
N which is composed of a client
payload (which can be a packet from
a client at higher layer N-1) with
specific overhead added at layer N
to ensure the forwarding integrity
of the server transport entity at
layer N.
Trail termination point A source or sink point of a trail at
layer N, at which the trail overhead
is added or removed respectively. A
trail termination point must have a
unique means of identification
within the layer network.
Harrison et. al. Expires August 2001 Page 4
OAM Functionality for MPLS Networks February 2001
3. Symbols and Abbreviations
This list is not exhaustive of all the abbreviations used in this
draft. In particular, those in common usage within the MPLS
community (like 'MPLS' itself) have been excluded.
Abbreviation Meaning
--------------- ----------------------------
AIS Alarm Indication Signal
BDI Backward Defect Indication
CV Packet Connectivity Verification Packet
FDI Forward Defect Indication
FFS For Further Study
OAM Operations and Maintenance
P Packets Performance Packets
QoS Quality of Service
SLA Service Level Agreement
TTSI Trail Termination Source Identifier
4. Requirements for MPLS OAM
MPLS layer OAM functionality is not a substitute for physical or
server layer OAM (e.g., SDH/SONET) or client layer OAM (e.g., IP).
MPLS LSPs create layer networks in their own right, and will have
defects that are only relevant to the MPLS LSP layer networks.
OAM functionality is useful because:
1) It allows the Operator to verify whether Quality of Service
guarantees given in SLAs (Service Level Agreements) are in fact
being met by the connection.
2) It allows the Operator to reduce network’s operating costs, by
allowing more efficient detection and handling of defects.
Long-term statistics show that the costs of operating a public
network are higher than the initial installation costs.
3) It gives support for improved accounting/billing procedures.
4) It helps provide security for customer traffic by the detection
of traffic mis-connections (which may otherwise be
undetectable).
Harrison et. al. Expires August 2001 Page 5
OAM Functionality for MPLS Networks February 2001
The following functions are required:
1) Connectivity Verification of LSPs to confirm that defects do
not exist on the target LSPs.
2) Fast and efficient defect detection, notification and
localization.
3) Measurement of availability performance.
The necessity of additional functions are for further study. In
particular, the need for in-service measurement of LSP QoS
performance (measurement of packet losses, spurious packets, errored
packets, delay and delay variation) is for further study. Note that
an LSP needs to be in the available state for QoS assessment to be
valid.
Defects include following cases:
1) Simple loss of LSP connectivity (due to a server layer failure
or a failure within the MPLS layer network);
2) Swapped LSP trails;
3) Unintended LSP mismerging (of 2 or more LSP trails);
4) Unintended replication of LSP packets (of the same LSP trail
for example, due to routing loops).
5. Principles of OAM Function
The following principles can, for the most part, be applied to any
layer networks, ie not just MPLS. This recommendation defines
specific embodiments of these principles, as functional OAM
entities, for MPLS layer networks. Although it is recommended that
all the OAM functional entities are deployed network-wide, operators
are free to choose if they wish to apply all or only some of these
OAM functional entities (ie CV flows but not P flows), and whether
deployment is network-wide or limited in scope to LSPs of certain
types, e.g. apply only to important LSPs such as those supporting
VPNs. In cases of limited OAM functional entity deployment or scope,
then operators should be aware that there could be deficiencies in
their ability to detect/handle certain defect cases.
5.1 Client/Server Recursion-Layering
A very important functional architecture feature of layer networks
is client/server recursion (also known as layering). That is, a
client layer link connection (ie a partition of a longer client
layer trail between two logically adjacent client layer nodes) is
created by a server layer trail. This is the basis of client layer
topology construction. This recursion principle extends between
various client/server layer relationships and ultimately 'to the
duct'. Note also that client layer link connections can be multiple
in number, ie a single server layer trail entity can support a
multiple number of client layer link connections.
Harrison et. al. Expires August 2001 Page 6
OAM Functionality for MPLS Networks February 2001
The key points to note here are:
(1) The client and server layer trails termination points will
generally not be congruent. And since the trail termination
points are associated with the addressable access points of a
layer network, it follows that the addressing of the two layers
will also generally not be congruent.
(2) The 'duct' (or more precisely the environment of physical
occupancy and connectivity) is the lowest layer network. The
degree of connectivity in this layer effectively defines the
degree of independent connectivity in all client layers. This
could be put another way, by saying that the availability
performance of any client layer network design is determined
(and inherited from) the physical infrastructure. This means
that if one cannot state which link connections have a common
lower server layer trail, then one cannot say anything with
certainty about the resilience design of a client layer
network.
5.2 OAM Functionality and Layer Independence
The OAM functionality of a layer network must not be dependent on
any specific server or client layer technology. This is critical to
ensure that layer networks can evolve (or new/old layer networks be
added/removed) without impacting other layer networks.
The control-plane of a given layer network must also have its own
OAM.
[Note - Control-plane OAM is outside the scope of this draft.]
5.3 Defects
All the major defect conditions must be identified with in-service
measurable entry and exit criteria, and all consequent actions must
be specified. The entry and exit criteria of various defects should
be temporally harmonized as far as possible to simplify trail
defect-state processing. Attention should be paid to relating the
defect entry/exit criteria to ‘short-breaks’, which are generally
accepted by many operators as 3-9s periods of gross signal
disturbance from which the network may self-recover. If the event
lasts for >=10s this is the normally accepted threshold for entering
the unavailable state (also see the next item).
5.4 Availability
The most important performance metric of a trail (or a subnetwork
partition thereof) is availability. This means that the entry and
exit criteria for the available state must be defined. It is also
important to understand how unavailable/available state transitions
relate to the stopping/starting of the aggregation of available
Harrison et. al. Expires August 2001 Page 7
OAM Functionality for MPLS Networks February 2001
state QoS metrics; noting that from pragmatic considerations this
may be effectively applied at an earlier point to preserve the
integrity of the available state metrics, e.g. after 3s say, which
marks the onset of (at least) a short-break, and which from
operational experience is a good practical rule-of-thumb for setting
a point beyond which a network is unlikely to self-recover.
5.5 Decoupling of User behavior from Connectivity Assessment
User traffic behavior must not be a factor in connectivity status
assessment. In practical terms, this means decoupling user traffic
behavior from all defects and (the dependent) available state
entry/exit criteria.
5.6 Forward and Backward Defect Indicators
The node in the layer network, which first detects a defect (sourced
from within that layer), should apply a well-known 'Forward Defect
Indication' (FDI) signal in the downstream direction. In the
majority of current transport network technologies such a signal has
been termed AIS (Alarm Indication Signal). At the trail termination
point where the appropriate FDI signal is generated:
(1) There should be a complimentary Backward Defect Indication
(BDI) signal (which is removed at the upstream trail
termination point) and
(2) There must be a mapping of the FDI signal from the server layer
to the appropriate FDI signal of the client layer(s) as part of
the server->client adaptation process.
The primary purpose of the FDI signal is to suppress client layer
alarms (which would otherwise create an 'alarm storm' in places
which could be geographically and organizationally far removed from
the originating defect source location).
Three secondary purposes of FDI (and in some cases BDI) are:
(1) To allow correct processing of available state performance
metrics.
(2) To inform applications that the connection is no longer
functioning correctly and to take appropriate action, e.g.
perhaps invoke a 're-connect' action, or in the case of voice
perhaps mute the speech path.
(3) To inform client layer trails (e.g. nested LSPs in the case of
MPLS) that a defect has occurred in a lower server layer trail,
and hence to provide some indication that protection-switching
in the affected client layer trails could be postponed to give
the server layer trail an opportunity to effect protection
switching.
FDI/BDI signals should also provide information on the defect
location and type. Such information is very useful to the lead
Harrison et. al. Expires August 2001 Page 8
OAM Functionality for MPLS Networks February 2001
operator in a co-operating domain scenario, and can also
differentiate failures, which are internal or external to public and
private domains.
Note that, if being used, the BDI signal must be generated (in the
backward direction) in response to detecting a defect at a trail
sink termination point (in the forward direction) and not from some
intermediate point, such as where the defect might be actually
located. The reasons for this are that:
(1) In the case of bi-directional trails and unidirectional
defects, each trail direction might not be congruently routed.
(2) In the case of unidirectional trails the BDI signal may be
provided out-of-band, e.g. perhaps via a control-plane or
management-plane mechanism. [Note: The exact means for
providing the BDI functionality in this is FFS]
The above requirements mean that the FDI/BDI architecture is valid
for all routing cases.
5.7 Connectivity Verification
An essential characteristic of the trails in a layer network is that
their trail termination points must have a unique identifier (at
least within that layer network). However, on link connections
between nodes within the layer network, relative identifiers are
commonly used for traffic forwarding. These relative identifiers
only have to be unique per interface, e.g. the VPI/VCI of ATM, the
DLCI of FR, the ‘label’ of MPLS.
When relative identifiers are used for traffic forwarding there is a
possibility of trail misconnectivity due to defects. These cover a
variety of connectivity failure modes, including:
1) Simple loss of continuity (due to a server layer failure or a
failure within the layer network considered);
2) Swapped connections;
3) Unintended mismerging (of 2 or more trails);
4) Unintended replication (of the same trail due, for example, to
routing loops).
Although some of these defects may be rare in practice, unless
detected/corrected their consequences can be very severe for an
operator; ranging from simple availability/QoS SLA violations
through to more serious security, censorship and mis-billing
implications.
It is therefore required that a unique trail source identifier be
periodically transmitted from the trail source to the trail sink to
detect these types of defect.
Harrison et. al. Expires August 2001 Page 9
OAM Functionality for MPLS Networks February 2001
5.8 Customers Should not be Used as Defect Detectors
The OAM tools provided should ensure (as far as reasonably
practicable) that customers should not have to act as failure
detectors for the operator.
5.9 The Reliability of OAM Functionality Under Fault Conditions
Under fault conditions a layer network cannot, by definition, be
expected to behave in a predictable manner. Therefore care should be
exercised when specifying and using OAM functions that require a
layer network to function in a reliable and predictable manner for
fault diagnosis.
6. Mechanisms of MPLS OAM
6.1 Special MPLS Label Values
The label structure defined in [1] indicates a single label field of
20 bits. Label field values 0-3 have already been reserved for
special functions. A special label, the 'OAM Alert Label', is
defined as follows:
Table 1: OAM Alert Label
Label value
(Decimal) Meaning
------------ -----------------------
4 OAM Alert Label. This indicates that the
first octet following the OAM Alert Label
[Note: this value is in the OAM payload (ie octet 5) is an OAM
yet to be officially Function Type field whose value defines
assigned by IANA] the type of defect handling OAM function
(ie CV, P, FDI or BDI), which follows in
the payload area.
All OAM packets must have a minimum payload length of 40 octets to
facilitate ease of processing. This is achieved by padding with all
0s when necessary. All padding bits are reserved for future operator
defined usage.
6.2 Handling of Errored OAM Packets
Each OAM packet uses a BIP16 (in the last two octets of the OAM
payload area) to detect errors. The BIP16 is computed over all the
fields of the OAM payload, including the initial octet, which
Harrison et. al. Expires August 2001 Page 10
OAM Functionality for MPLS Networks February 2001
specifies the Function Type and the BIP16 bit positions (which are
all pre-set to zero for initial calculation purposes).
BIP16 processing must be performed on all OAM packets prior to being
able to reliably pass their payload for further processing. Any OAM
packets that show a BIP16 violation upon reception processing should
be discarded.
In the case of the CV packet flow, persistent BIP16 violations will
cause a Loss of Connectivity Verification; this defect is defined
later, but for now we can note that it would occur after nominally
3s. This behavior is consistent with the nature of the defect.
However, it is recommended that at a local equipment level some
notification is given to the Network Management System to indicate
that BIP16 discards are occurring.
In the case of the other OAM packet types, ie the FDI, BDI and P
packets (these are defined later), it is again recommended that at a
local equipment level some indication is given to the Network
Management System that BIP16 discards are occurring. The threshold
to be used for recording/reporting such BIP16 discard activity for
these OAM packets should be programmable, and is outside the scope
of this Recommendation.
6.3 Label Stack Overhead Encoding Rules for OAM Packets
6.3.1 For CV OAM Packets
CV OAM packets are differentiated from normal user-plane traffic by
an increase of one in the label stack depth at a given LSP level at
which they are inserted. Therefore, they maintain this label stack
difference of one (from normal user-plane traffic) as they traverse
any lower layer server LSPs.
The OAM Alert Labeled header is added before (ie below) the normal
user-plane forwarding labeled header at the LSP trail source point.
The S bit is set only in the OAM Alert Label.
The CV OAM packet can be used on both E-LSPs and L-LSPs. However,
the coding of the EXP field is different in the two cases.
In the case of L-LSPs, the coding of the EXP field should be set to
all 0s in both the OAM Alert Labeled header and the preceding normal
user-plane forwarding header. This is to ensure the CV OAM packets
have a Per Hop Behavior (PHB), which ensures the lowest drop
probability [2].
In the case of E-LSPs, the coding of the EXP field should be set to
all 0s in the OAM Alert Labeled header and to whatever is the
'minimum loss-probability PHB' in the preceding normal user-plane
forwarding header for that E-LSP. This is again to ensure the CV
OAM packets have a PHB, which ensures the lowest drop probability
[2].
Harrison et. al. Expires August 2001 Page 11
OAM Functionality for MPLS Networks February 2001
The TTL field should be set to 1 in the OAM Alert Labeled header.
The reasons for this are:
· CV OAM packets should never travel beyond the LSP trail
termination sink point at the LSP level they were originally
generated (noting that they are not examined by intermediate
label-swapping LSRs, and are only observed at LSP sink points),
and
· The TTL of the immediately prior normal user-plane forwarding
header is used to mitigate against damage from looping packets.
6.3.2 For P OAM Packets
The label stack overhead encoding rules of performance P OAM packets
are FFS.
6.3.3 For FDI and BDI OAM Packets
FDI and BDI OAM packets are invoked, on a nominal 1 per second
basis, when defects are detected. The FDI packet traces forward and
upward through any nested LSP stack. The BDI packet is sent
backwards towards its peer-level LSP trail termination sink point in
the reverse direction (assuming a bi-directional in-band LSP exists)
for each LSP at and above the level of the defect.
The OAM Alert labeled header is inserted before (ie below) a normal
user-plane forwarding labeled header, and a label stack of 2 is only
ever required for either the FDI or BDI packet at their origin.
Note that in the case of FDI, it is assumed that the server->client
LSP adaptation mappings that were in existence prior to the failure
are recursively used to ensure correct FDI forwarding. It is
therefore important that the LSP sink point remembers any server-
>client LSP labels mappings that were in existence prior to the
failure. Although the exact means for achieving this are outside
the scope of this Recommendation, some examples of how these server-
> client layer label mappings could be configured are as follows:
· Manually, via the NMS say;
· Automatically on LSP set-up via extensions to LDP/RSVP
signaling;
· By an automatic 'learning process', i.e. if, during the
establishment of the client LSPs, the signaling is tunneled
trough the server layer, then the server trail terminating node
could keep the information about the established LSPs in memory
as they occur.
When server->client layer LSP relationships are changed (e.g.
existing client layer LSP removed, or new client LSP added say),
then it is important that the server->client label mappings are also
updated to reflect the new relationships.
Harrison et. al. Expires August 2001 Page 12
OAM Functionality for MPLS Networks February 2001
The S bit is set only in the OAM Alert Labeled header. The FDI OAM
packet is recursively mapped upwards, through a client/server
adaptation process at LSP trail termination sink points, into any
further affected higher client layer LSPs. When this arrives at the
top LSP it needs to be mapped into an equivalent FDI for whatever
client layer is then being carried. In the case of IP (or indeed
any other client layer), this is outside the scope of this document.
Note that higher level LSPs will also see failures (as a result of
corruption of their own CV flow) but they will also see an incoming
FDI OAM packet flow from the lowest level LSP where the failure
originates. This dynamic behavior allows for correct identification
of the true source of the defect and is explained in more detail
later. But for now it is sufficient to note that the incoming FDI
is needed to:
· Suppress unnecessary alarms in the affected higher layer LSPs.
· Give an indication to affected higher-level LSPs that they may
need to hold-off protection switching as the defect is at a
lower level LSP.
· To allow the appropriate BDI coding at the affected higher
layer.
It is assumed that when a BDI OAM packet is returned in-band it
follows a bi-directional LSP and, like the CV and P OAM packets,
that it should never travel beyond the LSP trail termination sink
point (of the return LSP).
The coding of the EXP field associated with the OAM Alert Labeled
header and the preceding normal user-plane forwarding labeled header
at the LSP level at which the FDI or BDI is inserted is the same as
that previously described for the CV OAM packet.
The TTL field should be set to 1 in the OAM Alert Labeled packet
header. The reasons for this are:
· The FDI OAM packet is recursively regenerated at each LSP trail
termination sink point into all affected client layer LSPs (if
any); so the TTL field is recursively regenerated with a value
of 1;
· The BDI OAM packet should never travel beyond the LSP trail
termination sink point of the return LSP at the LSP level that
it was originally generated;
· The TTL of the immediately prior normal user-plane forwarding
header is used to mitigate against damage from looping packets.
6.3.4 MPLS OAM Function Types for the OAM Alert Label
The first octet of the OAM packet payload specifies the OAM Function
Type as follows:
Table 2: OAM Function Types
Harrison et. al. Expires August 2001 Page 13
OAM Functionality for MPLS Networks February 2001
OAM Function Type First octet of OAM packet payload
codepoint (Hex) Function Type Purpose
----------------- ----------------------------------
00 Reserved
01 CV (Connectivity Verification). Used
to detect/diagnose all types of LSP
connectivity defect (sourced either
from below or within the MPLS
network). This will be the main in-
service OAM defect detection tool.
02 P (Performance). Used to measure
user-plane loss of packets and their
aggregate octets.
03 FDI (Forward Defect Indicator). This
is generated by an MPLS node detecting
any defect (defined later) and
inserted into affected client layers.
Its primary purpose is to suppress
alarms being raised within affected
higher level client LSPs and (in turn)
their client layers. It includes
fields to indicate the nature of the
defect and its location.
04 BDI (Backward Defect Indicator). This
is generated at a return LSP trail
termination source point in response
to a defect being detected at a LSP
trail termination sink point in the
other direction. The defect type and
location codepoints of the
complimentary FDI are mapped into
similar fields of the BDI. The BDI
may be realized either in the user-
plane if bi-directional LSPs are being
used (the case considered in this
document) or out-of-band (e.g. via
management-plane function) in the case
of uni-directional LSPs. The latter
scenario is outside the scope of this
document.
All other OAM Function Type codepoints are reserved for possible
future standardization.
6.4 MPLS OAM Packets
Harrison et. al. Expires August 2001 Page 14
OAM Functionality for MPLS Networks February 2001
6.4.1 Connectivity Verification (CV) Packets
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Func Type (1) | (must be 0) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| Ingress Router ID |
+ +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| LSP ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
\\ Reserved (0) 14 bytes \\
| |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | BIP 16 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 1: CV Payload Structure
The intention is that the CV OAM packet is transmitted from the LSP
trail termination source point at a nominal rate of 1 CV per second.
It is important that the rate of CV OAM packet generation is
constant so that simple and deterministic defect processing can be
carried out at the LSP trail termination sink point.
CV OAM packets within a given LSP are not synchronous to any other
CV OAM packets in any other LSP (this includes all nested LSPs, and
CV OAM packets from the remote end of an LSP at level N but in the
other direction when bi-directional LSPs at level N are being used).
The structure of the LSP Trail Termination Source Identifier (TTSI)
is defined by using a 16 octet Router ID IPv6 address plus a 4 octet
LSP Tunnel ID [3]. Note that the first 2 octets of the LSP Tunnel
ID are currently padded with all 0s to allow for any future increase
in the Tunnel ID field.
For nodes that do not support IPv6 addressing, an IPv4 address can
be used for the Router ID using the format described in RFC1884 [4].
Harrison et. al. Expires August 2001 Page 15
OAM Functionality for MPLS Networks February 2001
That is:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| (0) |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | (FF) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IPv4 Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2: IPV6 Compatible IPV4 Address
On LSP establishment the LSP trail termination sink point should be
configured with the expected TTSI (Ingress router ID + LSP ID).
Ideally this should be done automatically via LSP signaling at LSP
set-up time (e.g. via a CR-LDP or RSVP control-plane mechanism), but
it could also be configured manually. The mechanism for achieving
this configuration is outside the scope of this Recommendation.
6.4.2 Performance “P” Packets
The structure of the P OAM packet is FFS.
6.4.3 Forward defect Indicator “FDI” packets
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Func Type (3) | (must be 0) | Defect Type |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Defect Location |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
\\ Reserved (0) 30 bytes \\
| |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | BIP 16 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3: FDI Payload Structure
The FDI is sent downstream from the first node detecting the defect.
In the case of MPLS server layer failures (i.e. in a lower layer
technology such as SDH) this would be the first MPLS node downstream
of the server layer failure (as a consequence of the appropriate
client/server adaptation of the server FDI signal). In the case of
MPLS layer failures (i.e. failures within the MPLS fabric) this
Harrison et. al. Expires August 2001 Page 16
OAM Functionality for MPLS Networks February 2001
would be the first LSP trail termination sink point at the same LSP
level as the failure.
The primary function of the FDI is to stop downstream client layer
alarm storms and hence correctly focus the attention of Operational
personnel. However, FDI can also have an important role in:
· Facilitating correctly targeted nested LSP protection schemes,
i.e. one would want a lower level (server) LSP to protection
switch before a higher level (client) LSP if the fault was
sourced from within the lower level LSP, and
· Identifying availability/short-break events and hence suspend
up-state QoS metric aggregation.
The format of the Defect Location field and its handing at inter
domain NNI boundaries is FFS.
The Defect Type field is set at 2 octets here. This is currently
considered sufficient, but it should be confirmed once all the
Defects Types have been identified and fully specified. A candidate
set of Defect Types and their codepoints are given later.
The handling of the Defect Type field at inter domain NNI boundaries
is FFS. However, 2 octets have been reserved for this function.
When a FDI is to be passed from a server layer LSP to its client
layer LSP(s) (ie at the client/server adaptation function following
the server layer LSP trail termination sink point), the Defect
Location and Defect Type field should be copied from the server
layer LSP FDI into the client layer LSP(s) FDI.
The mapping of MPLS layer sourced FDI from the highest-level LSP
into its client layer (e.g. IP) is outside the scope of this
document.
6.4.4 Backward Defect Indicator “BDI”
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Func Type (4) | (must be 0) | Defect Type |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Defect Location |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
\\ Reserved(0) 30 bytes \\
| |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | BIP 16 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4: BDI Payload Structure
Harrison et. al. Expires August 2001 Page 17
OAM Functionality for MPLS Networks February 2001
For the case of bi-directional LSPs, the BDI is sent from the LSP
trail source point of the return LSP as a mirror of the appropriate
(see Note) FDI at the LSP trail sink point of the other direction.
The Defect Location and Defect Type fields are a direct mapping of
those sets in the appropriate (see Note) FDI and have identical
formats as described previously for the FDI OAM packet.
Note - The word 'appropriate' here signifies that any incoming FDI
(i.e. from a lower layer) takes precedence over any FDI that would
have been generated at the layer being considered due to detecting
defects at this layer (where these defects are only consequential as
a result of a lower layer defect).
The BDI does not propagate beyond its return LSP trail termination
sink point, and it is discarded at that point after any processing
based its observation is carried out, e.g. for single-ended short-
break and/or availability measurements.
6.5 Defect Types and their Entry/Exit Criteria
6.5.1 Defect Type Codepoints
The following coding structure is proposed for the various defect
types so far identified:
Table 3: Defect Types
DT code in FDI/BDI
OAM packets (Hex)
Note: first octet
indicates layer and
Defect second octet
Type indicates defect Meaning
------- -------------------- ------------------------
dServer 01 01 Any server layer defect
arising below the MPLS layer
network. It is not suggested
that these are individually
identified and defined for
each type of server layer,
since this function is only
appropriate to the server
layer itself. Hence, we only
need an indication that it is
the server layer and not the
MPLS layer.
dLOCV 02 01 Simple Loss of Connectivity
Verification due to missing
CV OAM packets with expected
TTSI. Note that if the cause
Harrison et. al. Expires August 2001 Page 18
OAM Functionality for MPLS Networks February 2001
of dLOCV is the server layer
(ie there is also an incoming
FDI signal from the server
layer) then the DT codepoint
01 01_H is used. The dLOCV
codepoint 02 01_H is only
used for MPLS layer simple
connectivity failures only.
dTTSI 02 02 Trail Termination Source
Identifier Mismatch due to an
unexpected TTSI observed in
the incoming CV OAM packets.
This detects swapped
connections and unintended
mismerging failures, which
can be differentiated by
noting whether an expected
TTSI is also missing or
present respectively. Note
that in the case of the
former (ie swapped
connections), the dTTSI
defect condition takes
priority over the dLOCV
defect condition, which is
also present.
dLoop 02 03 This detects an unintended
replication Looping defect
from observation of an
increased rate of expected CV
OAM packets above the nominal
1/sec. (Note this defect is
added for completeness, but
it is expected to be rare)
dUnknown 02 FF Unknown defect detected in
the MPLS layer. This is
expected to be used for MPLS
nodal failures, which are
detected within the node
(probably by proprietary
means) and affect user-plane
traffic.
None 00 00 Reserved
None FF FF Reserved
There are 3 MPLS layer user-plane defects, ie dLOCV, dTTSI and
dLoop, which we now define in more detail.
Harrison et. al. Expires August 2001 Page 19
OAM Functionality for MPLS Networks February 2001
6.5.2 dLOCV Entry Criteria
Entry to the dLOCV condition, and hence entry to the LSP Trail Sink
Near-End Defect State, occurs when there are no expected CV OAM
packets observed in any period of 3 consecutive seconds.
In terms of consequent actions:
· If there is an incoming FDI signal from a server layer below
the MPLS network, then this is mapped to the DT codepoint 01
01_H in the FDI OAM packets sent forwards and the BDI OAM
packets sent backwards. The local DL codepoint is also
inserted in these FDI and BDI OAM packets. There are no alarms
associated with the MPLS layer itself but only the server
layer, which sourced the FDI signal.
Else:
· If there is an incoming FDI signal from a lower level LSP
within the MPLS network, then that FDI signal's DL/DT
codepoints are mapped into the FDI sent to any further client
layers (i.e. suppresses generation of FDI DL/DT codepoints from
this point) and the BDI OAM packet sent backwards. There are
no alarms generated regarding this LSP (the alarm will be
associated with the lowest layer LSP within which the defect
originated).
Else:
· If there is no FDI signal incoming from the server layer or a
lower level LSP AND there are no CV OAM packets observed with
an unexpected TTSI which give rise to the dTTSI defect, then
the DT codepoint 02 01_H is inserted in the FDI OAM packets
sent downstream and the BDI OAM packets sent upstream. The
local DL codepoint is also inserted in these FDI and BDI OAM
packets. A local alarm is raised relevant to this defect
condition.
Note:
(i) Since OAM packet flows are not synchronized in LSPs at
different hierarchical levels (ie when LSPs are nested),
there is a possibility that a client layer LSP detects a
defect before its server layer LSP. This error could be
up to 1s due to CV packet arrival time differences plus
some additional uncertainty due to network delay
effects. This could result in an error of judgment as to
the type of defect that is present and hence which
consequent actions are appropriate; especially whether
the raising of a local alarm is appropriate and the
correct setting of the DL and DT codepoints in FDI/BDI
OAM packets. To mitigate this effect, it is recommended
that the raising of an alarm is deferred for at least 2
Harrison et. al. Expires August 2001 Page 20
OAM Functionality for MPLS Networks February 2001
seconds after a defect state is detected (the exact
value is FFS). This will also allow the network to
settle into a stable state as regards defect detection
behavior.
(ii) The starting/stopping of aggregation of any LSP user-
plane packet/octet loss metrics (e.g. if using the P OAM
packet say) is dependent on whether the LSP is in the
available or unavailable state.
6.5.3 DTTSI Entry Criteria
Entry to the dTTSI condition, and hence entry to the LSP Trail Sink
Near-End Defect State, occurs when there are >= 2 CV OAM packets
observed in any period of 3 consecutive seconds each with an
unexpected TTSI. Any expected CV OAM packets or any incoming FDI
signals (from either the server layer or a lower level LSP) are
ignored, and it should be noted that the dTTSI defect overrides the
dLOCV defect if both are present (as would be the case, for example,
with swapped LSPs). The DT codepoint 02 02_H is inserted in the FDI
OAM packets sent forwards and the BDI OAM packets sent backwards.
The local DL codepoint is also inserted in these FDI and BDI OAM
packets. A local alarm is raised relevant to this defect condition
and the unexpected TTSI captured locally (this may also be
optionally sent to the NMS as an exception report say). The
downstream traffic must also be suppressed.
Note:
(i) Since OAM packet flows are not synchronized in LSPs at
different hierarchical levels (ie when LSPs are nested), there
is a possibility that a client layer LSP detects a defect
before its server layer LSP. This error could be up to 1s due
to CV packet arrival time differences plus some additional
uncertainty due to network delay effects. This could result in
an error of judgment as to the type of defect that is present
and hence which consequent actions are appropriate; especially
whether the raising of a local alarm is appropriate and the
correct setting of the DL and DT codepoints in FDI/BDI OAM
packets. To mitigate this effect, it is recommended that the
raising of an alarm is deferred for at least 2 seconds after a
defect state is detected (the exact value is FFS). This will
also allow the network to settle into a stable state as regards
defect detection behavior.
(ii) The starting/stopping of aggregation of any LSP user-plane
packet/octet loss metrics (e.g. if using the P OAM packet say)
is dependent on whether the LSP is in the available or
unavailable state.
6.5.4 dLoop Entry Criteria
Harrison et. al. Expires August 2001 Page 21
OAM Functionality for MPLS Networks February 2001
Entry to the dLoop condition, and hence entry to the LSP Trail Sink
Near-End Defect State, occurs when there are >= 5 CV OAM packets
observed in any period of 3 consecutive seconds each with an
expected TTSI. The DT codepoint 02 03_H is inserted in the FDI OAM
packets sent forwards and the BDI OAM packets sent backwards. The
local DL codepoint is also inserted in these FDI and BDI OAM
packets. A local alarm is raised relevant to this defect condition.
Note:
(i) Since OAM packet flows are not synchronized in LSPs at
different hierarchical levels (ie when LSPs are nested), there
is a possibility that a client layer LSP detects a defect
before its server layer LSP. This error could be up to 1s due
to CV packet arrival time differences plus some additional
uncertainty due to network delay effects. This could result in
an error of judgment as to the type of defect that is present
and hence which consequent actions are appropriate; especially
whether the raising of a local alarm is appropriate and the
correct setting of the DL and DT codepoints in FDI/BDI OAM
packets. To mitigate this effect, it is recommended that the
raising of an alarm is deferred for at least 2 seconds after a
defect state is detected (the exact value is FFS). This will
also allow the network to settle into a stable state as regards
defect detection behavior.
(ii) The starting/stopping of aggregation of any LSP user-plane
packet/octet loss metrics (e.g. if using the P OAM packet say)
is dependent on whether the LSP is in the available or
unavailable state.
6.5.5 dLOCV, dTTSI and dLoop exit criteria
Exit of the dLOCV, dTTSI or dLoop condition, and hence exit of the
LSP Trail Sink Near-End Defect State, occurs when there are:
· >= 2 but <= 4 CV OAM packets observed each with an expected
TTSI, AND
· No CV OAM packets observed with an unexpected TTSI in any
period of 3 consecutive seconds.
Note that the numbers of CV OAM packets observed each with an
expected TTSI are a suggested number. It must be further studied if
these numbers are appropriate.
All the consequent actions invoked when entering the LSP Trail Sink
Near-End Defect State (i.e. sending of FDI and BDI OAM packets, the
raising of local alarms and the suppression of traffic in the dTTSI
case only) are stopped when we exit the LSP Trail Sink Near-End
Defect State.
Note – The starting/stopping of aggregation of any LSP user-plane
packet/octet loss metrics (e.g. if using the P OAM packet say) is
Harrison et. al. Expires August 2001 Page 22
OAM Functionality for MPLS Networks February 2001
dependent on whether the LSP is in the available or unavailable
state.
6.6 Available and unavailable state processing
The main purpose of defining harmonized defect entry/exit criteria
as noted above is in order to significantly simplify:
· Near-end/far-end LSP Trail Sink Defect State processing;
· Near-end/far-end LSP Available State processing (which will
shortly be discussed);
· The decision point at which any LSP user-plane traffic QoS
metrics (if being collected) are stopped/started with respect
to aggregation into long-term registers.
In all sections where the evaluation of events is described, the
measurement technique is based on a sliding-window with a 1 second
granularity of advance. Note that the datum for the commencement of
the sliding window is an arbitrary point in time decided by the each
node independently and is not synchronized to OAM packet arrival
events on any LSPs. This is deemed acceptable to allow simpler
nodal processing.
It should be noted that this Recommendation uses the traditional
functional dependency relationship between QoS and availability.
That is:
· QoS is a unidirectional metric, ie if QoS metrics are being
measured then each direction is measured independently.
· Availability is a bi-directional metric in the case of bi-
directional LSPs, in the sense that if any direction enters the
unavailable state (defined later) then both directions are
deemed to be unavailable. In the case of unidirectional LSPs,
then availability can only have unidirectional significance.
· QoS measurements must be suspended (as regards aggregation into
long-term available state registers) if an LSP enters the
unavailable state; noting that this means the QoS measurements
of both directions from the definition of the availability
metric above in the case of bi-directional LSPs.
However, it should also be noted that (for both pragmatic reasons
and to preserve their statistical significance) QoS metric
aggregation is actually suspended after detecting a short-break
event.
6.6.1 Short Break definition
We first define a short-break event. This is defined as a period
where the entry and exit to any of the previously defined defect
conditions both occur within 9s, ie the LSP Trail Sink Near-End
Defect State lasts for <= 9s. The start of the short-break occurs
at the beginning of the defect entry criteria and the end of the
Harrison et. al. Expires August 2001 Page 23
OAM Functionality for MPLS Networks February 2001
short-break occurs at the beginning of the defect exit criteria.
Clearly this has a minimum period of 3s. Short-breaks are only
defined to exist when the LSP is in the Available State.
Note – Short-breaks are more common than many people realize (in one
operator's network a study of SES (Severely Errored Second) events
showed that about 50% of these would have been classified as short-
breaks). They can cause severe disruption to some applications and
are therefore an important performance metric (perhaps second in
importance after availability). Since they exist at the physical
layers they will exist (by inheritance) in client layers, such as
MPLS and IP. An important property of the short-break, which we
will exploit, is that it yields a pragmatic harmonized threshold for
defect evaluation (across all defect types as noted previously) and
the stopping/starting of QoS metric aggregation into long-term up-
state performance registers.
6.6.2 Available/Unavailable State Definition
If the LSP Trail Sink Near-End Defect State exceeds 10 consecutive
seconds in duration then the LSP enters the Unavailable State. The
start point of the Unavailable State is deemed to be at the
beginning of these 10 consecutive seconds. We therefore no longer
have a short-break (and the event should not be registered as such).
A LSP re-enters the Available State after first exiting the LSP
Trail Sink Near-End Defect State and there has been an aggregate
period of 10 consecutive seconds in which there have been:
· >=9 and <= 11 CV OAM packets each with an expected TTSI, AND
· No CV OAM packets with an unexpected TTSI.
Note that the numbers of CV OAM packets observed each with an
expected TTSI are suggested numbers. It must be further studied if
these numbers are appropriate.
The start point of the Available State is deemed to be at the
beginning of these 10 consecutive seconds.
6.6.3 Near-end and Far-end Measurements of Availability
All of the above discussion is strictly only relevant to the near-
end processing when the LSP trail termination sink point is in the
LSP Trail Sink Near-End Defect State as discussed previously. We
can also measure the far-end availability behavior (useful when only
a single end is accessible for measurement) by using the BDI signal
(when bi-directional LSPs are being used) since this is a reflected
upstream mirror of the duration over which FDI is sent downstream.
We therefore define the LSP Trail Sink Far-End Defect State to be
the period over which BDI OAM packets are observed subject to the
following entry and exit criteria:
Harrison et. al. Expires August 2001 Page 24
OAM Functionality for MPLS Networks February 2001
· Entry of the LSP Trail Sink Far-End Defect State occurs on the
first BDI OAM packet observed.
· Exit of the LSP Trail Sink Far-End Defect State occurs after a
period of 3 consecutive seconds in which no BDI OAM packets
have been received.
Note that this 3s processing delay on exit is to cater for cases in
which perhaps a single BDI is lost (say due to congestion or
errors). Its effect must be catered for in the far-end processing
state machine as discussed later.
Since we have fixed the temporal duration of the far-end state to be
directly related to the near-end state (albeit with a +3s exit
checking period) we can therefore measure both short-breaks and
unavailability of both directions from a single end (on the
assumption that bi-directional LSPs are being used).
6.6.4 Near-End State Processing Flow-chart
The following figure summarizes many of the key points regarding the
near-end state-processing algorithm for a given LSP.
Figure 5: LSP Near-End State Processing Flow Chart
1. Assume we start in the available state in the box marked
‘Start’. All timers (shown later) can conceptually be assumed
reset at this point. If there are any QoS metrics being
collected (e.g. packet/octet loss measurements from the P OAM
packet) then this is assumed to be active at this time.
2. The first decision box is ‘dLOCV, dTTSI or dLoop?’. These
defects were defined previously. If none of these defects are
present we keep checking for this condition and stay in the
available state. However, if one of these defects is present
we enter the Trail Sink Near-End Defect State.
3. The consequent actions now required depend on the nature of the
defect observed, and whether there is any incoming FDI from a
lower layer, and should follow the rules given previously. But
note that any QoS metrics, which are being collected, are
suppressed from aggregation into the long-term registers
against available time. The registers are effectively
backdated 3s to allow for the defect detection time (at this
stage we cannot judge whether the event will be a Short-Break,
and hence the LSP remains in the Available State, or whether
the LSP will enter the Unavailable State).
4. We now start timer T1. This timer is used to determine the
duration of the Trail Sink Near-End Defect State, and if this
persists for a sufficient time (ie a further 10s) then this
timer is used to branch the flow-chart into the Unavailable
State processing region.
5. Below (timer) T1, we loop round the decision boxes ‘T1<10s?’
and ‘End dLOCV, dTTSI or dLoop?’. We can exit this loop if the
defect state ends (in accordance with criteria given
Harrison et. al. Expires August 2001 Page 25
OAM Functionality for MPLS Networks February 2001
previously) before T1 reaches 10s. Since we are still in the
available state, we restart any QoS metric aggregation into the
long-term registers (noting the last 3s must be accounted for),
we stop FDI/BDI OAM packet generation and capture the short-
break event in the local registers. Additionally, if the event
was due to a dTTSI, then we should also capture the TTSI of the
offending LSP and cease the suppression of traffic. The
timestamp of the event should be related to the onset of the
defect, which caused it. If however T1 reaches 10s we enter the
Unavailable State. Note that it is not possible to enter the
Unavailable State unless the Trail Sink Near-End Defect State
has persisted for at least 10s in the Available State.
6. We now record a date/time-stamped Unavailable State entry event
in the local registers together with information on the nature
of the defect, which caused it. Note that the date/timestamp
must be backdated 13s. Optionally, we may also send an
exception report to the NMS with the Unavailable State entry
date/timestamp noted above, together with any other relevant
information about the defect which caused it, e.g. in the case
of dTTSI this should include the TTSI of the offending LSP. We
now stop timer T1 and start timer T2, whose purpose is to
record the duration of the Unavailable State. Note that when we
enter the Unavailable State we also remain in the Trail Sink
Near-End Defect State.
7. We now run round a decision box ‘End dLOCV, dTTSI or dLoop?’,
which is just below the point where we started timer T2, which
checks for the end of the defect state. When the defect ends
(in accordance with the criteria given previously) we stop
FDI/BDI OAM packet generation and exit the Trail Sink Near-End
Defect State. Any QoS metric aggregation is still inhibited.
8. We now run round the decision loop comprised of the two boxes
‘>=9 but <= 11 expected CV OAM packets in last 10s AND no
unexpected CV OAM packets' and ‘dLOCV, dTTSI or dLoop?’. If a
further defect occurs before we meet the exit criteria of the
former decision box, we re-enter the Trail Sink Near-End Defect
State and hence restart the generation of FDI/BDI OAM packets
(with DL/DT codepoints and other consequent actions relevant to
the specific defect observed). Any QoS metric aggregation
continues to be inhibited. In this case we are back at point 7
above in the state processing and recommence checking for the
end of the defect. Note that timer T2 continues to run.
9. To get out of the Unavailable State we must first have exited
the Trail Sink Near-End Defect State as noted in 7 above, and
then met the criteria of the decision box ‘>=9 but <= 11
expected CV OAM packets in last 10s AND no unexpected CV OAM
packets?’ as noted in 8 above. Note that the ‘last 10s’
referred to here includes the 3s interval required to check for
the end of the Trail Sink Near-End Defect State as noted above
in item 7.
10. We now stop timer T2 and record the duration of the
unavailability event in the local registers. We recommence any
QoS metric aggregation into the local registers and cease all
consequent actions associated with the Unavailable State. Note
Harrison et. al. Expires August 2001 Page 26
OAM Functionality for MPLS Networks February 2001
that T2 will record Unavailable State duration, which is 3s
less than the true unavailability event. Note also that the
last 10s belong to the Available State and so any QoS metric
aggregation will need to take these 10s into account.
Optionally, we may also send an exception report to the NMS
with the Unavailable State exit date/timestamp suitably
corrected as noted above.
11. This now takes us back to our starting point in the Available
State.
6.6.5 Far-End State Processing Flow-chart
The following figure summarizes many of the key points regarding the
far-end state-processing algorithm for a given LSP.
Figure 6: LSP Far-End State Processing Flow Chart
1. Assume we start in the available state at the box marked
‘Start’. All timers shown later in the flow chart can
conceptually be assumed to be reset at this point. If there is
any backward QoS aggregation activated on the return direction
LSP then this will be via a separate P OAM packet flow on the
return LSP.
2. The first decision box is ‘BDI OAM packet?’. If the answer is
'No', then we keep looping this check condition and stay in the
Available State. If the answer is 'Yes', then this implies
that the near-end processing at the other end of the (outgoing)
LSP has entered the Trail Sink Near-End Defect State. Note
that this also implies that the defect has already existed for
3s at the other end of this LSP.
3. We then enter the Trail Sink Far-End Defect State and inhibit
any backward QoS metric aggregation. The QoS registers will
need to be corrected for the previous 3s, which should not be,
aggregated into the long-term Available State counts.
4. We now start timer T3, and run round the loop composed of the
decision boxes ‘T3 <13s?’ and ‘3s BDI-Free?’. T3 is used to
check the duration of the Trail Sink Far-End Defect State. If
T3 does not reach 13s and we get 3s, which are BDI-Free, then
we re-start any backward packet level metric aggregation. Note
that the last 6s must be accounted for in any backward QoS
metric aggregation registers. This arises since it takes the
near-end processing 3s to declare the end of the defect at the
other send of the (outgoing) LSP, and a further 3s to declare
the end of the Trail Sink Far-End Defect State at this end of
the (return) LSP, and all this time should count towards the
Available State at this end of the LSP to ensure correct QoS
metric aggregation. A Short-Break date/time-stamped event
should also be recorded in the local registers together the
DL/DT information of the defect as given in the BDI OAM packet.
This Short-Break event must be date/time-stamped relative to 3s
before the time at which the first BDI OAM packet was observed.
This now takes us back to the initial start position. If
however T3 reaches 13s we enter the far-end Unavailable State.
Harrison et. al. Expires August 2001 Page 27
OAM Functionality for MPLS Networks February 2001
Note that it is not possible to enter the Unavailable State
unless the Trail Sink Far-End Defect State has effectively
persisted for at least 13s (and which means that at the other
end of the (outgoing) LSP the Trail Sink Near-End Defect State
has persisted for at least 10s) in available time.
5. Optionally, we may now send a date/time-stamped unavailability
entry exception report to the NMS, which includes the relevant
BDI OAM packet DL/DT information. Note that the date/timestamp
of any such exception report should be backdated by 16s (ie 3s
prior to the first BDI OAM packet being observed for this
event) to align the far-end processing with that of the near-
end processing at the other end. We now stop timer T3 and
start a timer T4, whose purpose is to record the duration of
this unavailability event. Note that when we enter the
Unavailable State we also remain in the Trail Sink Far-End
Defect State.
6. We now run round a loop that checks for 3s which are BDI-Free.
This is used to take us out of the Trail Sink Far-End Defect
State. Note that this is not strictly necessary, and this
check condition could have been omitted and we could just have
shown the following one which checks for a continuous (ie
overall) 10s of BDI-Free behavior. However, it has been shown
like this to harmonize the ‘look’ of the near-end and far-end
Trail Sink Defect State processing.
7. If we get 3s which are BDI-Free then we exit the Trail Sink
Far-End Defect State and run a loop which checks if we have had
an overall continuous period of 10s which are BDI-Free. If any
further BDI OAM packets appear within this overall 10s checking
period then we re-enter the Trail Sink Far-End Defect State and
need to repeat the process from step 6 above. If, however, no
further BDI OAM packets appear within the 10s checking period
we exit the far-end Unavailable State.
8. We stop timer T4 and record the duration of the unavailability
event. T4 will record a time, which is 3s less than the true
unavailability event. A date/time-stamped unavailability exit
event, backdated 13s, together with the unavailability duration
should now be recorded in the local registers. Optionally,
this information may also be sent to the NMS as an exception
report.
9. Any backward QoS metric aggregation can now be restarted,
noting that the last 13s belong to available time and so the
aggregate registers should be corrected accordingly
6.6.6 A pictorial view of near-end and far-end state processing
The following figure is given to help clarify the temporal
relationships between the near-end and far-end state processing
given in the previous flow-charts for short-break event and an
unavailability event.
Figure 7: Near-End and Far-End Temporal Processing of a Short-Break
and Unavailability event
Harrison et. al. Expires August 2001 Page 28
OAM Functionality for MPLS Networks February 2001
7. Security Considerations
The OAM function described in this document enhances the security of
MPLS networks, by detecting mis-connections, and therefore
preventing customers’ traffic to be exposed to other customers.
The MPLS OAM functions as defined in this document do not raise any
new security issue, to MPLS networks.
8. References
[1] Rosen E, et al, RFC 3032, "MPLS label stack encoding".
[2] Le Faucheur et al, "MPLS support of Differentiated Services",
draft-ietf-mpls-ext-08.txt, work in progress.
[3] Awduche et al, "RSVP-TE: Extensions to RSVP for LSP Tunnels",
draft-ietf-mpls-rsvp-lsp-tunnel-05.txt, work in progress.
[4] Hinden and Deering, RFC 1884, "IP Version 6 Addressing
Architecture".
9. Author's Addresses
Neil Harrison
British Telecom Phone: 44-1604-845933
Heath Bank Email: neil.2.Harrison@bt.com
Iugby Road, Harleston
South Hampton, UK
Peter Willis
British Telecom Phone: 44-1473-645178
BT, PP RSB10/PP3 B81 Email: peter.j.willis@bt.com
Adastrial Park
Martlesham, Ipswich, UK
Shahram Davari
PMC-Sierra
411 Legget Drive Phone: 1-613-271-4018
Kanata, ON, Canada Email: Shahram_Davari@pmc-sierra.com
Ben Mack-Crane
Tellabs
4951 Indiana Ave Phone: 1-630-512-7255
Lisle, IL, USA Email: ben.mack-crane@tellabs.com
Hiroshi Ohta
NTT
Y-709A, 1-1 Hikarino’ka phone: 81-468-59-8840
Yokosuka-Shi Email: ohta.hiroshi@nslab.ntt.co.jp
Kanagawa, Japan
Harrison et. al. Expires August 2001 Page 29
Shahram Davari.vcf