[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Three GMPLS related IDs



Hi Yangguang, you wrote 05 March 2001 14:29:
> FYI, three GMPLS related IDs have been submitted.
> 
> Architecture:
>
http://www.ietf.org/internet-drafts/draft-xu-ccamp-gmpls-arch-intra-domain-0
0.txt
>
>Signaling:
>http://www.ietf.org/internet-drafts/draft-lin-ccamp-ipo-common-label-reques
t-01.txt
>http://www.ietf.org/internet-drafts/draft-xu-ccamp-gmpls-sig-reorg-00.txt
>
>Comments are welcomed.

Here are my observations on the sig-reorg and arch-intra-domain IDs (I could
not open the label-request one):

sig-reorg ID:

1	I noticed that NSAPs are mentioned in 6.2.  Yes, I think this form
of global addressing will be of considerable interest and importance to
several carriers and should be included.  It is something we are currently
looking closely at and almost certainly will require to be supported.

2	The 'connection ID' noted in para 7.2.2 seems to be the same
(similar?) to a trail termination identifier and, as noted, requires global
uniqueness.  I noted that you have this 'TBD' wrt size/format.....this seems
a sensible decision since something larger that 32 bits will surely be
required here.

3	 I was a little unclear regarding the requirement given in the 1st
bullet in para 9.1.  Agree that user and control plane *networks* can (and
will usually) be disjoint....indeed the control-plane network *must* only
take its survivability design cues from the duct network, since only that
layer can define the inherited real disjoint connectivity.  Further, both
the control plane and user plane network need their own OAM....and in the
control plane not only is this for the 'physical transport' aspects, but
also on a per protocol-type, ie the signalling protocol will have different
failure modes to the routing protocol.  One feature that is required
however, is that user-plane failures can map to control-plane actions for
invoking restoration (the reverse of this is clearly not true)...for example
to restore S-PVC-like user-plane trails (see also point 5 below).   So I
would like to see these types of relationship made more clear.
 
4	A further observation wrt the above is that it seems you are
advocating that all failures be conveyed by control-plane message 'events'.
Are you suggesting that (at least) the 3 distinct areas of (i) user-plane
failures, (ii) signalling protocol failures and (iii) routing protocol
failures all get 'merged' into some generic form of event notification?  I
would have thought that one would need notification (and general defect
handling) per protocol type, and that this would be a function of that
protocol only since the failure modes of each would be different (and if one
changes one protocol one would not want to create backwards compatibility
issues with another).

Further, if some aspects of the control-plane fail I would still expect the
user-plane defect handling to carry on as an autonomous function....for
example, the sending of a FDI into client layers to suppress alarm storms
and protect user-plane traffic, eg squelch traffic on misrouting if trail ID
mismatch.  I am not sure we would want to make this low-level action
dependent on a 'healthy control-plane'....and this seems quite important
when one has a server->client adaptation function between 2 different
technologies, eg OTN layer->SDH layer, and where a 'common control-plane'
might not be appropriate across such a (different operator domain)
boundary....a point you indeed make in para 4 in the Arch-intra-domain ID.
Can you please explain the intent of this section more clearly and comment
on the above points, including the network management of the control-plane
....maybe you have some new ideas here that are worth exploring further?


Arch-intra-domain ID:
Note - I like this paper in general, and I guess the reason for this is that
several operators have input (requirements) to it and it is not just sourced
from vendors.  

5	A control-plane for CO transport networks is not a new idea.
Indeed, the original designers of SDH (and esp my BT colleague Andy Reid and
Mike Sexton now of Alcatel) tried to do this for SDH about 12 years
ago....maybe a bit ahead of its time then.  However, the key point I wanted
to make here is that simply automating the control-plane is not necessarily
the key issue for improving the speed of provision (this is certainly true
for us).....it is all the associated OSS/service-suround processes (eg order
handling, testing (which is more than a quick LMP-like connectivity
verification), billing, etc) that go along with this to create a service.
These points are also very well made in section 3 of
draft-bms-optical-sdhsonet-mpls-control-frmwrk-00.txt I have noted.

BTW - we have far more interest in S-PVC equivalent services than SVC
equivalent services, at least initially.  I can't speak for other carriers
but I would be surprised if they thought otherwise.  The advantages of
S-PVCs come from the fact that there is no need:
-	to develop new billing systems
-	no new interfaces to billing systems
-	no need to solve all the naming issues
-	low risk security issues
-	solve all the call admission control mechanisms
-	etc
Yet S-PVCs immediately open up the possibility of much faster provisioning;
though in BT's case what we have in our network management systems already
is pretty fast in itself.  And as a subset of the functions that are needed
to implement a switched service it will obviously be easier to implement.

Thinking about the further concerns a carrier would have to address in a
connection oriented switched-network, there needs to be a consideration of
blocking probability and users having connection requests refused, ie a GoS
metric. The result is that transport networks currently designed around high
utilisation with long-holding connections need to be redesigned around
ensuring a large enough pool of free capacity.  So, in addition to having to
address the above issues up-front with SVC-like services, carriers would
also be forced to define a 'erlang/pricing model' for something akin to a
Gbit PSTN!  Depending upon the number of circuits and the distribution of
traffic holding times it may not be possible to apply simply erlang theory
as is, but rather some modifications will be required.....and the demand
model will obviously be an interdependent function of the pricing model.  

6	Noted the point in para 4.2 regarding ENE 'having to maintain
source/sink LSP inventory', but the SNE also needs to maintain its
server->client layer adaptation mappings....so that on failure the correct
FDI information can be sent forward into all affected clients (and this
process should recurse for clients of those clients, etc).

7	In section 5.3 on path selection source (ERO) and hop-hop routing
are compared.  However, the biggest advantage of source routing is not noted
strongly enough.....and that is the fact that one does not need to wait
until a failure to calculate an alternative route from the link-state
database.  For very important routes one can pre-calculate an alternative
path and reserve resources, so restoration is very fast.  This leads me to
the next point here.

8	Although not covered explicitly in this paper, I think something
needs to be said about pre-emtion/bumping.  We have extensive experience of
priority selected pre-emption/bumping schemes.  And this experience leads us
to have a requirement that we must be able to turn-off any such schemes.
Why?  Well, it only comes into any significant effect at high utilisations.
But, because even small changes of loading around the 'congestion knee' can
cause large swings in behaviour, it is at this time that predicting its
effects can be most difficult, and a single failure event can cause multiple
consequential failures as the bumping scheme ripples out.  We don't mind
dumping 'extra' traffic, but once dumped there is no way we would want this
to then dump some even lower priority traffic.  However, I clearly would not
wish to stop anyone wanting such a scheme to try it out and discover its
usefulness for themselves.  Our requirement therefore is (i) an ability to
turn-off any bumping based on priority, but (ii) an ability to select the
priority of restoration post failure between different trails.
BTW - Although the largest BW trails seem like they should be restored 1st
(if the most efficient packing density is the sole criterion), this may not
be the case and we would want to identify trail restoration priority
irrespective of BW.

9	In section 5.3.3 you are hinting at quite an important point
regarding routing protocols.  That is, the attributes of a routing protocol
for an OTN will be quite different to that needed for SDH....or indeed IP.
This leads to the question - do we create one all-embracing routing protocol
for all technologies (and have lots of extensions and redundancies) or do we
have different (specifically tailored) routing protocols for different
technologies?  We can also say the same, and perhaps more strongly, about
signalling protocols.  And indeed also addressing.....this facet definitely
requires independent address spaces per distinct layer network even if based
on the same generic structure.  The key point is here is that not only are
these control-plane facets 'different' for different layer networks, but
they are also orthogonal *within* a layer network.....a point which seems to
get glossed over quite regularly.  This is not to decry any particular
choice of addressing/signalling/routing protocol combination, just to point
out that it is not a logical way of reasoning to say that either (i) all
layers get treated the same or (ii) *if* you choose addressing schema X you
*must* have signalling protocol Y and you *must* have routing protocol Z.
Indeed for true CO fabrics like an OTN or SDH, it is very hard to agree to
arguments that say it must be v4 (or v6) addressing and it must be RSVP
(indeed it isn't since we also have CR_LDP) and it must be OSPF (and again
indeed it isn't since we also have IS-IS).....but for the NNI BGP seems the
only choice.  I have yet to hear what is technically wrong with a
combination of NSAPs, PNNI signalling and IS-IS/BGP...and on the face of it,
these would appear far more logical choices given the nature of the beast
they are controlling;  noting that for S-PVC-like services this would also
seem like a more 'off-the-shelf' solution.

10	In section 5.4.1.5 'actions' on trails are defined.  For a CO
network with fixed BW selection at trail creation time it is very hard to
see what a 'modify' action could be....other than perhaps a change of
restoration priority and/or change of dedicated back-up trail/resource.  If
there is any notion of (working) BW change here (be it a scalar magnitude
change, ie same route, or a vector change, ie new route (with or without a
scalar magnitude change)), it would seem to me that this is really a new
pair of 'create_new->switch_to_new->delete_old' actions.  I note however
that this has been marked as TBD, but I think it would help if it was made
clear that BW changes cannot (IMO) be a single 'modify' action.

11	In section 5.4.3.1 you describe the 4 main stages of
prot-sw/restoration.  I agree.  But please don't overlook these facts that
are relevant to the first item of failure detection:
-	1st, we must identify all failure modes
-	2nd, we must describe their entry/exit criteria
-	3rd, we must take correct consequent actions, eg FDI upwards to stop
alarm storms, BDI backward (if single ended visibility of both directions
needed), squelch traffic if trail ID mismatch (important to protect customer
traffic...so this is a sort of security consideration, but it also impacts
billing etc)
-	4th, we need to know/define what constitutes up and down states of
the trail.....this defines the 2 aspects of availability SLA and the QoS
SLA, where the latter only has meaning once the former is defined (ie only
valid in up-state) and will generally be based on the defect handling
covered by points 1st-3rd above.
See our draft on MPLS OAM (packet level) where we cover the above for an
example of what is required in the user-plane:
http://www.ietf.org/internet-drafts/draft-harrison-mpls-oam-00.txt We need a
similar approach for the lower transport layers....and note that there
should be some attempt at metric/objective harmonisation across the layers
in order for any measurements/SLAs taken/applied at each layer to have some
cross-layer relative significance.

neil