[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] Re: Comments on the Design Goals I-D



    > From: Scott W Brim <swb@employees.org>

    > Some late comments

Ditto - sorry 'bout the delay, but life has been busy. (And basically all of
this I know you know already, but I put this text in for others, and perhaps
also you can use some of it.)

    > Your text is indented.
    >
    >   Long experience with inter-domain routing has shown us that the
    >   global BGP routing table is continuing to grow rapidly ...
    >   ... the first required goal is to provide significant improvement
    >   to the scalability of the routing plane.
    >
    > The last sentence is a fine end result, but carrying the state
    > information isn't the only problem. What about churn?

This is related something that I noticed when I read the draft, which is that
it talks a lot about growth, but in general terms, and I'd like to see this
scalability point broken down into more detail.

Specifically, the overhead of the routing has historically been measured by
its use of three disparate resources: memory, transmission bandwidth, and
computing power. Early work characterized routing algorithms by equations
which described how they used all of these; these days, the picture is
simultaneously less and more complicated.

Less, because bandwidth isn't really an issue any more. AFAICT, the time
required to send data to neighbours, even in the worst cases, is a miniscule
percentage of the link bandwidths we have these days. (Although we do still
have plenty of slow-speed links in the network, which architects must
remember, I'm not sure there are very many in the DFZ.)

More, because there are two kinds of memory now: RIB and FIB (and because of
the way routers are designed now, there probably always will be this
distinction). Particularly with BGP, the usage of the two has different
equations of growth. Growth of the RIB is also important because it's a
stand-in for something else that's important, which is the time needed to
cold-start a router; I gather that many current routers can take quite a
while to load the full DFZ routing table from their neighbour. The issue
here, I gather, isn't really bandwidth, but rather computing power, 


There's also a fourth "resource" which is being consumed, and it's one that
Scott's comment alluded to, which is real-time, a critical component of
stability and stabilization-time issues. A lot of comment swirls around the
FIB size, but to me, that's not really the important thing; I'm much more
concerned about stability/stabilization-time issues (I'll explain why in a
bit).

The real-time that's being used comes from two source: propogation delays
(about which we can do nothing), but my impression is that the far larger
source of delays is when updates are serialized. I.e. with one processor, if
updates A and B arrive back-to-back from a neighbour, update B can't be
processed, and forwarded on to other neighbours, until update A has been
fully handled. So this turns back into processing-power (again).

The consumption of real-time has two consequences: bad and worse. The bad one
is that after a major topology change, it can take the routing "a while" to
recover. The Renesys NANOG presentaion Taiwan quake seemed to indicate that
"a while" is on the order of an hour or so. (However, it's not clear if that
was the time it took BGP to recover, or the time needed to reroute some
circuits. The Rensys presentation also covers much longer-term outages, but
those were caused by the need to set up new business relationships - not
really a routing problem!) But even if it was partially circuit
reconfiguration, still, I notice that whenever something on the East Coast
changes, my SSH connection from Virginia to MIT inevitably times out before
the routing stabilizes again, and it would be nice if adjusted faster than
that - there's no reason it should be taking so long.

The worse one is that if the line for "time to adapt to a topology change"
(which goes up as the network gets bigger) crosses the "average inter-event
time for topology changes" line (which goes down as the network gets bigger)
you can get serious instability.

So we need to characterize use of these resources (memory, processing power,
etc) not just for the static case (when the net is stable), but *especially*
for the dynamic case (when something changes), because that's what *really*
stresses the routing.


Which is a good place to stop and point out that FIB size, in the static
case, is a problem, but it's an easy one to i) understand (which is no doubt
a good part of why it generats so much commentary), and ii) solve - because
the solutions are *local* - i.e. inside each router, with no consideration to
what's going on with its neighbours.

Stability and stabilization-time are *system* problems, where no one node can
really do anything about them - and, if we *do* run into problems, it will be
difficult, and slow, to deploy an effective fix. So that's why I think the
dynamic case (i.e. response to topology changes) is much more important to
look at.

	Noel

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg