[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] Geoff Huston's article on BGP stability, update statistics and damping



    > From: Geoff Huston <gih@apnic.net>

    >> First, fixing this would be a *one-time* improvement in the growth
    >> curve for the update rate (versus long-term time). I.e. you'd see a
    >> flat spot (or a drop) in the curve, after which it would resume its
    >> previous growth.

    >> Second, 25% is not peanuts, but it's also not an order of magnitude
    >> ... I don't know what the current growth rate for BGP updates is, but
    >> if it's anything like the growth rate of the table itself .. 25% is
    >> not a long calendar time.

    > I'm not sure that I agree with either of your assertions

I'm a little surprised to hear that, because I thought the rationale behind
them was pretty non-contentious. Let me lay it out in a bit more detail, and
you can point out where you think I've missed something. (And please excuse
the level of detail; I'm not being snarky, I want to be able to see exactly
which step isn't justified.)


As I said, I don't know what the current growth rate for the BGP *update* rate
is (and see comment below about data), but I would assume that that growth
rate is related to the growth rate of the nework (and therefore the table)
itself. Perhaps not a straight linear relationship, e.g. because as the
network grows, paths get longer, so there's a higher chance that a given
connectivity change will impact the path to a given destination. However, I
think the relationship can hardly be sub-linear (i.e. slower growth on update
rate than on the table size).

Breaking it down a bit more, if the average rate of connectivity change (i.e.
crash/failure/etc), for each type of individual element (switch, link, etc),
is basically constant over time (and yeah, there will probably be
improvements, but I don't know how fast), then the overall connectivity change
rate, for the nework as a whole, will scale linearly with the growth of the
network. I would also assume that the rate of update generation has some
reasonably simple relationship to the rate of topology change. That would mean
that the derivatives of all three (i.e. the rate of change of the network
size, the rate of change of connectivity change, and the rate of change for
the rate at which updates are generated) also have some simple (i.e. linear)
relationship.

In other words, the long-term graph of the update rate would have the same
shape as the long-term graph of the table size - and we know what that curve
looks like.

From there, it's a simple step to the first point. Just as a one-time
reduction in the number of destinations historically produced a flat spot (or
a decline) in the size of the routing table, after which it resumed its
inexorable and painful growth, we'd see the same kind of shape in this graph.

Similarly, for the second, just as with the fairly high growth rate of the
overall routing table, a one-time reduction of 25% in routing table size
would only buy us a limited amount of time there, so too a 25% reduction in
update rate would only buy us a limited amount of time on the update rate
growth.


    > the basic reason is a lack of an analysis of this over a long term over
    > a number of years (the data is there, but someone has to set up the
    > computational run across it).

I couldn't agree more completely. The above line of reasoning seems fairly,
well, reasonable, but it might be wrong, and real data beats theorizing any
day.


    > My intuition says that as the degree of interconnection in the inter-AS
    > cloud increases, then the BGP "amplification" of underlying events
    > increases

Not just degree of interconnection; as I mentioned above, as the network
grows, the average path length grows, so intuitively (and maybe I'm wrong
here, but I think this is likely) there's a slow growth in the likelihood
that any particular connectivity change will impact any given path.

Of course, in actuality average paths lengths in the Internet aren't growing
as fast as they would in a random graph which is growing at the rate the
Internet is, but to achieve that we are seeing higher connectivity levels,
and that also has a potential impact, as you point out. So Mother Nature gets
you one way or the other!

    > and the same set of underlying events could be the cause of 26%,
    > 27%,... etc of all updates in the future.

I agree this is quite possibly true; but even if true, I don't think it
necessarily significantly conflicts with my points. Yes, if the share of
updates which are caused by this effect is slowly increasing, then getting
rid of them will not just produce a flat spot in the graph of the update
rate, but also slightly change the shape of the overall curve.

However, if the effect (numerically) is in the range you suggest, it don't
think it would make a *substantial* difference to the *overall shape* of the
curve, in the long term - that was my real, basic, point. Now, if it were
going to jump 25%, 30%, 40%, 60%, then, yes, then it would make a significant
difference.


    > the querstion in my mind is: are the dynamics we've been seeing in
    > update load an artifact of the increasing size of the internet, the
    > increasing interconnection of the Internet, the increasing level of
    > policy diversity within the Internet or all three

I would say all three, but without data, I doubt we can say much about the
relative contribution of each, which is of course what we'd like to know.

    > or do all three growth elements tend to interact with the rather chatty
    > way in which BGP undertakes convergence to create a long term traffic
    > trend?

I think you're asking if there's a synergistic effect on update rate growth
among the three? Could be, but we'd need even more detailed analysis to know,
I suspect.

    > If this is the case and if we understand that we really cannot change
    > the dynamics of the first three elements, then what precisely is the
    > nature of the interaction with BGP, and is it possible that by altering
    > BGP update propagation behaviors is it possible to shift the BGP load
    > onto a different trajectory?

Right, but a trajectory that's only slightly different doesn't really help.
But without more detailed understanding, not just of the basic curve itself
(which we don't know), but of how changes might affect it, we can't
definitely say much...


    > Your message says to me: "no, a modification of BGP creates a short
    > term reduction in update load, but the underlying growth factors are
    > independent of this." I am not sure that I agree with this proposition

Well, but looking at those growth factors (i.e. i) the increasing size of the
Internet, with its subsidiary ia) the growth in average path length; ii) the
increasing interconnection of the Internet - which does mitigate against ia;
iii) the increasing level of policy diversity within the Internet), only one
of them, the policy stuff, can possibly have any connection with BGP.

In other words, to the extent that the growth curve is being shaped by the
other three (and I think I made a reasonably argument, at the top, that they
are reasonably directly related to the growth in the update rate), we can't do
much to change it. I.e. even if we somehow dial out iii), the others will
combine to produce a growth curve which is likely to be problematic.

Now, I *can* think of ways to change the routing overhead, but it implies
more radical changes to the existing system (e.g. more abstraction, allowing
us to reduce the size of the routing tables).


    > more data and more experimentation is probably a good way to understand
    > this a little better.

Again, no disagreement with that! Real data is always better.

	Noel

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg