I've been mulling this around for a long time now, and I think we may
be trying to solve a number of problems at the wrong layer.
The routing problems we're trying to solve seem to stem primarily from
an inability to perform aggregation of routing information. There are
a number of causes for this, including:
- historical allocation policies (the swamp).
- inappropriate information hiding in BGP.
- multihoming.
- traffic engineering.
The first is not a serious problem, though may become one when we run
out of IPv4 addresses. The second is one that RRG might take a
serious look at. I've made one attempt at this with HLP, but other
solutions are certainly possible too, even without replacing BGP. But
the current concerns mostly seem to stem from multihoming and traffic
engineering.
I believe that if we are ever to make routing scale a great deal
better, we should not be attempting to solve these last two problems
primarily within the routing system. Clearly if you present routing
people with a problem, we'll come up with routing solutions, but if
you take a broader view, we can probably do much much better (Frank
Kelly deserves credit for the original idea I'm going to suggest). I'm
slowly writing a more detailed document discussing what I think the
solution space should be, but I'll try and give the general idea with
the simple example of a site that is dual homed to two ISPs.
Current practice attempts to hide this dual homing behind a single IP
address for each host. This then requires a long prefix to be
advertised via both ISPs, with appropriate AS prepending to balance
traffic. If either edge-link goes down, on average half the Internet
gets to process the routing update. Worse, this doesn't do a great
job of load balancing, so prefix splitting is sometimes performed to
better balance load, resulting in even more stress to the global
routing system. In short, attempts at improving local robustness
create global stresses, and potentially global fragility, which is the
problem we're all concerned with.
So, what happens if we stop trying to hide the multihoming. Take a
server at this multi-homed site and give it two IP addresses, one from
each provider's aggregated prefix. Now we modify TCP to use both
addresses *simultaneously* - this isn't the same as SCTP, which
switches between the two. The client sets up a connection to one
address, but in the handshake learns about the other address too. Now
it runs two congestion control loops, one with each of the server's IP
addresses. Packets are shared between the two addresses by the two
congestion control loops - if one congestion-controlled path goes
twice as fast as the other, twice as many packets go that way.
OK, so what is the emergent behaviour? The traffic self-load balances
across the two links. If one link becomes congested, the remaining
traffic moves to the other link automatically. This is quite unlike
conventional congestion control, which merely spreads the traffic out
in time - this actually moves the traffic away from the congested path
towards the uncongested path. Traffic engineering in this sort of
scenario just falls out for free without needing to involve routing at
all. And more advanced traffic engineering is possible using local
rate-limiting on one path to move traffic away from that link towards
the other. Again, this falls out without stressing routing.
Now, there's quite a bit more to it than this (for example, it's great
for mobile devices that want to use multiple radios simultaneously),
but there are also still quite a lot of unanswered questions. For
example, how much does this solve backbone traffic engineering
problems? The theory says it might. I'm working on a document that
discusses these issues in more depth. But I think the general idea
should be clear - with backwards-compatible changes to the transport
layer and using multiple aggregatable IP addresses for each
multi-homed system, we ought to be able to remove some of the main
drivers of routing stress from the Internet. That would then leave us
to tackle the real routing issues in the routing protocols.
I hope this makes some sort of sense,
Mark
--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg
--- Begin Message ---
- To: ram@iab.org
- Subject: Traffic engineering and the network model
- From: Elwyn Davies <elwynd@dial.pipex.com>
- Date: Thu, 21 Dec 2006 17:29:00 +0000
- User-agent: Thunderbird 1.5.0.8 (Windows/20061025)
A while back
Pekka Nikander wrote:
Noel Chiappa wrote:
When I look at what the problem seems to be ..., it seems to be
growth in the size of the routing table (with dynamics of routing
table entries coming in second). The two main drivers of growth seem
to be i) multi-homing, and ii) general entropy and lack of
aggregation. Is this correct?
I've been told that traffic engineering is another major factor.
However, I have to admit not understanding what people mean with it.
(A good reference would help.)
Networks use traffic engineering to maximise the utility (in the
economic sense) of their network assets. To do this they have to limit
the accepted portion of the offered traffic and direct the accepted
traffic towards its intended destination subject to a number of
constraints, especially:
- Capacity and availability of plant (routers, links etc)
- Policies (business, regulatory, political, etc.)
- Contractual (customer SLAs, transit agreements, etc)
It was clear from the IAB Routing and Addressing workshop, and also from
other discussions which I have had, that traffic engineering is very
important to network operators and some of the ways in which it is done
utilize the capabilities of the routing system (deliberate deaggregation
of prefixes in particular) so that it contributes significantly to the
growth of the core routing tables.
Traffic engineering only becomes 'interesting' if there are multiple,
comparably effective paths between many pairs of points in the network:
traffic engineering on a simple network where there is exactly one path
between a pair of (end-)points degenerates into limiting the accepted
traffic to the capacity of bottleneck link - the classical model for
which TCP is optimized!
The core network today is increasingly 'meshy' for economic and
robustness reasons, and the desirable but inexorable growth of traffic
means that link capacity is increasingly well- or over-used.
Accordingly traffic engineering is an essential tool for network owners
and operators.
Observation 1: Multihoming is a form of traffic engineering.
On the architecture discussion list there was some limited amount of
thought about the model we use for the network in the core. I would
like to take this a bit further and see how the network model fits with
the classical and current reality, and how it interacts with
multihoming/traffic engineering.
Application View:
=================
Seen from the point of view of an application running on a node, the
network *internet* layer provides, at its most abstract:
- a path for packets from 'here-to-anywhere' which is not necessarily
reliable
- unconstrained capacity for delivery of packets
There are no assumptions about how the network implements the path and
there are no assumptions about constraints on the size of the path. In
particular
- there is no assumption about uniqueness of path (in time or topology)
- there is no specific guarantee of ordered delivery
Aside: At this level, the distinction between a 'user' application
running on an end-point and the 'routing application' running at
intermediate points is that the user application expects symmetric
'anywhere-to-here' connectivity whereas the routing application doesn't.
Interestingly, the transport layer may make additional assumptions about
and requirements on the internet layer that are not specifically
provided by the classical model of the internet layer:
- the transport layer (e.g., TCP) may assume that the network is of
limited capacity
- the transport layer may make relatively strong assumptions about
ordered delivery
The constraints applied by the transport layer reflect the classical
model of Internet routing where there was a unique best path which had a
bottleneck segment that might change relatively slowly with time, but
would generally be stable during the lifetime of most application flows.
The development of the modern Internet with multihoming and traffic
engineering across multiple equally capable paths appears to be well
matched to the basic assumptions (no assumption of unique path,
unconstrained capacity) on the internet layer but struggles to meet the
constraints applied by the transport layer.
Routing View
============
The routing system in the classical model:
- assumes there is a single best path from here-to-anywhere at any given
time
- does not worry about capacity constraints
Adapting the routing system to the newer model and the real world has
required a number of hacks such as the TE extensions (constraining the
single best path due to capacity limits) and ECMP allowing some
utilization of multiple parallel paths. Overall this has not been very
satisfactory and it is certainly not an architecturally pure solution.
Moreover the distribution of traffic across multiple paths by ECMP
typically relies on hashing the whole of the destination and maybe the
source addresses and other info in the packet to ensure that the
transport layer constraints on ordering are met. In many cases this
splitting is very fine grained and relies on using more information than
is required for routing. This may limit our ability to do information
hiding in the routing system.
The deliberate disaggregation of prefixes is a response to the need to
force the routing system to distribute traffic over multiple paths
leading to the same destination where the specific tools are inadequate.
So what...
==========
Observation 2: The routing system today is not a very good match for the
reality of the network model today. This is partly caused by a mismatch
between the assumptions at the internet layer and the transport layer,
and partly by the development of meshy networks.
In today's meshy network, it is not just multihomed end sites that see
multiple useful routes from here-to-anywhere. Many routers in the core
network could, if the routing system allowed it, also see multiple
useful routes. I would therefore claim that traffic engineering and
multihoming contain a common problem that does not just manifest itself
at the edge of the network.
Observation 3: A good solution to the routing (scalability) problem
needs to embrace the availability of (and need to use) multiple routes
between both edge and interior points in the network.
The id/locator split solution which we have been looking at attempts to
hide this situation at one particular locus in the network (somewhere
between the end host and the first provider edge), rather than solving
it generally. It would presumably leave the traffic engineering problem
in the core using the existing routing based techniques.
In essence, unlike today's network, where the internet layer uses
reasonably uniform routing techniques throughout and crafts traffic
engineering/multihoming solutions from the same tools everywhere, the
network would be partitioned, with the edge and core using different
techniques to achieve the required traffic engineering.
Observation 4: Adopting multiple different solutions to the
multihoming/traffic enginering problems at different places in the
network is likely to lead to interactions between the solutions.
One area of interaction that I can see is the need to extract
information from inner headers in core routers to correctly execute ECMP
and other classification schemes if a form of encapsulation is used.
This will increase the processing and memory bandwidth burden on core
routers, particularly during the transition.
Observation 5: Disguising a problem rather than solving it will likely
lead to needing a more complex solution in the future.
There is a significant risk that network growth and increases in core
connectivity will lead to the routing table size problem reasserting
itself due to the use of complex traffic engineering. The problem would
thus be postponed rather than solved, and the result might be the need
to combine solutions rather than using a common one.
Conclusion:
===========
Whilst it is highly likely that the id/loc split is a good idea, we
shouldn't assume that it is a panacea for the multiple path problem. A
uniform routing solution which manages the existence of multiple paths
may well constrain the growth of the routing tables better than multiple
partial solutions and provide a solution to the multihoming aspects of
the core and edge problems.
In the longer term we need to look at the assumptions of the network
model we are using and determine if we can modify them to make the
multiple path problem more tractable.
--- End Message ---