[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] Are we solving the wrong problem?

To: Mark Handley <M.Handley@cs.ucl.ac.uk>
Subject: Re: [RRG] Are we solving the wrong problem?
From: Elwyn Davies <elwynd@dial.pipex.com>
Date: Tue, 19 Feb 2008 18:38:40 +0000
Cc: RRG Mailing List <rrg@psg.com>
In-reply-to: <84a612dd0802182135h78d8ca9fucac0915f532cc34@mail.gmail.com>
References: <84a612dd0802182135h78d8ca9fucac0915f532cc34@mail.gmail.com>
User-agent: Thunderbird 1.5.0.14 (Windows/20071210)

Hi, Mark.

I think you have come to a similar conclusion to the one described inthe mail that I sent to the RAM list back in December 2006 (attached).

Since then I have been struggling with the issue of how to deal withsub-flows where the traffic is carried over multiple paths through thenetwork.

You are contemplating a shim6/(real) sctp (i.e. not the 'interim'solution which choses 1 of n paths) solution using multiple addresses.

I was thinking about an IP layer solution which could identify thesub-flows in a useful but low cost and non-revealing way without muchinvolvement from the sending host, but would not rquire multipleaddesses at the transport level.

My best thought on this so far was for each router through which thetraffic passes to take a suitable field for the packet (part of the flowlabel field in IPv6 would seem a good idea), hash the value with somesuitable fixed but randomized value that is characteristic of the routerand write it back into the packet. When the packet arrives at thedestination this field would contain the signature of the path that thepacket had taken. Assuming deterministic hash functions and a fixedinitial value related to the exit path from the source domain that isalso passed along to the destination, the 'ack' could return the valueswhich would enable the source to determine which paths were working best.

My feeling is that this would not require a large number of bits (maybe8 for each part, giving up to 256 exit paths/routes through thenetwork)and there is (at least for IPv6 ) I believe an incrementaldeployment solution.


The solution is still very much a work in progress.

Regards,
Elwyn

Mark Handley wrote:

I've been mulling this around for a long time now, and I think we may
be trying to solve a number of problems at the wrong layer.

The routing problems we're trying to solve seem to stem primarily from
an inability to perform aggregation of routing information.  There are
a number of causes for this, including:

 - historical allocation policies (the swamp).
 - inappropriate information hiding in BGP.
 - multihoming.
 - traffic engineering.

The first is not a serious problem, though may become one when we run
out of IPv4 addresses.  The second is one that RRG might take a
serious look at.  I've made one attempt at this with HLP, but other
solutions are certainly possible too, even without replacing BGP.  But
the current concerns mostly seem to stem from multihoming and traffic
engineering.

I believe that if we are ever to make routing scale a great deal
better, we should not be attempting to solve these last two problems
primarily within the routing system.  Clearly if you present routing
people with a problem, we'll come up with routing solutions, but if
you take a broader view, we can probably do much much better (Frank
Kelly deserves credit for the original idea I'm going to suggest). I'm
slowly writing a more detailed document discussing what I think the
solution space should be, but I'll try and give the general idea with
the simple example of a site that is dual homed to two ISPs.

Current practice attempts to hide this dual homing behind a single IP
address for each host.  This then requires a long prefix to be
advertised via both ISPs, with appropriate AS prepending to balance
traffic.  If either edge-link goes down, on average half the Internet
gets to process the routing update.  Worse, this doesn't do a great
job of load balancing, so prefix splitting is sometimes performed to
better balance load, resulting in even more stress to the global
routing system.  In short, attempts at improving local robustness
create global stresses, and potentially global fragility, which is the
problem we're all concerned with.

So, what happens if we stop trying to hide the multihoming.  Take a
server at this multi-homed site and give it two IP addresses, one from
each provider's aggregated prefix.  Now we modify TCP to use both
addresses *simultaneously* - this isn't the same as SCTP, which
switches between the two.  The client sets up a connection to one
address, but in the handshake learns about the other address too.  Now
it runs two congestion control loops, one with each of the server's IP
addresses.  Packets are shared between the two addresses by the two
congestion control loops - if one congestion-controlled path goes
twice as fast as the other, twice as many packets go that way.

OK, so what is the emergent behaviour?  The traffic self-load balances
across the two links.  If one link becomes congested, the remaining
traffic moves to the other link automatically.  This is quite unlike
conventional congestion control, which merely spreads the traffic out
in time - this actually moves the traffic away from the congested path
towards the uncongested path.  Traffic engineering in this sort of
scenario just falls out for free without needing to involve routing at
all.  And more advanced traffic engineering is possible using local
rate-limiting on one path to move traffic away from that link towards
the other.  Again, this falls out without stressing routing.

Now, there's quite a bit more to it than this (for example, it's great
for mobile devices that want to use multiple radios simultaneously),
but there are also still quite a lot of unanswered questions.  For
example, how much does this solve backbone traffic engineering
problems?  The theory says it might.  I'm working on a document that
discusses these issues in more depth.  But I think the general idea
should be clear - with backwards-compatible changes to the transport
layer and using multiple aggregatable IP addresses for each
multi-homed system, we ought to be able to remove some of the main
drivers of routing stress from the Internet.  That would then leave us
to tackle the real routing issues in the routing protocols.

I hope this makes some sort of sense,

Mark

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg

--- Begin Message ---

To: ram@iab.org

Subject: Traffic engineering and the network model

From: Elwyn Davies <elwynd@dial.pipex.com>

Date: Thu, 21 Dec 2006 17:29:00 +0000

User-agent: Thunderbird 1.5.0.8 (Windows/20061025)
A while back
Pekka Nikander wrote:
Noel Chiappa wrote:
When I look at what the problem seems to be ..., it seems to begrowth in the size of the routing table (with dynamics of routingtable entries coming in second). The two main drivers of growth seemto be i) multi-homing, and ii) general entropy and lack ofaggregation. Is this correct?
I've been told that traffic engineering is another major factor.However, I have to admit not understanding what people mean with it.(A good reference would help.)
Networks use traffic engineering to maximise the utility (in theeconomic sense) of their network assets. To do this they have to limitthe accepted portion of the offered traffic and direct the acceptedtraffic towards its intended destination subject to a number ofconstraints, especially:
- Capacity and availability of plant (routers, links etc)
- Policies (business, regulatory, political, etc.)
- Contractual (customer SLAs, transit agreements, etc)
It was clear from the IAB Routing and Addressing workshop, and also fromother discussions which I have had, that traffic engineering is veryimportant to network operators and some of the ways in which it is doneutilize the capabilities of the routing system (deliberate deaggregationof prefixes in particular) so that it contributes significantly to thegrowth of the core routing tables.
Traffic engineering only becomes 'interesting' if there are multiple,comparably effective paths between many pairs of points in the network:traffic engineering on a simple network where there is exactly one pathbetween a pair of (end-)points degenerates into limiting the acceptedtraffic to the capacity of bottleneck link - the classical model forwhich TCP is optimized!
The core network today is increasingly 'meshy' for economic androbustness reasons, and the desirable but inexorable growth of trafficmeans that link capacity is increasingly well- or over-used.Accordingly traffic engineering is an essential tool for network ownersand operators.
Observation 1: Multihoming is a form of traffic engineering.
On the architecture discussion list there was some limited amount ofthought about the model we use for the network in the core. I wouldlike to take this a bit further and see how the network model fits withthe classical and current reality, and how it interacts withmultihoming/traffic engineering.
Application View:
=================
Seen from the point of view of an application running on a node, thenetwork *internet* layer provides, at its most abstract:- a path for packets from 'here-to-anywhere' which is not necessarilyreliable
- unconstrained capacity for delivery of packets
There are no assumptions about how the network implements the path andthere are no assumptions about constraints on the size of the path. Inparticular
- there is no assumption about uniqueness of path (in time or topology)
- there is no specific guarantee of ordered delivery
Aside: At this level, the distinction between a 'user' applicationrunning on an end-point and the 'routing application' running atintermediate points is that the user application expects symmetric'anywhere-to-here' connectivity whereas the routing application doesn't.
Interestingly, the transport layer may make additional assumptions aboutand requirements on the internet layer that are not specificallyprovided by the classical model of the internet layer:- the transport layer (e.g., TCP) may assume that the network is oflimited capacity- the transport layer may make relatively strong assumptions aboutordered delivery
The constraints applied by the transport layer reflect the classicalmodel of Internet routing where there was a unique best path which had abottleneck segment that might change relatively slowly with time, butwould generally be stable during the lifetime of most application flows.
The development of the modern Internet with multihoming and trafficengineering across multiple equally capable paths appears to be wellmatched to the basic assumptions (no assumption of unique path,unconstrained capacity) on the internet layer but struggles to meet theconstraints applied by the transport layer.
Routing View
============
The routing system in the classical model:
- assumes there is a single best path from here-to-anywhere at any giventime
- does not worry about capacity constraints
Adapting the routing system to the newer model and the real world hasrequired a number of hacks such as the TE extensions (constraining thesingle best path due to capacity limits) and ECMP allowing someutilization of multiple parallel paths. Overall this has not been verysatisfactory and it is certainly not an architecturally pure solution.Moreover the distribution of traffic across multiple paths by ECMPtypically relies on hashing the whole of the destination and maybe thesource addresses and other info in the packet to ensure that thetransport layer constraints on ordering are met. In many cases thissplitting is very fine grained and relies on using more information thanis required for routing. This may limit our ability to do informationhiding in the routing system.
The deliberate disaggregation of prefixes is a response to the need toforce the routing system to distribute traffic over multiple pathsleading to the same destination where the specific tools are inadequate.
So what...
==========
Observation 2: The routing system today is not a very good match for thereality of the network model today. This is partly caused by a mismatchbetween the assumptions at the internet layer and the transport layer,and partly by the development of meshy networks.
In today's meshy network, it is not just multihomed end sites that seemultiple useful routes from here-to-anywhere. Many routers in the corenetwork could, if the routing system allowed it, also see multipleuseful routes. I would therefore claim that traffic engineering andmultihoming contain a common problem that does not just manifest itselfat the edge of the network.
Observation 3: A good solution to the routing (scalability) problemneeds to embrace the availability of (and need to use) multiple routesbetween both edge and interior points in the network.
The id/locator split solution which we have been looking at attempts tohide this situation at one particular locus in the network (somewherebetween the end host and the first provider edge), rather than solvingit generally. It would presumably leave the traffic engineering problemin the core using the existing routing based techniques.
In essence, unlike today's network, where the internet layer usesreasonably uniform routing techniques throughout and crafts trafficengineering/multihoming solutions from the same tools everywhere, thenetwork would be partitioned, with the edge and core using differenttechniques to achieve the required traffic engineering.
Observation 4: Adopting multiple different solutions to themultihoming/traffic enginering problems at different places in thenetwork is likely to lead to interactions between the solutions.
One area of interaction that I can see is the need to extractinformation from inner headers in core routers to correctly execute ECMPand other classification schemes if a form of encapsulation is used.This will increase the processing and memory bandwidth burden on corerouters, particularly during the transition.
Observation 5: Disguising a problem rather than solving it will likelylead to needing a more complex solution in the future.
There is a significant risk that network growth and increases in coreconnectivity will lead to the routing table size problem reassertingitself due to the use of complex traffic engineering. The problem wouldthus be postponed rather than solved, and the result might be the needto combine solutions rather than using a common one.
Conclusion:
===========
Whilst it is highly likely that the id/loc split is a good idea, weshouldn't assume that it is a panacea for the multiple path problem. Auniform routing solution which manages the existence of multiple pathsmay well constrain the growth of the routing tables better than multiplepartial solutions and provide a solution to the multihoming aspects ofthe core and edge problems.
In the longer term we need to look at the assumptions of the networkmodel we are using and determine if we can modify them to make themultiple path problem more tractable.
--- End Message ---

Follow-Ups:
- Re: [RRG] Are we solving the wrong problem?
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: [RRG] Are we solving the wrong problem?
  - From: Dow Street <dow.street@linquest.com>

References:
- [RRG] Are we solving the wrong problem?
  - From: "Mark Handley" <M.Handley@cs.ucl.ac.uk>

Prev by Date: Re: [RRG] Why delaying initial packets matters
Next by Date: Re: [RRG] Are we solving the wrong problem?
Previous by thread: Re: [RRG] Are we solving the wrong problem?
Next by thread: Re: [RRG] Are we solving the wrong problem?
Index(es):
- Date
- Thread