[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] comments on IVIP Conceptual Summary and Analysis documents



Short version:   Fast-push mapping changes for TE.

                 Who would run the "anycast ITR(s)"?

                 End users with PI space today converting it
                 to a MAB so Ivip can manage it as numerous
                 micronets.

                 End users in the future renting address space
                 within a MAB from an organisation which provides
                 the space, runs the ITRs for this MAB and conveys
                 the end-user's mapping changes to the global
                 fast-push system.

                 Why all packets addressed to micronet addresses
                 should go through an ITR and ETR, rather than
                 relying on (in the case of the ETR and destination
                 network being in the same ISP's network as the
                 sending host) the local routing system to achieve
                 the same outcome.

                 Why it is easier, better etc. to charge for
                 map-encap mapping changes than to charge for
                 changes to BGP advertisements.

                 Reasons ISPs would run ITRs in their own networks.

                 Interactions between the map-encap scheme and the
                 BGP system.



Hi Phil,

Further to my first response, which dealt with the question of how
many end-users could be served by each Mapped Address Block (MAB =
BGP advertised prefix sliced and diced into micronets), here are
some responses to your other questions.

> <Every micronet is mapped to a single ETR's address. Load sharing
> can be achieved as long as the load is spread over multiple IP
> addresses (or /64s), by making each one a separate micronet, and
> mapping each micronet to a different one of several ETRs.>
> 
> Today multihoming & route flapping are big causes of BGP updates
> - I think a good % come from sites that flap at whatever the max
> rate allowed is (partly this seems to be how some do load
> balancing /TE). How many updates can your global push system cope
> with? Does it assume that these are just for change of provider,
> or are TE updates also ok?

Pushing one more mapping change out to ITRDs and QSDs (full database
ITRs and Query Servers) will be a highly efficient process compared
to what happens with BGP when a router changes its advertisements.
With BGP, there is a ripple effect as routers compare notes with
their peers, change their mind about the best route, tell their
peers about it etc.  There is a lot of communication overhead,
protocol complexity, policy stuff, CPU work, memory etc, and the
information gets propagated as far as it needs to one hop at a time,
with various adjustments and eventually the whole system settling
down into a new state.

In Ivip, the multiple RUASes (Root Update Authorisation Systems)
work together with a bunch of distributed servers which launch
identical sets of mapping update packets into a "Replicator"
network.  Each Replicator receives two identical feeds (for
robustness against packet loss or link failure) and generates, for
instance, 20 feeds to downstream replicators.  This is a crosslinked
tree-like structure I need to do an illustration for, since it would
be tricky in ASCII art.

The raw data for an Ivip IPv4 mapping change is only 12 bytes.  The
plan is to get it to all the ITRDs and QSDs within 5 seconds of the
end-user giving the map change command directly or indirectly to the
RUAS which manages the MAB in which the end-users micronet(s) are
located.

A router in my ISP in Melbourne Australia can get a packet to a
router in bt.com in the UK - half-way round the planet - in 161
msec.  (347 ms round trip from my ADSL service, minus 35 ms to that
router, divided by 2.) So I think a well engineered fast-push system
should be able to fan the mapping data out to tens or hundreds of
thousands of ITRDs and QSDs in five seconds.  Maybe the time could
be reduced to 2 or 3 seconds.

This past-push system, or at least most of it, will be financed the
RUASes which run it, by charging the end-users per update in some way.

More information about the fast-push system is in:

http://www.firstpr.com.au/ip/ivip/draft-whittle-ivip-db-fast-push-00.pdf

In: http://psg.com/lists/rrg/2008/msg00535.html I contemplated some
update rates and costs.  With 228 updates a second, a fee of 5 cents
an update would earn the RUASes a million dollars a day, which is
more than sufficient to run this global system of servers.  They
would be plain commercial off the shelf (COTS) servers (but dual
power supply, high reliability, quad core, ECC memory etc.) with
some presumably open-source software.

I can't predict what the update fee would be, but it seems it would
be in the range 0.5 cents to 10 cents.  If there are end-user
networks who find it cost-effective to use these updates to balance
their incoming traffic, then they will use them and so help pay for
the fast push system.  It is not hard to imagine this being an
automated system which generally enables traffic to be balanced in a
way which makes expensive data links much more productive.

(Generally I think map-encap address space is only for end-users,
but I think some of the activities ISPs engage in, including running
their web servers and servers for customers, could benefit from
using map-encap space too.  So I expect some ISPs to want some of
their space managed by Ivip.)


> I also worry about the single ETR approach (although I can see it
> has attractions from the simplicity point of view). What are the
> implications for resilience? Ie after there's a failure, how long
> will it be before a site gets connectivity /reachability again
> (do I have to wait for the global push to get everywhere)? 

Yes.  If the current ETR goes down, becomes unreachable from the
Net, or from the end-user's site, and they have another link to a
reachable ETR, then there needs to be a mapping change for as many
micronets as are currently mapped to the current ETR.  That would be
 ~5 cents (maybe more, probably less) a micronet, and a 5 second
delay for service restoration.

There would be some kind of automated multihoming monitoring system.
 Most likely this would be an external, distributed, system run by
some other company than the end-user.  The end-user would pay this
company to monitor the reachability of their network, and to issue
mapping changes as required so the traffic was tunnelled to
alternative ETRs.  That other company's system would also be given
some or all of the responsibility for controlling the mapping to
achieve any load balancing the end-user requires over their two or
more links to ISPs.

This means the other company's system can be extremely
sophisticated, since that is their core business, and they can have
a network of servers all over the world, constantly checking
connectivity of the ETRs used by their many customers, and
connectivity from those ETRs to the end-user's hosts or networks.

Then, the end-user doesn't have to do any real-time monitoring or
mapping changes themselves.  Of course the end-user could roll their
own system, inside or outside their network, or have some mix of
their own system and that of an outside multihoming monitoring company.

One major advantage of the fast-push, real-time change of mapping
which Ivip provides is that all the world's ITRs will be switched
over to tunnel to the new ETR at once.  With LISP, APT or TRRP, each
ITR is on its own, having to somehow detect loss of reachability (of
the ETR or of the end-user's network).  This could involve hundreds
or thousands of ITRs having to probe the ETR and some router at the
end-user's site, which is inefficient and burdensome for the ETR and
the end-user network.

If the ITRs were required to respond to an ETR going down within 5
seconds, all ITRs currently sending packets to the ETR would have to
do some fancy probing, at least once every 5 seconds.  (Actually,
more often, since the loss of a single reachability probe or
acknowledgement, or a traffic packet and its acknowledgement, should
not be treated as an outage.)

With Ivip, the multihoming monitoring system would be doing the
probing, from potentially multiple locations - in a more frequent
and robust manner than any ITR could reasonably be expected to do.
Since there is just one system doing this probing, the volume of
packets would be low and not be a significant burden.  Also, by
removing this reachability stuff from the map-encap scheme,
end-users are free to use whatever techniques they like to test
reachability, to somehow detect imminent link failure etc.

For instance, if there was some way the end-user could be warned 10
seconds or more beforehand that a link was going down, then the
mapping could be changed within seconds to an alternative and there
would be no loss of connectivity.  LISP, APT and TRRP couldn't do
this, since the end-user has only slow (latency of tens of minutes
or more likely hours) control of the mapping, and has to instruct
the ITRs how to behave within simple, fixed, limits.

Consequently, with Ivip, the reachability probe and acknowledgement
load is much lower for the ETR and is zero for all ITRs.  Probing
can be done better and faster.  All ITRs can be changed to the
alternative ETR within 5 seconds, whereas with LISP, APT or TRRP,
each ITR with cached mapping which at some stage starts to tunnel
packets to the ETR will have to go through its own expensive,
time-consuming, reachability testing before it could decide that the
ETR was unreachable.


> For the most sensitive customers (say stock exchange trading
> floors) we probably want to aim at msecs.

There is no way of providing msec level service restoration, as far
as I know, with current BGP techniques or with a map-encap system.
Nearby changes to BGP routing might happen pretty quickly if the
loss of reachability was immediately apparent to the upstream peers.

A map-encap scheme can't generally respond so quickly.  Probably the
Ivip situation with a few seconds to detect the outage and then 5
seconds (ideally less) to change the mapping in all the world's ITRs
represents the best outcome of any of the current schemes.

With LISP, APT or TRRP, you couldn't really have every currently
active ITR probing the ETR and its link to the end-user's network
every fraction of a second, suddenly changing the mapping if there
was one missing acknowledgement.

I will write another message about high availability techniques
which could be used with any map-encap scheme.  These would be
brute-force approaches to create multiple packet flows to the
end-user network via multiple ISPs, so if one goes down, the packets
still arrive by the other.  Likewise, outgoing packets would be
duplicated and sent over multiple links in parallel.


> < In particular, ITRs (presumably ITRDs, but not necessarily) can
> be located in the DFZ, where they advertise the MAB prefixes and
> attract packets sent from networks which have no ITRs. This
> "anycast ITRs in the DFZ (or core)" approach makes Ivip
> incrementally deployable.>
> 
> First, Let me check. If one end (A) is legacy & the other end (B)
> is Ivip, then comms go A-> "anycast ITR"-> ETR-> B and the other
> direction goes B -> A. 

That's right.  There is some question about whether the term
"anycast" is appropriate.  For now I am using it, because the
multiple ITRs announce the same prefix, and for each IP address (or
micronet) within the prefix, the end-point is the same host (or
router).  However the next-hop IP address probably wouldn't be the
same in each advertisement, whereas with some people's understanding
of "anycast", the next-hop addresses must be identical.


> Second, what's the benefit for the network deploying the "anycast 
> ITR"?

My current thinking is that these "anycast ITRs in the DFZ/core"
would be operated by the RUASes collectively, or at least that an
RUAS would operate a global system of such ITRs advertising the MABs
that RUAS manages.  The better this system, the better support for
packets from networks without ITRs, and the better they are able to
rent out the address space in their MABs.  Alternatively, if the
MABs are "owned" by some other organisations, not the RUASes, then
the better the RUAS would be able to serve those other
organisations' needs if those organisations entrusted the mapping of
their MABs to this RUAS.


> Initially one end is legacy for all the Ivip end's comms, so they
> all have to go through the anycast ITR. 

Yes.

> But presumably the anycast ITRs have to be well distributed
> through the Internet

If the micronets in a MAB are mapped to ETRs all around the world,
and hosts are sending packets to these micronets from networks
without ITRs, also all around the world, then to ensure short paths
you need "anycast ITRs in the DFZ/core" in lots of places to
advertise this MAB.  (If all the ETRs were in one place, or all the
sending hosts in another place, then a single ITR in one place or
the other would provide optimal path lengths, even if the two places
were on opposite sides of the Earth.)


> (eg would it be any good if just the Ivip network put in an 
> anycast ITR? Would the legacy nws be able to reach it?)

I think this question assumes a particular model by which address
space is converted to Ivip management.  I will explore various
models before returning to your question.

Let's say the end-user is a company based in Melbourne Australia,
and currently has an ASN and a /20 of PI space.  It has three
physical sites here in Melbourne, each with two links to two ISPs.
Currently it splits its /20 into a /21 and two /22s and advertises
them separately in BGP.  This is fine for multihoming, but since
each physical site only has one prefix, there is no load sharing
over the two links to each site.

The IT department wants load sharing, and so splits the
advertisements into 6 smaller subnets:  2 /22s for head office and 2
/23s for each of the two other sites.

However, the Corporate Ethics Division people have read Bill
Herrin's treatise on the unreasonable costs each such new BGP
advertised prefix places on all organisations with DFZ routers:

  http://bill.herrin.us/network/bgpcost.html

and are horrified that this profligate splitting of PI space is
costing DFZ router operators collectively about USD$8,000 a prefix.

In the spirit of 2012 - International Year of the Beleaguered DFZ
Router - and coveting a national award for good corporate
governance, the company decides to convert its /20 to management by
Ivip.

The end-user contracts an RUAS to transmit its mapping changes to
the global fast-push network and is then able to split its /20 into
as many micronets as it chooses, mapping each one to ETRs in its
ISP's networks, and therefore load sharing with finesse and dynamic
control (5 cents a change, 5 second response time) which could not
be achieved with BGP.

The end-user's /20 is now a Mapped Address Block (MAB) and is
advertised as a single prefix.  The Corporate Ethics Division folks
are grinning like Cheshire Cats since the company's DFZ footprint
has been reduced from 6 to 1, while achieving better load balancing
and retaining multihoming robustness, probably with faster response
times.

The story is complete apart from the question of where the "anycast
ITRs in the DFZ/core" will be.   In practice, I would expect that
the RUAS which the end-user chose would also run a global system of
such ITRs, each of which would now advertise the end-user's /20 MAB.
 It is possible that the same RUAS would have another end-user
company on an adjacent /20 and could advertise both address ranges
as a single /19, which would further please the Corporate Ethics
Division, since the company's DFZ footprint would now be 0.5.

(In future, watch for companies spruiking their regulatory body
approvals, greenhouse gas footprint, energy efficiency rating, DFZ
footprint etc. etc. in all their advertisements and promotional
material.)

For various reasons, the end-user might have chosen this RUAS
because that RUAS was already doing the mapping and "anycast ITRs"
for neighbouring address space, but in practice, the end-user would
be able to choose any RUAS out of dozens (maybe hundreds, but I am
thinking a dozen or so would be good) in the world.

Returning to your question, which assumes the end-user runs their
own "anycast ITRs".  The end-user in this example could run such
ITRs, which would reduce the services it requires from the RUAS.

As long as the end-user's sites are all in Melbourne, there only
needs to be a single "anycast ITR" - in Melbourne somewhere.
Probably there would be two, for redundancy and maybe load sharing,
but in principle one will do the trick.  So it isn't anycast - it is
just a single ITR advertising the MAB to the world.  Optimal paths
are achieved from any sending host due to the sending hosts either
being in networks with their own ITRs, or not - in which case the
packets flow in the DFZ to the Melbourne ITR, and then in
encapsulated form to whichever ISP's ETR the mapping specifies for
each micronet the MAB has been split into.

Now let's say the end-user sets up branches in Sydney.  There's no
fuss splitting the MAB into more micronets and doing multihoming and
load sharing via two links to Sydney ISPs.  Probably the single ITR
in Melbourne would be OK, but it is not quite ideal in terms of path
length when the sending host is closer to Sydney than Melbourne, and
the packet is addressed to a micronet which is currently mapped to
an ETR in Sydney.

When the company expands to branches in London, Honolulu and Kyoto,
the single ITR in Melbourne is clearly not going to work very well,
so the company could establish its own ITRs in these locations.

(The company doesn't really need a lot of address space at each
location.  8 IP addresses is generally fine, apart from head office.
 So the /20 MAB can be split into 500 or so /29s, each with 8 IP
addresses, enabling the company not to use any more IPv4 address
space as it grows into a global corporation with offices all over
the planet.)

More likely, since there would be many such end-users in a similar
situation, our example company would get some other company to run
anycast ITRs for its MAB.  Those could be in the cities just
mentioned, but then every time the company adds a new branch office,
the ITR company would need to change things.  So generally, the ITR
company would have a bunch of ITRs all over the Net, and each would
advertise the MAB of our example Melbourne-based company.

Still, I think the most likely situation would be the RUAS either
running its own global network of ITRs, or most likely, contracting
some other company, which works for other RUASes as well, to
advertise the RUAS's MABs on its large global system of ITRs.  There
could be multiple such global networks of ITRs, and an RUAS could
contract one or more of the companies which run them, according to
how well supported it wanted its MABs to be for traffic from
networks without ITRs.

Those ITR companies, in turn, would probably want to charge
according to traffic volume, so the end-users whose micronets
receive lots of traffic through these "anycast ITRs in the
core/DFZ", would wind up paying more to their RUASes for this ITR
capacity they are using in the global ITR systems.



The above example assumes the end-user was an existing end-user
large and established enough to already have its own PI space.

Initially, I would expect some such end-uses to convert some or all
their space to an Ivip MAB, as just described.  However, in the long
term, the greatest number of end-users, and the greatest volume of
traffic, number of people served etc. will come from end-users who
either don't exist now and might have otherwise got PI space (if
space was available, which it probably won't be in /20 chunks) or
most likely will be too small to ever get PI space in the current
BGP-managed system.

So most end-users, over time, will not start with their own address
space.  They will rent (maybe on a long-term contract - as secure as
the company they rent it from) the space from some organisation X
who has already got one or more Ivip-managed MABs.  X may be an RUAS
itself, or an ISP (ISPs could be RUASes too).  X might in principle
be an RIR, if the RIR wanted to get into the business of renting out
generally smallish amounts of address space to hundreds of thousands
of end-users.

In general, in the long term, most end-users - will gain their UAB
(User Address Block) from some company such as X.  X will provide,
as part of the package, global "anycast ITR" coverage for whichever
MAB the end-user's UAB is within.  This way, the end-user never
needs to worry about running ITRs.  It simply splits the UAB into as
many micronets as it likes and maps each one to any ETR in the world.

Meanwhile, ISPs do not absolutely need to install ITRs.  However, by
doing so, they assure their paying customers that the packets they
send to Ivip-mapped addresses will be encapsulated by the ISP's
ITRs, and not be dependent on ITRs outside the ISP.  This ensures
optimal path lengths and reduces the risk of packet loss due to
bottlenecks in the outside world.  The ISP may now gain some
benefits in terms of controlling the path of outgoing packets since
all packets leaving its network are going direct to their
destination networks, not to the nearest "anycast ITR" (or to one of
these ITRs, if some advertise some MABs and others advertise others).

Another benefit for the ISP of having its own ITRs is that packets
which are addressed to micronets mapped to ETRs which are located in
its own network do not need to leave the network.

I think the other map-encap proposals are relatively vague about how
this would be handled, but with Ivip, I think it would be best if
every packet addressed to a MAB went through an ITR and ETR, even
when the ETR and final destination was in the same ISP network.

This ensures simplicity and reliability, since all the world's ITRs
will be under direct control of the mapping system.  Without this,
the only alternative is to ensure the ISP's internal routing system
knew about every micronet of every end-user and somehow routed the
packets internally to the correct router or host, without going
through an ITR or ETR.  This sounds inordinately complex, and would
mean the local routing system had to be changed rapidly for any
mapping change concerning a micronet with a local destination, so I
think it is far better to ensure every packet addressed to a MAB
goes through an ITR.


> < The new type of space will be attractive to end-users because
> it can be used for multihoming and portability>
> 
> I'm not yet convinced about this. it seems to me that more of the
> benefit is for the transit nws. After all, the end-users can do
> portability (PI) today, it's just that Ivip hopefully reduces the
> burden on BGP.

Only a handful of end-users can do portability, multihoming, TE etc.
with PI space today - compared to the much larger number of
end-users now and in the future who the map-encap schemes will
primarily be serving.

There will be benefits to any organisation running a router in the
DFZ, as less end-users seek PI space and as some (as in the above
example) convert their split up PI space into a single advertised
prefix as they convert it to Ivip mapped address space.

The project of the RRG is primarily to devise a new routing and
addressing architecture which will relieve problems which today fall
directly on operators of DFZ routers.  Since all end-users, of any
kind whatsoever, pay indirectly for the costs of running DFZ
routers, there will be indirect benefits to all Internet users once
the map-encap scheme starts to achieve these goals.

However, we can't force a new architecture on anyone.  It has to be
adopted by end-users - current and future - and most of them don't
have any direct concern with the costs of running DFZ routers.

The trick will be to devise a new architecture which provides
immediate benefits to the ISPs and end-users who adopt it, and which
also achieves the goal of reducing the number of BGP advertised
prefixes and their rate of change.

I think we can do this.  While we are at it, we should ensure the
new system is clean, open-ended, elegant and so will provide the
best possible basis for whatever enhancements might be required in
the future.

As far as I know, a global ITR-ETR system has never been
contemplated before.  If it is scalable (not pure push or pull, but
a hybrid push-pull architecture such as APT or Ivip) and if it has
fast push (only Ivip currently aims for this) then it will be able
to support a new kind of mobility, which provides generally optimal
path lengths, without any fixed "home agent" and which works without
host changes for correspondent hosts, for both IPv4 and IPv6 - and
with minimal changes to the mobile hosts.

I think mobility is such a powerful and valuable concept which is
within reach of a map-encap scheme that we should embrace it as a
goal.  Also, mobility would be a major attraction - leading to many
more end-users adopting the new form of address space.

So even for a DFZ router operator which doesn't care at all about
end-users, having the map-encap scheme support mobility is a major
benefit, because it will drive adoption far more than any attempt at
coercion, persuasion etc.

Returning to your question, I think the benefits to an existing PI
using end-user of converting their space to Ivip include:

  1 - The space can easily be split into micronets of any size,
      including down to individual IPv4 addresses or IPv6 /64s
      (the protocol should support splits to /128, but I question
      the practicality and need for this).

      With BGP management, due to widespread filtering on
      routes with prefixes longer than /24, the end-user cannot
      in practice have their address space split into different
      prefixes in different locations finer than with this 256
      IPv4 address granularity.

  2 - For a few cents, the traffic for each micronet can be directed
      to any ETR in the world, within about 5 seconds.  This
      provides load splitting and real-time TE capabilities which
      are not possible today, even though there is no way of
      splitting the traffic of a single micronet between multiple
      ETRs (as there is in LISP, APT and TRRP, which therefore
      involve more complex mapping data, and much more complex ITR
      functions).

  3 - It would also be possible for an end-user with PI space today
      to rent out some of its current space to other end-users.
      This would significantly boost address utilisation, and so
      combat the problem of IPv4 address depletion - at least in
      respect of the many end-users who only need one or a few
      dozen IPv4 addresses.


> < With Ivip, mapping changes, or at least frequent mapping
> changes, should be charged for, so that endusers contribute to
> the cost of running the global push system,>
> 
> this is a nice idea, but the commercials of how to introduce this
> when others don't are tricky. Not necessarily insuperable - after
> all, you can give people a generous allowance as part of their
> contract. 

Yes.  I think for billing simplicity, most RUASes (or the UAS -
Update Authorisation Systems - which lie between an RUAS and
hundreds of thousands of end-users) would probably have some flat
rate billing system with a quota of updates per month.  Most
end-users probably wouldn't update their mapping from one month to
the next, unless they were doing mobility or fussing with load sharing.

However, there needs to be a way of charging end-users who generate
large numbers of updates.  Adding another 12 bytes (IPv4 - 48 or
maybe 32 bytes for IPv6) of information to the global push scheme
will have a pretty low marginal cost, and even with a low fee such
as 5 cents, I imagine money could be made by the RUASes.

Whatever the fee, the RUASes will be making money, since this is a
major part of their income steam, and the fast-push mapping system
is a unique and service, with direct benefits to end-users.


> I wonder: if we assumed it was commercially doable
> (charging for over-frequent updates), why couldn't we do this
> today? If we charged today for sites sending over-frequent
> updates, and charged on the expected impact (*) they cause the
> global BGP (eg update of a PI goes everywhere, so charge more) -
> would you get enough of the benefit compared with doing Ivip?

I think this is an important question.

If we could somehow charge a fee for every BGP changed
advertisement, perhaps scaling it in some way to reflect how much it
ripples through the whole system, and then returned a major
proportion of that fee to the folks who run DFZ routers, then we
could keep some kind of damper on the growth in the number of DFZ
routes.

Even if we could, the current BGP management system does not suit
the needs of many end-users, because it only deals with IPv4 address
space in chunks of 256 addresses.  (Actually, with the above
charging scheme, it would probably be OK to get rid of the /24
filtering and allow people to advertise prefixes as long as /32 -
they would still be paying a fee per prefix and per change of the
prefix's advertisement.)

In order to meet the needs of millions to billions of end-users, we
need to slice and dice address space very finely - and that large
number of separate prefixes is  what we are trying to avoid in the
global BGP system.

There are practical reasons why it is difficult or impossible to
introduce such charges for BGP advertisements and changes.  They are
physically introduced at multiple places all over the Net.  There is
no centralised gateway or authentication system which can tell BGP
routers which advertisements are approved and which are not.  So
charging could not be on a "turnstile" basis - pay per action - but
would have to be done on some kind of survey basis: monitoring
changes and invoicing whoever seemed to have made them.  This would
be extraordinarily messy and open to abuse.

With Ivip, charging on a turnstile basis is quite simple.  There is
(in the current model I am developing) one global stream of updates,
put together by multiple RUASes working together.  Each RUAS is
responsible for one or more MABs.  If an end-user wants their
mapping change sent out to the ITRDs and QSDs of the world, they
need to do it through this RUAS.  Whether they deal with the RUAS
directly or through one or more intermediary UASes doesn't really
matter.  The RUAS won't propagate the mapping change unless it knows
it will be paid for it.


> would the migration difficulty of doing this be harder or easier
> than Ivip? 

Charging for BGP updates would be a band-aid solution, whereas Ivip
provides lasting benefits including a new form of widely usable
mobility, fast control of load balancing, very fine slicing and
dicing of address space etc.  It is not clear how anyone could
reliably charge end-users for BGP updates.


> (*) Similarly, 'charge' wouldn't have to mean '£ money
> per msg'; in fact, you could think of today's route flap damping
> as a form of charging, and this would just be a bit more
> sophisticated.

Route flap damping is a pretty ugly form of disincentive, since it
is a problematic algorithm, distributed over the network in the form
of router software and configuration items.  Furthermore, route flap
damping can't necessarily reliably distinguish between changes which
are (however decided) "legitimate" and "unreasonable, undesirable,
unwanted - to be suppressed".


> I have a slightly unfocussed concern about introducing a new
> 'layer' of the ITR-ETRs (unfocussed meaning I can't quite put my
> finger on it, so it's hopefully wrong). Basically, in addition to
> today's inter-domain 'cloud' we have on top an ITR-ETR cloud.
> Does this effectively add another layer of policy control,
> failure management, fault tracing, perhaps even topology mgt,
> etc? 

I don't think there's anything unfocussed about your concern!

We are planning to introduce a whole new architectural arrangement,
incrementally, but ultimately, ideally ubiquitously.  This will
involve a bunch of ITRs, ETRs and some kind of mapping system.

All this involves extra costs, extra forms of unreliability etc.

There is lots to be concerned about.  So I think we need to make the
whole thing as robust, useful and as extensible as possible.


> do these things operate separately for each (are their
> operations hidden from each other)? Or does the mapping system
> effectively expose the inter-domain 'cloud' to the ITR-ETR cloud?
> Even the latter would seem to mean a big change from a policy mgt
> perspective, in that policy control shifts from BGP into the
> mapping system?

With Ivip, and I think the other schemes, the BGP system forms a
"black box" which provides carriage of packets from one network to
another.  The restrictions on this include:

1 - We currently don't allow divisions smaller than 256 IPv4
    addresses to be part of this global system.

2 - We want to reduce the current 250k advertised prefixes, not
    increase them - as would happen if we tried to do all the things
    we want to do with current BGP techniques rather than with a
    new map-encap system.

3 - A map-encap system needs to use BGP in some way if it is to
    attract packets from networks which lack ITRs into ITRs so
    they can be tunnelled to the correct ETR.  This means that
    whatever manages the map-encap system must advertise its
    managed address space in some form to BGP.  (Ivip's "anycast
    ITRs in the core/DFZ" and LISP's "Proxy Tunnel Routers".)


Other than that, the map-encap scheme doesn't care at all that the
BGP network is abuzz with messages changing the way individual DFZ
routers choose to send packets addressed to any one prefix out to
one peer rather than the other.

If the BGP system fails to cope with outages, causing lost
connectivity to an ISP or any other AS network, then the map-encap
scheme will need to respond to this in order to restore service to
multihomed end-users who have an ETR at an ISP not affected by the
outage.

In LISP, APT and TRRP, it is the job of the ITRs working
individually, with the ETRs, to determine that something is wrong
and to decide how to tunnel packets to an alternative ITR.  So a
great deal of complexity and fixed functionality is built into the
ITRs, ETRs and the protocols of the map-encap scheme.

Ivip is different.  The Ivip system itself takes no interest in
reachability (except as discovered during the unfortunately tricky
business of ITRs determining PMTU to ETRs).  It is the
responsibility of the end-user to do their own reachability testing
and mulithoming service restoration decisions.  Likewise, Ivip does
not have any TE functions.  The end-user can achieve load sharing or
whatever by splitting the traffic over multiple micronets and
controlling the mapping of each micronet to one of various ETRs in
real-time.

So Ivip is modular, rather than monolithic - enabling and requiring
the end-user to do these things themselves, or to contract a company
to do them on their behalf.

As long as the BGP system provides continual connectivity to each
ISP network - assuming this involves suitably low rates of packet
loss and no unreasonably long paths or delays - the map-encap scheme
should be unaffected by whatever is going on in the BGP system itself.

I think your question relates to things such as:

1 - ISPs currently use BGP to load balance their links to other
    ISPs.  While they can still do that with a map-encap scheme
    in operation, how would the map-encap scheme affect traffic
    flows associated with each BGP-advertised prefix - so
    potentially complicating their efforts?

2 - To what extent would ISPs have hooks into the map-encap scheme
    so they could affect the path of packets?

There has been discussion recently about both the ISP (provider) and
end-user having a say in traffic engineering and I guess multihoming
 service restoration.

SHIM6 enables TE and multihoming with all control being exercised at
the host.   This is good in some ways, but not in others.  Many
end-users would would prefer to manage these things at the level of
their router, rather than having to have hooks into every server,
desktop machine etc.

I understand that Christian Vogt's Six/One proposal (not Six/One
Router, which is more recent and quite different) involves a nifty
use of IPv6 address bits so that both the host and a router (the
end-users or the ISP's?) can both influence TE (multihoming
restoration too?).

With Ivip, there is no such shared management.  The end-user has
complete control over which ETR their micronet is addressed to.

The ETRs are in RLOC space, which is advertised as prefixes in BGP
by ISPs.  So ISPs still have complete control, with their existing
BGP mechanisms, over which links are used for incoming traffic for
each such prefix.  Generally speaking, the end-users won't care what
the ISP does, as long as the ETR is reachable from the rest of the
net, and as long as the ISP enables their outgoing packets to be
forwarded properly.

However, it is possible to imagine some situations in which the
end-users might care about how the ISP uses BGP.  If the ISP had a
national network, all around Australia, and the end-user's ETR is in
Melbourne, the end-user would be perfectly happy if the ETR was in a
prefix which was advertised by the ISP's border routers at peering
points in Melbourne.  Then, packets from sending hosts in Melbourne
would have optimal path lengths to the ETR - via ITRs in their own
networks, or by an "anycast ITR in the core/DFZ" somewhere in Melbourne.

But suppose the ISP ran its national network as a closed unit, with
the only link to the outside world in Perth, 3400km.  This would be
suboptimal for end-users with an ETR in Melbourne.

This use of BGP by the ISP wouldn't affect the fundamental ability
of the map-encap scheme to work - it would just degrade its service
to customers outside Perth.

Assuming a multihomed end-user in Melbourne has links to two ISPs
with ETRs in Melbourne, I think the end-user is primarily interested
in load balancing over its two links to those ISPs, and has little
or no interest in whether the packets flow over various links from
each ISP to other ISPs.

So as long as the BGP system provides robust connectivity, I don't
think it affects the map-encap scheme directly - or Ivip and the
end-user's multihoming service restoration and TE system.

To what extent glitches in connectivity affects the map-encap
system, in terms of the system itself (LISP, APT or TRRP)
determining reachability and changing mapping as a result, is a good
question.  Likewise with Ivip: glitches in connectivity affecting
the multihoming and TE decision making systems of end-users.

One of the good things about Ivip compared to LISP, APT or TRRP is
that the end-user can fine tune the methods by which reachability is
monitored and by which mapping decisions are made.  If their service
often has 2 second glitches of lost connectivity, and they don't
care about this, then they can have relaxed probing and decisions,
so as not to change the mapping.  If they are fussier, and any
outage is likely to be long, then they could have more frequent
probing and a quicker decision to switch the mapping to the other ETR.

The monolithic map-encap systems couldn't reasonably build such
flexibility into their protocols or into all the world's ITRs.


Any map-encap system which provides backwards compatibility for
packets sent from hosts in networks without ITRs will need to
advertise the mapped address space in BGP.  Ivip does this by the
"anycast ITRs in the core/DFZ" advertising MABs which they have the
mapping data for.  (I imagine these will be full database ITRs, but
they could be caching ITRs with a nearby full database query
server.)  Whether every such ITR advertises every MAB or whether
some ITRs advertise some MABs and others advertise others depends on
the business model by which Ivip is deployed.  Each MAB typically
includes micronets for dozens to tens or hundreds of thousands of
end-users.

Likewise LISP's "Proxy Tunnel Routers", except I think the LISP
model implies that each EID prefix is more likely for one end-user.

Likewise the way APT islands advertise mapped address space to their
non-APT peers.  However, this generally only works when each
end-user gets a /24 or shorter, since if two end-users with adjacent
/25s were on different APT islands, there is no way each island
could advertise the one covering /24 and provide service for the two
physically separated end-users.  (All this could be solved by making
APT mapping a single global system, independent of BGP links - then
there is no such thing as an "APT island".)

The BGP system will be affected by changes in how these prefixes
(MABs in Ivip) are advertised.  An "anycast ITR in the core/DFZ"
should not advertise a MAB to its peers in the if it cannot reliably
tunnel the packets.  So if it somehow loses connection with its
query server (there should be several to choose from, all reasonably
local) or if it somehow loses its feed of mapping changes, or if it
somehow has got corrupted mapping information for one or more MABs,
and needs a few minutes to download a recent dump of this MAB's data
and to bring itself up to date with incoming mapping changes . . .
then arguably the ITR should stop advertising this MAB.

Ideally, these "anycast ITRs" should be reasonably stable and should
not advertise and withdraw these MAB prefixes.  To the extent that
they do, then this is a burden on the BGP system.  However, the
impact of these changes will be quite limited compared to similar
changes in conventional BGP usage, because there will be hundreds,
later perhaps thousands, of ITRs around the Net also advertising
this MAB.

If the Ivip map-encap scheme advertises relatively few MABs and
keeps those advertisements relatively stable and if this happens
from a large number of "anycast ITRs in the core/DFZ", then I think
the burden Ivip places on the BGP system will be pretty small.

Ideally, Ivip should be managed so that there are not a plethora of
MABs.  Ideally, if there were four end-users with contiguous slices
of PI space, who decide to convert it to Ivip management, they would
unify the space into a single MAB, and therefore choose to use a
single RUAS - rather than keep them separate.  That way, only one
MAB would be advertised in BGP, rather than four.

Such amalgamations could be encouraged by some global system which
authenticates legitimate BGP advertisements and charges per
advertisement (and perhaps per change).  Then, each end-user would
only be paying a fraction of the cost of a BGP advertisement, rather
than the full cost with their own MAB.


I don't have a clear idea of the business models and deployment
plans for LISP.  In principle, the same things could be done as I
just described for Ivip - few EID prefixes serving many end-users.

Ivip or LISP could be used in such a way that there were a very
large number of MABs (EID prefixes) each serving only one or a few
end-users.  This would be of scarcely any benefit to the BGP system,
except to the extent that each such end-user would otherwise have
advertised multiple prefixes.

 - Robin               http://www.firstpr.com.au/ip/ivip/



--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg