[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RRG] Re: Comparing APT & Ivip



Hi Robin,

This thread is getting a little scattered, so let me consolidate.

Below, you describe your doubts about how a single ISP could deploy APT
unilaterally, without the involvement of their customers. Allowing for
this, and giving ISPs an incentive to do so, is perhaps *the* primary
goal of our incremental deployment scheme. As I mentioned before, we
should have a new document describing the updated details sometime in
the next few months. But, to summarize, you can think of a single ISP
deploying APT as similar (in concept) to a single ISP deploying MPLS, or
some other internal efficiency improvement. The difference is, APT
allows for a potential increase in benefits with every other ISP that
deploys it.

This relates to another point below regarding non-addressability of
internal routers in an APT ISP. Put simply, since all traffic through an
APT ISP is encapsulated (even "legacy" traffic), only the border routers
of that ISP can possibly be addressed from outside that ISP.

Regarding our discussion on economics and policy: I think your
description is backwards. Nothing will be deployed on the Internet as it
exists today unless someone has an economic incentive to do so. So, we
must constrain our technical solutions to meet the needs of real-world
economics, not vice versa.

When I say that your solution is an economic one, I mean that it depends
upon the success of a number of business models and economic structures
that do not exist in the Internet today. If no one thinks they can make
a profit off of being a RUAS, MRD, or so on, and you cannot convince
them otherwise, then Ivip won't work. Even if you do convince them
otherwise, if no one is willing to pay for the services of a RUAS, MRD,
or so on, Ivip also won't work.

With APT, we have focused on providing a solution that can be deployed
*within the economic constraints that exist on the Internet today*. That
is, if an ISP feels that deploying APT is a good investment in their
business, they can do so without any new business partners, contracts,
payment plans, or so on.

I believe that, though the economic solutions proposed in Ivip may be
profitable or useful to someone, it is nearly impossible they will be
profitable or useful to *everyone*. We would be better off incorporating
these solutions into other systems that do not *depend* upon them.

One example is charging for BGP updates (see other thread), another is
charging edge networks to update the TE info in their mappings under
APT. This is certainly possible, since we haven't specified any
automated way for an edge network to update their TE info, each provider
can choose for itself if this is something they should charge their
customers for. But they would never be *required* to do so in order for
APT to function efficiently.

-Michael


Robin Whittle wrote:
> Short version:   From the thread: Re: [RRG] Separation vs. Elimination
>                  http://psg.com/lists/rrg/2008/msg02583.html
> 
>                  Continuing the previous discussion comparing APT
>                  and Ivip.
> 
>                  I also give a detailed description of how Ivip's
>                  multihoming reachability testing and decision
>                  making might work, contrasting it with how I
>                  understand APT Default Mappers would determine
>                  reachability of each ETR, depending in part on
>                  BGP to tell them the ETR's ISP network is
>                  unreachable, which could involve significant
>                  delays.
> 
> Hi Michael,
> 
> You wrote, in part:
> 
>>> Ivip pushes the mapping fast and APT pushes it slowly.  Ivip has a
>>> distributed, but still reasonably centralised, push system which
>>> fans out to all full database query servers. APT has multiple
>>> islands, with more diffuse pushing of the updates from multiple
>>> sources of mapping data within the islands.  (I don't see any
>>> benefit in APT islands - I think there should be one APT system.)
>> I think the larger difference between the mapping systems is what I
>> pointed out below -- the difference in what information is distributed.
>>
>> Regarding APT, I think you will find that APT islands that are not
>> physically connected can be logically connected in the next version of
>> our incremental deployment scheme. However, we can't force ISPs that
>> don't already have business relationships to create them, so there is
>> always the (as you say, probably undesirable) possibility of multiple,
>> disconnected islands.
> 
> OK.
> 
> 
>>> OK - but I don't understand how APT (with APT islands) can robustly
>>> support packets from non-upgraded networks when there are two EIDs
>>> such as /25 or longer, in the one /24, and the two end-user networks
>>> are using ISPs in different APT islands, in a setting where it is
>>> impossible to have advertisements for prefixes longer than /24
>>> propagate across the DFZ.
>> I assume this is not possible. But that means it provides an incentive
>> for the separate islands to converge to one. =)
> 
> Indeed.
> 
> 
>>>> We are trying to make
>>>> the point in the paper that transit networks are the ones that need the
>>>> routing table to scale, and it is possible for transit networks to
>>>> deploy separation schemes themselves, in theory, in order to directly
>>>> address that issue. This is not possible for elimination schemes.
>>> I agree it is not possible for elimination schemes, but I can't
>>> think how transit networks could deploy any of the current
>>> separation schemes without involving the end-user networks.  Can you
>>> give an example?
>> Sure: APT. APT is deployed by an ISP by turning their border routers
>> into APT EDRs (or TRs or whatever you want to call them). Customers
>> outgoing packets are encapped by the ISP, their incoming packets are
>> decapped by the ISP. If the customer is multihomed, they can ask their
>> providers to put their TE preferences into the mapping info for their
>> prefix(es), but that's totally optional.
> 
> I will think this through:  Lets say I have an edge network -
> and end-user network, not an ISP network - and I have my own PI
> space used as it is today, with BGP.  I have links to two ISPs
> and I advertise the whole address space as one prefix, through
> either ISP-1 or ISP-2.
> 
> It is fine for ISP-1 and/or ISP-2 to pass my outgoing packets
> through an ITR (or whatever it is called) arrangement - to find
> packets which are addressed to prefixes which are mentioned in
> APT's mapping system , and to encapsulate them and tunnel them
> to whatever ETR that ISP's Default Mapper decides should get
> the packets.  That can be done for outgoing packets from any
> edge network, PI or PA, without the edge network needing to do
> anything, and without it upsetting anything.
> 
> Actually, if there are multiple APT islands, then this is only
> going to work for packets addressed to edge networks which use
> ETRs in ISPs in the island to which ISP-1 and ISP-2 are a part.
> Maybe they are in different islands.  Edge networks in islands
> other than the one the outgoing ISP is in must have their own
> arrangements in their own islands to attract packets which make
> it past my ISP-1 or ISP-2 ITRs into the DFZ.
> 
> 
> But let's consider the options for one or both ISPs using APT
> for my address space.  There are two cases:
> 
>  1 - Either ISP-1 or ISP-2 does it, but not the other.
> 
>  2 - Both do it together.  Within this scenario, there are two
>      options:
> 
>      a - They are part of the same APT island.
> 
>      b - They are in different APT islands.
> 
> In case 1, let's say ISP-1 is a member of Island-1 and adds my
> prefix to the APT mapping system, so it is an APT EID in Island-1.
> 
> As long as I want to use ISP-1 for incoming traffic, this will
> work.  I would have my CE router accept packets from an ETR in
> ISP-1's network.  However, this arrangement cannot not improve
> scalability and support my multihoming arrangements at the same
> time.
> 
> Let's say I want to use ISP-2 instead, including for reasons such
> as my link to ISP-1 is down.  Because my prefix is an APT EID in
> Island-1, the border routers of ISP-1 will be advertising my
> prefix to the DFZ.  My understanding of APT's support for packets
> from non-upgraded networks is that the border routers of all ISPs
> in Island-1 would also be advertising my prefix, acting as ITRs
> to collect packets emitted from non-upgraded networks, as close
> as possible to where they are emitted, and tunneling the packets
> to the correct ETR.
> 
> If I wanted to use ISP-2 instead, ISP-1 and any other border
> routers of ISPs in Island-1 must stop advertising my prefix, and
> ISP-2's border router(s) must start advertising it.
> 
> This could be done, I suppose, but I don't think APT is intended
> to have such real-time change of the prefixes advertised by
> border routers of ISPs in the APT island.
> 
> There is no scalability benefit in this arrangement - since my
> prefix is still advertised in the DFZ, and the advertisement
> still changes when for multihoming reasons, I have the packets
> come in via ISP-2.
> 
> If there was a scalability gain in that the address range of my
> prefix was advertised as part of a shorter prefix by the
> Island-1 border routers - assuming that there were other
> networks on adjoining address ranges which were also using APT
> in Island-1 - then the above arrangement couldn't work, since I
> assume it would be unworkable for these border routers to stop
> advertising the encompassing shorter prefix, and then advertise
> the space in that prefix minus my space - which could involve
> multiple prefixes, every time my multihoming arrangement needed
> to use ISP-2.
> 
> In case 2a, ISP-1 and ISP-2 are both part of Island-1.  For the
> purposes of collecting packets from non-upgraded networks, the
> border routers of both ISPs and all other ISPs in Island-1, will
> advertise my prefix to BGP, and these routers will act as ITRs
> to encapsulate packets and tunnel them to the correct ETR.
> 
> Perhaps my prefix is advertised as it is, in which case there is
> no scaling advantage in terms of the number of prefixes in the
> DFZ.  However, I could split my space up into multiple EIDs and
> this would provide a scaling advantage, compared to me splitting
> the space and advertising then conventionally as separate prefixes.
> 
> Perhaps my prefix is part of the address range of a larger
> prefix advertised by the border routers of all the ISPs in
> Island-1.  Then there are scaling advantages, in terms of reduced
> number of advertisements.  This would mean that other edge
> networks on adjacent space above and/or below my space also use
> APT with Island-1.
> 
> As long as my network multihomes with ISPs in Island-1, there are
> scaling advantages in terms of my multihoming changes not causing
> any changes to prefixes advertised in the DFZ.
> 
> In this case 2a situation, I guess the system works fine.  My
> decision to advertise my prefix to one ISP or another, or the
> physical fact of each ISP's ETR finding it has or does not have
> connectivity to my CE router(s) would cause those ETRs to be able to
> send messages to any ITR (or Default Mapper?) which sent packets to
> it about it not being able to reach my network, the destination
> network.  Therefore, the ITRs (with their Default Mappers?) would
> figure out, independently, which of these two ETRs of ISP-1 and ISP-2
> they should send packets to.
> 
> Case 2b won't work - ISP-1 being part of Island-1 and ISP-2 being part
> of Island-2.  The only way of forcing it to work would be something
> ugly and impractical such as case 1, in which one all of one island's
> ISP's border routers must stop advertising my prefix and the border
> routers of the other island's ISPs must start advertising it,
> according to which ISP's ETR my network was using at the time.
> 
> So this means a multihomed edge network needs all its upstream ISPs to
> be in the one APT island.
> 
> It also means that a single ISP can't introduce APT for a multihomed
> customer, unless it does so in concert with the one or more other
> upstream ISPs which must also be in the same APT island.
> 
> As far as I can see, ISP-A or ISP-B can't act alone - without the
> involvement of my network - to handle my prefix with APT.  They would
> need to check with me and verify that every upstream ISP I uses was
> both using APT and was in the same APT island.  Then they would need
> to reconfigure the APT island's mapping system to include my prefix
> as an EID, and make sure that it was advertised by all Island-1's
> ISP's border routers, either on its own or as part of a shorter,
> encompassing prefix.
> 
> So I still can't think how transit networks could deploy any of the
> current separation schemes without involving the end-user networks.
> 
> 
>>>> If your *provider* is encapsulating your packets (as in APT), of course
>>>> you can't address anything in the *upgraded* transit core. But
>>>> non-upgraded transit networks are effectively in edge space, so they can
>>>> still be addressed.
>>> I would have thought it more correct to say:
>>>
>>>   non-upgraded transit networks are effectively in *core* space, so
>>>   they can still be addressed.  The Default Mapper has no mapping
>>>   for any address in a non-upgraded edge network, and to ensure
>>>   it can be reached from an upgraded edge network, it must forward
>>>   the packets without encapsulation.
>>>
>>>   (Of course, for hosts in non-upgraded edge networks to be able
>>>   to send packets to hosts in upgraded edge networks, one or
>>>   more border routers in the APT island need to advertise either
>>>   the upgraded edge network's EID prefix, or some other prefix
>>>   which covers this EID prefix.)
>>>
>>>   Only once all edge networks adopt APT will the core be truly
>>>   "separated" from all edge networks.
>> Continuing from where I left off above, all packets that get sent to the
>> ISP from then on are either already encapped (by another ISP in the same
>> island), or get encapped by that ISP. Same goes for decap. No non-border
>> router inside that APT island is directly addressable by devices outside
>> the island.
> 
> I am finding this hard to follow.  Are you referring to the initial
> situation where you need to be able to send packets to edge networks
> which have not yet adopted APT, or after APT is adopted by all edge
> networks?
> 
> 
> 
>>> handling of changes is intended to be much faster and less expensive
>>> - while also making it easy to charge a small fee per update, a few
>>> cents for instance.
>>>
>>>> APT only distributes topology information in the mapping system.
>>> For each micronet, Ivip's mapping system distributes the address of
>>> the ETR to which packets addressed to that micronet should be sent
>>> by every ITR.  It is the responsibility of the end user to change
>>> the mapping to some other ETR which works if the current ETR is not
>>> working, or is not connected to their network.  This requires Ivip's
>>> mapping to be sent out fast - effectively in real-time - and it
>>> simplifies the ITRs and ETRs and reduces the size of the mapping
>>> information, since ITRs and ETRs are not involved in testing
>>> reachability.
>>>
>>> APT, like LISP or TRRP, has mapping information with two or more ETR
>>> addresses (assuming a multihomed end-user network).
>>>
>>> This mapping information is assumed not to need to be changed very
>>> much, since APT's push (to the Default Mappers, from which ITRs pull
>>> the mapping information they need) is slow by comparison to Ivip.
>>>
>>> Consequently, each APT (or LISP or TRRP) ITR (or the ITR's Default
>>> Mapper) needs to do its own probing of ETRs or use whatever
>>> techniques to determine ETR reachability - then the ITRs (actually,
>>> I think it is APT's DMs) need to make their decisions, based on the
>>> mapping information, which ETR to tunnel the packets to.
>>>
>>>
>>>> So the
>>>> increased number of edge prefixes that an APT-based network can handle
>>>> is relative to the (increasing) amount of BGP traffic that is due to
>>>> reachability changes in edge prefixes.
>>> Are you saying that a major efficiency - that is, scaling -
>>> advantage of APT is that it doesn't convey reachability information?
>>>
>>> Assuming this is the case, then I see it is true to this extent:
>>> Your mapping doesn't need to be pushed as fast or as frequently as
>>> Ivip's, or as fast or frequently as BGP would ideally propagate its
>>> changes.
>> Yes, exactly!
> 
> OK.
> 
>>> No-one else has drawn parallels between Ivip and BGP - so this is
>>> interesting.  Clearly they are completely different, and Ivip
>>> assumes a core routing system - BGP.  However, they are alike in
>>> that both are in a hurry to communicate "reachability" information
>>> across the Net.
>>>
>>> Every time a BGP prefix is advertised, or no-longer advertised at a
>>> given router, changes to this effect need to be propagated.  The
>>> changes may not need to go very far, or they may involve changing
>>> the best path decision of every router in the DFZ.
>>>
>>> Every time an Ivip micronet is mapped to some ETR, or not mapped to
>>> it (typically it would simply be mapped to another ETR, but it could
>>> be mapped to NULL, or its space could be reassigned to one or more
>>> different micronets), the Ivip system needs to convey this quickly
>>> to every full database query server.  (And to any full database ITR,
>>> but my current thinking is that there will be few or none of these -
>>> just caching ITRs, some of which have a full database query server
>>> integrated into them, or have one in the same rack.)
>>>
>>> APT is not in a hurry to push mapping information to all the
>>> island's Default Mappers.
>>>
>>> Being not in a hurry is arguably a scaling benefit, as is not having
>>> to push the mapping very often.  However, one extra cost (compared
>>> to Ivip) is that APT needs to push more complex, more voluminous,
>>> mapping information.  Another major extra cost your ITRs/DMs and
>>> ETRs need to be much more complex than Ivip's because they must do
>>> all the reachability testing and multihoming service restoration
>>> decision making.
>> This is all true. But we argue this is necessary complexity. The
>> Internet is a complex system. I think it was Einstein that said: as
>> simple as possible, but no simpler.
> 
> OK - I am going pursuing a system with a more ambitious mapping
> system, to save complexity and protocol overhead in ITRs and ETRs.
> 
> You are aiming to simplify the mapping system in terms of the
> frequency of mapping updates, but one price you pay is more
> complex mapping information (addresses for the various ETRs,
> with preferences for load sharing and for which one to choose
> in a multihoming service restoration setting).
> 
> You are also paying a price in complexity of ITRs and ETRs,
> and in inflexibility regarding reachability detection and
> decisions resulting from those reachability findings.  Ivip
> puts that stuff outside of the system and both enables and
> requires end-users to do their own reachability testing and
> consequent decision making.
> 
> 
>>>>> The Ivip approach is to provide firstly a more efficient mapping
>>>>> system relative to BGP - as does APT - but also to technically
>>>>> structure the mapping system so end-users can be made to pay for
>>>>> each mapping change.  This enables quite a lot of the mapping system
>>>>> (not all, but most of it) to be run as a series of businesses.  More
>>>>> details are in:
>>>>>
>>>>>   http://tools.ietf.org/html/draft-whittle-ivip4-etr-addr-forw-01
>>>>>
>>>>>
>>>>> Then, there funding for the mapping system, so there are incentives
>>>>> to build it, extend it, run it, make it more efficient etc.  This
>>>>> should greatly extend the maximum number of micronets, updates etc.
>>>>> the whole system can handle, compared to what I understand is the
>>>>> APT approach of a more efficient mapping system with costs falling
>>>>> unfairly on the ISPs, with no backpressure on end-users to reduce
>>>>> the number of mapping changes or the number of EIDs in the mapping
>>>>> system.
>>>> The costs are not falling *unfairly* on the ISPs -- they are the ones
>>>> that stand to benefit. Again, the primary goal of APT is better routing
>>>> scalability, which is a DFZ problem.
>>> But what if an end-user network issued a mapping update every 10
>>> seconds, 24 hours a day?
>> Edge networks don't send mapping updates directly under APT. ISPs send
>> mapping updates that contain their customers' prefixes.
> 
> Ahh - OK.
> 
>> We also plan to
>> have the protocol limit the frequency of updates to something on the
>> order of (very roughly) once per hour. Because we don't carry
>> reachability information, there is no need for frequent updates. We're
>> currently working on some simulations to see exactly what time scale is
>> workable.
> 
> Still, the edge network which pushes its luck changing its
> preferences every hour, for instance to dynamically adjust its
> incoming traffic according to circumstances which vary significantly
> hour-by-hour, is placing a considerably greater burden on the whole
> APT island than another network which only changes its preferences
> every few months.  You have no economic arrangement to make the
> hourly changing edge network pay more money which is somehow
> distributed to the ISPs in the APT island.
> 
> When you get to a single APT island - which I think is the only way
> APT can work properly - then you have these hourly changers requiring
> their mapping change to be pushed to every ISP on the planet.  You
> might bet that with 100 million edge networks, you are only going to
> have a small number who do this.  But if it costs them nothing, and
> it gives them some means of dynamically responding to incoming traffic
> and/or congestion in their various upstream ISPs, then what is to
> stop 10 million edge networks changing their mapping every hour?
> You are back to the tragedy of the commons, because you can't stop
> this and you can't charge them for it.
> 
> 10 million updates per hour would be unscalable - since you have to
> push this to every ISP in the world. (I am assuming 100% APT adoption,
> all in one island.)
> 
> Ivip has no such problem, or at least has a greatly reduced problem,
> because the fee per mapping change will reflect the burden that
> change places on most of the mapping distribution system - and make
> help pay for the mapping distribution system to be upgraded to handle
> whatever volume of changes people want to make at the current cost
> per update.
> 
> 
>>> That would be as much of a burden on the ISPs in the APT island as
>>> hundreds of thousands of ordinary end-user networks which only
>>> changed their mapping every month or so.  (In terms of mapping
>>> traffic and processing it at each DM.  In terms of storage, the
>>> burden is the same.)
>>>
>>> In this respect, APT achieves no benefits over BGP.  Both APT and
>>> BGP have to accept the arguably excessively frequent changes and
>>> propagate them - and in both cases there is no way of charging the
>>> originator of these changes to help deter them from making so many,
>>> or to help pay for their cost across all the ISPs.
>>>
>>> Ivip is completely different in this regard - there will be a small
>>> fee per update.
>> Yes, our solution is technical (frequent updates are not necessary for
>> desired functionality), yours is economic (frequent updates are
>> necessary, but can be costly). The way I see it, we are technicians, not
>> policy makers, so I don't see how an economic solution is enforceable.
> 
> My solution is technical and economic.  Yours is purely technical.
> 
> Sure we are policy makers - or at least policy prototypers!  We are
> designing a new architecture to be added to the Internet.  That involves
> technical, commercial and policy arrangements.  We design the whole lot together.
> 
> We don't actually make policy, of course.  We design a complete
> integrated system of protocols, functionalities, design principles and
> suggestions for how the thing should be administered and operated in a
> business sense.  Then, if the IETF likes it, it is developed.  Then if
> business-people and policy-makers like it, they adopt it.
> 
> 
>>> So I think APT continues this problem of having to propagate
>>> end-user initiated changes across many devices, without a fee.  This
>>> is what bedevils BGP - and is a significant part of the the heart of
>>> the routing scaling problem.
>>>
>>>> I suspect you will have a lot to say on this, so perhaps it should be
>>>> moved to a separate thread, but I am curious: in regards to your
>>>> economic model, what is stopping an ISP from charging their BGP-speaking
>>>> customer for each BGP update today?
>>> Other folks on the list would have a better idea of this, but I
>>> think the current BGP system relies on trust and unsecured
>>> announcements.  To put up some kind of fence, administered in some
>>> way to reject updates from networks which don't in fact pay a fee
>>> per update, would require a major upgrade to all participating
>>> routers.  It would require some pretty fancy security arrangement
>>> and my guess is that it is all too much of a headache.  It would
>>> also require some kind of organisation, accounting system, and
>>> presumably some way of distributing monies to help ISPs pay for
>>> bigger DFZ routers.
>> I was trying to say that you could charge edge networks for sending BGP
>> updates to their providers, not charge in the core. As we've discussed,
>> this is where most of the updates originate. These BGP connections are
>> manually configured, and, AFAIK, generally need to match up with the
>> physical connection that the customer's data flows across. So it seems
>> to me that's pretty hard to fake.
>>
>>> I figure it could be done.  If this was all that was required to fix
>>> the routing scaling problem, I think people would be working on it.
>>>
>>> However, it doesn't alter the fact that the BGP system can't
>>> reasonably be expected to scale to as many prefixes as there are
>>> end-user networks which need portability and multihoming.
>>>
>>> Fees for each BGP prefix and for each updates would help deter those
>>> who advertise and change their advertisement of prefixes for
>>> arguably spurious reasons, so it would marginally reduce the scaling
>>> problem.  However it wouldn't solve the scaling problem assuming we
>>> want to have 10 million, 100 million or whatever end-user networks
>>> with portable, multihomable address space.
>> So, if I am understanding correctly, your claim is the following: Ivip
>> solves routing scalability by (a) proactively distributing the same edge
>> network topology and reachability information as BGP, except more
>> efficiently, and (b) enforcing a charge per update?
> 
> Yes, but I never use the word "proactive".  Ivip is clearly a very
> different system than BGP - it is an overlay system for the
> interdomain network which happens to run BGP.  The fast push system
> should be a lot more efficient than BGP's approach of conversations
> between neighbours about individual prefixes potentially resulting
> in changed decisions in each router, and therefore changed
> announcements by that router.
> 
> Ivip is intended to give each end-user network effectively real-time
> control of how their space is split into micronets - which are any
> contiguous range of addresses, not just binary boundary prefixes as
> with BGP and the other core-edge separation schemes.  The real-time
> control enables the packets to be sent to any ETR in the world, and
> for this to be changed in a matter of seconds for a small fee.
> 
> In principle, end-user networks can have real-time control over
> their existing PI prefixes, but BGP is slower at propagating the
> changes over longer distances, and each change burdens potentially
> thousands of routers, in principle perhaps every router in the DFZ,
> which is not reasonable or scalable.
> 
> This real-time, direct, control over the tunneling of traffic
> packets is not the aim of APT or LISP-NERD.  Those systems let the
> end-user specify ETRs and preferences, and have the ITRs (or Default
> Mappers) make the decisions from moment-to-moment.  LISP-ALT and
> TRRP enable the end user to change their mapping as often as they
> like, but unless there is an unscalably short caching times on the
> mapping replies, this does not translate into real-time control of
> the ITRs in the way Ivip is intended to achieve.
> 
> 
>>>>>>> Your description:
>>>>>>>
>>>>>>>    LISP-CONS and LISP-ALT build a DNS-like hierarchical overlay to
>>>>>>>    retrieve mapping data when needed.
>>>>>>>
>>>>>>> strikes me as wrong.  Neither has much resemblance to DNS.  ALT is a
>>>>>>> completely separate network, with its own BGP instance, using a
>>>>>>> different but parallel address space, for sending mapping queries,
>>>>>>> which are typically actually traffic packets.
> 
> . . .
> 
>>>>> In the ALT system, a single query gets to the appropriate server -
>>>>> there is no recursion.
>>>> I think you mean, there is *only* recursion, in the sense of recursive
>>>> DNS queries.
>>> I mean "no recursion".
> 
> . . .
> 
>>> LISP-ALT has no concept of recursion, as far as I know.
>> Ok, I see what you mean.
> 
> OK!
> 
> 
> 
>>>> No, we don't rely on ICMP in APT. We have our own control messages that
>>>> are generated and processed at DRs and DMs. I believe the details (at
>>>> least of previous versions of our failure handling) are described in
>>>> most (if not all) of the APT-related documents.
>>> OK - there is a message from the ETR to the DM of the sending
>>> network that the ETR can't reach the destination network:
>>>
>>>   http://tools.ietf.org/html/draft-jen-apt-01#section-11.4
>>>
>>> If the ETR's unreachability is reflected in its BGP prefix no longer
>>> being reachable, this is handled in section 6.1.1.
>>>
>>> But what if there is some network failure between the ITR and the
>>> ETR?  Section 6.1.2 handles this, on the assumption that the ITR can
>>> send packets to the network in which the ETR resides, and that the
>>> default mapper there, working with the internal routing system, will
>>> be able to detect that the ITR sent a packet to the no-longer
>>> reachable ETR.
>>>
>>> OK - I see how you do this without relying on ICMP messages.  You do
>>> however rely on:
>>>
>>> 1 -  BGP to tell the sending network's border router, and therefore
>>>      the ITR, that the ETR's network is unreachable.
>> As is true in the Internet today.
> 
> Yes.
> 
>>> 2 -  The DM in the ETRs network to tell the ITR (or the ITR's DM?)
>>>      that the ETR is unreachable.
>>>
>>> 3 -  The reachable ETR to tell the ITR (or the ITR's DM?) that the
>>>      destination network is unreachable.
>>>
>>>
>>> I think 3 should be OK - and probably 2.
>>>
>>> However, there can be long delays across the Net with BGP
>>> propagating a notion of unreachability.  The destination network
>>> could go off the air and nearby routers cancel their advertisements,
>>> but other routers think their neighbour has a path, and advertise
>>> that path's length.  This does not necessarily get propagated
>>> quickly, due to a delays in each router, including if there is a
>>> flurry of such announcements when a major link dies.  Also, there is
>>> MRAI path hunting, depending on the structure of the routers.
>>>
>>> http://www.firstpr.com.au/ip/sram-ip-forwarding/#BGP_hunting_MRAI_disc
>>>
>>> which can delay the propagation of an unreachable condition by ~30
>>> seconds for however many depths of this process there are.
>> All of this is true, but APT isn't meant to fix or avoid BGP's problems,
>> just to limit how much of the network BGP routers have to deal with.
> 
> The same is true of Ivip.
> 
> In both Ivip and APT, we are relying on the BGP system to maintain
> connectivity between ISPs.  No-one has an incrementally deployable
> alternative to BGP for this task, and my impression is that it does
> the job very well.  (The MRAI timer path hunting timer problem could
> be fixed, in my opinion - but BGP is a very deep thing and I don't
> know much about it.)
> 
> Ivip has a potential advantage over APT regarding rapid response to
> reachability problems.  This example is perhaps a little contrived -
> but it illustrates the point.
> 
> 
> Let's say end-user network N1 is multihomed to ISP-1 and ISP-2.  It
> is currently getting its packets via ISP-1.  In APT, that ISP-1's
> ETR's address is the address in the mapping which has top priority.
> In Ivip, this means that the last mapping update N1 sent for its
> micronet was to map it to ISP-1's ETR.
> 
> Now let's say ISP-1's border router dies, or some link dies or
> whatever.  Let's also assume that there is an ugly arrangement of
> routers near ISP-1 such that some ISP far away - ISP-9 - doesn't
> find out via BGP that ISP-1 is unreachable until 2 minutes after
> the problem occurs.  This could be 4 levels of 30 second MRAI timer
> path hunting, or some other delays in BGP.  In particular it could
> be many levels of BGP router coping with a flood of changes for
> hundreds of thousands of prefixes affected by by the same outage
> which affects ISP-1.
> 
> Hosts in any edge network which relies on ISP-9's ITRs and Default
> Mappers are going to be sending their packets to a black hole for
> these two minutes, because ISP-1's ETR is unreachable, and because
> ISP-9's DM and ITRs haven't yet figured this out, due to the BGP
> delays.
> 
> You might at this point decide change APT to to rely on ICMP
> messages, but you would need a way of securing those, to prevent
> spoofers.  That would involve either a nonce and therefore extra
> processing and packet length overhead in every traffic packet,
> and/or extra processing and overhead due to probe packets at some
> rate which would detect the loss of reachability in less than 2
> minutes.  You couldn't very well pepper the ETR with probes just
> because you got an unsecured ICMP destination unreachable packet,
> since that opens up a DDoS pathway.
> 
> There would be major scaling difficulties with ITRs frequently
> testing reachability to ETRs, since the one ETR might be getting
> such probes from tens of thousands of ITRs.  Whatever you do to
> determine reachability, it needs to be hard coded into the ITR and
> DM functions and also into the ETR functions.  You don't have a way
> of letting N1's administrators directly control where ITRs tunnel
> their packets, depending on *their* ideas of reachability and
> whatever it is they want to do regarding packets coming in via the
> ETRs of ISP-1 and ISP-2.
> 
> 
> In the Ivip setting, I assume that N1 administrator hires some
> specialised company MRD (Multihoming Reachability and Decision-making
> Inc.) to continually monitor the reachability of its network via the
> two ISP's ETRs.   MRD has sites all over the world for this purpose,
> and sends a stream of nonce-protected probe packets from all these
> sites, to some node in N1 (probably multiple nodes, for redundancy)
> which acknowledge the probe with a packet containing the nonce.
> 
> N1's administrators are free to choose the frequency of these probe
> packets, and so trade off the traffic and load they carry - which
> could be small - against how quickly MRD's system could decide that
> reachability would be lost.  MRD provides a sophisticated language,
> or series of options, by which N1's administrators specify what
> criteria are used for deciding such things as:
> 
>    ETR-1 is unreachable, so change the mapping to ETR-2 if it is
>    is reachable.
> 
>    After an outage, ETR-1 is reachable for a sufficiently long
>    period of time that it is best to change the mapping back to
>    ETR-1 again.
> 
> N1's space could be split into multiple micronets, for the purpose of
> load sharing over the two ETRs.  MRD's system is in charge of the
> mapping of these micronets, and it could also communicate with nodes
> in N1 which report on traffic loads, congestion etc. so that by
> specified decision criteria, MRD would dynamically adjust the mapping
> of multiple micronets to load share the traffic however N1's
> administrator's choose.  This could be an automated process,
> or N1's administrators could take manual charge for a while.  MRD has
> the username and password it needs to change N1's mapping, via the
> RUAS (or some other related company) which handles the mapping for all
> micronets in the Mapped Address Block which N1's Scalable PI space is
> part of.  N1 pays for these updates, so N1's administrators will
> optimise the decision logic they specify for MRD's operations, to avoid
> too much chopping and changes, but to achieve whatever multihoming
> service restoration and TE goals they desire.
> 
> In this setting, assuming N1 has got MRD testing reachability every
> second or two, MRD will be able to detect ETR-1 becoming unreachable
> from some, many or all of its probing sites within a second or two.
> Depending on how N1 sets up the logic, for instance to ignore 3
> second glitches, but to change the mapping to ETR-2 if there is
> failure to acknowledge probes for 4 seconds, MRD could change the
> mapping within a few seconds, and the Ivip system propagates that
> change to all ITRs which need it within another 2 or 3 seconds.
> 
> This doesn't rely on BGP or ICMP messages at all.  Probably MRD would
> ramp up the number of probes from one a second, to 10 a second, if one
> or two second's worth were not acknowledged.  That would avoid
> changing the mapping due to just a few unfortunately lost probe or
> acknowledgement packets in succession.
> 
> Multiple companies such as MRD would exist, so any network such as
> NI would have the choice between a number of flexible, potentially
> highly sophisticated, distributed systems for testing the
> reachability of their network via various ETRs and for changing
> the mapping accordingly.
> 
> Even if this Ivip system wasn't inherently faster than APT's
> reliance on BGP, the separation of this function from the
> core-edge separation scheme is an important benefit, since it
> scales better than having 10,000 ITRs trying to determine
> reachability to one ETR, and because networks would often want
> more control over mapping than whatever could be provided in
> APT's (or LISP's or TRRP's) fixed functionality for ITRs and ETRs.
> 
>   - Robin
> 
> 

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg