[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] Taxonomy: 25 questions - updates & web page

To: "Robin Whittle" <rw@firstpr.com.au>
Subject: Re: [RRG] Taxonomy: 25 questions - updates & web page
From: "William Herrin" <bill@herrin.us>
Date: Thu, 24 Apr 2008 18:43:36 -0400
Cc: "Routing Research Group" <rrg@psg.com>
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=Bv7A0Vhx38fdIw/3+eFCrMuNFZXDK9MOB/Bnv9mQi78fY1FNkA5LCXPMI8Jfl2b5IcmnKkC53eZUzMrWcBqoLTh8dpywob7DNkVQ+HQe8hAJVdW8PKvX7qj22nCivR2mzg+yljpNXyB0f1m/PJCf4DKaHS4HPsZyL9KUYlcoLYg=
In-reply-to: <480C08AA.2050903@firstpr.com.au>
References: <47F2CD78.2010806@firstpr.com.au> <3c3e3fca0804012107q7851011av60095f572a56aff@mail.gmail.com> <480C08AA.2050903@firstpr.com.au>

On Sun, Apr 20, 2008 at 11:23 PM, Robin Whittle <rw@firstpr.com.au> wrote:
>  I added to both the LISP-ALT and TRRP sections saying that each
>  proposal was probably not "pure pull", because both had planned
>  extensions by which the ITRs handling traffic packets for some
>  ETR could be given, or induced to seek, freshly updated mapping
>  information.
>
>  For TRRP, I linked to your description of the 2 methods and to my
>  critique of these:
>
>   http://bill.herrin.us/network/trrp-preempt.html
>   http://psg.com/lists/rrg/2008/msg00532.html
>
>  I think there is more food for discussion in this, and I also want
>  to find out more about what Dino said about ETRs prompting ITRs
>  to request fresh mapping. (Friday 14th March RRG meeting - no audio
>  archive, and my live recording ended in the middle of this part of
>  Dino's presentation.)

Hi Robin,

I think there's an advantage to the ETR being able to tell the ITR
that it is no longer a valid selection for a particular destination.
That having been said, there are a number of cautions:

1. The ETR can't be relied upon to correctly tell the ITR anything. As
near as I can tell, most if not all protocols that rely on an
intermediate node telling the originator or some other intermediate
node about its failure tend to run in to trouble.

Perhaps an example where it almost works would be more instructive
than the many discarded protocols where it fails outright: Path MTU
discovery. Need I say more?

2. In a pull-based protocol, I think there is some value to making
notification elements like this optional. Doing so both allows
compliant implementations that aren't excessively complex and serves
to remind us that the preemptive notification is frequently-useful
speed-up, not a reliable core part of the protocol.

>  > TRRP has a notification system: http://bill.herrin.us/network/trrp-preempt.html
>  >
>  > Notification is an optional component. The system is designed to work
>  > acceptably without it but work faster with it.
>  >
>  > Without notifications, the maximum effective change rate per map is
>  > about once every 10 seconds. That interval has enough overhead that
>  > folks who are just doing multihoming service restoration will normally
>  > back it off further. For every communication where notification is
>  > supported at both the ITR and map server, TRRP can in the average case
>  > switch to the new map within a fraction of a second after the map is
>  > published.
>
>  I included your text, and noted that I don't yet fully understand it.

The speed with which a TRRP map can be updated depends most directly
on the originator of the map and the completely arbitrary timeout
value he gives it. My comments above were primarily an observation
that once you get below about 10 seconds, other factors besides that
TTL start to provide a noticeable percentage of the update delay.

When the preemptive notifications are implemented and the notification
packet isn't lost, cutover to the updated map will be faster, often
less than a second, regardless of the timeout. The preemptive
notifications should usually work but they're not guaranteed to.

>  > or http://lists.arin.net/pipermail/ppml/2008-March/010475.html
>  > comes in to play and he'll  pay the aggregator.
>
>  I included your text, but have not read the PPML message yet.

This was a brainstorm for how implement long prefixes without the
"everybody else pays" problem. It's not a part of TRRP or any other
proposal.

One idea has the long PI prefixes paying into a fund and all backbones
who agree to carry those prefixes receiving distributions from the
fund.

The other implements a partially-meshed Internet in which the
operators of long prefixes make private arrangements with the dozen or
so providers necessary to get back to the tier-1's who advertise a
supernet prefix. They still get full connectivity and most of the
multihomed reliability but the routes themselves only propagate nearby
and only carry the cost of propagating nearby.

Both were intended to be approaches that could be implemented without
new routing protocols.

>  Doesn't the mapping information give the ITR options for ETRs to use
>  when one fails?  I thought it was like LISP or APT in this respect.

Yes it does; it allows multiple ETRs to be specified, multiple
encapsulation protocols and a ranked priority for each.

>  Since I don't think either of your Notification systems is reliable,

Correct; only the timeout-rerequest is deemed reliable. The ITR
regularly re-requests map entries from the authoritative map server
for any communications which are ongoing. At that time it picks up any
map changes designated by the originator of the map, which will
include any changes because of link failures or whatever.

The connectivity or routing tests that the originator uses to
determine what maps to include in the map reply are beyond the scope
of the TRRP protocol itself. It could range from "I'll change it by
hand if I need to" to an Akamai-like system that takes packet loss and
other complex factors into account.

>  and since you say TRRP is supposed to work OK without them, I don't
>  see how an ITR can decide which alternative ETR to use if that ETR
>  goes down, since the ITR doesn't have a reliable way of knowing
>  the ETR is down, or that the ETR is not able to reach the
>  destination network, or any other reason why the end-user wants the
>  ITRs to use a different ETR.

IF the ITR receives a preemptive (early) notification from the ETR (or
a router in the path to the ETR) that a problem has occurred, the ITR
can decide prior to the map update to fall back on a lower priority
ETR. Regardless of that decision, it will still seek the originator's
updated map and follow the originator's instructions once it gets
them.

This is almost the exact opposite of, "the map-encap system must do
all reachability testing and make all decisions."

>  >>  Does the proposal tackle the PMTU and Fragmentation problem?
>  >>
>  >>   TRRP       No.
>  >
>  > Yes, it does. See http://bill.herrin.us/network/trrp.html subheading
>  > "Fragmentation."
>
>  I added:
>
>    Bill refers to the Fragmentation section of:
>
>       bill.herrin.us/network/trrp.html
>
>    but this looks inadequate to me.
>
>    For instance, there is no suggestion of how the ITR
>    can discover the Path MTU to each ETR.

Bog standard path MTU detection: an overlarge packet with the DF flag
induces a too-big message back to the ITR.

>    Also, the suggestion that the ITR alter traffic
>    packets to adjust TCP MSS in any SYN packets sounds
>    really undesirable and impractical to me.

It's one of the solutions Cisco came up with in their
currently-deployed GRE implementation in order to deal with broken
PMTUD. If you know that packets beyond a certain size are likely to be
a problem, just tweak the MSS option that occurs near the start of a
TCP connection so that you never get packets that large.

Obviously there are pluses and minuses to this, which is why I offer
it as a recommendation rather than a requirement.

>     Bill discussed TRRP's support for mobility:
>
>        psg.com/lists/rrg/2008/msg00766.html
>
>     but I don't yet know how fast it could get fresh
>     mapping to all ITRs, since I don't believe either
>     of the TRRP approaches to Notification are really
>     useable.

Assuming "fast"=fraction of a second and "not fast"=10 seconds, the
point of my linked article is that TRRP doesn't need to distribute the
update "fast" in order to support mobility. All it needs to do is
communicate with both the approaching and receding towers during the
overlap period.

Since only the ITRs actively communicating with me care what my map
is, and they will re-lookup my map at whatever interval I, the
originator of the map, tell them to, I can guarantee that those ITRs
will get the new map between the time I decide that a tower is
receding and the time that it's no longer reachable to me.

>  How fast could a mobile TRRP end-user could have their mapping
>  changed when they want to use a new TTR?

As previously discussed, a user who knows he's mobile should be able
to set a timeout that guarantees update propagation within 10 seconds.

>  My understanding is that
>  without a reliable Notify system, you would depend on short caching
>  times for your pull mapping responses, which would be extremely
>  onerous.

My back-of-the-napkin calculations suggest that an active mobile
station would have as much as a 5% traffic overhead in order to keep
the mapping updated and propagate the changes to the interested ITRs
when the delay is dropped to 10 seconds.

Is that onerous? That depends on your perspective. A CD-Rom has about
a 15% overhead for error correction bits. Satellite Internet has
similar error correction overheads. That's simply what makes it work.

If it takes a 5% traffic overhead, mostly concentrated in the
land-bound fiber optic lines, in order to make mobility work smoothly
and eliminate the handoff problem, that's not necessarily a
dealbreaker.

>  >>  What is the granularity of address management?
>  >>   APT        } Not yet clear, but I think they can all support
>  >>   LISP-NERD  } EID prefixes starting at any IPv4 address or
>  >>   LISP-ALT   } IPv6 /64, in CIDR prefix format and therefore
>  >>   TRRP       } powers of two lengths on binary boundaries.
>  >
>  > TRRP: primary granularity is one IP address. Administrative grouping
>  > can be arbitrary; its an implementation detail in the map server.
>  > Efficiencies are gained at power-of-two boundaries, 4-bit boundaries
>  > and 8-bit boundaries.
>
>  I added your text and the note:
>
>    I think this relates to how many DNS servers the
>    ITR needs to query before one of them responds with
>    authoritative mapping.

Generally no. Its more a question of how many queries the ITR has to
make to get the full span of IP addresses covered by the map and how
many map slots the result will take in the ITR's lookup table. If it's
not on a power-of-two boundary then the full span of addresses will
take multiple slots in the lookup table and if it's not on a DNS
boundary (4 or 8 bit) then the full span will take multiple lookups.

>  I added your text with the note:
>
>     TRRP, like Ivip, enables the sending host to be its
>     own ITR, as long as it is on a public IP address.
>
>     I think this includes addresses which are mapped by
>     TRRP or Ivip.

Correct.

>  Do you expect all packets from Sending Hosts outside a given
>  TRRP-mapped edge network to flow through an ITR and the ETR,
>  including those sent from Sending Hosts in the same ISP network the
>  ETR is located in?

No.

I also expect it'll be asymmetric: if a TRRP-mapped host communicates
with a non-mapped host, the packets from the TRRP host will be bare on
the Internet while the packets returning from the non-mapped host will
have to find an ITR in order to get to the TRRP-mapped host.

>  > Note #2: ICMP unreachables as commonly implemented include enough of
>  > the original packet that the ITR can translate it into valid
>  > unreachable for the source with no particular heroics. Not every case
>  > to be sure and the standards don't require it, but when I looked at
>  > the packets they usually had enough data.
>
>  I added your text, with a note:
>
>    If this is true to a large extent, then maybe the same
>    is true for Packet Too Big messages from these same
>    routers.  If so, then this would be contrary to my
>    arguments about it being impractical to have ITRs
>    respond to PTBs from the tunnel in a way which
>    reliably created a PTB the Sending Host would
>    recognise:
>
>  http://www.firstpr.com.au/ip/ivip/pmtud-frag/
>  http://www.ietf.org/mail-archive/web/ram/current/msg01766.html
>  http://www.ietf.org/mail-archive/web/ram/current/msg01769.html

I haven't rigorously explored this. I spent a few minutes poking at it
with tcpdump and noticed that instead of seeing just the N bytes of
payload that my TCP/IP Illustrated book told me to expect for an ICMP
unreachable, I was usually seeing payload out to the 500+ byte minimum
MTU cutoff.

Regards,
Bill Herrin

-- 
William D. Herrin ................ herrin@dirtside.com bill@herrin.us
3005 Crane Dr. ...................... Web: <http://bill.herrin.us/>
Falls Church, VA 22042-3004

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg

References:
- [RRG] Taxonomy: 25 questions
  - From: Robin Whittle <rw@firstpr.com.au>
- Re: [RRG] Taxonomy: 25 questions
  - From: "William Herrin" <bill@herrin.us>
- Re: [RRG] Taxonomy: 25 questions - updates & web page
  - From: Robin Whittle <rw@firstpr.com.au>

Prev by Date: [RRG] Updates to Ivip analysis and homepage
Next by Date: Re: [RRG] Which Side to Control Ingress Link Selection?
Previous by thread: Re: [RRG] Taxonomy: 25 questions - updates & web page
Next by thread: [RRG] Re: Taxonomy: 25 questions
Index(es):
- Date
- Thread