[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Granularity (was Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip)



Tony Li wrote:
Again, the real question here is about whether we need host granularity or not.
It may be that rather than having one problem/solution match-up to consider, that there are two:

dual-homed
multi-homed (N>2)

I'd argue that the requirements of each differ, and there can be significant scaling benefits from splitting them out, and handling each separately.


However, if we do need host granularity, then I think you need about 3 orders of magnitude more scale, and pure push approaches simply won't get you there.
For that reason, I think further exploring how to provide "host-like" granularity without going to full host granularity, would be wise.

So, toward that end, here I go with thoughts about 2-homed and N-homed...

N-homed (N>2)
The presumptions are that a site with >2 homes, will have the following characteristics:

  1. EID prefixes assigned from central authority/authorities (e.g. RIRs)
  2. has a network administrator assigning (directly, or via automated
     mechanism) EIDs
  3. handles its own TE issues
  4. handles its own EID->RLOC mapping policies
  5. has delegated PA space from upstreams for RLOCs, with PA blocks
     hierarchically aggregated by upstreams (e.g. by POP)
  6. quantities X of such sites: (current number of ASNs)<< X <<
     (number of 2-homed sites)

This can scale well enough by itself for things like NERD.

Similarly, the presumptions and requirements on 2-homed sites could be:

  1. no network administrator - just normal users with minimal
     networking knowledge
  2. prefer not to have "centrally assigned" EID prefixes
  3. want lightweight mechanism for EID->RLOC mapping, preferably automagic
  4. want scalable aggregation of EID->RLOC mappings, if possible
  5. want site EID prefixes, with EID assignment within the site via
     something like DHCP
  6. don't want actual host granularity on EIDs
  7. has delegated PA space from upsteams for RLOCs, with PA blocks
     hierarchically aggregated by upstreams (e.g. by POP)
  8. ability to "authenticate" EID->RLOC mappings to prevent spoofing
  9. needs to scale to arbitrary pairings of upstream ISP, presumably
     based on 32-bit ASNs

Putting these requirements together puts the need for addressing schemes, and enough bits in the address fields, into perspective.

For instance, combining requirements #2 and #3, means there would be a preference for a deterministic mapping scheme for (2 x RLOCs)->EID. Similarly, adding #9 into this, means that EID prefixes must be >>64 bits long - meaning this address space must diverge from the standard IPv6 use of 64-bit Interface Identifiers.

I'll take a stab at how this might look.

Let's assume first, that we are using new prefix space from IPv6.

The prefix space itself would be, for instance, the first 4 bits, such as 4200::/4.
Call this prefix "X:4" (meaning value X, 4 bits long).
(I'll use :N rather than /N, since /N is anchored and absolute in length; :N means "the next N bits", and is relative.)

The ASN-specific RLOC blocks would be constructed as:
X:4 + A:32, forming RLOC-block-A/36, and similarly RLOC-block-B/36, for ASNs A and B.

The actual RLOC values would be prefixes assigned from sub-blocks under this, constructed like: RLOC-block-A:36 + a:28 to form RLOC-A-a/64. RLOC-B-b/64 is created similarly with sub-block b:28. The assignments would presumably be done by ISP A and B, however they choose, probably hierarchically.

Now for the tricky part -- Constructing EID prefixes.
This is done as follows:
X:4 + A:32 + B:32 + a:28 + b:28 to form EID-AB-ab/124.
Another EID prefix (for reasons of symmetry) is also created, EID-BA-ba/124.

The "knob" for TE purposes, is the choice of which upstream is preferred: EID-AB-ab prefers A over B.

Note that this allows for *lots* of RLOC assignments (and thus end sites), but does so at the expense of the available EIDs per site (absolute max of 16, max usable under normal circustances is 13.)

Global reachability of RLOC blocks is aggregated. How more-specific information on unreachability may be carried somehow, e.g. as opaque BGP data used only at the edge (ITR), is unspecified. EID->RLOC mappings are deterministic, by parsing the prefix portion, and appending the host portion:
X:4 + A:32 + B:32 + a:28 + b:28 + host:4 maps onto two RLOCs:

   * X:4 + A:32 + a:28 + 0:60 + host:4 (preferred)
   * X:4 + B:32 + b:28 + 0:60 + host:4 (backup)

The alternate EID has the *same* RLOCs, but in reverse preference.

Note that this means that the mappings themselves are completely deterministic, and as such, the requirement for push/pull of mapping database goes away. It scales to 2^28 sites per ISP (268M), with arbitrary ISP/ISP pairings for upstreams. It still allows for non-EID use of RLOC prefixes as individual /64's, with the requirement that a few fixed values be set aside (::0/4 specifically).

Presumably prefix filtering on ingress by ISPs A and B, would permit only things that match (X:4 + A:32 + *:32 + a:28 + *:28)/124 as the encapsulated EID value, or the alternate encapsulated EID value (X:4 + *:32 + A:32 + *:28 + a:28)/124.

I know it's not what was necessarily envisioned by the current strategies, but hey, it scales pretty well, IMHO.

What do folks think?

Brian Dickson

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg