[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Notes about identifier - locator separator



[During the last couple of days there have been quite a lot
 of comments, by various people, on the ideas of separating
 identifiers and locators (see the end of this message for
 some quotations).  Based on our experience on Mobile IPv6,
 the late Homeless Mobile IPv6 suggestion, and our current
 work on Mobile IPv6 security and HIP, I'd like to present
 a few observations.  Disclaimer:  I am far from being a
 routing expert, so take your usual grain of salt.]

I think that there are a number of dimensions that we
have to keep in mind when discussing separating identifiers
from locators.  Furthermore, I think that it is important
first to understand these issues and only once we really
understand the dimensions to start take positions on
specific solutions.  Maybe this is no news to you, but
writing this has certainly helped my own thinking :-)

The immediate issues that I see are the following:
  1. Architectural "structure" of the identifiers.
  2. Where to perform identifier->locator and/or locator->locator
     translations.
  3. Identifer -> locator resolution (+ reverse resolution for ops)
  4. Backwards compatibility
     a) application level backwards compatibility
     b) routing level backwards compatibility
  5. Security & Privacy issues
  6. Issuing and uniqueness of identifiers
There are most probably others, but I think these six
might work as a starting point.


1. Architectural "structure" of the identifiers

   From my point of view, there are two different basic
   approaches:

     a) make the end-point identifiers a completely
        separate name space

     b) divide the IPv6 address space into locator
        and identifier space

   Due to the chartering of this working group, I think it is
   natural that people here tend to look at solutions from the
   b) category.  However, I personally think that a completely
   separate name space might be better.  Furthermore, if we
   create a completely new name space, that would leave
   the IP address space more or less intact, allowing routing
   technogy to be developed independently of end-host mobility
   and end-host multi-homing issues.  (I do realise that network
   level mobility and multi-homing are different.  They
   probably require changes on the routing level too, and
   end-point identifiers might just act as a helping tool.)

2. Translations

   Once the end-point identifiers have been separated from
   locators, we need to translate the identifiers to locators
   at some point, and it also becomes possible to translate
   between locators.  That is, the apps will only know the
   identifiers, not locators.  Thus, at some point, the
   identifiers must be translated into locators so that the
   packets can be routed to their destination.

   Here I see two basic solutions, again:

     a) perform the identifier -> locator translation at
        the end-host so that all packets leaving the end-host
        have a proper locator,

     b) allow the identifiers to leave the end-host without
        locators, and perform the translation within the
        the network, e.g., at a site border router.

   Independent of that, it will be possible to translate
   locators to locators within the network, as long as the
   end-points are able to determine the identifiers either
   implicitly (based on state), explicitly (based on
   information carried in the packet), or bt a combination of
   state & per-packet information.

3. Resolution

   Apparently there must also be some way of finding the
   locators based on the identifier or a name.  Apparently
   there is a huge number of possible solutions to this.
   One of the most obvious ones is to store both identifiers
   and locators into the DNS, and make the name resolution
   library to fetch both.

4. Backwards compatibility

   The basic issue here is to keep the semantics of the
   fields in the application level data structures and
   the address fields in IPv6 header sufficiently intact.
   However, we must understand that the address semantics
   are different from the application point-of-view and
   from the routing point-of-view.

   a) application level backwards compatibility

   For application level compatibility, the identifiers
   must *look* like IP addresses.  That is, the current
   name resolution fuctions (getaddrinfo) must return
   something that looks enough like an address, and the
   kernel must be able to understand these.  However, if
   the identifier name space is separate from the locator
   space, and if the translation is performed at the
   end-host, these identifiers might never leave the
   end-host in the IP address fields.  Thus, it is
   possible to *separate* application level and routing
   level backwards compatibility

   b) routing level backwards compatibility

   I'm fraid I can't say much intelligent here.  You
   folks know this subject much better than I do.

5. Security and privacy

   I might as well write a separate message about this.
   (Would a draft be useful?)  However, briefly:  From
   the security point of view, there are two concerns:

    a) We want to make sure that the multi-homing
       mechanism cannot be used to "steal" addresses
       or connections.  That is, the solution must
       make sure that the end-points talking to each
       other remain the same even if the underlying
       locators are changed.

    b) We want to make sure that the multi-homing
       mechanism cannot be used as a vehicle in
       DoS attacks.  This is the "flooding" attack
       problem that I've described in other messages.

   From the privacy point of view, it would be a
   definite plus if the actual, long lasting identifiers
   would not be visible in packets.  There are
   techniques how identifiers could be cryptographically
   masked in a way that still allows middle boxes
   (such as firewalls) and end-points to still recognize
   valid identifiers but make it hard for outsiders
   to track them.  However, these naturally require
   that the identifiers are completely separated from
   locators, i.e., that they are not being used for routing.

6. Issuing and uniqueness of identifiers

   Here we once more seem to have two possibilities:

    a) The identifiers are issued hierarchically

    b) The identifiers are issued by a random number
       generator and are long enough so that the
       probability of collision is low enough.

   I know that there are doubts against b).  However,
   if the identifiers are long enough, the chance of
   collision by change can be so low that it really
   can be ignored, since a collision would be likely
   to happen less often than there are atoms in our
   universe.  (The quality of random number generators
   would still be an issue, though.)

----

Those are the dimensions as I see them.  There are
probably others.  Apparently, my personal position
is that I'd like to see that

  - identifiers are completely different from
    IP addresses, and IP addresses gradually
    become pure locators,

  - identifiers are translated into locators
    at the end-hosts, and the network performs
    locator-locator translations if needed
    e.g. for traffic engineering,

  - both identifiers and locators are stored in
    the DNS; additionally there might be dynamic
    locator->locator translation services to
    help mobility,

  - the application level and routing level
    compatibility issues are separated,

  - identifiers are cryptographic in nature so
    that it becomes easier to solve the security
    problems,

  - identifiers are usually not carried in the
    packets, just locators; this helps privacy
    and keeps the header size from growing,

  - it is *possible* to generate totally
    random identifiers; if some people want
    to use hierarchically assigned identifiers,
    I have nothing against it.

----

Some related quotations from the discussion in the
last two days.  Apologies to those occasions where
I haven't got the attribution right.

Tony Hain wrote:
I have not been opposed to GSE, but I think the time has passed for it
to be adopted without coupling it with something like MIPv6 to provide a
stable identifier to the psuedo-header calculation and multi-party apps
that insist on refering addresses rather than name strings.
Craig A. Huegen wrote:
With that said, let's find some solutions.  I also agree with the idea of
separating router goop from identities; however, I also am a pragmatist
and am concerned with the existing applications that would break with
this.
Tony Li wrote:
>> You are also assuming that aggregation and multihoming are somehow
>> at odds.  They are not.  All you have to do is to disconnect
>> the locator and the identifier as GSE does and aggregate only the
>> locators.  Each host in a multihomed domain has a single
>> identifier, but multiple locators.  No deaggregation happens
>> because the locators are bound to the provider and thus we
>> have good topological 'addressing'.  The enterprise doesn't
>> lose because we tweak the transport to base the pseudoheader
>> on the host identifier and then let the locator be free to be
>> replaced by border routers.  Connections flip back and forth
>> between locators easily.

To which Tony Hain replied:
This model was probably possible 4 years ago, but at this point I doubt
tweaking the pseudoheader can even be on the table. Also, 'freely
replaced' usually sets off the alarm bells of the spoof-sensitive
security types. That does not mean we GSE is hopeless, just that we will
probably have to use something like MIPv6 to mask the label swapping. In
the abstract, there is not much difference between GSE and a Care-of
Address. So for routing & packet forwarding, if the CoA had finer
grained pattern replacement happening the end systems would not be aware
of it. The missing link would be letting the host know what the current
publicly visible address is so it could pass that to the CN.
In a later message, Tony Hain wrote:
But since there is no way to ensure that an identifier is globally
unique, the only way to accomplish the goal is to couple them to
describe 'this identifier at this location'. If the location is not
stable, the static description is not stable.
...
Because the identifier is not stable & globally unique, and even if we
had a way to ensure that, the privacy advocate's alarms would be going
off. Despite the desire to avoid it, the current reality is that to
create a globally unique identifier we bound the problem by coupling it
with its locator.