[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] Re: Six/One Router revised 2008-07-12 - IPsec



Short version:   Further discussion of the Flow Label as the only
                 available place to snaffle a header bit for the
                 Bilateral / Unilateral bit.

                 Diagrams depicting the actions of Bilateral
                 and Unilateral translation shown in Figs 2, 4
                 and 5.

                 The awkwardness of rotating 16 bits of the
                 address during translation to avoid the need
                 to change the checksum which covers the address.
                 This seems to limit the granularity of the
                 system to /48.  Ivip6 goes to /64.

                 The awkwardness of choosing Transit prefixes
                 for each End user prefix such that translation
                 will not affect the checksum.

                 The awkwardness of doing neither, but of updating
                 the checksum instead.

                 Unilateral translation breaks IPsec AH.  This is
                 not just during transition until all hosts are
                 using the new kind of address space.  It will
                 affect communications with all hosts which never
                 change to the new style of address space - and if
                 people are happy having their hosts on their ISP's
                 conventional PA address space, why should they
                 have to change to the new Six/One Router kind of
                 space?


Hi Christian,

Thanks for your replies, in which you wrote:

> You are suggesting that the Bilateral/ Unilateral bit, which
> Six/One Router requires, could be taken from the IPv6 Flow
> Label field.  This would be possible, right.
>
> I was myself considering taking the bit from the Traffic Class
> field.  This would have two advantages:
>
> - Some bits in this fields are still unused.

OK - I will return to this.


> - Backwards compatibility:  The IPv6 spec states that bits from
>   the Traffic Class field can be set by routers, and that hosts
>   and routers must ignore bits they do not recognize.
>
>   Consequently, nothing would break if one of the bits were
>   allocated for use by Six/One routers.
>
> Think it makes sense?

In principle.  I assumed all 8 bits in the Traffic Class octet were
used - here is my detailed investigation.

While I was researching this, Brian Carpenter wrote:

> Not true, unfortunately. They are all allocated to either ECN
> or diffserv. These are both encoded fields, so there are no spare
> bits.

FWITW, here is my research:

<research>

    RFC 3168 (2001-09) states that in both IPv4 and IPv6, the two
    ECN bits (6 and 7) are used - with all four codepoints.  Section
    22 has a history of this octet.

    RFC 2474 (1998-12) and RFC 2780 (2000-03) specifies bits 0 to 5
    are for DiffServ (Differential Services Field - DSCP = DiffServ
    Code Point) with bits 6 and 7 unused.  However, this was before
    RFC 3168.

    RFC 2474 apparently allows routers to change these 8 bits.  From
    RFC 3168:

       Prior to RFC 2474, routers were not permitted to modify
       bits in either the DSCP or ECN field of packets
       forwarded through them . . .

    So bits 7 and 8 are tied up with Explicit Congestion
    Notification.  What about the DiffServ bits?

      http://tools.ietf.org/html/rfc2474#section-6

        012345
        xxxxx0   Pool 1    For codepoints recommended in standards.

        xxxx11   Pool 2    Experimental / Local use.

        xxxx01   Pool 3    Initially available for experimental /
                           Local use - but which should be
                           preferentially utilized for standardized
                           assignments if Pool 1 is ever exhausted.

    Counting in network byte/bit order (backwards binary) RFC 2474
    defines 000000   Default Per Hop Behavior (PHB)

        100000   } Class Selector PHBs.
        010000   }
        110000   }
        001000   }
        101000   }
        011000   }
        111000   }

    RFC 2597 specifies 12 of the 32 Pool 1 codepoints, using all
    five available bits.

      http://tools.ietf.org/html/rfc2597#section-6
      Assured Forwarding Per Hop Behavior codepoints:

        100100
        010100
        001100
        011100

        100010
        010010
        001010
        011010

        100110
        010110
        001110
        011110

    RFC 3246 specifies another codepoint:

        101110  Expedited Forwarding PHB

    I may have missed some, but that is 21 of the 32 codepoints in
    bits 0 to 4 taken. 	

</research>


I think the only way you could get a bit here for the Six/One Router
Bilateral/Unilateral bit is either to:

  1 - Redefine the "Flow Label" so you can use one bit of this.

  2 - Have DiffServ redefined (for IPv6 at least) not to use bit 5,
      so there are no pools 1 or 2.  Then use this bit 5.

      (This might require router changes, so they treat all values
       nnnnn1 the same as nnnnn0.)

Alternatively, I think you could:

  3 - Decide that Six/One Router won't support DiffServ fully or
      perhaps at all.  For instance, the DiffServ RFCs remain, but
      DiffServ wouldn't be able to be used as is over Six/One
      router, due to problems such as:

      a - After the first translation router, the packet having a
          different bit 5 state - unless all core routers were
          upgraded to ignore it.  Still, this would wrap pool 2 and
          3 values to pool 1 values - unless the first translation
          router converted all values zzzzz1 to 0000B, where
          B is the Bilateral/Unilateral bit.

      b - After the recipient translation router, the packet having
          bit 5 changed from the state it was set by the sending
          host or routers prior to the first translation router.

Sending host's can't tell - and shouldn't have to worry about -
whether the destination host is reached via Six/One Router - and
since Six/One Router is supposed to carry traffic for most end-user
networks, any limitation you place on DiffServ will become a
limitation for all uses of DiffServ for any communication which
crosses the inter-domain core.


As far as I can see, you need to have the "Flow Label" redefined if
you need an extra bit in the header.


> Regarding your proposal, Robin, for flow-label-based forwarding:
> A circumstance that mitigates the disruptive impact of re-defining
> the Flow Label field is that (i) use of the Flow Label field is
> currently pretty rare, and that (ii) hosts not using a flow label
> must ignore the field [RFC 3697] anyway.

Yes.  I think that it would be possible for the IETF to agree to
redefine the function of the Flow Label bits if it was crucial to
the best scalable routing solution.


Stepping back a little, I don't clearly understand the reason for
needing this "Unilateral/Bilateral" bit, since I don't clearly
understand the need for Unilateral mode between two hosts, both in
upgraded edge networks.  (Section 2.4.2 and Fig 5.)

Why don't you simply make it a rule that (referring to Fig 5) if the
host on the left (ABC:A) sends a packet to the host on the right
(DEF:B) that it always makes the destination address equal to DEF:B?

In other words, why don't you insist that upgraded hosts never send
packets to transit addresses - only to end-user addresses?

Perhaps this is because you want to have the one end-user address
and one or more transit addresses in the DNS AAAA record, and it is
possible that ABC:A will try to use these in some arbitrary order,
and find that the first one it tries works, with the first one
happening to be a transit address.

My understanding of the need for putting transit addresses, such as
 (taken from the example below):

  2001 0200 0400 29FF 1111 2222 3333 4444  Transit address for:
  4000 0000 1000 0000 1111 2222 3333 4444  End-user network address.

in the AAAA records is so that when a host in a non-upgraded network
tries to contact the destination host, identifying it by its FQDN,
it will try these addresses and try again if one doesn't work.

The second one won't work, because it is not in an advertised
prefix.  So it tries the first one, which is the transit address,
which works.

So perhaps you could create some way that hosts in upgraded networks
only get DNS results with the end-user network address, or that they
can easily identify which of multiple IP addresses is the end-user
one.  Then maybe you wouldn't need Unilateral mode between two
upgraded networks?

One way of doing this might be to say that all end-user address
space in the new Six/One Router style of end-user network comes from
the prefix 4::/3 - or some other simple prefix.  Then - and you need
upgraded host operating systems to do this - the host can know from
its own address it is an upgraded network and see from the various
addresses in an AAAA record which is also an end-user address, since
all the transit addresses will not be within 4::/3.

I think that for IPv6 scalable routing, modest upgrades to hosts and
routers should be considered - if they lead to a significant
long-term reduction in overhead and/or complexity.


Returning to this Bilateral/Unilateral business, I still find it
confusing.

Here is my summary of the three diagrams.  The host on the left is
always in an upgraded network.  For Fig 2 and 5 the right one is in
an upgraded network too.

  X = translated.

  T = Transit address of a host in a Six/One Router user network.

  E = Real address of a host in a Six/One Router user network.

  N = Real address of host in ordinary network.

  U/B = Unilateral / Bilateral bit state.


Fig 2 Bilateral  Both networks upgraded

 L --> R    Src  X           Src  X
            Dest X           Dest X

 Src:   E             T               E
 Dest:  E             T               E
                      B

            Src  X           Src  X        L <-- R
            Dest X           Dest X

 Src:   E             T               E
 Dest:  E             T               E
                      B



Fig 5 Unilateral  Both networks upgraded

 L --> R    Src  X           Src
            Dest             Dest X

 Src:   E             T               T
 Dest:  T             T               E
                      U

            Src              Src  X        L <-- R
            Dest X           Dest

 Src:   T             T               E
 Dest:  E             T               T
                      U



Fig 4 Unilateral  Right host not in upgraded network


 L --> R    Src  X
            Dest

 Src:   E             T
 Dest:  N             N
                      U

            Src                             L <-- R
            Dest X

 Src:   N             N               N
 Dest:  E             T               T
                      ?


In all three examples, when the left translation router receives the
packet from its left host, how does it know whether or not to do
Bilateral or Unilateral translation?

If it can figure this out - presumably from what it knows about the
address space and which prefixes are Transit or End user prefixes,
then why do you need a flag bit in the packet sent to the right
translation router?  Can't both the left and right routers in Fig 2
and Fig 5 figure out the Transit or End nature of the source and
destination addresses they receive, and then whether or not to
translate, if necessary, to both Transit or both End user, depending
on whether the packet is going to the core or to an Edge network?

The state of the Unilateral/Bilateral bit would need always to be B
when raw packets (having not been through any translation router)
arrive at the left translation router in Fig 4.

I am still not sure I understand your proposal fully.




>>    MSB                                 LSB
>>
>>    xxxx xxxx xxxx yyyy zzzz zzzz zzzz zzzz
>>
>> z bits are unaffected by address translation.
>>
>> My understanding is that you want to rewrite all of x and
>> y bits, but you think that in order to keep the checksum the
>> same.  So the best approach is to rewrite the most important
>> (most significant) bits the way you want, and not try to be
>> specific about the yyyy 16 bits, because you are going to have
>> to set them to some value based on the other changes
>> (difference between the 16 bit sum of the original xxxx
>> xxxx xxxx bits and the 16 bit sum of their translated values)
>> to keep the 16 bit checksum from changing.
>>
>> This therefore would limit the use of your system to prefixes of
>> /48 or shorter - which is not much of a problem, I think.
>
> No, not at all.  You can continue to use the y-bits in the address
> as usual.  (These bits are called "subnet ID" in RFC 4291
> terminology.)

> (A "routing prefix" is what RFC 4291 calls the x-bits in the
> address.)

> The subnet ID simply changes by a constant offset as a packet
> traverses a Six/One router.  (And FWIW, this constant is the same
> for all packets that a given Six/One router forwards.)

I don't understand how this can be the case.  The offset is only
constant for a particular pair or Transit and End user addresses.

Let's say an End-user prefix 4000:0000:1000::/48 uses a provider
prefix 2001:0200:0400::/48 as its Transit prefix, then when the
translation router receives a packet from an End-user in its network
and the source address is:

   xxxx xxxx xxxx yyyy zzzz zzzz zzzz zzzz
   4000 0000 1000 0000 1111 2222 3333 4444

it needs to translate it to a Transit prefix (no matter whether this
is Unilateral or Bilateral) The result, ideally, would be:

   xxxx xxxx xxxx yyyy zzzz zzzz zzzz zzzz
   2001 0200 0400 0000 1111 2222 3333 4444

However, to keep the 16 bit checksum the same, you would need to
make the yyyy bits equal to:


yyyy + (4000 + 0000 + 1000)
     - (2001 + 0200 + 0400)  =  yyyy + 29FF

So within this /48 prefix, there is a rotated translation between
the yyyy bits of all host's addresses in the End-user network and
the address from which they appear to use as Transit addresses.  For
the recipient host on the right, in Figure 4 (Unilateral translation
for supporting hosts in non-upgraded networks) this is the address
it must use to send packets to or receive packets from the host on
the left.

Assuming the upgraded network on the left is multihomed, translating
for a second provider prefix would result in a different rotation value.

Translating back to the original address at the recipient Six/One
Router, you need to ignore the yyyy bits for a moment and translate
the xxxx xxxx xxxx bits back to their value for the appropriate
end-user network, and then (for the above example) subtract 29FF
from the yyyy bits.

This would not be possible if your end-user network had a prefix
length of /49 or more.

So I think it limits the granularity of the system to /48 - as long
as you use this technique of avoiding changing the checksum.  That
is probably fine, but for Ivip6, I want the granularity to be /64.

Recent RRG rough consensus is for a mapping granularity (and
therefore the granularity with which EID prefixes AKA micronets can
be defined) of one IP address: for IPv6: a /128.  Or at least that
is my interpretation of:

  http://psg.com/lists/rrg/2008/msg01299.html

    for the mapping function, we have consensus that we should
    support host specific identifiers, as well as blocks.

I think /64 is a much better granularity for IPv6 than /128.


> The reason why the subnet ID doesn't lose its meaning is that it
> is used only within an edge network, and packets always have the
> original (unchanged) subnet ID while within a given edge network.

I understand that there are no translations within any given edge
network.

I wasn't suggesting the rotation of the yyyy range (subnet ID) would
 not work - just observing that it is a messy arrangement having a
given end-user host appear at transit addresses with differing yyyy
values for every one of the transit prefixes it uses (such as with
multihoming).

In Fig 4 - Unilateral writing to make a user network host reachable
from a host in a non-upgraded network, via a transit address - the
non-upgraded host would be able to send a packet to:

  xxxx xxxx xxxx yyyy zzzz zzzz zzzz zzzz
  2001 0200 0400 29FF 1111 2222 3333 4444

and the translation router would look at the xxxx xxxx xxxx bits, do
the translation, subtract 29FF from yyyy and so cause the packet to
be addressed to the proper end-user network host:

  xxxx xxxx xxxx yyyy zzzz zzzz zzzz zzzz
  4000 0000 1000 0000 1111 2222 3333 4444

The non-upgraded host could have gained this 2001.... address either
by being sent a packet by the host in the end-user network, by
finding it was an address which worked, out of several in an AAAA
record from a DNS lookup, or by being told the address by some other
means.

I am not saying it can't be done, just that having the transit
addresses being more than a simple translation is a confusing aspect
of this arrangement.  One can't just overwrite bits, one must look
at what the bits were beforehand, and then do some hex arithmetic on
the difference, and add or subtract that from the yyyy bits.  This
would need to be done, for instance, to generate entries in the DNS
AAAA record.


> Of course, the proposed method is just one example of how checksum
> changes could be avoided.  Others are possible -- like
> re-computing the checksum, or obtaining transit routing prefixes
> that are checksum-equivalent with the corresponding edge routing
> prefix.

I will return to this second point below.

My understanding of "recomputing the checksum" is that you would
need to find and change the checksum for the whole packet, which is
not in the IPv6 header, but which may be in whatever follows:

          Use IPv6 Pseudoheader?
          | http://tools.ietf.org/html/rfc2460#section-8.1
          |
Protocol  |     Checksum location or notes
       |  |     |

     TCP  Y     Bits 128 to 143.

     UDP  Y     Bits 48 to 63.

    SCTP  N     32 bit "checksum" - but I don't see any
                reference to the IP header.

    DCCP  N     Doesn't seem to refer to IP header.

    ICMP  Y     Bits 16 to 31.

    IGMP  N     Checksum is only for the IGMP payload.

But for those packets with checksums, you also need to step past any
extension headers.

This sounds really tricky and expensive to do on every translation.
 Map-encap involves extra bytes in the header, but at least it
delivers the desired packet verbatim, without the need for any of
this translation and checksum work.



> obtaining transit routing prefixes that are checksum-equivalent
> with the corresponding edge routing prefix.

I think this would be highly restrictive.  For a given end-user
prefix - for example any particular /48 - you have to choose a
provider prefix:

    aaaa bbbb cccc /48

However, to make this choice result in no change being required to
the yyyy bits in order to keep the checksum value unchanged, then
for any particular combination of values aaaa and bbbb, you can't
choose any value cccc - there is only one of the 64k values of cccc
which can be chosen.


> Since the method is transparent to the remote peer, a Six/One
> router can do whatever it likes as long as the checksum in the
> indirected packet is consistent with the packet.
>
> And worthwhile to emphasize, you only need any of these methods
> for backwards compatibility, not for packet exchanges between
> upgraded edge networks.
>
> Does this makes things clearer?

Yes.  "Only for backwards compatibility" is no small thing, because
this is the condition imposed on virtually all communications for
early adopters - and if these restrictions are unpalatable, then
there will only ever be a few adoptors.  Then, the system would
never be widely enough adopted either to make a difference to the
routing scaling problem or to enable the adoptors to reap the
benefits by enjoying more traffic which is free of the "backwards
compatibility" restrictions.

Also, isn't there a continuing role for end-users being on ordinary
PI addresses, which are not subject to translation?  If so, then
there will always be lots of space used which has no translation
router in its network.




>>                 I can't see how Six/One Router's Unilateral mode
>>                 could support IPsec authentication of the IPv6
>>                 header.
> 
> You are correct.  In backwards compatibility mode, Six/One Router
> breaks IPsec Authentication Header.  It's the same as with NATs.
> Packet exchanges between two upgraded edge networks don't have this
> limitation.  And packet exchanges in IPsec ESP don't have it either.
> But packet exchanges in IPsec AH with legacy edge networks do.  One
> may argue that this is acceptable because IPsec AH is not widely used,
> but that is clearly a personal opinion.


OK - but what about my previous point that not all end-users might
be on the new kind of address space?

With map-encap such as LISP with PTRs or Ivip4 with OITRDs, or with
Ivip6's "Label Forwarding" approach (also with OITRDs) it is fine
for there to be plenty of end-users on conventional PI space.  They
can communicate fine with the new kinds of end-user networks - hosts
on ordinary PI addresses would need to have their packets go through
an ITR in order for the packet to reach the EID destination host
(LISP) or the micronet addressed destination hot (Ivip).  That ITR
could be in the ISP network of the PI sending host, or if that ISP
for some reason had no ITRs (ISPs should provide them) then the
packets will find their way to the core, where they will go to the
nearest OITRD (PTR for LISP) and then be tunneled to the ETR.

So there is no loss of connectivity between hosts on ordinary
address space and on the new kind of address space, irrespective of
whether there are ITRs or not in the network of the sending host
which is on a conventional address.  IPsec will work fine.
(Assuming there is no NAT at either end.)

With Six/One Router, the inability for IPsec AH to work is a
permanent restriction on communication between hosts with the new
kind of address space and those with conventional addresses.

I don't know much about IPsec, but if it is not unthinkable to break
 IPsec's AH mode, then that gives me some ideas for using bits in
the IPv4 header for purposes different than they are currently used.

  - Robin






--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg