[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: about draft-ietf-shim6-proto-01.txt



marcelo bagnulo braun wrote:
Hi,

i think that the document is improving rapidly, i send some comments below...

Thanks for your comments.

I've addressed the comments if they are not included in this response.
Thus I'm just responding with clarifications and discussion points here.

In section 2.1  Definitions

   Host-pair context   The state that the multihoming shim maintains for
                       a particular peer.  The context is for a ULID
                       pair, as is identified by a context tag for each
                       direction.

I would rather use the expression shim-context rather than host pair context, because since each context is associated to a given ULID pair, there maybe more than one context for a given host pair, right?

If we've nailed that down it might make sense to rename it ULID-pair
context. A "shim context" name doesn't indicate much about its scope.

But I'm not sure we've nailed that down completely; there might be some
sharing of reachability state across different ULID-pairs for the same
host-pair.

(moreover, as we consider the forking concept more in depth, there maybe more than one context for a given ulid pair) so using the vague expression shim context would suit to whatever extent this evolves to.

But even with forking one context applies to only one ULID-pair. (Having
multiple contexts per ULID pair doesn't change this.)

Similarly, i wouldn't say that the Host-pair context is" The state that the multihoming shim maintains for a particular peer." but rather that it is the "The state that the multihoming shim maintains for a particular ULID pair." (at least for now)

I'll fix this.


   Even though we do not overload the flow label field to carry the
   context tag, any protocol (such as RSVP or NSIS) which signals
   information about flows from the host stack to devices in the path,
   need to be made aware of the locator agility introduced by a layer 3
   shim, so that the signaling can be performed for the locator pairs
   that are currently being used.

I would remove the first part of the sentence of this paragraph. I mean, in the future when people read the spec, they won't be familiar with our flow label discussions, so this initial part of the sentence may sound a bit strange to them imho.

Well, yes, but I'm more concerned about people understanding it now
since we've changed direction a bit on this point. We can adjust this
later (as in -03).

I guess that the common shim header would be the following:

    0                   1
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Next Header  |  Hdr Ext Len  |P|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

being P a flag that indicates whether this is a payload message (P set) or this is a control message (P reset)

Good suggestion.

and then the payload and control messages can be defined.

Then in section 5.2  Common Shim6 Control header

   The common part of the header has a next header and header extension
   length field which is consistent with the other IPv6 extension
   headers, even if the next header value is always "NO NEXT HEADER" for
   the control messages

but before this header there can be other header, right? like the routing header or hop-by-hop header.
My point is that this header is also located before
   any endpoint extension headers (fragmentation headers, destination
   options header, AH, ESP), but after any routing related headers (hop-
   by-hop extensions header, routing header, a destinations options
   header which precedes a routing header).
The difference is that for now, no piggybacking of ulp in the shim control messages is defined yet.

And none planned! ;-)
We know from MIPv6 endless discussions that piggybacking e.g. TCP
payload on some other semantics carrying protocol is problematic, since
since like packet policies (IPsec, firewalls) might want to let one part
of the packet through and not another part of it.

So i would state the location of the shim header and distinguish the case of payload messages where piggybacking of ulp is supported and control messages, where it is not. This description would fit in the description of the common header (since it is always placed in the same place)

I don't understand what text you are suggesting to change, given that we
do not allow anything but nexthdr=59 for control messages.

In section 5.4  R1 Message Format

It may make sense to allow the responder to include the locator set in the R1 message, so that the initiator can acknowledge the reception of the locator set (otherwise, the locator set of the responder is sent in R2 which remains unacknowledged).

This isn't a problem AFAICT. If the initiator doesn't receive R2 it will
retransmit I2. The ULP packets from responder to initiator can still
flow, since they haven't failed over to use an alternate locator pair
yet. Worst case would be a packet loss when we want to switch to
alternate locators immediately as part of the context establishment
(when the responder might start sending payload messages before the
initiator has received the lost R2.)

If we want to fix that, it might make more sense to make the receiver
demux packets solely on the context tag, thus the source locator isn't
used to lookup the context state. (This also allows router rewriting of
source locators, which might be useful flexibility for the future.)


the potential problem with this is that an small I1 packet may result in a somehow big R1 packet which may be used by an attacker as an amplifier (but i am not sure whether this is a big issue, since the amplification won't be so important)

If you are going to send the locator list, wouldn't you need to also
send that is needed by the peer to verify the list? For CGA that would
imply generating a PK signature in order to send R1.

In section 5.6  R2 Message Format

   CGA Parameter Data Structure: Included when the locator list is
                  included and the PDS was not included in the context
                  establishment messages, so the receiver can verify the
                  locator list.
This is kind of strange since this is one of the context establishment messages...
I guess it should read the same than in I2, i.e.

   CGA Parameter Data Structure: Included when the locator list is
                  included so the receiver can verify the locator list.

Yes.Cut and paste error.

In section 5.8  Update Request Message Format

As i understand the locator set contained in this message contains the complete set of locators available, as opposed to including a differential list of what are the new locators to be added/removed to the locator list, right?

Correct.

If this is so, i think this should be explicitly stated in this section.

In section 5.9  Update Acknowledgement Message Format

   This message is sent in response to a Update Request message.  It
   implies that the Update Request has been received, and that any new
   locators in the Update Request can now be used as the source locators
   of packets.  But it does not imply that the (new) locators have been
   verified

I am not sure what verifications are needed to accept a new locator as a valid src address and which is needed for using it as a valid dst address. I mean, i clearly see that the reachability test used to prevent flooding attacks is required before using the new locator as dest address, but it is not needed before accepting packets with this locator as src address. But, are the HBA/CGA verifications required before accepting packets with the new locator as src address? In other words, does the receiver of a update request message needs to perform HBA/CGA validation before sending the update ack packet?

I don't think so. That would imply that the same requirement would apply
to the context establishment. It makes sense to be able defer the
HBA/CGA verification as well as the return routability check until the
locator is about to be used as a destination. That way we can get the RR
check for free as part of the reachability testing that will happen in
any case.

Let's see this a bit more in detail... accepting packets from unverified locators in src address would allow an attacker to inject packets in a communication associated with a shim context, even though the peer will be still sending packets to the verified address. Of course for doing that, the attacker needs to determine the context tag, so it needs to have been on path once (and also need to use a sequence number in the locator update higher than the one currently used), so it doesn't seems so easy. However, accepting an unverified locator set may cause problems, in particular suppose that the attacker sends a packet with a very high seq number and the correct context tag, but it includes a locator list that contains all invalid locators. The victim accepts the list and substitutes the old (valid) list by the new (false) list. The result is a DoS attack, since there won't be any locator valid in the new list. Basically, accepting unverified lists of locators would enable time-shifted DoS attacks. This can be prevented by performing hba/cga verification before accepting the locator list. This has a cost though, since we need to make all the computation effort of verifying the locator list upon the reception of the list, even if we may not need the alternative locators ever.

Either the host needs to track that it has an old (potentially still
unverified) list of locators from the context establishment, and a new
(unverified) list of locators from one (or more) update messages, and
accept any one of those locators as a source.

Or, we bite the bullet and have a larger context tag and ignore the
source locator in the lookup.

But in both cases we have an issue with injection of bad Update
messages. What prevents the on-path attacker from injecting such a
message and replacing the CGA parameter set with something, which when
verified, will fail the verification?

In section 5.10  Reachability Probe Message Format

   This message includes the ULID pair as well as the context tag, so
   that the peer can indeed verify that it has that ULID and that the
   context tag is correct.

Just to keep in mind Iljitsch suggestion to be able to use a single reachability probe when multiple contexts were using the same locator pair, so that no multiple probes to the same locator pair are needed. For this, we need something different than simply the context tag and the ulid pair. Perhaps all the ulids and context tags of the involved context would be enough rather then move to new namespaces as hostids...

Yes. I haven't yet read Jari's and Iljitsch draft to see what will happen
here. Perhaps we'll end up removing the definition of the reachability
related packet formats from the proto document and have them in the
reachability document.

In section 5.14  Option Formats

it is stated that:

   Total Length = 11 + Length - (Length + 3) % 8;

wouldn't that be

   Total Length = 12 + Length - (Length + 4) % 8;

I just stole that from the HIP spec, so how would I know ;-)

When Length = 4, no padding is needed so the Total should be 8.
But your formula ends with with 16 in that case.
When Length = 5 we need 7 bytes of padding for a total of 16, which we
get from both formulas.
When Length=12 no padding is needed so Total should be 16. Etc.

In section 5.14.1  Validator Option Format

wouldn't it make sense to define different option types for different hash functions used for the validator? like 1 for sha1 validator, 2 for md5 and so on, so that different hash functions are supported? (and it is possible to move from one to another hash function by supporting many of them?

The responder can use whatever way it wants, including the phase of the
moon, as the validator. The initiator doesn't do anything with it other
than return it in I2.

In section 5.14.3  Locator Preferences Option Format

I am not sure i fully understand the format of this option...

For instance, all elements have the same length i.e. the Element length field message applies to all the elements

Yes.

, or there is an Element Length
field for each Element item that follows? (i think it is the first option, but the figure confuses me, because the Element[2] seems to have a different Length than Element[1] and Element[3]; other thing that confuses me about the figure is that the Element Length field seems to have variable length or 2 octets (because it has two rows))

The picture was messed up a bit, but all the Elements are two octets in
it. Perhaps I need a different picture for each different Element Len?

Besides, the signature defined in RFC3972 requires a 128-bit type tag for the message from the CGA Message Type name space that is specific to the usage of the CGA. In this case we would need to define such for shim and declare it in the IANA considerations, so that IANA assigns it. This 128 bit type tag is concatenated to what is being signed, so i guess this needs to be included here.

Does that mean we need to repeat the format from 3972?
Send text.

Furthermore, i guess we need to specify what is being signed here. I guess it doesn't makes sense to sign the whole locator list, since some of them are validated using HBAs. So a possible approach would be to extract from the locator list all of those that are validated through the CGA and sign those (ordered as they are included in the locator list),

I was assuming that would be specified in the HBA draft. But we can
specify it in either draft I guess.

How does Send/CGA prevent replays? Wouldn't that imply we need to sign
more (the generation number?).
What is easiest to describe and perhaps implement is to sign the whole
Update Message, but (except the signature option itself which we could
require to be last). But that would be a bit odd in the sense that a
locator preference option, when sent together with a locator list option
using CGA, would be signed. But when the locator preference option is
sent by itself it wouldn't be signed.

In any case, I don't see a problem with signing the whole Locator List
option (plus whatever other pieces are needed) i.e. include any
HBA-verified prefixes.


In section 7.5  Sending I1 messages

   If, after several retransmissions, there is no response, then most
   likely the peer does not implement the shim6 protocol, or there could
   be a firewall that blocks the protocol.

I guess that this depends on whether there is an ongoing communication or not. If there is an ongoing communication, then we can assume that the ulid is reachable, so if no answer is received then is because of any of the two above reasons. However, if there is not an ongoing communication or if a failure has just occurred, maybe it is that the ULID selected in unreachable and other locators/ULIDs should be tried. Not sure what to do in this case...

Failures before the context are up would need to be handled using
"failures during initial contact" wouldn't they?
Or is there a middle place where
 - ULP has communicated a bit, but no context is created yet
 - a failure occurs
This would be complex to handle I think.

In section 9.  Handling ICMP Error Messages

   But when the shim on the transmitting side
   replaces the ULIDs in the IP address fields with some other locators,
   then an ICMP error coming back will have a "packet in error" which is
   not a packet that the ULP sent.  Thus the implementation will have to
   apply the reverse mapping to the "packet in error" before passing the
   ICMP error up to the ULP.

but the ICMP error won't have a shim payload header so, it won't be processed by the shim, right? or are we supposing that thee shim will inspect all packets (having the shim header or not) and identify ICMP error packets and in those inspect whther the inside packet contains a shim header and then do the processing? I am especially concerned with the inspect all packets part, since i think i remeber talking in the meeting that this would result in important performance issues...

It's not all packets. An implementation which conforms to RFC 1122 need
a mechanism to pass ICMP errors (as some form of error notifications) to
the ULPs on the host based on the next hdr values in the packet in error.
Thus if the packet in error was TCP it passes an event to a TCP specific
handler. And if the packet is SHIM6, it passes an event to a SHIM6
specific handler. This handler can then figure out what to do next,
based on the following next hdr value in the packet in error.

Suggestions on how I can clarify this?

10.  Teardown of the Host Pair Context

   Each host can unilaterally decide when to tear down a host-pair
   context.  It is RECOMMENDED that hosts not tear down the context when
   they know that there is some upper layer protocol that might use the
   context.

Not sure i agree here. I recall Iljitsch suggestion that servers could tear down context prematurelly and leave up to the clients the recovery effort, using the context loss recovery mechanisms in order to reduce the load imposed by the shim. This sounds like a very nice approach to me. so i am not sure i agree with the reccomend here (perhaps i fail to see what the reccomend term implies...

Added as an open issue.

Section 16.  Open Issues

   The following open issues are known:
   o  Is there need for keeping the list of locators private between the
      two communicating endpoints?  We can potentially accomplish that
      when using CGA but not with HBA, but it comes at the cost of doing
      some public key encryption and decryption operations as part of
      the context establishment.

I think that the option selected for this is a future extension of the protocol

Yep

   o  Forking the context state.  On the mailing list we've discussed
      the need to fork the context state, so that different ULP streams
      can be sent using different locator pairs.  No protocol extensions
      are needed if any forking is done independently by each endpoint.
      But if we want A to be able to tell B that certain traffic (a
      5-tuple?) should be forked, then we need a way to convey this in
      the shim6 protocol.  The hard part would be defining what
      selectors can be specified for the filter which determines which
      traffic uses which of the forks.  So the question is whether we
      really need signaling for forking, or whether it is sufficient to
      allow each endpoint to do its own selection of which locator pair
      it is using for which traffic.

imho for this what is needed is the capability of setting multiple context with the same ULID pair but with different context tags, and an API that allows the ulp to signal the shim which context is to be used (what the doc calls primary vs other contexts i think)

Yes.

   Erik