[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: review of draft-ietf-shim6-failure-detection-03.txt

To: Jari Arkko <jari.arkko@piuha.net>
Subject: Re: review of draft-ietf-shim6-failure-detection-03.txt
From: Iljitsch van Beijnum <iljitsch@muada.com>
Date: Mon, 26 Jun 2006 15:25:06 +0200
Cc: shim6-wg <shim6@psg.com>
In-reply-to: <449FCB03.6070607@piuha.net>
References: <615BD9B54CB01B41838C323DB9E91AAB4075EB@esebe100.NOE.Nokia.com> <446DC7E4.3090501@piuha.net> <F9F7CB0B-567C-4CB0-9775-63098CEBBD22@muada.com> <4497FBEA.7020908@piuha.net> <099F9743-9BD2-43B8-B53F-A1644D1D81C9@muada.com> <449FCB03.6070607@piuha.net>

On 26-jun-2006, at 13:54, Jari Arkko wrote:

One of the reason why I think generality is something that we
need to leave for future verification is that other contexts like
HIP have significantly more complex requirements than Shim6.
In particular, HIP needs to work with IPv4 and NATs.

One way to address this is to split the reachability detection into ashim-specific part (the keepalives) and a generic part (the pathexploration), so that the generic part can evolve independent of shim6.

Note that I'm not saying this is necessarily the path we should take,but seeing that there is other stuff that can also benefit fromreachability detection, it certainly seems prudent to think aboutthis now while we still have the opportunity to easily go intoanother direction.

I am uncertain if the ICMP mechanism for the path exploration
part is the best way forward. The entire REAP protocol could
certainly be encapsulated in any "user" protocol.

Yes, but that way it's hard to avoid it from being implemented morethan once and it's impossible to keep it from running more than once.

But many
scenarios that I can see do not necessarily work well with ICMP,
particularly with one version of ICMP. In MOBIKE, for instance,
it would have been inappropriate to use anything else than the
regular NAT-T UDP at the bottom, because only that can show
whether actual MOBIKE/IKEv2/IPsec traffic will get through. And
what about HIP running over IPv4?

Since we're determining unidirectional reachability, NAT is notactually as big a problem as it would ordinarily seem. The big issuewith NAT is making sure that both ends know the addresses on bothsides and there are translation rules that allow incoming packets tobe delivered. But you're right ICMP won't work with NAT, we'd have touse UDP for that. But I guess it would make sense to develop a non-NAT IPv6 version first and then see what needs to be done to make itwork over IPv4 with NAT.

But as a general rule, I'd like to
get a working, as simple as possible Shim6 mechanism out there.
Even if its not optimized for all situations. We can work onextensions
like fate sharing between contexts later, too.

I disagree with that approach, I think we should make the firstversion as good as it can be. I don't see much added value infinishing this work a little earlier, we're obviously too late toavoid PI in IPv6 now (if that can be avoided it won't be because ofshim6) or avoid having legacy non-shim IPv6 implementations outthere, but at the same time IPv6 isn't deployed on any measurablescale yet so there is still (some) time.

Was there other issues related to probe storms? We do have
exponential back-off.

Not really. There is no description of how this works.

Another thing that's missing completely from this draft is a
discussion of how to use address pair preference information. This
makes it impossible to address traffic engineering needs.

This is important, but can be addressed separately.

No, that would make it MUCH harder, as this will only work well ifboth ends implement it. And we're getting flack for not payingattention to traffic engineering as it is.

This doesn't say what shim6 implementers should do. In my opinion:
keep using deprecated addresses as the ULID/primary locator as long
as possible, but prefer non-deprecated addresses when selecting
alternative locators.

Right. But I actually already deleted Section 4.5 and the discussion
of deprecated addresses. IPv6 specifications already call for use
of non-deprecated addresses for new communications, and disallow

the use of invalid addresses. So its not clear that we need to saymore.

Leaving out all mention of deprecated addresses is ok by me, but assoon as you bring it up it's a good idea to say what you want tohappen with them.

Data packets as opposed to what other types of packets?

I added a clarification. Basically, its all packets including both
ULP packets and SHIM6 control messages, but NOT keepalives
or probes.

I think it's a good idea to consider TCP ACKs with no user data asnon-data packets that don't need to generate return traffic as well.

So when we receive a keepalive from the other side, _we_ stop sending
keepalives? This may be the right thing to do, but it's not obvious
to me why. Some explanation would help.

Keepalives are only used if there's one-way communication. Since the
other side sends a keepalive, its not sending anything else at that
time. Hence we have no need for keepalives.

But do we need this rule? It may make implementations more complexwithout any benefit.

The keepalives are sent at an interval of 3 seconds (or shorter, I
imagine that an implementation isn't going to keep an exact timer for
each context, any rounding must obviously be in the down direction)
and the timeout is 10 seconds. In these 10 seconds you'd normally
receive 3 keepalives, while 1 is enough to indicate that the other
side is still alive. The other 2 are only there in case of packet
loss. I think that's excessive. Starting the full reachability
exploration because of incidental packet loss isn't such a big deal
that it warrants sending three times as many packets as necessary.

The question is what the right number is. We want to avoid
entering exploration needlessly, so I'd rule 1 keepalive out.
We now have 3, are you arguing for 2? I'd be fine with that,
but I note that we don't have a lot of evidence to support
either view. We're going to have to revisit this after we get
the experience.

Why not leave it up to the implementers? If we say that after 10seconds the full path exploration starts, implementers are free toexperiment with what works well (3, 4, 8 seconds between keepalives)without any need to revisit the spec.

Why would a keepalive need an id field?

So that a probe reception report can indicate
seeing a recent keepalive.

I see. But what about the case where there are no keepalives, onlydata packets? In that case, there's no id field either.

I believe that since the id of the last received probe is included,
the iseeyou flag is unnecessary.

But we also have the case where you report seeing data packets but
no probes.

Good point. But couldn't that be solved by using a special value forthe last seen id?

But if you have ideas on how this could be simplified -- perhaps by
not thinking about the data packets during exploration -- those
would be welcome.

Way ahead of you - I wasn't thinking about data packets duringexploration until now. :-)

Although copying back the last seen id seems to do the job, I can't
help but feel that it would be preferable to add timers to reach
round trip times and copy back more received ids and also sent ids.
This allows the receiver of a probe to determine which of the probes
that made it to the other side did so faster, so it can select the
address pair with the shortest round trip time.

Right. But all that can go into extensions. I'd like to have the
minimum necessary to get this spec done.

Let's split the difference and specify the fields, but make (most of)their use optional.

I severely dislike having fixed length data in TLV format, becausethat makes parsing much harder. If you look at Van Jacobson's workwith TCP you'll see that a fixed header allows extremely streamlinedimplementations.

Including sent ids along with the addresses the probe with that id
was sent to helps the receiver determine that some probes didn't make
it (yet). If a probe didn't work in one direction of an address pair,
it's reasonable to assume that it may also not work in the other
direction and try other pairs first.

True as well, but again perhaps material for future
optimizations.

I really like having multiple ids for earlier probes in there, itshould cut down on the number of packets exchanged, and I alsosuspect that there could be race conditions when certain packets arelost and not others so that an implementation that only echos thelast seen id may stay in an oscillating state, but I haven't beenable to think of an example so far.

Follow-Ups:
- Re: review of draft-ietf-shim6-failure-detection-03.txt
  - From: Jari Arkko <jari.arkko@piuha.net>

References:
- review of draft-ietf-shim6-failure-detection-03.txt
  - From: <john.loughney@nokia.com>
- Re: review of draft-ietf-shim6-failure-detection-03.txt
  - From: Jari Arkko <jari.arkko@piuha.net>
- Re: review of draft-ietf-shim6-failure-detection-03.txt
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: review of draft-ietf-shim6-failure-detection-03.txt
  - From: Jari Arkko <jari.arkko@piuha.net>
- Re: review of draft-ietf-shim6-failure-detection-03.txt
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: review of draft-ietf-shim6-failure-detection-03.txt
  - From: Jari Arkko <jari.arkko@piuha.net>

Prev by Date: Re: review of draft-ietf-shim6-failure-detection-03.txt
Next by Date: Re: review of draft-ietf-shim6-failure-detection-03.txt
Previous by thread: Re: review of draft-ietf-shim6-failure-detection-03.txt
Next by thread: Re: review of draft-ietf-shim6-failure-detection-03.txt
Index(es):
- Date
- Thread