[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: review of draft-ietf-shim6-failure-detection-03.txt
On 22-jun-2006, at 16:59, marcelo bagnulo braun wrote:
This draft works per-context exclusively. So if there are 2, 5 or
10 contexts between two hosts, this means 2, 5 or 10 times the
amount of work is done.
i agree that this would be a nice feature. the problem with this is
how do you identify the peer in such a way that you can probe all
the existing contexts.
Have a look at the revision of my reachability detection draft:
http://www.muada.com/drafts/draft-van-beijnum-shim6-reach-detect-00.txt
Note that this is an update of draft-ietf-shim6-reach-detect-01.txt
and it's not yet posted by the secretariat.
The other option would be to use a single probe/keepalive for all
the contexts between two peers. In order to do that we need a mean
to identify the peer so that the receiver of the packet can
identify all the contexts corresponding to the same hosts and apply
the received packet to all the contexts.
Indeed.
BAsically this would introduce the notion of endpoint in the shim
context/protocol (which is not present today), since today the
granularity is ulid pairs (as oposed to endpoint pairs)
Not necessarily, read my draft.
this would be a considerable change in the protocol i guess, but
may be explored if people deem it relevant.
It makes the protocol a bit more complex, but it does allow it to be
used by many different protocols at the same time.
As a general comment, i am kind of worried about the complexity of
the resulting protocol, including shim protoc and the failure
detection protocol and i would really preffer to try to simplify
the protocol rather than making it more complex, even if this means
loosing some optimization for some cases.
I suppose the case where there are multiple contexts between two host
won't be that common that it's worth too much effort to deal with it.
But if other protocols also need this, then it would be MUCH better
to have a single code base that's shared by all of them rather than
have essentially the same thing pop up in different places.
I am concerned about having a complex protocol that may become
error prone (we already have feedback expressing this concern BTW)
I hate complexity as much as the next IETFer, but leaving the last
10% out just because it's simpler is generally not a good solution.
However, it's important that there is fate sharing between the
reachability protocol and the user protocol (shim in our case). I
think this can be solved by having the quick reachability
verification stuff (= FBD) encapsulated in the user protocol, but
let the full path exploration be a protocol of its own or live
under ICMPv6 or some such.
not sure why do you think this is needed. Defining the protocol
messages in a way that they can be included in the shim6 header as
well as in the mobility header or the hip header would be good
enough to allow using the failure detection protocol in other
protocols.... what am i missing?
See the discussion above, and the need for fate sharing between the
reachability protocol and the "user" protocol. If we want the
reachability detection to be shared by different users, then it can
happen that one protocol is filtered and another isn't. So we
probably want the reachability detection to be independent of the
"user" protocols and then when the reachability protocol says that
something is reachable, the user protocol does a quick check using
its own protocol number to be sure it actually works.
Another thing that's missing completely from this draft is a
discussion of how to use address pair preference information. This
makes it impossible to address traffic engineering needs.
well, i have been working on this and i have submitted a draft
about how to perform locator pair selection, including reachability
information and also preference information from the shim protocol
you can find it at:
http://www.ietf.org/internet-drafts/draft-ietf-shim6-locator-pair-
selection-00.txt
of course your feedback would be very welcome
I'll have a look at it.
i think that the definition section is very useful, because the
insight it provides about the different states of an address and
address pairs are very important.
I agree, but my problem with the definition section is that it
contains too much stuff that shouldn't be there. It's not unusual to
have to go back to the definition section several times during
reading, so a definition section needs to be as concise as possible.
I suggest tightening the use of words like "operational", "work",
"reachable". They're mostly used interchangably in the draft.
i don't think this is the case.
i find this differences relevant imho
I'm not sure there is a difference, and if there is, what it is...
This doesn't say what shim6 implementers should do. In my opinion:
keep using deprecated addresses as the ULID/primary locator as
long as possible, but prefer non-deprecated addresses when
selecting alternative locators.
i think this should belong to the locator selection document...
Is that a separate document???
2. Whenever outgoing data packets are generated
Data packets as opposed to what other types of packets?
signalling packets, such as keeplives or probes (is my understanding)
Sure, but the draft doesn't say that.
4. The reception of a REAP keepalive packet leads to stopping the
timer associated with the return traffic from the peer.
So when we receive a keepalive from the other side, _we_ stop
sending keepalives
as i understand it, this means that we are not expecting another
packet (until we send a new packet, of course)
I guess. But shouldn't this follow from the general rules rather than
be a specific one?
The keepalives are sent at an interval of 3 seconds (or shorter, I
imagine that an implementation isn't going to keep an exact timer
for each context, any rounding must obviously be in the down
direction) and the timeout is 10 seconds. In these 10 seconds
you'd normally receive 3 keepalives, while 1 is enough to indicate
that the other side is still alive. The other 2 are only there in
case of packet loss. I think that's excessive.
would you suggest it to reduce it to 2 packets every 10 secs?
That's a bit better, but actually I think 1 in 10 seconds is enough,
although that means you need to take a few extra seconds before you
can time out. If you want to time out after 10 seconds then sending a
keepalive after 8 would probably be a good choice.
I mean, i think this protocol will require quite a lot of fine
tunning based on experience and simulations of the load... i guess
that what's in the current spec are resonable values for the time
being (i have no problem with changing them a bit, but as i said i
guess in depth fine tunning will be needed once we have more
experience...)
How is experience going to tell us anything that we don't know
already in this case? If we go for one missed keepalive before a
timeout that would be a new approach that may not work out well and
then we can go back to 5 seconds or 3 seconds, but starting at 3
means a lot of packets but as good as no unnecessary triggering of
path exploration, there won't be any surprises there.
I believe that since the id of the last received probe is
included, the iseeyou flag is unnecessary.
you mean that if the id field is empty, this means iseeyou=no?
No, what I mean is that the value of this bit doesn't convey any
interesting information.
Or maybe it really is a "reply requested" bit in disguise, like we
discussed earlier.
Although copying back the last seen id seems to do the job, I
can't help but feel that it would be preferable to add timers to
reach round trip times and copy back more received ids and also
sent ids. This allows the receiver of a probe to determine which
of the probes that made it to the other side did so faster, so it
can select the address pair with the shortest round trip time.
i would suggest to leave this for future work, since it is added
complexity and it is not obvious to me that selecting the fastest
one is always the best choice.... (e.g. bandwidth is not considered)
I'd say: put in the fields, this is very little extra work, and the
values can be ignored for simplicity when desired. Then, implementers
can experiement with how they use them if they like.
The keepalive is a fairly long packet. I think just a shim header
as would be used for data packets but with no ULP following the
shim header would be sufficient.
not sure what would you omit from the current packet format... i
mean, we need the context tag and the identifier and we need it to
make it extensible in the header....
No we don't. Data packets don't have these fields either and also
indicate that the current context is working. Moreover: data packets
that haven't been rewritten don't even have a shim header!
Requiring random numbers in packets that are sent rather
frequently is a bad idea, because it depletes the typically
limited amount of entropy that's available for strong random
number generation rather quickly and semi-random number generation
may be somewhat expensive (and not that good). And I don't see
what good an id does in a keepalive anyway... Also, there may be
reasons to have non-random numbers, such as ease of lookup.
i guess this i neeeded to indeed verify that the reply was
generated as a response to the initial packet,
Keepalives are generated autonomously, not in response to other shim
packets, so this is not relevant in this case.
I don't have a good feeling about this... It's too hard to
determine what should be happening. Maybe it would be better
rather than go down the list of packets that are sent/received and
describe the behavior in each state, to take one state at a time
and describe what happens with packets in that state.
that would be the state machine i guess, right?
I don't know.
Then I'm ignoring this too.
But I would be happier if they'd be removed, because either
they're superfluous as they're not normative, or they're actually
necessary to understand the protocol, which is even worse because
they're not part of the normative text.
i think state machines are very useful to understand how the
protocol works and to verify that it is working and i think these
should be included in the docuemnts
Is it really not possible to express them in ASCII so they can be
made part of the normative text?