[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: flow label demultiplexing
On 18-apr-05, at 16:42, Pekka Savola wrote:
This has a couple of generic issues:
- what if the same flow label is used for some other address, and
that
address later gets added to this locator set?
Host A has locators 2001::1 and 2001::2. When it establishes
communications to B, it uses 2001::1 as source and 2002::B1 as
destination locator (and 2002::B2 is backup). That communication uses
flow label = "1".
Now, Host A (and presumably, also B depending on how it allocates flow
labels) marks flow label = 1 as reserved for
({2001::1,2001::2},{2002::B1,2002::B2}). Normally, the flow label
implementation either doesn't do reservation at all or would just mark
(2001::1,2002::B1,1) as reserved.
Ok.
Now, multihoming context is exchanged.
After that, the host tries to establish a new flow from 2001::1 to
2002::B2; unless there is reservation, it could end up picking "1" as
well, which could create issues when either 2002::B1 or 2002::B2 fails
because the flow labels overlap.
Hm this isn't how current implementations work. They pick a more or
less random flow label for each new session. This is encouraged because
it makes for better hash bucket utilization in boxes that want to do
something useful with the flow label.
For multihoming, we would probably want to use the same flow label
instead in order to reuse a previously established context. If having
all sessions between two hosts use the same flow label is undesirable,
it may be necessary to tie the multihoming state to ranges of flow
labels rather than a single one...
Of course, there is still a race condition between the communication
is established and the multihoming context is set up -- there's no way
to do the reservation for destination locators before you know which
ones they are.
Ok so you're saying a user visits a website at www.ietf.org and the
html links to an image on images.ietf.org, both have addresses 2001::1
and 2002::2 and due to DNS load balancing the session to www uses
2001::1 and the one to images 2002::2. Then, when the sessions are set
up with the same flow label for each, the shim does its thing and...?
The sessions will be merged into a single association so there isn't a
problem, I'd think. Rather the opposite: normal behavior would 99.9999%
(yes I counted the nines) sure result in two different flow labels but
now there is one association with two flow labels... But that shouldn't
be too problematic.
In any event, you wouldn't want to reuse the same flow label any time
soon irrespective of the addresses.
The question is whether we absolutely, positively need to be able to
determine which association a packet belongs to based on addresses
and the flow label, or that it's just a very good guess that may turn
out wrong in later steps. I.e., the machine may have 1000 sessions
open, and being able to narrow down which session a packet belongs to
by looking at the flow label to two or three of those is probably
good enough. If so, then having the source cycle throough the entire
flow label space without even managing collisions with existing
sessions would be enough.
This is a good point, and I don't recall seeing it in the draft.
However, I would be interested in hearing if you have ideas what the
protocol should do after narrowing the flow label down to 3 or 4. How
does it disambiguate further from those 3 or 4? Would you rely on the
implementation taking a peek at (say) layer 4 headers or some other
information?
There are two obvious options:
1. The flow label is relevant regardless of the addresses: this can't
work because of the possibility of clashes with flow labels in
unrelated packets
2. The flow label is only relevant with certain source/dest address
combinations: but what's the additional benefit of the flow label here,
we can demultiplex on the addresses
I guess the only way the flow label could be useful is as a last resort
to correlate packets with unknown addresses to existing sessions. But
this is insecure, so we need to trigger some kind of negotiation to get
to know the unknown addresses. This could still be somewhat useful
because if we're 99% sure who we're talking to we can do a one RTT
challenge/response rather than something more involved.
Or maybe I'm just going down the wrong path here... Why do we need a
bit in the IP header again?