[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: flow label demultiplexing



On 18-apr-05, at 16:42, Pekka Savola wrote:

This has a couple of generic issues:
- what if the same flow label is used for some other address, and that
address later gets added to this locator set?

Host A has locators 2001::1 and 2001::2. When it establishes communications to B, it uses 2001::1 as source and 2002::B1 as destination locator (and 2002::B2 is backup). That communication uses flow label = "1".

Now, Host A (and presumably, also B depending on how it allocates flow labels) marks flow label = 1 as reserved for ({2001::1,2001::2},{2002::B1,2002::B2}). Normally, the flow label implementation either doesn't do reservation at all or would just mark (2001::1,2002::B1,1) as reserved.

Ok.

Now, multihoming context is exchanged.

After that, the host tries to establish a new flow from 2001::1 to 2002::B2; unless there is reservation, it could end up picking "1" as well, which could create issues when either 2002::B1 or 2002::B2 fails because the flow labels overlap.

Hm this isn't how current implementations work. They pick a more or less random flow label for each new session. This is encouraged because it makes for better hash bucket utilization in boxes that want to do something useful with the flow label.


For multihoming, we would probably want to use the same flow label instead in order to reuse a previously established context. If having all sessions between two hosts use the same flow label is undesirable, it may be necessary to tie the multihoming state to ranges of flow labels rather than a single one...

Of course, there is still a race condition between the communication is established and the multihoming context is set up -- there's no way to do the reservation for destination locators before you know which ones they are.

Ok so you're saying a user visits a website at www.ietf.org and the html links to an image on images.ietf.org, both have addresses 2001::1 and 2002::2 and due to DNS load balancing the session to www uses 2001::1 and the one to images 2002::2. Then, when the sessions are set up with the same flow label for each, the shim does its thing and...? The sessions will be merged into a single association so there isn't a problem, I'd think. Rather the opposite: normal behavior would 99.9999% (yes I counted the nines) sure result in two different flow labels but now there is one association with two flow labels... But that shouldn't be too problematic.


In any event, you wouldn't want to reuse the same flow label any time soon irrespective of the addresses.

The question is whether we absolutely, positively need to be able to determine which association a packet belongs to based on addresses and the flow label, or that it's just a very good guess that may turn out wrong in later steps. I.e., the machine may have 1000 sessions open, and being able to narrow down which session a packet belongs to by looking at the flow label to two or three of those is probably good enough. If so, then having the source cycle throough the entire flow label space without even managing collisions with existing sessions would be enough.

This is a good point, and I don't recall seeing it in the draft.

However, I would be interested in hearing if you have ideas what the protocol should do after narrowing the flow label down to 3 or 4. How does it disambiguate further from those 3 or 4? Would you rely on the implementation taking a peek at (say) layer 4 headers or some other information?

There are two obvious options:

1. The flow label is relevant regardless of the addresses: this can't work because of the possibility of clashes with flow labels in unrelated packets
2. The flow label is only relevant with certain source/dest address combinations: but what's the additional benefit of the flow label here, we can demultiplex on the addresses


I guess the only way the flow label could be useful is as a last resort to correlate packets with unknown addresses to existing sessions. But this is insecure, so we need to trigger some kind of negotiation to get to know the unknown addresses. This could still be somewhat useful because if we're 99% sure who we're talking to we can do a one RTT challenge/response rather than something more involved.

Or maybe I'm just going down the wrong path here... Why do we need a bit in the IP header again?