[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: delay as a metric, aggregation



Jari Arkko wrote:

Anyway, I see two bigger issue here. The first one is the division
of work between IP and transports. If we are sure that we
can develop a simple scheme that works well, it would make sense
adding to that to the IP layer. But if we find out later that
we need to take into account variations in congestion level,
packet loss, etc., we may find ourselves adding a lot of
functionality to try to mimic what transports are already
doing. An approach that does not have this drawback would
be providing feedback from transports and ULPs to the
shim6 so that when they say "unacceptable", Shim6 would
attempt to find another path. This would still leave the
problem of the exploration process finding the right
alternative with a high probability of success. I'm not sure
we know how to do that. For instance, if the transports say
that while the current path works its too small bandwidth,
we test another one, switch to that, and then find out that
also is too small. Perhaps delay can act as a metric here,
but I'd prefer to see some experimental results to understand
how well it works.

One way to carve this up between the transports and the shim is to have the shim be able to tell it's peer the priority and weight for its locators. This would then allow the transports on A to tell the shim on A that A1 is now higher priority than A2, and the shim will inform B of this change.

In such a scheme we can end up with multiple sets of locators
	locator	priority	weight
	A1	1		50
	A2	1		50
	A3	2		33
	A4	2		33
	A5	2		33

which would imply that if either A1 or A2 is working they would be preferred over the locators in the second set. (And within each set the weights suggest an even split.) If A then says that A5 now has priority 0, then presumably B should try to switch to A5.

The second issue is our ambition level with load balancing
functionality in Shim6. If the ambition level is so high that
we try to get the same session over two paths, then this
impacts transports. But even if the ambition level is
running different sessions in different paths, we
would still have to deal with congestion in some manner.
Ideally, we'd be adding sessions to a path so that
the existing sessions do not have to slow down.

I've never had that fancy level of ambition.
(1) The minimum (which the proto draft ia alluding to) is that two separate hosts in site A which communicate with site B (or a particular host in site B) should not be forced to use the same locator prefix as the destination when sending to B.

(2) I think we can easily provide the ability for different context that use different ULID pairs between two hosts to use different locators. For instance, A (locators A1, A2) and B (locators B1, B2, B3) might end up with TCP connections between ULID <A1, B1> and ULID <A2, B3>. There is no reason we need to force those ULID pairs (between the same hosts) to use the same locator pair; they'll be different host-pair contexts with different context tags in the current model.

There are at least two more levels of ambitions I can distinguish:
(3) Different ULP sessions/connections between the same ULID pair can use different locator pairs.

(4) A single ULP session/connection spread over a different locator pairs.

FWIW I think we should leave #4 out of scope. I the transport area wants to explore this that's ok with me, but it feels way to early to think about how to add support for this in the shim.

#3 might require some care in the reachability detection, since it requires a ULID pair to be able to verify that two different locator pairs are working at the same time, without accidentally concluding that just because some traffic gets through on one locator pair, both pairs are working. But apart from that it seems like a local matter between the transports and the shim (For instance, a transport can optionally indicate its prefered locator pair for each packet that is passed down to IP.)

For #3 I'm far from convinced that just because the transport prefers A1->B2 for some connection, that this should imply that this connection will use B2->A1 in the reverse direction. I see several issues with this: - It doesn't seem useful if we are going to handle the unidirectional locator pair reachability problem, since the locator pairs in the two directions must be able to be different in that case. - The shim would need a "language" to express what ULP packets the peer should send over which locator pair, which certaily is added complexity. - The shim would have to cope with the transport passing down preferences with each packet (as suggested above under #3) *and* the peer shim expressing which ULP packets should use which locator pair. Thus the shim (which has no ULP knowledge) needs to be able to resolve those potentially conflicting preferences.

  Erik