[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: delay as a metric, aggregation
Jari Arkko wrote:
Anyway, I see two bigger issue here. The first one is the division
of work between IP and transports. If we are sure that we
can develop a simple scheme that works well, it would make sense
adding to that to the IP layer. But if we find out later that
we need to take into account variations in congestion level,
packet loss, etc., we may find ourselves adding a lot of
functionality to try to mimic what transports are already
doing. An approach that does not have this drawback would
be providing feedback from transports and ULPs to the
shim6 so that when they say "unacceptable", Shim6 would
attempt to find another path. This would still leave the
problem of the exploration process finding the right
alternative with a high probability of success. I'm not sure
we know how to do that. For instance, if the transports say
that while the current path works its too small bandwidth,
we test another one, switch to that, and then find out that
also is too small. Perhaps delay can act as a metric here,
but I'd prefer to see some experimental results to understand
how well it works.
One way to carve this up between the transports and the shim is to have
the shim be able to tell it's peer the priority and weight for its
locators. This would then allow the transports on A to tell the shim on
A that A1 is now higher priority than A2, and the shim will inform B of
this change.
In such a scheme we can end up with multiple sets of locators
locator priority weight
A1 1 50
A2 1 50
A3 2 33
A4 2 33
A5 2 33
which would imply that if either A1 or A2 is working they would be
preferred over the locators in the second set. (And within each set the
weights suggest an even split.)
If A then says that A5 now has priority 0, then presumably B should try
to switch to A5.
The second issue is our ambition level with load balancing
functionality in Shim6. If the ambition level is so high that
we try to get the same session over two paths, then this
impacts transports. But even if the ambition level is
running different sessions in different paths, we
would still have to deal with congestion in some manner.
Ideally, we'd be adding sessions to a path so that
the existing sessions do not have to slow down.
I've never had that fancy level of ambition.
(1) The minimum (which the proto draft ia alluding to) is that two
separate hosts in site A which communicate with site B (or a particular
host in site B) should not be forced to use the same locator prefix as
the destination when sending to B.
(2) I think we can easily provide the ability for different context that
use different ULID pairs between two hosts to use different locators.
For instance, A (locators A1, A2) and B (locators B1, B2, B3) might end
up with TCP connections between ULID <A1, B1> and ULID <A2, B3>.
There is no reason we need to force those ULID pairs (between the same
hosts) to use the same locator pair; they'll be different host-pair
contexts with different context tags in the current model.
There are at least two more levels of ambitions I can distinguish:
(3) Different ULP sessions/connections between the same ULID pair can
use different locator pairs.
(4) A single ULP session/connection spread over a different locator pairs.
FWIW I think we should leave #4 out of scope. I the transport area wants
to explore this that's ok with me, but it feels way to early to think
about how to add support for this in the shim.
#3 might require some care in the reachability detection, since it
requires a ULID pair to be able to verify that two different locator
pairs are working at the same time, without accidentally concluding that
just because some traffic gets through on one locator pair, both pairs
are working.
But apart from that it seems like a local matter between the transports
and the shim (For instance, a transport can optionally indicate its
prefered locator pair for each packet that is passed down to IP.)
For #3 I'm far from convinced that just because the transport prefers
A1->B2 for some connection, that this should imply that this connection
will use B2->A1 in the reverse direction. I see several issues with this:
- It doesn't seem useful if we are going to handle the unidirectional
locator pair reachability problem, since the locator pairs in the two
directions must be able to be different in that case.
- The shim would need a "language" to express what ULP packets the
peer should send over which locator pair, which certaily is added
complexity.
- The shim would have to cope with the transport passing down
preferences with each packet (as suggested above under #3) *and* the
peer shim expressing which ULP packets should use which locator pair.
Thus the shim (which has no ULP knowledge) needs to be able to resolve
those potentially conflicting preferences.
Erik