[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: comments on draft-ietf-psamp-sample-tech-04.txt
Benoit,
please see inline. I think I've understood all your concerns, except one.
Maurizio
Benoit Claise wrote:
Maurizio Molina wrote:
Benoit Claise wrote:
Hi Maurizio,
My point is that, if systematic time based sampling is implemented,
will you do it like 1. or 2.
1. Case Systematic Time Based: - Interval length (in
usec), Spacing (in usec)
2. Case Systematic Time Based: - Interval length (in
usec), Spacing (# packets)
The option1 has got the big drawback that we have no idea how many
packets will be inspected and as a consequence we don't know what
are the bandwidth requirement for the export link(s). And if we do
sampling, it's typically because we have a bottleneck on the export
link(s) bandwidth or on the collector side...
Benoit,
I'm not 100% sure I understand what type of sampling you're speaking
for or against.
However, in your upper sentence you say with Systematic Count based
sampling (which includes 1 out of N), you don't have firm limits on
the exported bandwidth.
That's true, but this type of sampling (*) allows you to estimate the
rate of the link. With Systematic Time based you cannot see any rate
variation on the link because you always export a packet each T sec.
I don't think this is correct, unless I completely misunderstood
everything.
You're right. I provided an example in a hurry. If you use Systematic
Time based sampling, with parameters t and T, where t is the interval
during wich you sample and T is the one during which you don't sample
(see below, I clarify this issue..) the bandwidth can be estimated as
[E(X)/(t+T)]*[(t+T)/t], i.e. E(X)/t
where E(X) is the average of the packets that you sample at each cycle t+T.
Let me reexplain what the issue is, maybe I took some shortcut.
The draft says:
SELECTOR_PARAMETERS For sampling processes the SELECTOR
PARAMETERS define the input parameters for the process. _Interval
length in systematic sampling means, that all packets that arrive
in this interval are selected._ The spacing parameter defines the
spacing in time or number of packets between the end of one
sampling interval and the start of the next succeeding interval.
Case n out of N: - Population size N, Sample size n
Example: we select randomly n packets out of N. No problem on this one
Case Systematic Count Based: - Interval length(in
packets), Spacing (in packets)
Note: I start with "Case Systematic Count Based" to illustrate my point.
Example: if Interval length = 10 packets, Spacing = 100 packets
This means: I select 10 packets, I don't select the next 90 packets,
I select 10 packets, etc...
Note2: this is not clear from the draft if this the previous line
example or...
I select 10 packets, I don't select the next 100 packets, I
select 10 packets, etc...
This must be clarified with an example.
taking your 10/100 example, the intention was to define the second case
you mention, that is:
I select 10 packets, I don't select the next 100 packets, I select 10
packets, etc...
And also for the Systematic Time Based we wanted to mean
I select all packets during 10 usec, I don't select packets during the
next 100 usec, etc...
I agree that the text must be improved so that we don't leave any doubt,
and that an example should be added.
I'll provide a proposal to Tanja, OK?
Case Systematic Time Based: - Interval length (in
usec), Spacing (in usec) Example: if Interval length = 10 usec,
Spacing = 100 usec This means: I select _X_ packets during 10 usec,
I don't select packets during the next 90 usec, etc...
BTW, see my note2 above that is equivalent here: is it 10, 90, 10,
90, ... or 10, 100, 10, 100, ...
And this is my entire point, you select X packets during an interval.
And you don't know how many.
You might know it with the ratio 10/100 * bandwidth. BUT you have _no
clue_ about the flow records number and as a consequence we don't know
what is the bandwidth requirement for the export link(s). And if we do
sampling, it's typically because we have a bottleneck on the export
link(s) bandwidth or on the collector side... So this way of doing of
sampling is dangerous. The only application I see for such a sampling
scheme is when the bottleneck is the interface or the line card
resources, typically the memory.
As you say, the maximum (average) export bandwidth will be bounded by
(t/t+T)*link_bandwidth[pkt/s]*export_size[bytes/exported packet]. So you
can bound it.
The fact that you sample during t all the packets doesn't mean that you
must export them within t seconds. In general, you'll have t+T seconds
to do so (i.e. you can shape the export traffic). Or, even without
shaping, you can set t and T small enough so that at each t at most 1
packet (is sampled).
Was that your concern? the burstiness? In that case, I agree that in the
draft we must describe it, and put some reasoning of how to avoid it and
give an example, etc...but I don't think that this hinders the
usefulness of this (very simple) type of sampling
Regards,
Maurizio
If you keep this mechanism (anyway this is a MAY requirement), you
must say a remark about it.
Now, you speak above about "With Systematic Time based you cannot see
any rate variation on the link because you always export a packet each
T sec."
If you want to do that, and I agree it makes sense (actually a lot
more sense that the previous scheme), then you will have a sampling
scheme like this:
Case Systematic Time Based: - Interval length (# packets),
Spacing (usec)
Example: if Interval length = 10 packets, Spacing = 100 usec
This means: I select 10 packets, I don't select packets during 90
usec, I select 10 packets, etc...
BTW, the Note2 still applies here.
Regards, Benoit.
(actually, you can only understand if the rate drops below 1/T) .
So, Systematic Time based can be useful for a lot of applications
(e.g. random packet content inspection), but not for understanding
the dynamics on a link.
Maurizio.
(*) and probabilistic sampling as well
--
to unsubscribe send a message to psamp-request@ops.ietf.org with
the word 'unsubscribe' in a single line as the message text body.
archive: <http://ops.ietf.org/lists/psamp/>