[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: comments on draft-ietf-psamp-sample-tech-04.txt

To: Maurizio Molina <molina@ccrle.nec.de>
Subject: Re: comments on draft-ietf-psamp-sample-tech-04.txt
From: Benoit Claise <bclaise@cisco.com>
Date: Fri, 05 Mar 2004 14:52:56 +0100
Cc: psamp <psamp@ops.ietf.org>
In-reply-to: <4044CA0E.7090703@ccrle.nec.de>
References: <4043516A.3030907@cisco.com> <4043885E.9080808@ccrle.nec.de> <40449611.9080101@cisco.com> <4044CA0E.7090703@ccrle.nec.de>
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.6) Gecko/20040113

Maurizio Molina wrote:

Benoit Claise wrote:

Hi Maurizio,

My point is that, if systematic time based sampling is implemented, will you do it like 1. or 2.
1. Case Systematic Time Based: - Interval length (in usec), Spacing (in usec)
2. Case Systematic Time Based: - Interval length (in usec), Spacing (# packets)

The option1 has got the big drawback that we have no idea how many packets will be inspected and as a consequence we don't know what are the bandwidth requirement for the export link(s). And if we do sampling, it's typically because we have a bottleneck on the export link(s) bandwidth or on the collector side...

Benoit,
I'm not 100% sure I understand what type of sampling you're speaking for or against.
However, in your upper sentence you say with Systematic Count based sampling (which includes 1 out of N), you don't have firm limits on the exported bandwidth.
That's true, but this type of sampling (*) allows you to estimate the rate of the link. With Systematic Time based you cannot see any rate variation on the link because you always export a packet each T sec.

I don't think this is correct, unless I completely misunderstood everything.
Let me reexplain what the issue is, maybe I took some shortcut.
The draft says:

SELECTOR_PARAMETERS
For sampling processes the SELECTOR PARAMETERS define the input
parameters for the process. Interval length in systematic
sampling means, that all packets that arrive in this interval
are selected. The spacing parameter defines the spacing in time
or number of packets between the end of one sampling interval
and the start of the next succeeding interval.

Case n out of N:
- Population size N, Sample size n

Example: we select randomly n packets out of N.
No problem on this one

Case Systematic Count Based:
- Interval length(in packets), Spacing (in packets)

Note: I start with "Case Systematic Count Based" to illustrate my point.
Example: if Interval length = 10 packets, Spacing = 100 packets
This means: I select 10 packets, I don't select the next 90 packets, I select 10 packets, etc...
Note2: this is not clear from the draft if this the previous line example or...
I select 10 packets, I don't select the next 100 packets, I select 10 packets, etc...
This must be clarified with an example.

Case Systematic Time Based:
- Interval length (in usec), Spacing (in usec)

Example: if Interval length = 10 usec, Spacing = 100 usec
This means: I select X packets during 10 usec, I don't select packets during the next 90 usec, etc...
BTW, see my note2 above that is equivalent here: is it 10, 90, 10, 90, ... or 10, 100, 10, 100, ...
And this is my entire point, you select X packets during an interval. And you don't know how many.
You might know it with the ratio 10/100 * bandwidth. BUT you have no clue about the flow records number and
as a consequence we don't know what is the bandwidth requirement for the export link(s). And if we do
sampling, it's typically because we have a bottleneck on the export
link(s) bandwidth or on the collector side... So this way of doing of sampling is dangerous.
The only application I see for such a sampling scheme is when the bottleneck is the interface or the line card resources, typically the memory.
If you keep this mechanism (anyway this is a MAY requirement), you must say a remark about it.

Now, you speak above about "With Systematic Time based you cannot see any rate
variation on the link because you always export a packet each T sec."
If you want to do that, and I agree it makes sense (actually a lot more sense that the previous scheme), then you will have a sampling scheme like this:
Case Systematic Time Based: - Interval length (# packets), Spacing (usec)
Example: if Interval length = 10 packets, Spacing = 100 usec
This means: I select 10 packets, I don't select packets during 90 usec, I select 10 packets, etc...
BTW, the Note2 still applies here.

Regards, Benoit.

(actually, you can only understand if the rate drops below 1/T) .
So, Systematic Time based can be useful for a lot of applications (e.g. random packet content inspection), but not for understanding the dynamics on a link.
Maurizio.

(*) and probabilistic sampling as well

Follow-Ups:
- Re: comments on draft-ietf-psamp-sample-tech-04.txt
  - From: Maurizio Molina <molina@ccrle.nec.de>

References:
- comments on draft-ietf-psamp-sample-tech-04.txt
  - From: Benoit Claise <bclaise@cisco.com>
- Re: comments on draft-ietf-psamp-sample-tech-04.txt
  - From: Maurizio Molina <molina@ccrle.nec.de>
- Re: comments on draft-ietf-psamp-sample-tech-04.txt
  - From: Benoit Claise <bclaise@cisco.com>
- Re: comments on draft-ietf-psamp-sample-tech-04.txt
  - From: Maurizio Molina <molina@ccrle.nec.de>

Prev by Date: Re: Begin WG Last Call: draft-psamp-framework-05.txt
Next by Date: Re: comments on draft-ietf-psamp-sample-tech-04.txt
Previous by thread: Re: comments on draft-ietf-psamp-sample-tech-04.txt
Next by thread: Re: comments on draft-ietf-psamp-sample-tech-04.txt
Index(es):
- Date
- Thread