[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
comments on draft-ietf-psamp-framework-03.txt
Hi Nick,
The draft looks good.
Here are my comments from the first half of the draft.
I'll try and finish up by evening the next half.
Thanks
Ganesh
2 Elements, Terminology, and Architecture
This section defines the basic elements of the PSAMP framework. At
the highest level, the architecture comprises observation points
(at which packets are observed), measurement processes (which
select packets and construct reports on them) and export processes
(which export reports to collectors). The full defintions of these
terms now follow.
* Observation Point: the observation point is a location in the
network where a packet stream is observed. Examples are, a line
Duffield (Ed.) draft-ietf-psamp-framework-03.txt [Page 3]
Internet-Draft Passive Packet Measurement June 2003
to which a probe is attached, a shared medium, such as an
Ethernet-based LAN, a single port of a router, or set of
interfaces (physical or logical) of a router, an embedded
measurement subsystem within an interface.
* Measurement Process: the combination of a selection process
followed by a reporting process.
* Packet Stream: a sequence of packets, each of which was observed
at the observation point. Note that when packets are sampled
from a stream, the selected packets usually do not have common
properties by which they can be distinguished from packets that
have not been selected. Therefore we define here the term
stream instead of flow, which is defined as set of packets with
common properties [QuZC02].
G: Please mention explicitly that these are packets that pass the selection
process.
* Packet Content: the union of the following: packet header, packet
payload, encapsulation headers, and link layer headers.
G: What is the difference between packet header and encapsulation headers?
A better way to put it is "Packet headers which include link layer,
network layer and other encapsulation headers and packetpayload"
* Observed Packet Stream: the packet stream comprising all packets
observed at the observation point.
G: This defintin should come before "Packet Stream". Can you change this to
"Set of all packets observed at the observation point" ?
* Selection Process: a selection process selects a substream of
G: It is not a "substream" but it is a "Packet Stream" as defined by the
terminology above.
packets from the observed packet stream. A selection process
entails the composition of one or more selectors in succession,
acting on each packet in the observed packet stream. When
selectors are composed, the output stream packet issuing from
one selector forms the input packet stream for the succeeding
selector.
* Selector (or selection operation): a configurable packet
selection operation that acts on single packets. It takes as
its input, the content of a single packet from a packet stream,
information derived from the packet's treatment at the
observation point, and selection state that may be maintained
at the observation point. If the packet is selected, this same
information may be considered as the output. Selectors may
change the selection state.
G: Selector definition should come before "Selection Process".
* Composite Selector: an ordered composition of selectors.
* Primitive Selector: a selector that is not a composition of
multiple selectors.
* Selection State: the selection process may maintain state
information for use by the selection process and/or the
reporting process. At a given time, the selection state may
depend on packets observed up that time and/or other variables.
G: Did you mean "packets observed at that time" ?
Examples include sequence numbers of packets at the input of
selectors, timestamps, iterators for pseudorandom number
generators, calculated hash values, and indicators of whether a
Duffield (Ed.) draft-ietf-psamp-framework-03.txt [Page 4]
Internet-Draft Passive Packet Measurement June 2003
packet was selected by a given selector.
* Reporting Process: the creation of a report stream of information
G: "report stream" definition should come first?
on packets selected by a selection processes, in preparation
for export. The input to a reporting process comprises that
information available to a selection process, for the selected
packets.
G: Does the input not include the "selection state" also?
The report stream contains two distinguished types of
information: packet reports, and report interpretation.
* Packet Reports: a configurable subset of the per packet input to
the reporting process.
G: Can you add some more verbeage to this defintion. "For example.
the packet reports includes packet fragment and some interesting fields
like TCP flags."
* Report Interpretation: subsidiary information relating to one or
more packets, that is used for interpretation of their packet
reports. Examples include configuration parameters of the PSAMP
device, and configuration parameters of the selection and
reporting process.
* Export Process: sends the output of one or more reporting process
to one or more collectors.
* Collector: a collector receives a report stream exported by one
or more measurement processes. In some cases, the entity that
G: Isn't it export process instead of measurement process?
hosts the measurement and/or export process may also serve as
the collector.
* Measurement packets: one or packet reports, and perhaps report
interpretation, are bundled by the export process into a
measurement packet for export to a collector.
G: Isn't it same as a report stream encapsulated with transport and
lower layer headers?
Various possibilities for the high level architecture of these
elements is as follows.
= Observation Point, MP = Measurement Process, EP = Export Process
+---------------------+ +------------------+
|Observation Point(s) | | Collector(1) |
|MP(s)--->EP----------+---------------->| |
|MP(s)--->EP----------+-------+-------->| |
+---------------------+ | +------------------+
|
+---------------------+ | +------------------+
|Observation Point(s) | +-------->| Collector(2) |
|MP(s)--->EP----------+---------------->| |
+---------------------+ +------------------+
+---------------------+
|Observation Point(s) |
|MP(s)--->EP---+ |
| | |
|Collector(3)<-+ |
Duffield (Ed.) draft-ietf-psamp-framework-03.txt [Page 5]
Internet-Draft Passive Packet Measurement June 2003
+---------------------+
3 Requirements
3.1 Selection Process Requirements.
* Ubiquity: The selectors must be simple enough to be implemented
ubiquitously at maximal line rate.
* Applicability: the set of selectors must be rich enough to
support a range of existing and emerging measurement based
applications and protocols. This requires a workable trade-off
between the range of traffic engineering applications and
operational tasks it enables, and the complexity of the set of
capabilities.
* Extensibility: to allow for additional packet selectors to
support future applications.
* Flexibility: to support selection of packets using different
network protocols or encapsulation layers (e.g. IPv4, IPv6,
MPLS, etc), and under packet encryption.
* Robust Selection: packet selection MUST be robust w.r.t. attempts
to craft a packet stream from which packets are selected
G: Isn't it "to craft the observed packet streams"
disproportionately (e.g. to evade selection, or overload the
measurement system).
* Parallel Measurements: multiple independent measurement
processes at the same entity.
G: What is meant by entity and why is this a requirement?
* Non-contingency: in order to satisfy the ubiquity requirement,
the selection decision for each packet MUST NOT depend on
future packets. Rather, the selection decision MUST be capable
of being made on the basis of the selection process input up to
and including the packet in question. This excludes selection
functions that require caching of packet for selection
contingent on subsequent packets. See also the timeliness
requirement following.
G: Does this mean that "selection functions that require caching of packet"
need not be ubiquitous. It makes sense though.
Selectors are outlined in Section 4, and described in more detail in
the companion document [ZMRD03].
3.2 Reporting Process Requirements
* Transparency: allow transparent interpretation of measurements as
communicated by PSAMP reporting, without any need to obtain
G: Please change PSAMP reporting to "reporting process" for consistency.
additional information concerning the observed packet stream.
* Robustness to Information Loss: allow robust interpretation of
Duffield (Ed.) draft-ietf-psamp-framework-03.txt [Page 6]
Internet-Draft Passive Packet Measurement June 2003
measurements with respect to reports missing due to data loss,
e.g. in transport, or within the measurement, reporting or
exporting processes. Inclusion in reporting of information
that enables the accuracy of measurements to be determined.
* Faithfulness: all reported quantities that relate to the packet
treatment MUST reflect the router state and configuration
G: "MUST" I think is too hard especially in cases like exporter overloading
etc. Can this be changed to "SHOULD" ?
encountered by the packet at the time it is received by the
measurement process.
* Privacy: selection of the content of packet reports will be
cognizant of privacy and anonymity issues while being
responsive to the needs of measurement applications, and in
accordance with RFC 2804. Full packet capture of arbitrary
packet streams is explicitly out of scope.
A specific reporting processes meeting these requirements, and the
requirement for ubiquity, is described in Section 5.
3.3 Export Process Requirements
* Timeliness: reports on selected packets MUST be made available
to the collector quickly enough to support near real time
applications. Specifically, any report on a packet MUST be
dispatched within 1 second of the time of receipt of the packet
by the measurement process.
G: Does the exporter drop the report stream if it could not dispatch the
packet in 1 sec? If so the time should be larger. Seconds of delay
could be caused if the system is overloaded.
* Congestion Avoidance: export of a report stream across a network
MUST be congestion avoiding in compliance with RFC 2914.
* Secure Export:
- confidentiality: the option to encrypt exported data MUST be
provided.
- integrity: alterations in transit to exported data MUST be
detectable at the collector
- authenticity: authenticity of exported data MUST be
verifiable by the collector in order to detect forged data.
The motivation here is the same as for security in IPFIX
export; see Sections 6.3 and 10 of [QZCZ03].
3.4 Configuration Requirements
* Ease of Configuration: of sampling and export parameters,
e.g. for automated remote reconfiguration in response to
measurements.
* Secure Configuration: the option to configure via protocols that
Duffield (Ed.) draft-ietf-psamp-framework-03.txt [Page 7]
Internet-Draft Passive Packet Measurement June 2003
prevent unauthorized reconfiguration or eavesdropping on
configuration communications MUST be available. Eavesdropping
on configuration might allow an attacker to gain knowledge that
would be helpful in crafting a packet stream to (for example)
evade subversion, or overload the measurement infrastructure.
Configuration is discussed in Section 8. Feasibility and complexity
of PSAMP operations is discussed in Section 9.
Reuse of existing protocols will be encouraged provided the
protocol capabilities are compatible with the requirements laid out
in this document.
4 Packet Selection
4.1 Packet Selection Terminology.
* Filtering: a filter is a selection operation that selects a
packet deterministically based on the packet content, its
treatment, and functions of these occurring in the selection
state. Examples include match/mask filtering, and hash-based
selection.
* Sampling: a selection operation that is not a filter is called a
sampling operation. This reflects the intuitive notion that if
the selection of a packet cannot be exactly predicted from its
G: Please change this to "the selection of a packet cannot be
exactly predicted in all cases from its" . Otherwise the terminology
"Content-dependent Sampling" becomes a contradiction.
content, there must be some type of sampling taking place.
* Content-independent Sampling: a sampling operation that does not
use packet content (or quantities derived from it) as the basis
for selection is called a content-independent sampling
operation. Examples include systematic sampling, and uniform
pseudorandom sampling driven by a pseudorandom number whose
generation is independent of packet content. Note that in
independent sampling it is not necessary to access the packet
content in order to make the selection decision.
* Content-dependent Sampling: a sampling operation where selection
is dependent on packet content is called a content-dependent
sampling operation. Examples include pseudorandom selection
according to a probability that depends on the contents of a
packet field; note that this is not a filter.
* Emulated Sampling: selection operations in any of the above four
categories may be emulated by operations in the same or another
category for the purposes of implementation. For example,
uniform pseudorandom sampling may be emulated by hash-based
selection, using suitable hash function and hash domain.
G: Need to forward reference "hash-based selection"
* Hash-based selection: a filter specified by a hash domain, a hash
Duffield (Ed.) draft-ietf-psamp-framework-03.txt [Page 8]
Internet-Draft Passive Packet Measurement June 2003
function, and hash range and a hash selection range.
* Hash domain: a subset of the packet content and the packet
treatment, viewed as an N-bit string for some positive integer
N.
* Hash range: a set of M-bit strings for some positive integer M.
* Hash function: a deterministic map from the hash domain into the
hash range.
G: Please move "Hash domain", "Hash range" and "Hash function" above
"Hash-based
Selection"
* Selection range: a subset of the hash range. The packet is
selected if the action of the hash function on the hash domain
for the packet yields a result in the hash selection range.
* Pool size: the size of a set of packets in a packet stream.
G: Did you mean the # of packets in a packet stream?
* Sample size: the size of a set of packets selected by a sampling
operation.
G: Did you mean the # of packets in a packet stream which got selected
as a result of the sampling oepration?
* Target Sampling Frequency: a configurable sampling frequency in a
sampling operation.
G: Just mention that it could be time or packet based.
* Attained Sampling Frequency: Given a subset of packets in a stream
input to a sampling operation, the attained sampling frequency is
the ratio of the sample size to the pool size.
4.2 Packet Selection Operations for a PSAMP
A spectrum of packet selection operations is described in detail in
[ZMRD03]. Here we only briefly summarize the meanings for
completeness.
A PSAMP selection process MUST support at least one of the
following selectors.
* Systematic Time Based:
packet selection is triggered at periodic instants separated
by a time called the Spacing. All packets that arrive within a
certain time of the trigger (called the Interval Length) are
selected.
* Systematic Count Based:
similar to systematic time based expect that selection is
reckoned w.r.t. packet count rather than time. Packet
selection is triggered periodically by packet count, a number
of successive packets being selected subsequent to each trigger.
* Uniform Probabilistic: packets are selected independently with
fixed sampling probability p.
* Non-uniform Probabilistic:
Duffield (Ed.) draft-ietf-psamp-framework-03.txt [Page 9]
Internet-Draft Passive Packet Measurement June 2003
packets are selected independently with probability p that
depends on packet content.
* Probabilistic n-out-of-N:
form each count-based successive block of N packets, n are
selected at random
* Match/Mask Filtering:
This entails taking the masking portions of the packet
G: "This entails taking the masking portions of the packet,
parameters for selectors, and functions of these occurring
in the selection state"
(i.e. taking the bitwise AND with a binary mask) and selecting
the packet if the result falls in a specified range. This
specification doesn't preclude the future definition of a high
level syntax for defining filtering in a concise way (e.g. TCP
port taking a particular value) providing that syntax can be
compiled into the bitwise expression.
Match/mask operations SHOULD be available for different
protocol portions of the packet:
o the IP header (excluding options in IPv4, stacked headers in
IPv6)
o transport header
o encapsulation headers (including MPLS label stack, ATOM)
When an entity offers Match/Mask filtering in the selection
process and, in its usual capacity other than in performing
PSAMP functions, identifies or processes information from one
or more of the above protocols, then the information SHOULD be
made available for filtering. For example, when an entity
routes based on destination IP address, that field should be
made available. Conversely, an entity that does not route is
not expected to be able to locate an IP address within a
packet, or make it available for filtering, although it MAY do
so.
* Hash-based Selection:
Hash-based selection will employ one or more hash functions to
be standardized. The hash domain is specified by a bitmaps on
the IP packet header and the IP payload.
G: Just curious- Can't we hash based on selector parameters and
intermediate results also?
When the hash function is sufficiently good, hash-based
selection can be used to emulate uniform random sampling over
the hash domain. The target sampling frequency is then the
ratio of the size of the selection range to the hash range.
Applications of hash-based selection include:
o Trajectory Sampling: all routers use the same hash selector;
Duffield (Ed.) draft-ietf-psamp-framework-03.txt [Page 10]
Internet-Draft Passive Packet Measurement June 2003
the hash domain includes only portions of the packet that do
not change from hope to hop (e.f. TTL is excluded). Hence
packets are consistently selected in the sense that they
are selected at all routers on their path or none. Reports
also include a second hash (the label hash) that
distinguishes different packets. Reports of a given packet
reaching the collector from different routers can be used to
reconstruct the path taken by the packet. Trajectory
Sampling is proposed in [DuGr01]; further description is
found in [ZMRD03]; some applications are described in
Section 10.
o Consistent Flow Sampling: the hash domain is a flow key. For
a given flow, either all or none of its packets are
sampled. This is accomplished without the need to maintain
flow state.
Some applications need to calculate packet hashes for purposes
other than selection (e.g. the label hash in Trajectory
Sampling). This can be achieved by placing a calculated hash
in the selection state, and setting the selection range to be
the whole of the hash range.
* Router State Filtering:
This class of filters selects a packet on based on the following
conditions, combined with the AND, OR or NOT operators:
o Ingress interface at which packet arrives equals a specified
value
o Egress interface to which packet is routed to equals a
specified value
o Origin AS equals a specified value or lies within a given
range.
o Destination AS equals a specified value or lies within a given
range
o Packet violated acl on the router
o Failed rpf
o Failed rsvp
o No route found for the packet
Router architectural considerations may preclude some
information concerning the packet treatment, e.g routing
state, being available at line rate for selection of
packets. However, if selection not based on routing state has
reduced down from line rate, subselection based on routing
state may be feasible.
4.3 Input Sequence Numbers for Primitive Selectors.
Each instance of a primitive selector MUST maintain a count of
packets presented at its input. The counter value is to be included
Duffield (Ed.) draft-ietf-psamp-framework-03.txt [Page 11]
Internet-Draft Passive Packet Measurement June 2003
as a sequence number for selected packets. This enables
applications to determine the attained frequency at which packets
are selected, and hence correctly normalize network usage estimates
regardless of loss of information, whether this occurs because of
discard of packet reports in the measurement or reporting process
(e.g. due to resource contention), or loss of measurement packets
in transmission or collection; see [PPM01]. The sequence numbers
are considered as part of the packet's selection state.
4.4 Composite Selectors
The ability to compose selectors in a selection process SHOULD be
provided. The following combinations appear to be most useful for
applications:
* filtering followed by sampling
* sampling followed by filtering
Composite selectors are useful for drill down applications. The
first component of a composite selector can be used to reduce the
load on the second component. In this setting, the advantage to be
gained from a given ordering can depend on the composition of the
packet stream.
4.5 Constraints on the Sampling Frequency
Sampling at full line rate, i.e. with probability 1, is not
excluded in principle, although resource constraints may not
support it in practice.
4.6 Criteria for Choice of Selection Operations
In current practice, sampling has been performed using particular
algorithms, including:
- pseudorandom independent sampling with probability 1/N;
- systematic sampling of every Nth packet.
The question arises as to whether both of these should be
standardized as distinct selection operations, or whether they can
be regarded as different implementations of a single selection
operation.
To determine the answer to this question, we need to consider
(a) measured or assumed statistical properties of the packet
stream, e.g., one or more of the following:
- contents of different packets are statistically independent
- correlations between contents of different packets decay
at a specified rate
- contents of certain fields within the same packet are
Duffield (Ed.) draft-ietf-psamp-framework-03.txt [Page 12]
Internet-Draft Passive Packet Measurement June 2003
significantly variable and exhibit small cross correlation
(b) the desired reference sampling model, e.g., one of:
- sample packets with long term probability 1/N
- sample packets independent with probability 1/N
(c) the set of possible alternatives and implementations, e.g.,
one of:
- pseudorandom independent sampling with probability 1/N
- systematic sampling with period N
- hash-based sampling with target probability 1/N
(d) the tolerance for error in the applications that use the
measurements.
We can say that a given alternative from (c) reproduces a reference
model (b) for the applications if the results obtained using them
are sufficiently accurate in (d) for traffic satisfying an assumed
statistical properties in (a). Clearly, application to evaluate
methods in (c) requires developing agreement on the relevant
properties in (a), (b) and (d).
Example: systematic sampling with period N will not count the
occurrence of closely space packets (less than N counts apart) from
the same flow. Thus for applications that are concerned with the
joint statistics of multiple packets within flows, systematic
sampling may not reproduce the results obtained with random
sampling sufficiently accurately.
5 Reporting Process
5.1 Mandatory Contents of Packet Reports (MUST)
The reporting process MUST include the following in each packet report:
(i) the input sequence number(s) of any sampling operation
that acted on the packet in the instance of a measurement
process of which the reporting process is a component.
The reporting process MUST be able to include the following
in each packet report, as a configurable option:
(ii) some number of contiguous bytes from the start of the
packet.
Some devices may not have the resource capacity or functionality to
provide more detailed reports that those in (i) and (ii)
above. Using this minimum required reporting functionality, the
reporting process places the burden of interpretation on the collector,
or on applications that it supplies.
5.2 Recommended Contents for Packet Reports (SHOULD)
G: Isn't SHOULD is a hard requirement. I would think the only hard
requirement is i) and ii). The reset is upto the capacity of the
reporting process.
The reporting process SHOULD provide for the inclusion in packet
Duffield (Ed.) draft-ietf-psamp-framework-03.txt [Page 13]
Internet-Draft Passive Packet Measurement June 2003
reports of the following information, inclusion any or all being
configurable as a option.
(iii) fields relating to the following protocols
used in the packet, specifically: IPv4, IPV6, transport
protocols, MPLS, ATOM.
G: This is valid only if it is not already covered by ii)
(iv) packet treatment, including:
- identifiers for any input and output interfaces of the
observation point that were traversed by the packet
- source and destination AS
(v) selection state associated with the packet, including:
- timestamps
- hashes, where calculated.
The specific fields will include those set out as requirements for
IPFIX [QZCZ03], with modifications appropriate to reporting on
single packets rather than flows.
When an entity that hosts a reporting process and, in its usual
capacity other than performing PSAMP functions, identifies or
process one or more of the above fields, then the contents of each
such field(s) SHOULD be made available for optional reporting. For
example, when a device routes based on destination IP address, that
field should be made available. Conversely, an entity that does
not route is not expected to be able to locate an IP address within
a packet, or make it available for reporting, although it MAY do
so.
5.3 Report Interpretation
Information for use in report interpretation MUST include (i)
configuration parameters of the selectors of the packets reported
on; (ii) format of the packet reports (iii) configuration
parameters and state information of the network element; (iv)
G: What is "network element" ? Isn't this the observation point(s) or
PSAMP device?
indication of the inherent accuracy of the reported quantities,
e.g., of timestamps; (v) identifiers for observation point,
measurement process, and export process.
The requirements for robustness and transparency are motivations
for including report interpretation in the report stream. Inclusion
makes the report stream self-defining. The PSAMP framework
excludes reliance on an alternative model in which interpretation
is recovered out of band. This latter approach is not robust with
respect to undocumented changes in selector configuration, and may
give rise to future architectural problems for network management
Duffield (Ed.) draft-ietf-psamp-framework-03.txt [Page 14]
Internet-Draft Passive Packet Measurement June 2003
systems to coherently manage both configuration and data collection.
It is not envisaged that all report interpretation be included
in every packet report. Many of the quantities listed above are
expected to be relatively static; they could be communicated
periodically, and upon change.
To conserve network bandwidth and resources at the collector, the
measurement packets may be compressed before export. Compression
is expected to be quite effective since the sampled packets may
share many fields in common, e.g. if a filter focuses on packets
with certain values in particular header fields. Using compression,
however, could impact the timeliness of reports. Any consequent
delay MUST not violate the timeliness requirement for availability
of packet reports at the collector.
6 Parallel Measurement Processes
Because of the increasing number of distinct measurement
applications, with varying requirements, it is desirable to set up
parallel measurement processes on a stream of packets. A PSAMP
device SHOULD support more than one independently configurable
G: Definition of PSAMP device is missing from terminology.
measurement process. The measurement process may have an exclusive
export process, or may share it with other measurement processes.
G: exclusive reporting process and/or export process.
Each of the parallel measurement processes SHOULD be
independent. However, resource constraints may prevent complete
reporting on a packet selected by multiple selection processes. In
this case, reporting for the packet MUST be complete for at least
one measurement process; other measurement processes need only
report that they selected the packet. The priority amongst
measurement processes to report packets MUST be configurable.
It is not proposed to standardize the number of parallel
measurement processes.
--
to unsubscribe send a message to psamp-request@ops.ietf.org with
the word 'unsubscribe' in a single line as the message text body.
archive: <http://ops.ietf.org/lists/psamp/>