[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

comments on draft-ietf-psamp-framework-03.txt



Hi Nick,
  The draft looks good.
  Here are my comments from the first half of the draft.
  I'll try and finish up by evening the next half.
Thanks
Ganesh

2 Elements, Terminology, and Architecture

  This section defines the basic elements of the PSAMP framework. At
  the highest level, the architecture comprises observation points
  (at which packets are observed), measurement processes (which
  select packets and construct reports on them) and export processes
  (which export reports to collectors). The full defintions of these
  terms now follow.

  * Observation Point: the observation point is a location in the
      network where a packet stream is observed. Examples are, a line

Duffield (Ed.)       draft-ietf-psamp-framework-03.txt          [Page 3]

Internet-Draft          Passive Packet Measurement             June 2003

      to which a probe is attached, a shared medium, such as an
      Ethernet-based LAN, a single port of a router, or set of
      interfaces (physical or logical) of a router, an embedded
      measurement subsystem within an interface.

* Measurement Process: the combination of a selection process
followed by a reporting process.
* Packet Stream: a sequence of packets, each of which was observed
at the observation point. Note that when packets are sampled
from a stream, the selected packets usually do not have common
properties by which they can be distinguished from packets that
have not been selected. Therefore we define here the term
stream instead of flow, which is defined as set of packets with
common properties [QuZC02].


G: Please mention explicitly that these are packets that pass the selection
  process.

  * Packet Content: the union of the following: packet header, packet
      payload, encapsulation headers, and link layer headers.

G: What is the difference between packet header and encapsulation headers?
  A better way to put it is "Packet headers which include link layer,
  network layer and other encapsulation headers and packetpayload"

  * Observed Packet Stream: the packet stream comprising all packets
      observed at the observation point.
G: This defintin should come before "Packet Stream". Can you change this to
"Set of all packets observed at the observation point" ?

  * Selection Process: a selection process selects a substream of
G: It is not a "substream" but it is a "Packet Stream" as defined by the
  terminology above.

      packets from the observed packet stream. A selection process
      entails the composition of one or more selectors in succession,
      acting on each packet in the observed packet stream. When
      selectors are composed, the output stream packet issuing from
      one selector forms the input packet stream for the succeeding
      selector.


* Selector (or selection operation): a configurable packet selection operation that acts on single packets. It takes as its input, the content of a single packet from a packet stream, information derived from the packet's treatment at the observation point, and selection state that may be maintained at the observation point. If the packet is selected, this same information may be considered as the output. Selectors may change the selection state.

G: Selector definition should come before "Selection Process".

* Composite Selector: an ordered composition of selectors.

* Primitive Selector: a selector that is not a composition of
multiple selectors.
* Selection State: the selection process may maintain state
information for use by the selection process and/or the
reporting process. At a given time, the selection state may
depend on packets observed up that time and/or other variables.


G: Did you mean "packets observed at that time" ?

      Examples include sequence numbers of packets at the input of
      selectors, timestamps, iterators for pseudorandom number
      generators, calculated hash values, and indicators of whether a

Duffield (Ed.)       draft-ietf-psamp-framework-03.txt          [Page 4]

Internet-Draft          Passive Packet Measurement             June 2003

packet was selected by a given selector.

* Reporting Process: the creation of a report stream of information

G: "report stream" definition should come first?

      on packets selected by a selection processes, in preparation
      for export. The input to a reporting process comprises that
      information available to a selection process, for the selected
      packets.

G: Does the input not include the "selection state" also?

The report stream contains two distinguished types of
      information: packet reports, and report interpretation.

  * Packet Reports: a configurable subset of the per packet input to
      the reporting process.
G: Can you add some more verbeage to this defintion. "For example.
the packet reports includes packet fragment and some interesting fields
like TCP flags."

  * Report Interpretation: subsidiary information relating to one or
      more packets, that is used for interpretation of their packet
      reports. Examples include configuration parameters of the PSAMP
      device, and configuration parameters of the selection and
      reporting process.

  * Export Process: sends the output of one or more reporting process
      to one or more collectors.

  * Collector: a collector receives a report stream exported by one
      or more measurement processes. In some cases, the entity that

G: Isn't it export process instead of measurement process?

      hosts the measurement and/or export process may also serve as
      the collector.

  * Measurement packets: one or packet reports, and perhaps report
      interpretation, are bundled by the export process into a
      measurement packet for export to a collector.

G: Isn't it same as a report stream encapsulated with transport and
lower layer headers?

  Various possibilities for the high level architecture of these
  elements is as follows.

= Observation Point, MP = Measurement Process, EP = Export Process

+---------------------+ +------------------+
|Observation Point(s) | | Collector(1) |
|MP(s)--->EP----------+---------------->| | |MP(s)--->EP----------+-------+-------->| |
+---------------------+ | +------------------+
| +---------------------+ | +------------------+
|Observation Point(s) | +-------->| Collector(2) |
|MP(s)--->EP----------+---------------->| |
+---------------------+ +------------------+
+---------------------+ |Observation Point(s) | |MP(s)--->EP---+ | | | | |Collector(3)<-+ |


Duffield (Ed.)       draft-ietf-psamp-framework-03.txt          [Page 5]

Internet-Draft          Passive Packet Measurement             June 2003

+---------------------+


3 Requirements


3.1 Selection Process Requirements.

  * Ubiquity: The selectors must be simple enough to be implemented
      ubiquitously at maximal line rate.

  * Applicability: the set of selectors must be rich enough to
      support a range of existing and emerging measurement based
      applications and protocols. This requires a workable trade-off
      between the range of traffic engineering applications and
      operational tasks it enables, and the complexity of the set of
      capabilities.

  * Extensibility: to allow for additional packet selectors to
    support future applications.

  * Flexibility: to support selection of packets using different
      network protocols or encapsulation layers (e.g. IPv4, IPv6,
      MPLS, etc), and under packet encryption.

  * Robust Selection: packet selection MUST be robust w.r.t. attempts
      to craft a packet stream from which packets are selected
G: Isn't it "to craft the observed packet streams"
      disproportionately (e.g. to evade selection, or overload the
      measurement system).

  * Parallel Measurements: multiple independent measurement
      processes at the same entity.
G: What is meant by entity and why is this a requirement?

  * Non-contingency: in order to satisfy the ubiquity requirement,
      the selection decision for each packet MUST NOT depend on
      future packets.  Rather, the selection decision MUST be capable
      of being made on the basis of the selection process input up to
      and including the packet in question. This excludes selection
      functions that require caching of packet for selection
      contingent on subsequent packets. See also the timeliness
      requirement following.
G: Does this mean that "selection functions that require caching of packet"
need not be ubiquitous. It makes sense though.

  Selectors are outlined in Section 4, and described in more detail in
  the companion document [ZMRD03].

3.2 Reporting Process Requirements

  * Transparency: allow transparent interpretation of measurements as
      communicated by PSAMP reporting, without any need to obtain

G: Please change PSAMP reporting to "reporting process" for consistency.

additional information concerning the observed packet stream.

* Robustness to Information Loss: allow robust interpretation of

Duffield (Ed.)       draft-ietf-psamp-framework-03.txt          [Page 6]

Internet-Draft          Passive Packet Measurement             June 2003

      measurements with respect to reports missing due to data loss,
      e.g. in transport, or within the measurement, reporting or
      exporting processes.  Inclusion in reporting of information
      that enables the accuracy of measurements to be determined.

  * Faithfulness: all reported quantities that relate to the packet
      treatment MUST reflect the router state and configuration

G: "MUST" I think is too hard especially in cases like exporter overloading
  etc. Can this be changed to "SHOULD" ?

      encountered by the packet at the time it is received by the
      measurement process.

  * Privacy: selection of the content of packet reports will be
      cognizant of privacy and anonymity issues while being
      responsive to the needs of measurement applications, and in
      accordance with RFC 2804.  Full packet capture of arbitrary
      packet streams is explicitly out of scope.

  A specific reporting processes meeting these requirements, and the
  requirement for ubiquity, is described in Section 5.

3.3 Export Process Requirements

  * Timeliness: reports on selected packets MUST be made available
      to the collector quickly enough to support near real time
      applications. Specifically, any report on a packet MUST be
      dispatched within 1 second of the time of receipt of the packet
      by the measurement process.
G: Does the exporter drop the report stream if it could not dispatch the
packet in 1 sec? If so the time should be larger. Seconds of delay
could be caused if the system is overloaded.

  * Congestion Avoidance: export of a report stream across a network
      MUST be congestion avoiding in compliance with RFC 2914.

* Secure Export:
- confidentiality: the option to encrypt exported data MUST be
provided.
- integrity: alterations in transit to exported data MUST be
detectable at the collector
- authenticity: authenticity of exported data MUST be
verifiable by the collector in order to detect forged data.


The motivation here is the same as for security in IPFIX
export; see Sections 6.3 and 10 of [QZCZ03].


3.4 Configuration Requirements

  * Ease of Configuration: of sampling and export parameters,
      e.g. for automated remote reconfiguration in response to
      measurements.

* Secure Configuration: the option to configure via protocols that

Duffield (Ed.)       draft-ietf-psamp-framework-03.txt          [Page 7]

Internet-Draft          Passive Packet Measurement             June 2003

      prevent unauthorized reconfiguration or eavesdropping on
      configuration communications MUST be available.  Eavesdropping
      on configuration might allow an attacker to gain knowledge that
      would be helpful in crafting a packet stream to (for example)
      evade subversion, or overload the measurement infrastructure.


Configuration is discussed in Section 8. Feasibility and complexity of PSAMP operations is discussed in Section 9.

  Reuse of existing protocols will be encouraged provided the
  protocol capabilities are compatible with the requirements laid out
  in this document.

4 Packet Selection

4.1 Packet Selection Terminology.

  * Filtering: a filter is a selection operation that selects a
      packet deterministically based on the packet content, its
      treatment, and functions of these occurring in the selection
      state. Examples include match/mask filtering, and hash-based
      selection.

  * Sampling: a selection operation that is not a filter is called a
      sampling operation. This reflects the intuitive notion that if
      the selection of a packet cannot be exactly predicted from its

G: Please change this to "the selection of a packet cannot be
exactly predicted in all cases  from its" . Otherwise the terminology
"Content-dependent Sampling" becomes a contradiction.

content, there must be some type of sampling taking place.

  * Content-independent Sampling: a sampling operation that does not
      use packet content (or quantities derived from it) as the basis
      for selection is called a content-independent sampling
      operation. Examples include systematic sampling, and uniform
      pseudorandom sampling driven by a pseudorandom number whose
      generation is independent of packet content. Note that in
      independent sampling it is not necessary to access the packet
      content in order to make the selection decision.

  * Content-dependent Sampling: a sampling operation where selection
      is dependent on packet content is called a content-dependent
      sampling operation. Examples include pseudorandom selection
      according to a probability that depends on the contents of a
      packet field; note that this is not a filter.

  * Emulated Sampling: selection operations in any of the above four
      categories may be emulated by operations in the same or another
      category for the purposes of implementation. For example,
      uniform pseudorandom sampling may be emulated by hash-based
      selection, using suitable hash function and hash domain.
G: Need to forward reference "hash-based selection"

* Hash-based selection: a filter specified by a hash domain, a hash

Duffield (Ed.)       draft-ietf-psamp-framework-03.txt          [Page 8]

Internet-Draft          Passive Packet Measurement             June 2003

function, and hash range and a hash selection range.

  * Hash domain: a subset of the packet content and the packet
      treatment, viewed as an N-bit string for some positive integer
      N.

* Hash range: a set of M-bit strings for some positive integer M.

* Hash function: a deterministic map from the hash domain into the
hash range.
G: Please move "Hash domain", "Hash range" and "Hash function" above "Hash-based
Selection"


  * Selection range: a subset of the hash range. The packet is
      selected if the action of the hash function on the hash domain
      for the packet yields a result in the hash selection range.

  * Pool size: the size of a set of packets in a packet stream.
G: Did you mean the # of packets in a packet stream?

  * Sample size: the size of a set of packets selected by a sampling
      operation.
G: Did you mean the # of packets in a packet stream which got selected
as a result of the sampling oepration?


* Target Sampling Frequency: a configurable sampling frequency in a
sampling operation.
G: Just mention that it could be time or packet based.
* Attained Sampling Frequency: Given a subset of packets in a stream
input to a sampling operation, the attained sampling frequency is
the ratio of the sample size to the pool size.



4.2 Packet Selection Operations for a PSAMP


  A spectrum of packet selection operations is described in detail in
  [ZMRD03]. Here we only briefly summarize the meanings for
  completeness.

A PSAMP selection process MUST support at least one of the
following selectors.
* Systematic Time Based:
packet selection is triggered at periodic instants separated
by a time called the Spacing. All packets that arrive within a
certain time of the trigger (called the Interval Length) are
selected.


  * Systematic Count Based:
       similar to systematic time based expect that selection is
       reckoned w.r.t. packet count rather than time. Packet
       selection is triggered periodically by packet count, a number
       of successive packets being selected subsequent to each trigger.

  * Uniform Probabilistic: packets are selected independently with
       fixed sampling probability p.

* Non-uniform Probabilistic:

Duffield (Ed.)       draft-ietf-psamp-framework-03.txt          [Page 9]

Internet-Draft          Passive Packet Measurement             June 2003

       packets are selected independently with probability p that
   depends on packet content.

  * Probabilistic n-out-of-N:
       form each count-based successive block of N packets, n are
       selected at random

* Match/Mask Filtering:

This entails taking the masking portions of the packet

G: "This entails taking the masking portions of the packet,
parameters for selectors, and functions of these occurring
in the selection state"

       (i.e. taking the bitwise AND with a binary mask) and selecting
       the packet if the result falls in a specified range.  This
       specification doesn't preclude the future definition of a high
       level syntax for defining filtering in a concise way (e.g. TCP
       port taking a particular value) providing that syntax can be
       compiled into the bitwise expression.

   Match/mask operations SHOULD be available for different
       protocol portions of the packet:

     o the IP header (excluding options in IPv4, stacked headers in
   IPv6)

o transport header
o encapsulation headers (including MPLS label stack, ATOM)


       When an entity offers Match/Mask filtering in the selection
       process and, in its usual capacity other than in performing
       PSAMP functions, identifies or processes information from one
       or more of the above protocols, then the information SHOULD be
       made available for filtering. For example, when an entity
       routes based on destination IP address, that field should be
       made available.  Conversely, an entity that does not route is
       not expected to be able to locate an IP address within a
       packet, or make it available for filtering, although it MAY do
       so.

  * Hash-based Selection:
       Hash-based selection will employ one or more hash functions to
       be standardized.  The hash domain is specified by a bitmaps on
       the IP packet header and the IP payload.

G: Just curious- Can't we hash based on selector parameters and
intermediate results also?

       When the hash function is sufficiently good, hash-based
       selection can be used to emulate uniform random sampling over
       the hash domain. The target sampling frequency is then the
       ratio of the size of the selection range to the hash range.

Applications of hash-based selection include:
o Trajectory Sampling: all routers use the same hash selector;


Duffield (Ed.)       draft-ietf-psamp-framework-03.txt         [Page 10]

Internet-Draft          Passive Packet Measurement             June 2003

     the hash domain includes only portions of the packet that do
     not change from hope to hop (e.f. TTL is excluded). Hence
     packets are consistently selected in the sense that they
     are selected at all routers on their path or none. Reports
         also include a second hash (the label hash) that
     distinguishes different packets. Reports of a given packet
     reaching the collector from different routers can be used to
     reconstruct the path taken by the packet. Trajectory
     Sampling is proposed in [DuGr01]; further description is
     found in [ZMRD03]; some applications are described in
     Section 10.

   o Consistent Flow Sampling: the hash domain is a flow key. For
     a given flow, either all or none of its packets are
     sampled. This is accomplished without the need to maintain
     flow state.

       Some applications need to calculate packet hashes for purposes
       other than selection (e.g. the label hash in Trajectory
       Sampling). This can be achieved by placing a calculated hash
       in the selection state, and setting the selection range to be
       the whole of the hash range.

  * Router State Filtering:
       This class of filters selects a packet on based on the following
       conditions, combined with the AND, OR or NOT operators:

       o Ingress interface at which packet arrives equals a specified
       value
       o Egress interface to which packet is routed to equals a
       specified value
       o Origin AS equals a specified value or lies within a given
       range.
       o Destination AS equals a specified value or lies within a given
       range
       o Packet violated acl on the router
       o Failed rpf
       o Failed rsvp
       o No route found for the packet

       Router architectural considerations may preclude some
       information concerning the packet treatment, e.g routing
       state, being available at line rate for selection of
       packets. However, if selection not based on routing state has
       reduced down from line rate, subselection based on routing
       state may be feasible.

4.3 Input Sequence Numbers for Primitive Selectors.

  Each instance of a primitive selector MUST maintain a count of
  packets presented at its input. The counter value is to be included

Duffield (Ed.)       draft-ietf-psamp-framework-03.txt         [Page 11]

Internet-Draft          Passive Packet Measurement             June 2003

  as a sequence number for selected packets. This enables
  applications to determine the attained frequency at which packets
  are selected, and hence correctly normalize network usage estimates
  regardless of loss of information, whether this occurs because of
  discard of packet reports in the measurement or reporting process
  (e.g. due to resource contention), or loss of measurement packets
  in transmission or collection; see [PPM01].  The sequence numbers
  are considered as part of the packet's selection state.

4.4 Composite Selectors

The ability to compose selectors in a selection process SHOULD be
provided. The following combinations appear to be most useful for
applications:
* filtering followed by sampling
* sampling followed by filtering


  Composite selectors are useful for drill down applications. The
  first component of a composite selector can be used to reduce the
  load on the second component. In this setting, the advantage to be
  gained from a given ordering can depend on the composition of the
  packet stream.

4.5 Constraints on the Sampling Frequency

  Sampling at full line rate, i.e. with probability 1, is not
  excluded in principle, although resource constraints may not
  support it in practice.

4.6 Criteria for Choice of Selection Operations

  In current practice, sampling has been performed using particular
  algorithms, including:

- pseudorandom independent sampling with probability 1/N;
- systematic sampling of every Nth packet.


  The question arises as to whether both of these should be
  standardized as distinct selection operations, or whether they can
  be regarded as different implementations of a single selection
  operation.

To determine the answer to this question, we need to consider

     (a) measured or assumed statistical properties of the packet
     stream, e.g., one or more of the following:
         - contents of different packets are statistically independent
         - correlations between contents of different packets decay
           at a specified rate
         - contents of certain fields within the same packet are

Duffield (Ed.)       draft-ietf-psamp-framework-03.txt         [Page 12]

Internet-Draft          Passive Packet Measurement             June 2003

           significantly variable and exhibit small cross correlation
     (b) the desired reference sampling model, e.g., one of:
         - sample packets with long term probability 1/N
         - sample packets independent with probability 1/N
     (c) the set of possible alternatives and implementations, e.g.,
     one of:
         - pseudorandom independent sampling with probability 1/N
         - systematic sampling with period N
         - hash-based sampling with target probability 1/N
     (d) the tolerance for error in the applications that use the
         measurements.

  We can say that a given alternative from (c) reproduces a reference
  model (b) for the applications if the results obtained using them
  are sufficiently accurate in (d) for traffic satisfying an assumed
  statistical properties in (a). Clearly, application to evaluate
  methods in (c) requires developing agreement on the relevant
  properties in (a), (b) and (d).

  Example: systematic sampling with period N will not count the
  occurrence of closely space packets (less than N counts apart) from
  the same flow. Thus for applications that are concerned with the
  joint statistics of multiple packets within flows, systematic
  sampling may not reproduce the results obtained with random
  sampling sufficiently accurately.

5 Reporting Process

5.1 Mandatory Contents of Packet Reports (MUST)

The reporting process MUST include the following in each packet report:

      (i) the input sequence number(s) of any sampling operation
      that acted on the packet in the instance of a measurement
      process of which the reporting process is a component.

  The reporting process MUST be able to include the following
  in each packet report, as a configurable option:

      (ii) some number of contiguous bytes from the start of the
      packet.

  Some devices may not have the resource capacity or functionality to
  provide more detailed reports that those in (i) and (ii)
  above. Using this minimum required reporting functionality, the
  reporting process places the burden of interpretation on the collector,
  or on applications that it supplies.

5.2 Recommended Contents for Packet Reports (SHOULD)

G: Isn't SHOULD is a hard requirement. I would think the only hard
requirement is i) and ii). The reset is upto the capacity of the
reporting process.

The reporting process SHOULD provide for the inclusion in packet

Duffield (Ed.)       draft-ietf-psamp-framework-03.txt         [Page 13]

Internet-Draft          Passive Packet Measurement             June 2003

  reports of the following information, inclusion any or all being
  configurable as a option.

   (iii) fields relating to the following protocols
   used in the packet, specifically: IPv4, IPV6, transport
   protocols, MPLS, ATOM.
G: This is valid only if it is not already covered by ii)

(iv) packet treatment, including:

- identifiers for any input and output interfaces of the
observation point that were traversed by the packet
- source and destination AS


(v) selection state associated with the packet, including:

- timestamps

- hashes, where calculated.

  The specific fields will include those set out as requirements for
  IPFIX [QZCZ03], with modifications appropriate to reporting on
  single packets rather than flows.

  When an entity that hosts a reporting process and, in its usual
  capacity other than performing PSAMP functions, identifies or
  process one or more of the above fields, then the contents of each
  such field(s) SHOULD be made available for optional reporting. For
  example, when a device routes based on destination IP address, that
  field should be made available.  Conversely, an entity that does
  not route is not expected to be able to locate an IP address within
  a packet, or make it available for reporting, although it MAY do
  so.

5.3 Report Interpretation

  Information for use in report interpretation MUST include (i)
  configuration parameters of the selectors of the packets reported
  on; (ii) format of the packet reports (iii) configuration
  parameters and state information of the network element; (iv)

G: What is "network element" ? Isn't this the observation point(s) or
PSAMP device?

  indication of the inherent accuracy of the reported quantities,
  e.g., of timestamps; (v) identifiers for observation point,
  measurement process, and export process.

  The requirements for robustness and transparency are motivations
  for including report interpretation in the report stream. Inclusion
  makes the report stream self-defining.  The PSAMP framework
  excludes reliance on an alternative model in which interpretation
  is recovered out of band. This latter approach is not robust with
  respect to undocumented changes in selector configuration, and may
  give rise to future architectural problems for network management

Duffield (Ed.)       draft-ietf-psamp-framework-03.txt         [Page 14]

Internet-Draft          Passive Packet Measurement             June 2003

systems to coherently manage both configuration and data collection.

  It is not envisaged that all report interpretation be included
  in every packet report. Many of the quantities listed above are
  expected to be relatively static; they could be communicated
  periodically, and upon change.

  To conserve network bandwidth and resources at the collector, the
  measurement packets may be compressed before export.  Compression
  is expected to be quite effective since the sampled packets may
  share many fields in common, e.g. if a filter focuses on packets
  with certain values in particular header fields. Using compression,
  however, could impact the timeliness of reports. Any consequent
  delay MUST not violate the timeliness requirement for availability
  of packet reports at the collector.

6 Parallel Measurement Processes

  Because of the increasing number of distinct measurement
  applications, with varying requirements, it is desirable to set up
  parallel measurement processes on a stream of packets.  A PSAMP
  device SHOULD support more than one independently configurable

G: Definition of PSAMP device is missing from terminology.

  measurement process. The measurement process may have an exclusive
  export process, or may share it with other measurement processes.
G: exclusive reporting process and/or export process.

  Each of the parallel measurement processes SHOULD be
  independent. However, resource constraints may prevent complete
  reporting on a packet selected by multiple selection processes. In
  this case, reporting for the packet MUST be complete for at least
  one measurement process; other measurement processes need only
  report that they selected the packet. The priority amongst
  measurement processes to report packets MUST be configurable.

  It is not proposed to standardize the number of parallel
  measurement processes.


-- to unsubscribe send a message to psamp-request@ops.ietf.org with the word 'unsubscribe' in a single line as the message text body. archive: <http://ops.ietf.org/lists/psamp/>