Framework for Metric
CompositionAT&T Labs200 Laurel Avenue SouthMiddletown,NJ07748USA+1 732 420 1571+1 732 368 1192acmorton@att.comhttp://home.comcast.net/~acmacm/Alcatel-LucentCopernicuslaan 50Antwerp2018Belgium+32 3 240 3983steven.van_den_berghe@alcatel-lucent.comThis memo describes a detailed framework for composing and
aggregating metrics (both in time and in space) originally defined by
the IP Performance Metrics (IPPM) RFC 2330 and developed by the IETF.
This new framework memo describes the generic composition and
aggregation mechanisms. The memo provides a basis for additional
documents that implement the framework to define detailed compositions
and aggregations of metrics which are useful in practice.The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119.The IPPM framework describes two forms
of metric composition, spatial and temporal. The text also suggests that
the concepts of the analytical framework (or A-frame) would help to
develop useful relationships to derive the composed metrics from real
metrics. The effectiveness of composed metrics is dependent on their
usefulness in analysis and applicability to practical measurement
circumstances.This memo expands on the notion of composition, and provides a
detailed framework for several classes of metrics that were described in
the original IPPM framework. The classes include temporal aggregation,
spatial aggregation, and spatial composition.Network operators have deployed measurement systems to serve many
purposes, including performance monitoring, maintenance support,
network engineering, and reporting performance to customers. The
collection of elementary measurements alone is not enough to
understand a network's behaviour. In general, measurements need to be
post-processed to present the most relevant information for each
purpose. The first step is often a process of "composition" of single
measurements or measurement sets into other forms. Composition and
aggregation present several more post-processing opportunities to the
network operator, and we describe the key motivations below.A network's measurement possibilities scale upward with the
square of the number of nodes. But each measurement implies
overhead, in terms of the storage for the results, the traffic on
the network (assuming active methods), and the operation and
administration of the measurement system itself. In a large network,
it is impossible to perform measurements from each node to all
others.An individual network operator should be able to organize their
measurement paths along the lines of physical topology, or routing
areas/Autonomous Systems, and thus minimize dependencies and overlap
between different measurement paths. This way, the sheer number of
measurements can be reduced, as long as the operator has a set of
methods to estimate performance between any particular pair of nodes
when needed.Composition and aggregation play a key role when the path of
interest spans multiple networks, and where each operator conducts
their own measurements. Here, the complete path performance may be
estimated from measurements on the component parts.Operators that take advantage of the composition and aggregation
methods recognize that the estimates may exhibit some additional
error beyond that inherent in the measurements themselves, and so
they are making a trade-off to achieve reasonable measurement system
overhead.There are many different measurement users, each bringing
specific requirements for the reporting timescale. Network managers
and maintenance forces prefer to see results presented very rapidly,
to detect problems quickly or see if their action has corrected a
problem. On the other hand, network capacity planners and even
network users sometimes prefer a long-term view of performance, for
example to check trends. How can one set of measurements serve both
needs?The answer lies in temporal aggregation, where the short-term
measurements needed by the operations community are combined to
estimate a longer-term result for others. Also, problems with the
measurement system itself may be isolated to one or more of the
short-term measurements, rather than possibly invalidating an entire
long-term measurement if the problem was undetected.Another motivation is data reduction. Assume there is a network
in which delay measurements are performed among a subset of its
nodes. A network manager might ask whether there is a problem with
the network delay in general. It would be desirable to obtain a
single value that gives an indication of the overall network delay.
Spatial aggregation methods would address this need, and can produce
the desired "single figure of merit" asked for, one that may also be
useful in trend analysis.The overall value would be calculated from the elementary delay
measurements, but it not obvious how: for example, it may not to be
reasonable to average all delay measurements, as some paths (e.g.
having a higher bandwidth or more important customers) might be
considered more critical than others.Metric composition can help to provide, from raw measurement
data, some tangible, well-understood and agreed upon information
about the service guarantees provided by a network. Such information
can be used in the Service Level Agreement/Service Level
Specification (SLA/SLS) contracts between a service provider and its
customers.If a network measurement system operator anticipates needing to
produce overall metrics by composition, then it is prudent to keep
that requirement in mind when considering the measurement design and
sampling plan. Also, certain summary statistics are more conducive
to composition than others, and this figures prominently in the
design of measurements and when reporting the results.The purpose of this memo is provide a common framework for the
various classes of metrics that are composed from primary metrics. The
scope is limited to the definitions of metrics that are composed from
primary metrics using a deterministic function. Key information about
each composed metric, such as the assumptions under which the
relationship holds and possible sources of error/circumstances where the
composition may fail, are included.At this time, the scope of effort is limited to composed metrics for
packet loss, delay, and delay variation, as defined in , , , , , and the comparable metrics in . Composition of packet reordering metrics and duplication metrics are considered research topics at the time this
memo was prepared, and beyond its scope.This memo will retain the terminology of the IPPM Framework as much as possible, but will extend the
terminology when necessary. It is assumed that the reader is familiar
with the concepts introduced in , as they
will not be repeated here.This section defines the terminology applicable to the processes of
Metric Composition and Aggregation.The logical or physical location where packet observations are
made. The term Measurement Point is synonymous with the term
"observation position" used in when
describing the notion of wire time. A measurement point may be at the
boundary between a host and an adjacent link (physical), or it may be
within a host (logical) that performs measurements where the
difference between host time and wire time is understood.The complete path is the actual path that a packet would follow as
it travels from the packet’s Source to its Destination. A
Complete path may span the administrative boundaries of one or more
networks.The complete path metric is the Source to Destination metric that a
composed metric attempts to estimate. A complete path metric
represents the ground-truth for a composed metric.The complete time interval is comprised of two or more contiguous
sub-intervals, and is the interval whose performance will be estimated
through temporal aggregation.A composed metric is an estimate of an actual metric describing the
performance of a path over some time interval. A composed metric is
derived from other metrics by applying a deterministic process or
function (e.g., a composition function). The process may use metrics
that are identical to the metric being composed, or metrics that are
dissimilar, or some combination of both types.A composition function is a deterministic process applied to
individual metrics to derive another metric (such as a Composed
metric).As applied here, the notion of ground truth is defined as the
actual performance of a network path over some time interval. The
ground truth is a metric on the (unavailable) packet transfer
information for the desired path and time interval that a composed
metric seeks to estimate.A span of time.A Sub-interval is a time interval that is included in another
interval.A Sub-path is a portion of the complete path where at least the
Sub-path Source and Destination hosts are constituents of the complete
path. We say that such a sub-path is “involved” in the
complete path.Since sub-paths terminate on hosts, it is important to describe how
sub-paths are considered to be joined. In practice, the Source and
Destination hosts may perform the function of measurement points.If the Destination and Source hosts of two adjoining paths are
co-located and the link between them would contribute negligible
performance, then these hosts can be considered equivalent (even if
there is no physical link between them, this is a practical
concession).If the Destination and Source hosts of two adjoining paths have a
link between them that contributes to the complete path performance,
then the link and hosts constitutes another sub-path that is involved
in the complete path, and should be characterized and included in the
composed metric.A sub-path path metric is an element of the process to derive a
Composite metric, quantifying some aspect of the performance a
particular sub-path from its Source to Destination.This section defines the various classes of Composition. There are
two classes more accurately described as aggregation over time and
space, and the third involves concatenation in space.Aggregation in time is defined as the composition of metrics with
the same type and scope obtained in different time instants or time
windows. For example, starting from a time series of the measurements
of maximum and minimum One-Way Delay on a certain network path
obtained over 5-minute intervals, we obtain a time series measurement
with a coarser resolution (60 minutes) by taking the maximum of 12
consecutive 5-minute maxima and the minimum of 12 consecutive 5-minute
minima.The main reason for doing time aggregation is to reduce the amount
of data that has to be stored, and make the visualization/spotting of
regular cycles and/or growing or decreasing trends easier. Another
useful application is to detect anomalies or abnormal changes in the
network characteristics.In RFC 2330, the term "temporal composition" is introduced and
differs from temporal aggregation in that it refers to methodologies
to predict future metrics on the basis of past observations (of the
same metrics), exploiting the time correlation that certain metrics
can exhibit. We do not consider this type of composition here.Aggregation in space is defined as the combination of metrics of
the same type and different scope, in order to estimate the overall
performance of a larger network. This combination may involve weighing
the contributions of the input metrics.Suppose we want to compose the average One-Way-Delay (OWD)
experienced by flows traversing all the Origin-Destination (OD) pairs
of a network (where the inputs are already metric "statistics"). Since
we wish to include the effect of the traffic matrix on the result, it
makes sense to weight each metric according to the traffic carried on
the corresponding OD pair:OWD_sum=f1*OWD_1+f2*OWD_2+...+fn*OWD_nwhere fi=load_OD_i/total_load.A simple average OWD across all network OD pairs would not use the
traffic weighting.Another example metric that is "aggregated in space", is the
maximum edge-to-edge delay across a single network. Assume that a
Service Provider wants to advertise the maximum delay that transit
traffic will experience while passing through his/her network. There
can be multiple edge-to-edge paths across a network, and the Service
Provider chooses either to publish a list of delays (each
corresponding to a specific edge-to-edge path), or publish a single
maximum value. The latter approach simplifies the publication of
measurement information, and may be sufficient for some purposes.
Similar operations can be provided to other metrics, e.g. "maximum
edge-to-edge packet loss", etc.We suggest that space aggregation is generally useful to obtain a
summary view of the behaviour of large network portions, or in general
of coarser aggregates. The metric collection time instant, i.e. the
metric collection time window of measured metrics is not considered in
space aggregation. We assume that either it is consistent for all the
composed metrics, e.g. compose a set of average delays all referred to
the same time window, or the time window of each composed metric does
not affect aggregated metric.Concatenation in space is defined as the composition of metrics of
same type and (ideally) different spatial scope, so that the resulting
metric is representative of what the metric would be if obtained with
a direct measurement over the sequence of the several spatial scopes.
An example is the sum of OWDs of different edge-to-edge network's
delays, where the intermediate edge points are close to each other or
happen to be the same. In this way, we can for example estimate OWD_AC
starting from the knowledge of OWD_AB and OWD_BC. Note that there may
be small gaps in measurement coverage, likewise there may be small
overlaps (e.g., the link where test equipment connects to the
network).One key difference from examples of aggregation in space is that
all sub-paths contribute equally to the composed metric, independent
of the traffic load present.In practice there is often the need to compute a new metric using
one or more metrics with the same spatial and time scope. For example,
the metric rtt_sample_variance may be computed from two different
metrics: the help metrics rtt_square_sum and the rtt_sum. The process
of using help metrics is a simple calculation and not an aggregation
or a concatenation, and will not be investigated further in this
memo.Composed metrics might themselves be subject to further steps of
composition or aggregation. An example would be the delay of a maximal
path obtained through the spatial composition of several composed
delays for each Complete Path in the maximal path (obtained through
spatial composition). All requirements for first order composition
metrics apply to higher order composition.An example using temporal aggregation: twelve 5-minute metrics are
aggregated to estimate the performance over an hour. The second step
of aggregation would take 24 hourly metrics and estimate the
performance over a day.The definitions for all composed metrics MUST include sections to
treat the following topics.The description of each metric will clearly state: the definition (and statistic, where appropriate);the composition or aggregation relationship;the specific conjecture on which the relationship is based and
assumptions of the statistical model of the process being measured,
if any (see section 12);a justification of practical utility or usefulness for analysis
using the A-frame concepts;one or more examples of how the conjecture could be incorrect and
lead to inaccuracy;the information to be reported.For each metric, the applicable circumstances will be defined, in
terms of whether the composition or aggregation: Requires homogeneity of measurement methodologies, or can allow a
degree of flexibility (e.g., active or passive methods produce the
"same" metric). Also, the applicable sending streams will be
specified, such as Poisson, Periodic, or both.Needs information or access that will only be available within an
operator's network, or is applicable to Inter-network
composition.Requires precisely synchronized measurement time intervals in all
component metrics, or loosely synchronized, or no timing
requirements.Requires assumption of component metric independence w.r.t. the
metric being defined/composed, or other assumptions.Has known sources of inaccuracy/error, and identifies the
sources.If one or more components of the composition process are encumbered
by Intellectual Property Rights (IPR), then the resulting Composed
Metrics may be encumbered as well. See BCP 79 for IETF policies on IPR disclosure.Figure 1 illustrates the process to derive a metric using spatial
composition, and compares the composed metric to other IPPM
metrics.Metrics <M1, M2, M3> describe the performance of sub-paths
between the Source and Destination of interest during time interval
<T, Tf>. These metrics are the inputs for a Composition Function
that produces a Composed Metric.The Composed Metric is an estimate of an actual metric collected
over the complete Source to Destination path. We say that the Complete
Path Metric represents the "Ground Truth" for the Composed Metric. In
other words, Composed Metrics seek to minimize error w.r.t. the
Complete Path Metric.Further, we observe that a Spatial Metric collected for packets traveling over the same
set of sub-paths provide a basis for the Ground Truth of the
individual Sub-Path metrics. We note that mathematical operations may
be necessary to isolate the performance of each sub-path.Next, we consider multiparty metrics as defined in , and their spatial composition. Measurements
to each of the Receivers produce an element of the one-to-group
metric. These elements can be composed from sub-path metrics and the
composed metrics can be combined to create a composed one-to-group
metric. Figure 2 illustrates this process.Here, Sub-path Metrics M1, M2, and M3 are combined using a
relationship to compose the metric applicable to the Src-Rcvr1 path.
Similarly, M1, M4, and M5 are used to compose the Src-Rcvr2 metric and
M1, M4, and M6 compose the Src-Rcvr3 metric.The Composed One-to-Group Metric would list the Src-Rcvr metrics
for each Receiver in the Group:(Composed-Rcvr1, Composed-Rcvr2, Composed-Rcvr3)The "Ground Truth" for this composed metric is of course an actual
One-to-Group metric, where a single source packet has been measured
after traversing the Complete Paths to the various receivers.Temporal Aggregation involves measurements made over
sub-intervals of the complete time interval between the same Source
and Destination. Therefore, the "Ground Truth" is the metric
measured over the desired interval.Spatial Aggregation combines many measurements using a weighting
function to provide the same emphasis as though the measurements
were based on actual traffic, with inherent weights. Therefore, the
"Ground Truth" is the metric measured on the actual traffic instead
of the active streams that sample the performance.A metric composition can deviate from the ground truth for several
reasons. Two main aspects are:The propagation of the inaccuracies of the underlying
measurements when composing the metric. As part of the composition
function, errors of measurements might propagate. Where possible,
this analysis should be made and included with the description of
each metric.A difference in scope. When concatenating many active
measurement results (from two or more sub-paths) to obtain the
complete path metric, the actual measured path will not be
identical to the complete path. It is in general difficult to
quantify this deviation with exactness, but a metric definition
might identify guidelines for keeping the deviation as small as
possible. The description of the metric composition MUST include an
section identifying the deviation from the ground truth.In practice, when measurements cannot be initiated on a sub-path or
during a particular measurement interval (and perhaps the measurement
system gives up during the test interval), then there will not be a
value for the sub-path reported, and the result SHOULD be recorded as
"undefined".The measured values of many metrics depend on time-variant factors,
such as the level of network traffic on the source to destination
path. Traffic levels often exhibit diurnal (or daily) variation, but a
24 hour measurement interval would obscure it. Temporal Aggregation of
hourly results to estimate the daily metric would have the same
effect, and so the same cautions are warranted.Some metrics are predominantly* time-invariant, such as the actual
minimum one-way delay of fixed path, and therefore temporal
aggregation does not obscure the results as long as the path is
stable. However, paths do vary, and sometimes on less predictable time
intervals than traffic variations. (* Note - It is recognized that
propagation delay on transmission facilities may have diurnal,
seasonal, and even longer-term variations.)This document makes no request of IANA.Note to RFC Editor: this section may be removed on publication as an
RFC.The security considerations that apply to any active measurement of
live networks are relevant here as well. See , and .The exchange of sub-path measurements among network providers may be
a source of concern, and the information should be sufficiently
anonymized to avoid revealing unnecessary operational details (e.g., the
network addresses of measurement devices). However, the schema for
measurement exchange is beyond the scope of this memo, and likely to be
covered by bilateral agreements for some time to come.The authors would like to thank Maurizio Molina, Andy Van Maele,
Andreas Haneman, Igor Velimirovic, Andreas Solberg, Athanassios
Liakopulos, David Schitz, Nicolas Simar and the Geant2 Project. We also
acknowledge comments and suggestions from Phil Chimento, Emile Stephan,
Lei Liang, Stephen Wolff, Reza Fardid, Loki Jorgenson, and Alan
Clark.Internet protocol data communication service - IP packet
transfer and availability performance parameters