Every Packet Counts: Fine-Grained Delay and Loss Measurement with Reordering

Delay is an important metric to understand and improve system performance. While existing approaches focus on aggregate delay statistics in pre-programmed granularity, providing only statistical results such as averages and deviations, those approaches fail to provide fine-grained delay measurement at a flexible level and thus may miss important delay characteristics. For example, delay anomalies, which are critical system performance indicators, may not be captured by existing coarse grained approaches. In this work, we propose a fine-grained delay measurement approach based on a new measurement structure design called order preserving aggregator (OPA). OPA can efficiently encode the ordering and loss information by exploiting inherent data characteristics. Based on OPA, we propose a two layer design to convey both ordering and time stamp information, and then derive per-packet delay/loss measurement with a small overhead. We evaluate our approach both analytically and experimentally with widely used real-world data sets. The results show that our approach can achieve accurate per-packet delay measurement with an average of per-packet relative error at 2%, and an average of aggregated relative error at 10-5, while introducing less than 4 × 10-4 additional overhead.

[1]  Olivier Rioul,et al.  Fast algorithms for discrete and continuous wavelet transforms , 1992, IEEE Trans. Inf. Theory.

[2]  Sudipto Guha,et al.  Fast, small-space algorithms for approximate histogram maintenance , 2002, STOC '02.

[3]  Deborah Estrin,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Fine-grained Network Time Synchronization Using Reference Broadcasts , 2022 .

[4]  Yin Zhang,et al.  Detecting the performance impact of upgrades in large operational networks , 2010, SIGCOMM '10.

[5]  Myungjin Lee,et al.  Not all microseconds are equal: fine-grained per-flow measurements with reference latency interpolation , 2010, SIGCOMM '10.

[6]  Lothar Thiele,et al.  Reconstruction of the correct temporal order of sensor network data , 2011, Proceedings of the 10th ACM/IEEE International Conference on Information Processing in Sensor Networks.

[7]  Antony I. T. Rowstron,et al.  Better never than late: meeting deadlines in datacenter networks , 2011, SIGCOMM.

[8]  David Walker,et al.  Abstractions for network update , 2012, SIGCOMM '12.

[9]  Stefan Savage,et al.  California fault lines: understanding the causes and impact of network failures , 2010, SIGCOMM '10.

[10]  T. N. Vijaykumar,et al.  Deadline-aware datacenter tcp (D2TCP) , 2012, SIGCOMM '12.

[11]  Sudipto Guha,et al.  Near-optimal sparse fourier representations via sampling , 2002, STOC '02.

[12]  Lothar Thiele,et al.  How was your journey?: uncovering routing dynamics in deployed sensor networks with multi-hop network tomography , 2012, SenSys '12.

[13]  Alex X. Liu,et al.  Noise can help: accurate and efficient per-flow latency measurement without packet probing and time stamping , 2014, SIGMETRICS '14.

[14]  Cheng Huang,et al.  Challenges, design and analysis of a large-scale p2p-vod system , 2008, SIGCOMM '08.

[15]  George Varghese,et al.  Every microsecond counts: tracking fine-grain latencies with a lossy difference aggregator , 2009, SIGCOMM '09.

[16]  Konstantina Papagiannaki,et al.  Measurement and analysis of single-hop delay on an IP backbone network , 2003, IEEE J. Sel. Areas Commun..

[17]  Myungjin Lee,et al.  MAPLE: a scalable architecture for maintaining packet latency measurements , 2012, IMC '12.

[18]  George Varghese,et al.  Fine-grained latency and loss measurements in the presence of reordering , 2011, SIGMETRICS '11.

[19]  George Varghese,et al.  Building a better NetFlow , 2004, SIGCOMM.

[20]  Nick G. Duffield,et al.  Trajectory sampling for direct traffic observation , 2001, TNET.