Wrinkles in Time: Detecting Internet-wide Events via NTP

Understanding the nature and characteristics of Internet events such as route changes and outages can serve as the starting point for improvements in network configurations, management and monitoring practices. However, the scale, diversity, and dynamics of network infrastructure makes event detection and analysis challenging. In this paper, we describe a new approach to Internet event measurement, identification and analysis that provides a broad and detailed perspective without the need for new or dedicated infrastructure or additional network traffic. Our approach is based on analyzing data that is readily available from Network Time Protocol (NTP) servers. NTP is one of the few on-by-default services on clients, thus NTP servers have a broad perspective on Internet behavior. We develop a tool for analyzing NTP traces called Tezzeract, which applies Robust Principal Components Analysis to detect Internet events. We demonstrate Tezzeract’s efficacy by conducting controlled experiments and by applying it to data collected over a period of 3 months from 19 NTP servers. We also compare and contrast Tezzeract’s perspective with reported outages and events identified through active probing. We find that while there is commonality across methods, NTP-based monitoring provides a unique perspective that complements prior methods.

[1]  Sven Serneels,et al.  Principal component analysis for data containing outliers and missing elements , 2008, Comput. Stat. Data Anal..

[2]  Aaron Schulman,et al.  Timeouts: Beware Surprisingly High Delay , 2015, Internet Measurement Conference.

[3]  Mark Crovella,et al.  Mining anomalies using traffic feature distributions , 2005, SIGCOMM '05.

[4]  Christophe Croux,et al.  High breakdown estimators for principal components: the projection-pursuit approach revisited , 2005 .

[5]  David Wetherall,et al.  Studying Black Holes in the Internet with Hubble , 2008, NSDI.

[6]  Mia Hubert,et al.  ROBPCA: A New Approach to Robust Principal Component Analysis , 2005, Technometrics.

[7]  Chen-Nee Chuah,et al.  Characterization of Failures in an Operational IP Backbone Network , 2008, IEEE/ACM Transactions on Networking.

[8]  Ramesh Govindan,et al.  MIND: A Distributed Multi-Dimensional Indexing System for Network Diagnosis , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[9]  Paul Barford,et al.  Bigfoot: A geo-based visualization methodology for detecting BGP threats , 2016, 2016 IEEE Symposium on Visualization for Cyber Security (VizSec).

[10]  Paul Barford,et al.  RiskRoute: a framework for mitigating network outage threats , 2013, CoNEXT.

[11]  Paul Barford,et al.  TimeWeaver: Opportunistic One Way Delay Measurement Via NTP , 2018, 2018 30th International Teletraffic Congress (ITC 30).

[12]  C. Croux,et al.  Principal Component Analysis Based on Robust Estimators of the Covariance or Correlation Matrix: Influence Functions and Efficiencies , 2000 .

[13]  Paul Barford,et al.  Network Performance Anomaly Detection and Localization , 2009, IEEE INFOCOM 2009.

[14]  Farnam Jahanian,et al.  Internet routing instability , 1997, SIGCOMM '97.

[15]  Jaideep Chandrashekar,et al.  A first step toward understanding inter-domain routing dynamics , 2005, MineNet '05.

[16]  Ming Zhang,et al.  PlanetSeer: Internet Path Failure Monitoring and Characterization in Wide-Area Services , 2004, OSDI.

[17]  V. Paxson End-to-end routing behavior in the internet , 2006, CCRV.

[18]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[19]  C. Papadopoulos,et al.  Census and Survey of the Visible Internet ( extended ) 0 USC , 2008 .

[20]  Farnam Jahanian,et al.  Experimental study of Internet stability and backbone failures , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).

[21]  Kelley Klaver Pecheux,et al.  EFFECTS OF CATASTROPHIC EVENTS ON TRANSPORTATION SYSTEM MANAGEMENT AND OPERATIONS, HOWARD STREET TUNNEL FIRE, BALTIMORE CITY, MARYLAND, JULY 18, 2001: FINDINGS , 2002 .

[22]  Ying Zhang,et al.  A Measurement Study of Internet Delay Asymmetry , 2008, PAM.

[23]  Jennifer Rexford,et al.  Sensitivity of PCA for traffic anomaly detection , 2007, SIGMETRICS '07.

[24]  Gabriel Maciá-Fernández,et al.  Hierarchical PCA-based multivariate statistical network monitoring for anomaly detection , 2016, 2016 IEEE International Workshop on Information Forensics and Security (WIFS).

[25]  Mia Hubert,et al.  Computational Statistics and Data Analysis Robust Pca for Skewed Data and Its Outlier Map , 2022 .

[26]  Ying Zhang,et al.  Understanding network delay changes caused by routing events , 2007, SIGMETRICS '07.

[27]  Peter Filzmoser,et al.  Robust feature selection and robust PCA for internet traffic anomaly detection , 2012, 2012 Proceedings IEEE INFOCOM.

[28]  D. G. Simpson,et al.  Robust principal component analysis for functional data , 2007 .

[29]  P. Filzmoser,et al.  Algorithms for Projection-Pursuit Robust Principal Component Analysis , 2007 .

[30]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[31]  Mark Crovella,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM '04.

[32]  Renata Teixeira,et al.  NetDiagnoser: troubleshooting network unreachabilities using end-to-end probes and routing data , 2007, CoNEXT '07.

[33]  John S. Heidemann,et al.  Trinocular: understanding internet reliability through adaptive probing , 2013, SIGCOMM.

[34]  Jennifer Rexford,et al.  LatLong: Diagnosing Wide-Area Latency Changes for CDNs , 2012, IEEE Transactions on Network and Service Management.

[35]  Vyas Sekar,et al.  Internet Outages, the Eyewitness Accounts: Analysis of the Outages Mailing List , 2015, PAM.

[36]  Paul Barford,et al.  Time's Forgotten: Using NTP to understand Internet Latency , 2015, HotNets.