A machine learning approach to edge type inference in Internet AS graphs

The Internet AS topology can be represented by AS graphs where nodes represent ASes and edges represent business relationships between ASes. AS relationship can be broadly classified into two types: provider-to-customer (p2c) and peer-to-peer (p2p). In this paper we present a machine learning approach to edge type inference in AS graphs. Given an AS graph derived from publicly available data source, we use the Gentle AdaBoost machine learning algorithm to train a classifier that classifies the edge types (p2c and p2p) based on a set of node features. We use our method to train classifiers for three AS graphs derived from different data sources-a BGP graph, a traceroute graph, and an IRR graph. The three classifiers achieve 93.97%-97.73% accuracy when validated against ground truth and achieve 81.76%-95.66% accuracy when validated against CAIDA's AS relationship inference dataset. We merge the three individual graphs to obtain a combined graph and propose a method to compute edge types in the combined graph. We analyze the characteristics of the three individual graphs and the combined graph and show that combining the three individual graphs gives us a significantly more complete view of both the p2p and p2c ecosystems in the Internet.

[1]  Lixin Gao,et al.  On inferring autonomous system relationships in the Internet , 2000, Globecom '00 - IEEE. Global Telecommunications Conference. Conference Record (Cat. No.00CH37137).

[2]  Dmitri V. Krioukov,et al.  AS relationships: inference and validation , 2006, CCRV.

[3]  Vasileios Giotsas,et al.  AS relationships, customer cones, and validation , 2013, Internet Measurement Conference.

[4]  Daniel Massey,et al.  Collecting the internet AS-level topology , 2005, CCRV.

[5]  Priya Mahadevan,et al.  The internet AS-level topology: three data sources and one definitive metric , 2005, Comput. Commun. Rev..

[6]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[7]  Michalis Faloutsos,et al.  Analyzing BGP policies: methodology and tool , 2004, IEEE INFOCOM 2004.

[8]  Thomas Erlebach,et al.  Computing the types of the relationships between autonomous systems , 2007, IEEE/ACM Trans. Netw..

[9]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[10]  Walter Willinger,et al.  Towards capturing representative AS-level Internet topologies , 2002, SIGMETRICS '02.

[11]  Michalis Faloutsos,et al.  A Systematic Framework for Unearthing the Missing Links: Measurements and Impact , 2007, NSDI.

[12]  Chiara Orsini,et al.  C Consiglio Nazionale delle Ricerche The Impact of IXPs on the AS-level Topology Structure of the Internet , 2010 .

[13]  Vasileios Giotsas,et al.  Inferring Complex AS Relationships , 2014, Internet Measurement Conference.

[14]  G. Di Battista,et al.  Computing the types of the relationships between autonomous systems , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[15]  Mark Culp,et al.  ada: An R Package for Stochastic Boosting , 2006 .

[16]  Randy H. Katz,et al.  Characterizing the Internet hierarchy from multiple vantage points , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[17]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[18]  Michalis Faloutsos,et al.  Power laws and the AS-level internet topology , 2003, TNET.

[19]  Lixia Zhang,et al.  The (In)Completeness of the Observed Internet AS-level Structure , 2010, IEEE/ACM Transactions on Networking.

[20]  Walter Willinger,et al.  The origin of power laws in Internet topologies revisited , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[21]  Lixin Gao,et al.  On the evaluation of AS relationship inferences [Internet reachability/traffic flow applications] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..