Acceleration of decision tree searching for IP traffic classification

Traffic classification remains a hot research problem, especially when facing new traffic trends and new hardware architectures. We propose a classification tree search method called explicit range search, motivated by the characteristics of machine learning based classification approaches. Our method differs from previously known algorithms such as HiCut and HyperCut in how to cut the ranges within a dimension and how to search within the ranges. By storing explicit marks and performing hardware supported parallel comparison, the explicit range search can reduce the worst-case number of memory accesses from 26 to 5 on a number of realistic rule sets generated from a well-known machine learning algorithm (C4.5). We also describe in this paper the proposed design based on FPGA devices.

[1]  Venkatachary Srinivasan,et al.  Packet classification using tuple space search , 1999, SIGCOMM '99.

[2]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[3]  Nick McKeown,et al.  Algorithms for packet classification , 2001, IEEE Netw..

[4]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[5]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[6]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[7]  Renata Teixeira,et al.  Traffic classification on the fly , 2006, CCRV.

[8]  T. V. Lakshman,et al.  High-speed policy-based packet forwarding using efficient multi-dimensional range matching , 1998, SIGCOMM '98.

[9]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[10]  David E. Taylor Survey and taxonomy of packet classification techniques , 2005, CSUR.

[11]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[12]  Yan Luo,et al.  DPICO: a high speed deep packet inspection engine using compact finite automata , 2007, ANCS '07.

[13]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[14]  George Varghese,et al.  Packet classification using multidimensional cutting , 2003, SIGCOMM '03.

[15]  Ian Witten,et al.  Data Mining , 2000 .

[16]  Pankaj Gupta,et al.  Packet Classification using Hierarchical Intelligent Cuttings , 1999 .

[17]  Grenville J. Armitage,et al.  Training on multiple sub-flows to optimise the use of Machine Learning classifiers in real-world IP networks , 2006, Proceedings. 2006 31st IEEE Conference on Local Computer Networks.

[18]  Nick McKeown,et al.  Designing and implementing a fast crossbar scheduler , 1999, IEEE Micro.

[19]  Salvatore J. Stolfo,et al.  Cost-based modeling for fraud and intrusion detection: results from the JAM project , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[20]  George Varghese,et al.  Packet classification for core routers: is there an alternative to CAMs? , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[21]  George Varghese,et al.  Scalable packet classification , 2001, SIGCOMM '01.

[22]  George Varghese,et al.  Fast and scalable layer four switching , 1998, SIGCOMM '98.