Big data and semantics management system for computer networks

We define Big Networks as those that generate big data and can benefit from big data management in their operations. Examples of big networks include the current Internet and the emerging Internet of things and social networks. The ever-increasing scale, complexity and heterogeneity of the Internet make it harder to discover emergent and anomalous behavior in the network traffic. We hypothesize that endowing the otherwise semantically-oblivious Internet with memory management mimicking the human memory functionalities would help advance the Internet capability to learn, conceptualize and effectively and efficiently store traffic data and behavior, and to more accurately predict future events. Inspired by the functionalities of human memory, we proposed a distributed network memory management system, termed NetMem, to efficiently store Internet data and extract and utilize traffic semantics in matching and prediction processes. In particular, we explore Hidden Markov Models (HMM), Latent Dirichlet Allocation (LDA), and simple statistical analysis-based techniques for semantic reasoning in NetMem. Additionally, we propose a hybrid intelligence technique for semantic reasoning integrating LDA and HMM to extract network semantics based on learning patterns and features with syntax and semantic dependencies. We also utilize locality sensitive hashing for reducing dimensionality. Our simulation study using real network traffic demonstrates the benefits of NetMem and highlights the advantages and limitations of the aforementioned techniques.

[1]  Benoit Huet,et al.  Semantic feature extraction with multidimensional hidden Markov model , 2006, Electronic Imaging.

[2]  Aniket Kittur,et al.  Apolo: making sense of large network data by combining rich user interaction and machine learning , 2011, CHI.

[3]  Gulshan Kumar,et al.  The use of artificial intelligence based techniques for intrusion detection: a review , 2010, Artificial Intelligence Review.

[4]  Yurdaer N. Doganata,et al.  Large-Scale Distributed Storage System for Business Provenance , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[5]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[6]  Xin Yao,et al.  Evolving artificial neural networks , 1999, Proc. IEEE.

[7]  Moshe Sipper,et al.  Evolving artificial neural networks with FINCH , 2013, GECCO '13 Companion.

[8]  Jay Beale,et al.  Snort 2.1 Intrusion Detection, Second Edition , 2004 .

[9]  Mohamed Eltoweissy,et al.  Biologically-inspired network “memory” for smarter networking , 2012, 8th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom).

[10]  G. Casella,et al.  Explaining the Gibbs Sampler , 1992 .

[11]  Jácint Szabó,et al.  Latent Dirichlet Allocation for Automatic Document Categorization , 2009, ECML/PKDD.

[12]  Rodrigo Fernandes de Mello,et al.  A Model for Automatic On-Line Process Behavior Extraction, Classification and Prediction in Heterogeneous Distributed Systems , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[13]  John S. Gero,et al.  Design Prototypes: A Knowledge Representation Schema for Design , 1990, AI Mag..

[14]  Rebecca Copeland Network Intelligence - facilitate operators win in mobile broadband era , 2009, 2009 13th International Conference on Intelligence in Next Generation Networks.

[15]  H. El-Sayed,et al.  Network “memory” system for enhanced network services , 2013, 2013 9th International Conference on Innovations in Information Technology (IIT).

[16]  Deyi Li,et al.  Network Thinking and Network Intelligence , 2006, WImBI.

[17]  Mark Allman,et al.  A Scalable System for Sharing Internet Measurements , 2007 .

[18]  Mohamed Eltoweissy,et al.  Towards a Data Semantics Management System for Internet Traffic , 2014, 2014 6th International Conference on New Technologies, Mobility and Security (NTMS).

[19]  Masamichi Shimosaka,et al.  Typical Behavior Patterns Extraction and Anomaly Detection Algorithm Based on Accumulated Home Sensor Data , 2007, Future Generation Communication and Networking (FGCN 2007).

[20]  Shehroz S. Khan,et al.  Towards the detection of unusual temporal events during activities using HMMs , 2012, UbiComp '12.

[21]  Thomas L. Griffiths,et al.  Integrating Topics and Syntax , 2004, NIPS.

[22]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[23]  Hsinchun Chen,et al.  Intelligent internet searching agent based on hybrid simulated annealing , 2000, Decis. Support Syst..

[24]  Max Welling,et al.  Distributed Inference for Latent Dirichlet Allocation , 2007, NIPS.

[25]  Chengcui Zhang,et al.  Semantic Event Extraction Using Neural Network Ensembles , 2007 .

[26]  Kate Adler Policy Map2013137Policy Map. Phildadelphia, PA: The Reinvestment Fund (TRF) Last visited November 2012. $5,000 annually (up to five users) URL: www.policymap.com , 2013 .

[27]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[28]  Kazusuke Maenaka,et al.  Behavior extraction from multiple sensors information for human activity monitoring , 2011, 2011 IEEE International Conference on Systems, Man, and Cybernetics.

[29]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[30]  Graham J. Williams,et al.  On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms , 2000, KDD '00.

[31]  Adriano Lorena Inácio de Oliveira,et al.  A Novel Hybrid Training Method for Hopfield Neural Networks Applied to Routing in Communications Networks , 2007, HIS.

[32]  Anja Feldmann,et al.  Internet clean-slate design: what and why? , 2007, CCRV.

[33]  Sarmad Ullah Khan,et al.  Future Internet: The Internet of Things Architecture, Possible Applications and Key Challenges , 2012, 2012 10th International Conference on Frontiers of Information Technology.

[34]  J. Hawkins,et al.  On Intelligence , 2004 .

[35]  Michael Winter,et al.  Goguen Categories: A Categorical Approach to L-fuzzy Relations , 2007 .

[36]  Andrew R. Baker,et al.  Snort 2.1 intrusion detection , 2004 .

[37]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[38]  Dirk Ourston,et al.  Applications of hidden Markov models to detecting multi-stage network attacks , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[39]  Dan A. Simovici,et al.  Approximate Computation of Object Distances by Locality-Sensitive Hashing , 2008, DMIN.

[40]  Sujni Paul,et al.  An Optimized Distributed Association Rule Mining Algorithm in Parallel and Distributed Data Mining with XML Data for Improved Response Time , 2010 .

[41]  Adriano Lorena Inácio de Oliveira,et al.  A Novel Hybrid Training Method for Hopfield Neural Networks Applied to Routing in Communications Networks , 2007, 7th International Conference on Hybrid Intelligent Systems (HIS 2007).

[42]  N.B. Idris,et al.  Artificial Intelligence Techniques Applied to Intrusion Detection , 2005, 2005 Annual IEEE India Conference - Indicon.

[43]  Ajith Abraham,et al.  Modeling intrusion detection system using hybrid intelligent systems , 2007, J. Netw. Comput. Appl..

[44]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[45]  Moustafa Youssef,et al.  CellNet: A Bottom-Up Approach to Network Design , 2009, 2009 3rd International Conference on New Technologies, Mobility and Security.

[46]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[47]  Pieter E. Vermaas,et al.  John Gero’s Function-Behaviour-Structure model of designing: a critical analysis , 2005 .

[48]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.