An Intelligent Anomaly Detection Scheme for Micro-Services Architectures With Temporal and Spatial Data Analysis

Service-oriented 5G mobile systems are commonly believed to reshape the landscape of the Internet with ubiquitous services and infrastructures. The micro-services architecture has attracted significant interests from both academia and industry, offering the capabilities of agile development and scale capacity. The emerging mobile edge computing is able to firmly maintain efficient resource utility of 5G systems, which can be empowered by micro-services. However, such capabilities impose significant challenges on micro-services system management. Although substantial data are produced for system maintenance, the interleaved temporal-spatial information has not been fully exploited. Additionally, the flooding data impose heavy pressures on automatic analysis tools. Automated digestion of data is in an urgent need for system maintenance. In this paper, we propose a new learning-based anomaly detection framework for service-provision systems with micro-services architectures using service execution logs (temporally) and query traces (spatially). It includes two major parts: logging and tracing representation, and two-stage identification via a sequential model and temporal-spatial analysis. The experimental results show that the temporal-spatial features can accurately capture the nature of operational data. The proposed framework performs well on anomaly detection, and helps gain in-depth insights of large-scale systems.

[1]  Jon Stearley,et al.  What Supercomputers Say: A Study of Five System Logs , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[2]  Shenglin Zhang,et al.  LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs , 2019, IJCAI.

[3]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[4]  Donald Beaver,et al.  Dapper, a Large-Scale Distributed Systems Tracing Infrastructure , 2010 .

[5]  Xiao Yu,et al.  CloudSeer: Workflow Monitoring of Cloud Infrastructures via Interleaved Logs , 2016, ASPLOS.

[6]  Phil Blunsom,et al.  Discovering Discrete Latent Topics with Neural Variational Inference , 2017, ICML.

[7]  Jie Xu,et al.  Data-driven dynamic resource scheduling for network slicing: A Deep reinforcement learning approach , 2019, Inf. Sci..

[8]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[9]  Jian Li,et al.  An Evaluation Study on Log Parsing and Its Use in Log Mining , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[10]  Jeffrey G. Andrews,et al.  What Will 5G Be? , 2014, IEEE Journal on Selected Areas in Communications.

[11]  Mahesh K. Marina,et al.  Network Slicing in 5G: Survey and Challenges , 2017, IEEE Communications Magazine.

[12]  Sheng Chen,et al.  Service‐oriented 5G network architecture: an end‐to‐end software defining approach , 2016, Int. J. Commun. Syst..

[13]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[14]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  Wei Xu,et al.  Advances and challenges in log analysis , 2011, Commun. ACM.

[16]  Evangelos E. Milios,et al.  Clustering event logs using iterative partitioning , 2009, KDD.

[17]  Cesare Pautasso,et al.  Microservices in Practice, Part 1: Reality Check and Service Design , 2017, IEEE Software.

[18]  Junyuan Xie,et al.  Don't Forget the Quantifiable Relationship between Words: Using Recurrent Neural Network for Short Text Topic Discovery , 2017, AAAI.

[19]  Rami Bahsoon,et al.  Microservices and Their Design Trade-Offs: A Self-Adaptive Roadmap , 2016, 2016 IEEE International Conference on Services Computing (SCC).

[20]  Cody Bunch,et al.  OpenStack Cloud Computing Cookbook , 2012 .

[21]  Eric J. Pauwels,et al.  One Class Classification for Anomaly Detection: Support Vector Data Description Revisited , 2011, ICDM.

[22]  Shehroz S. Khan,et al.  One-class classification: taxonomy of study and review of techniques , 2013, The Knowledge Engineering Review.

[23]  Mohammed Samaka,et al.  Exploring microservices for enhancing internet QoS , 2018, Trans. Emerg. Telecommun. Technol..

[24]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[25]  Claus Pahl,et al.  Microservices: The Journey So Far and Challenges Ahead , 2018, IEEE Softw..

[26]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[27]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[28]  Randy H. Katz,et al.  X-Trace: A Pervasive Network Tracing Framework , 2007, NSDI.

[29]  Saharon Rosset,et al.  Analyzing system logs: a new view of what's important , 2007 .

[30]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[31]  Michael Bell,et al.  SOA Modeling Patterns for Service-Oriented Discovery and Analysis , 2009 .

[32]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[33]  Risto Vaarandi,et al.  A data clustering algorithm for mining patterns from event logs , 2003, Proceedings of the 3rd IEEE Workshop on IP Operations & Management (IPOM 2003) (IEEE Cat. No.03EX764).

[34]  Albert Y. Zomaya,et al.  A Survey of Big Data and Computational Intelligence in Networking , 2017 .

[35]  Feifei Li,et al.  DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning , 2017, CCS.

[36]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[37]  Olaf Zimmermann,et al.  Microservices tenets , 2017, Computer Science - Research and Development.

[38]  Evangelos E. Milios,et al.  A Lightweight Algorithm for Message Type Extraction in System Application Logs , 2012, IEEE Transactions on Knowledge and Data Engineering.

[39]  Chong Wang,et al.  TopicRNN: A Recurrent Neural Network with Long-Range Semantic Dependency , 2016, ICLR.

[40]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[41]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[42]  Zibin Zheng,et al.  Big Data Analytics for Large-scale Wireless Networks , 2019, ACM Comput. Surv..

[43]  Shilin He,et al.  Experience Report: System Log Analysis for Anomaly Detection , 2016, 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE).

[44]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[45]  Zhiting Hu,et al.  Improved Variational Autoencoders for Text Modeling using Dilated Convolutions , 2017, ICML.

[46]  Yuan Zuo,et al.  Learning-based network path planning for traffic engineering , 2019, Future Gener. Comput. Syst..

[47]  Zibin Zheng,et al.  Tools and Benchmarks for Automated Log Parsing , 2018, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).