Online anomaly detection using non-parametric technique for big data streams in cloud collaborative environment

Big Data and cloud computing are complementary technological paradigms with a core focus on scalability, agility, and on-demand availability. The rise of cloud computing and cloud data stores have been a precursor and facilitator to the emergence of big data. Cloud computing turns traditional siloed computing assets into shared pools of resources that are based on an underlying internet foundation. As a result a number of enterprises are building efficient and agile cloud environments, and cloud providers continue to expand service offerings. Many cloud providers offer online collaboration service which is basically loosely-coupled in nature. Online anomaly detection aims to detect anomalies in data flowing in a streaming fashion. Such stream data is commonplace in today's cloud centric collaborations which enables participating domains to dynamically interoperate through sharing and accessing of information. Accordingly to forestall unauthorized disclosure of the shared resources and conceivable misappropriation, there is a need to identify anomalous access requests. To the best of our knowledge, the detection of anomalous access requests in cloud-based collaborations through non-parametric statistical technique has not been studied in earlier works. This paper proposes an online anomaly detection algorithm based on non-parametric statistical technique to detect anomalous access requests in cloud environment at runtime.

[1]  Jianguo Liu,et al.  A Hybrid Anomaly Detection Framework in Cloud Computing Using One-Class and Two-Class Support Vector Machines , 2012, ADMA.

[2]  Michael Schatz,et al.  Learning Program Behavior Profiles for Intrusion Detection , 1999, Workshop on Intrusion Detection and Network Monitoring.

[3]  Jonghyun Kim,et al.  Behavior-based anomaly detection on big data , 2015 .

[4]  L. Baker,et al.  A Hierarchical Probabilistic Model for Novelty Detection in Text , 1999, NIPS 1999.

[5]  W. R. Buckland,et al.  Outliers in Statistical Data , 1979 .

[6]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[7]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[8]  Huang Chuanhe,et al.  Anomaly Based Intrusion Detection Using Hybrid Learning Approach of Combining k-Medoids Clustering and Naïve Bayes Classification , 2012, 2012 8th International Conference on Wireless Communications, Networking and Mobile Computing.

[9]  Piotr Indyk,et al.  Maintaining Stream Statistics over Sliding Windows , 2002, SIAM J. Comput..

[10]  M.M. Deris,et al.  A Comparative Study for Outlier Detection Techniques in Data Mining , 2006, 2006 IEEE Conference on Cybernetics and Intelligent Systems.

[11]  LewisLundy,et al.  Detection and classification of intrusions and faults using sequences of system calls , 2001 .

[12]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[13]  Yasser Yasami,et al.  A novel unsupervised classification approach for network anomaly detection by k-Means clustering and ID3 decision tree learning methods , 2010, The Journal of Supercomputing.

[14]  Jin Tong,et al.  NIST Cloud Computing Reference Architecture , 2011, 2011 IEEE World Congress on Services.

[15]  Reda Alhajj,et al.  A comprehensive survey of numeric and symbolic outlier mining techniques , 2006, Intell. Data Anal..

[16]  Dit-Yan Yeung,et al.  Parzen-window network intrusion detectors , 2002, Object recognition supported by user interaction for service robots.

[17]  Helmut Krcmar,et al.  Big Data , 2014, Wirtschaftsinf..

[18]  Charu C. Aggarwal,et al.  Outlier Detection for Temporal Data , 2014, Outlier Detection for Temporal Data.

[19]  Jung-Min Park,et al.  An overview of anomaly detection techniques: Existing solutions and latest technological trends , 2007, Comput. Networks.

[20]  Dewan Md. Farid,et al.  Combining Naive Bayes and Decision Tree for Adaptive Intrusion Detection , 2010, ArXiv.

[21]  Raman K. Mehra,et al.  Detection and classification of intrusions and faults using sequences of system calls , 2001, SGMD.

[22]  Jeffrey P. Buzen,et al.  MASF - Multivariate Adaptive Statistical Filtering , 1995, Int. CMG Conference.

[23]  E. Lehmann Testing Statistical Hypotheses , 1960 .

[24]  Athanasios V. Vasilakos,et al.  Big data: From beginning to future , 2016, Int. J. Inf. Manag..