Anomalous behaviour detection based on heterogeneous data and data fusion

In this paper, we propose a new approach to identify anomalous behaviour based on heterogeneous data and a data fusion technique. There are four types of datasets applied in this study including credit card, loyalty card, GPS, and image data. The first step of the complete framework in this proposed study is to identify the best features for every dataset. Then, the new anomaly detection technique which is recently introduced and known as empirical data analytics (EDA) is applied to detect the abnormal behaviour based on the datasets. Standardised eccentricity (a newly introduced within EDA measure offering a new simplified form of the well-known Chebyshev inequality) can be applied to any data distribution. Image data are processed using pre-trained deep learning network, and classification is done by using support vector machine. Most of the other data used in our previous work are of type “signal”/real number (e.g. credit card, loyalty card and GPS data). However, a clear conclusion that a misuse was made very often cannot be reached based on them only. When gender or age is different from the expected, it is obvious misuse. At the final stage of the proposed method is combining anomaly result and image recognition using data fusion technique. From the experiment results, this proposed technique may simplify the tedious job in the real complex cases of forensic investigation. The proposed technique is using heterogeneous data which combine all the data from the VAST Challenge as well as image data using an introduced data fusion technique. These can assist the human expert in processing huge amount of heterogeneous data to detect anomalies. In future research, text data can also be used as a part of heterogeneous data mixture, and the data fusion technique may be applied to other datasets.

[1]  Xiaowei Gu,et al.  Empirical Data Analytics , 2017, Int. J. Intell. Syst..

[2]  Yongbum Kim,et al.  Development of an Anomaly Detection Model for a Bank's Transitory Account System , 2014, J. Inf. Syst..

[3]  Jingzheng Ren,et al.  Emergy Analysis and Sustainability Efficiency Analysis of Different Crop-Based Biodiesel in Life Cycle Perspective , 2013, TheScientificWorldJournal.

[4]  Plamen Angelov,et al.  Anomaly detection based on eccentricity analysis , 2014, 2014 IEEE Symposium on Evolving and Autonomous Learning Systems (EALS).

[5]  Fakhri Karray,et al.  Multisensor data fusion: A review of the state-of-the-art , 2013, Inf. Fusion.

[6]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[7]  Harry Timmermans,et al.  Car drivers’ characteristics and the maximum walking distance between parking facility and final destination , 2015 .

[8]  Ying Wu,et al.  Topology Preserving Mapping for Maritime Anomaly Detection , 2014, ICCSA.

[9]  Lada A. Adamic,et al.  The role of social networks in information diffusion , 2012, WWW.

[10]  Edward J. Delp,et al.  Automatic detection of abnormal human events on train platforms , 2014, NAECON 2014 - IEEE National Aerospace and Electronics Conference.

[11]  J. G. Saw,et al.  Chebyshev Inequality With Estimated Mean and Variance , 1984 .

[12]  Anazida Zainal,et al.  Fraud detection system: A survey , 2016, J. Netw. Comput. Appl..

[13]  V Jyothsna,et al.  A Review of Anomaly based Intrusion Detection Systems , 2011 .

[14]  Plamen P. Angelov,et al.  Empirical data analysis: A new tool for data analytics , 2016, 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[15]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[16]  Francesco Palmieri,et al.  A distributed approach to network anomaly detection based on independent component analysis , 2014, Concurr. Comput. Pract. Exp..

[17]  Xiaofeng Wang,et al.  Gender and Age Classification of Human Faces for Automatic Detection of Anomalous Human Behaviour , 2017, 2017 3rd IEEE International Conference on Cybernetics (CYBCON).

[18]  Taghi M. Khoshgoftaar,et al.  Intrusion detection and Big Heterogeneous Data: a Survey , 2015, Journal of Big Data.

[19]  Christian Jutten,et al.  Multimodal Data Fusion: An Overview of Methods, Challenges, and Prospects , 2015, Proceedings of the IEEE.

[20]  Vasant Dhar,et al.  Data science and prediction , 2012, CACM.

[21]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[22]  Plamen Angelov,et al.  Applying Computational Intelligence to Community Policing and Forensic Investigations , 2017 .

[23]  Federico Castanedo,et al.  A Review of Data Fusion Techniques , 2013, TheScientificWorldJournal.

[24]  T. V. Pollet,et al.  To Remove or not to Remove: the Impact of Outlier Handling on Significance Testing in Testosterone Data , 2017 .

[25]  Michele Vespe,et al.  Vessel Pattern Knowledge Discovery from AIS Data: A Framework for Anomaly Detection and Route Prediction , 2013, Entropy.

[26]  Wei Jiang,et al.  On-line outlier detection and data cleaning , 2004, Comput. Chem. Eng..

[27]  Mahmoud Reza Hashemi,et al.  An adaptive profile based fraud detection framework for handling concept drift , 2013, 2013 10th International ISC Conference on Information Security and Cryptology (ISCISC).

[28]  Plamen Angelov Typicality distribution function — A new density-based data analytics tool , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[29]  Borko Furht,et al.  Sensor fault and patient anomaly detection and classification in medical wireless sensor networks , 2013, 2013 IEEE International Conference on Communications (ICC).

[30]  Eamonn J. Keogh,et al.  Finding Unusual Medical Time-Series Subsequences: Algorithms and Applications , 2006, IEEE Transactions on Information Technology in Biomedicine.

[31]  Nuno Vasconcelos,et al.  Anomaly Detection and Localization in Crowded Scenes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.