Model-Based Outlier Detection System with Statistical Preprocessing

Reliability, lack of error, and security are important improvements to quality of service. Outlier detection is a process of detecting the erroneous parts or abnormal objects in defined populations, and can contribute to secured and error-free services. Outlier detection approaches can be categorized into four types: statistic-based, unsupervised, supervised, and semi-supervised. A model-based outlier detection system with statistical preprocessing is proposed, taking advantage of the statistical approach to preprocess training data and using unsupervised learning to construct the model. The robustness of the proposed system is evaluated using the performance evaluation metrics sum of squared error (SSE) and time to build model (TBM). The proposed system performs better for detecting outliers regardless of the application domain.

[1]  Mia Hubert,et al.  Robust statistics for outlier detection , 2011, WIREs Data Mining Knowl. Discov..

[2]  Takehisa Yairi,et al.  An approach to spacecraft anomaly detection problem using kernel feature space , 2005, KDD '05.

[3]  Philippe Owezarski,et al.  UNADA: Unsupervised Network Anomaly Detection Using Sub-space Outliers Ranking , 2011, Networking.

[4]  S. Agarwal,et al.  An Investigation of Wireless Sensor Network: A Distributed Approach in Smart Environment , 2012, 2012 Second International Conference on Advanced Computing & Communication Technologies.

[5]  Youlin Shang,et al.  Semi-supervised outlier detection based on fuzzy rough C-means clustering , 2010, Math. Comput. Simul..

[6]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[7]  Raman K. Mehra,et al.  Detection and classification of intrusions and faults using sequences of system calls , 2001, SGMD.

[8]  Alexander G. Tartakovsky,et al.  Efficient Computer Network Anomaly Detection by Changepoint Detection Methods , 2012, IEEE Journal of Selected Topics in Signal Processing.

[9]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[10]  Takafumi Kanamori,et al.  Statistical outlier detection using direct density ratio estimation , 2011, Knowledge and Information Systems.

[11]  Francesca Bovolo,et al.  Supervised change detection in VHR images using contextual information and support vector machines , 2013, Int. J. Appl. Earth Obs. Geoinformation.

[12]  Y. Zhang,et al.  – 20 Statistics-based outlier detection for wireless sensor networks , 2012 .

[13]  Liang Zhao,et al.  A Network-Based Semi-supervised Outlier Detection Technique Using Particle Competition and Cooperation , 2013, 2013 Brazilian Conference on Intelligent Systems.

[14]  Ana L. N. Fred,et al.  Combining multiple clusterings using evidence accumulation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Ashkan Sami,et al.  Entropy-based outlier detection using semi-supervised approach with few positive examples , 2014, Pattern Recognit. Lett..

[16]  VENKATA RATNAM GANJI Credit card fraud detection using anti-k nearest neighbor algorithm , 2012 .

[17]  Philippe Owezarski,et al.  Knowledge-independent traffic monitoring: Unsupervised detection of network attacks , 2012, IEEE Network.

[18]  Philip Chan,et al.  Learning States and Rules for Detecting Anomalies in Time Series , 2005, Applied Intelligence.

[19]  Janaina Mourão Miranda,et al.  Patient classification as an outlier detection problem: An application of the One-Class Support Vector Machine , 2011, NeuroImage.

[20]  K. Thangavel,et al.  Semi-supervised k-means clustering for outlier detection in mammogram classification , 2010, Trendz in Information Sciences & Computing(TISC2010).

[21]  Suhaimi Ibrahim,et al.  Outlier Detection in Stream Data by Clustering Method , 2014 .

[22]  Carla E. Brodley,et al.  FRaC: a feature-modeling approach for semi-supervised and unsupervised anomaly detection , 2012, Data Mining and Knowledge Discovery.

[23]  Karsten M. Borgwardt,et al.  Rapid Distance-Based Outlier Detection via Sampling , 2013, NIPS.

[24]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[25]  L. Hudgins,et al.  Array-based technology and recommendations for utilization in medical genetics practice for detection of chromosomal abnormalities , 2010, Genetics in Medicine.

[26]  Bhavani M. Thuraisingham,et al.  Classification and Novel Class Detection in Concept-Drifting Data Streams under Time Constraints , 2011, IEEE Transactions on Knowledge and Data Engineering.

[27]  John Akhilomen Data Mining Application for Cyber Credit-Card Fraud Detection System , 2013, ICDM.