A Review of Outlier Prediction Techniques in Data Mining

The main objective of this review is that to predict the outliers in data mining. In general, the data mining is a process of applying various techniques to extract useful patterns or models from the available data. It plays a vital role to choose, explore and model high dimensional data. Outlier detection refers a substantial research problem in the domain of data mining those objectives to uncover objects which exhibit significantly different, exceptional and inconsistent from rest of the data. The outlier potential sources can be noise and errors, events and malicious attack in the network. The main challenges involved in the outlier detection with high complexity, size and different types of datasets, are how to catch similar outliers as a group by using clustering-based approach. The outlier or noise available in the clustered data is accurately removed and retrieves an efficient high dimensional data. Nowadays, the classification and clustering techniques for outlier prediction are applied in various fields like bioinformatics, natural language processing, military application, geographical domains etc. This study surveys various data classification and data clustering techniques in order to identify the optimal techniques, which provides better outlier predicted data detection. Moreover, the comparison between the various classification and clustering techniques for outlier prediction are illustrated.

[1]  Avinash Chandra Pandey,et al.  Outlier detection: A survey on techniques of WSNs involving event and error based outliers , 2014, 2014 Innovative Applications of Computational Intelligence on Power, Energy and Controls with their impact on Humanity (CIPECH).

[2]  Michael K. Ng,et al.  Automated variable weighting in k-means type clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Manoj Kumar EVALUATING THE EXISTING SOLUTION OF OUTLIER DETECTION IN WSN SYSTEM , 2014 .

[4]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[5]  Naomie Salim,et al.  Prediction of New Bioactive Molecules using a Bayesian Belief Network , 2014, J. Chem. Inf. Model..

[6]  Osmar R. Zaïane,et al.  A Nonparametric Outlier Detection for Effectively Discovering Top-N Outliers from Engineering Data , 2006, PAKDD.

[7]  Naomie Salim,et al.  Ligand-Based Virtual Screening Using Bayesian Networks , 2010, J. Chem. Inf. Model..

[8]  Sebastián Ventura,et al.  Educational Data Mining: A Review of the State of the Art , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[9]  P. N. Chatur,et al.  Hybrid Approach for Outlier Detection over Wireless Sensor Network Real Time Data , 2013 .

[10]  Stan Uryasev,et al.  Value-at-risk support vector machine: stability to outliers , 2013, Journal of Combinatorial Optimization.

[11]  Shuchita Upadhyaya,et al.  Outlier Detection: Applications And Techniques , 2012 .

[12]  Andrew H. Sung,et al.  A Similarity Measure for Clustering and its Applications , 2008 .

[13]  B. Pradhan,et al.  Landslide Susceptibility Assessment in Vietnam Using Support Vector Machines, Decision Tree, and Naïve Bayes Models , 2012 .

[14]  Hayri Sever,et al.  Performance Evaluation of the Machine Learning Algorithms Used in Inference Mechanism of a Medical Decision Support System , 2014, TheScientificWorldJournal.

[15]  S.Ravi,et al.  Brain Tumor Segmentation Using K-MeansClustering And Fuzzy C-Means AlgorithmsAnd Its Area Calculation , 2014 .

[16]  Johan Pieter de Villiers,et al.  Maritime piracy situation modelling with dynamic Bayesian networks , 2015, Inf. Fusion.

[17]  Hongxing He,et al.  A comparative study of RNN for outlier detection in data mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[18]  Li Li,et al.  Defining and Evaluating Classification Algorithm for High-Dimensional Data Based on Latent Topics , 2014, PloS one.

[19]  Suhaimi Ibrahim,et al.  Outlier Detection in Stream Data by Clustering Method , 2014 .

[20]  Renxia Wan,et al.  A Fast Incremental Clustering Algorithm , 2009 .

[21]  Charu C. Aggarwal,et al.  Outlier Detection for Temporal Data , 2014, Outlier Detection for Temporal Data.

[22]  Vijay Kumar,et al.  An Efficient Clustering and Distance Based Approach for Outlier Detection , 2013 .

[23]  Y. Zhang,et al.  – 20 Statistics-based outlier detection for wireless sensor networks , 2012 .

[24]  Songfeng Lu,et al.  Quantum decision tree classifier , 2014, Quantum Inf. Process..

[25]  H. Karimi,et al.  Study on Support Vector Machine-Based Fault Detection in Tennessee Eastman Process , 2014 .

[26]  Amir Mosavi,et al.  Multiple Criteria Decision-Making Preprocessing Using Data Mining Tools , 2010, ArXiv.

[27]  Kamiya Arora,et al.  Clustering of Image Data Using K-Means and Fuzzy K-Means , 2014 .