Outliers in biometrical data: What's old, What's new

Nowadays a huge amount of data is gathered in the biometric area, e.g., sequences of DNA code, graphical images for recognition or authorisation of subjects, video monitoring, clinical trials or health care. Outliers are observations which are discordant with the model describing the data. The appearance of an outlier may be caused by a gross error; alternatively, an outlier (or a group of them) may represent observations which are caused by phenomena not accounted for in the assumed model. The paper shows a subjective survey of some methods serving for detection of outliers or anomalies in multivariate data. The methods are viewed from historical perspective.

[1]  Li Liu,et al.  Robust singular value decomposition analysis of microarray data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[2]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[3]  A. Bartkowiak Identifying outliers - a look by a grand tour , 1998 .

[4]  Anna Bartkowiak,et al.  A Set of XLispStat Subroutines for Detecting Outliers , 2001 .

[5]  Mia Hubert,et al.  LIBRA: a MATLAB library for robust analysis , 2005 .

[6]  A. Hadi Identifying Multiple Outliers in Multivariate Data , 1992 .

[7]  Bell Telephone,et al.  ROBUST ESTIMATES, RESIDUALS, AND OUTLIER DETECTION WITH MULTIRESPONSE DATA , 1972 .

[8]  Anna Bartkowiak,et al.  Are amino acids counts in yeast ORFs negative binomial? , 2009, Int. J. Biom..

[9]  Jin-Tsong Jeng Annealing robust fuzzy basis function for modelling with noise and outliers , 2006, Int. J. Comput. Appl. Technol..

[10]  M. Hubert,et al.  Outlier detection for skewed data , 2008 .

[11]  Anne Ruiz-Gazen,et al.  A monitoring display of multivariate outliers , 2003, Comput. Stat. Data Anal..

[12]  P. Rousseeuw,et al.  A fast algorithm for the minimum covariance determinant estimator , 1999 .

[13]  A COMPARISON OF FOUR DIFFERENT METHODS FOR OUTLIER DETECTION IN BIOEQUIVALENCE STUDIES , 2004, Journal of biopharmaceutical statistics.

[14]  J. Friedman Exploratory Projection Pursuit , 1987 .

[15]  Myong Kee Jeong,et al.  Robust Probabilistic Multivariate Calibration Model , 2008, Technometrics.

[16]  Shizuhiko Nishisato,et al.  Elements of Dual Scaling: An Introduction To Practical Data Analysis , 1993 .

[17]  Daniel Asimov,et al.  The grand tour: a tool for viewing multidimensional data , 1985 .

[18]  Sridhar Seshadri,et al.  Robust Analysis of Variance: Process Design and Quality Improvement , 2005 .

[19]  Frederick Mosteller,et al.  Exploring Data Tables, Trends and Shapes. , 1986 .

[20]  Ira Assent,et al.  Subspace outlier mining in large multimedia databases , 2007, Parallel Universes and Local Patterns.

[21]  Andreas Buja,et al.  Grand tour methods: an outline , 1986 .

[22]  Michael J. Black,et al.  A Framework for Robust Subspace Learning , 2003, International Journal of Computer Vision.

[23]  Wenping Wang,et al.  Examining Outlying Subjects and Outlying Records in Bioequivalence Trials , 2003, Journal of Biopharmaceutical Statistics.

[24]  D. J. Finney Calibration Guidelines Challenge Outlier Practices , 2006 .

[25]  Frederick Mosteller,et al.  Exploring Data Tables, Trends, and Shapes (Wiley Series in Probability and Statistics) , 2006 .

[26]  Angel R. Martinez,et al.  : Exploratory data analysis with MATLAB ® , 2007 .

[27]  Xuming He,et al.  Lower Rank Approximation of Matrices Based on Fast and Robust Alternating Regression , 2008 .

[28]  Hans-Peter Kriegel,et al.  Supervised probabilistic principal component analysis , 2006, KDD '06.

[29]  M. Hubert,et al.  High-Breakdown Robust Multivariate Methods , 2008, 0808.0657.

[30]  Victor J. Yohai,et al.  Robust Low-Rank Approximation of Data Matrices With Elementwise Contamination , 2008, Technometrics.

[31]  I. Jolliffe Principal Component Analysis , 2002 .

[32]  Anthony C. Atkinson,et al.  The stalactite plot for the detection of multivariate outliers , 1993 .

[33]  Mia Hubert,et al.  Computational Statistics and Data Analysis Robust Pca for Skewed Data and Its Outlier Map , 2022 .

[34]  Anna Bartkowiak,et al.  Outliers – finding and classifying which genuine and which spurious , 2000, Comput. Stat..

[35]  Brian Everitt,et al.  Principles of Multivariate Analysis , 2001 .

[36]  Anna Bartkowiak Outliers in Biometrical Data - Two Real Examples of Analysis , 2009, 2009 International Conference on Biometrics and Kansei Engineering.

[37]  Shaogang Gong,et al.  Video Behavior Profiling for Anomaly Detection , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Christophe Croux,et al.  High breakdown estimators for principal components: the projection-pursuit approach revisited , 2005 .

[39]  D. F. Andrews,et al.  Finding the Outliers that Matter , 1978 .

[40]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[41]  J. Towner,et al.  The grand tour: a key phase in the history of tourism , 1984 .

[42]  A. Atkinson Fast Very Robust Methods for the Detection of Multiple Outliers , 1994 .

[43]  S. Shankar Sastry,et al.  Generalized principal component analysis (GPCA) , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Anna Bartkowiak Robust Mahalanobis distances obtained using the 'multout' and 'fast-mcd' methods , 2005 .

[45]  Ying Liu,et al.  Outlier detection and evaluation by network flow , 2004, 2004 International Conference on Machine Learning and Applications, 2004. Proceedings..

[46]  Mia Hubert,et al.  Robustness and Outlier Detection in Chemometrics , 2006 .

[47]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[48]  C. Posse Projection pursuit exploratory data analysis , 1995 .

[49]  Adam L. Asare,et al.  Power enhancement via multivariate outlier testing with gene expression arrays , 2009, Bioinform..

[50]  David L. Woodruff,et al.  Identification of Outliers in Multivariate Data , 1996 .

[51]  Wojtek J. Krzanowski,et al.  Principles of multivariate analysis : a user's perspective. oxford , 1988 .

[52]  John W. Tukey,et al.  A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.