Automatic Outlier Detection: A Bayesian Approach

In order to achieve reliable autonomous control in advanced robotic systems like entertainment robots, assistive robots, humanoid robots and autonomous vehicles, sensory data needs to be absolutely reliable, or some measure of reliability must be available. Bayesian statistics can offer favorable ways of accomplishing such robust sensory data pre-processing. In this paper, we introduce a Bayesian way of dealing with outlier-infested sensory data and develop a "black box" approach to removing outliers in real-time and expressing confidence in the estimated data. We develop our approach in the framework of Bayesian linear regression with heteroscedastic noise. Essentially, every measured data point is assumed to have its individual variance, and the final estimate is achieved by a weighted regression over observed data. An expectation-maximization algorithm allows us to estimate the variance of each data point in an incremental algorithm. With the exception of a time horizon (window size) over which the estimation process is averaged, no open parameters need to be tuned, and no special assumption about the generative structure of the data is required. The algorithm works efficiently in realtime. We evaluate our method on synthetic data and on a pose estimation problem of a quadruped robot, demonstrating its ease of usability, competitive nature with well-tuned alternative algorithms and advantages in terms of robust outlier removal

[1]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[2]  Thomas P. Ryan,et al.  Modern Regression Methods , 1996 .

[3]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[4]  Wolfram Burgard,et al.  Monte Carlo Localization: Efficient Position Estimation for Mobile Robots , 1999, AAAI/IAAI.

[5]  H. J. Arnold Introduction to the Practice of Statistics , 1990 .

[6]  Monica C. Jackson,et al.  Introduction to the Practice of Statistics , 2001 .

[7]  Bianca Zadrozny,et al.  Outlier detection by active learning , 2006, KDD '06.

[8]  Chris Jermaine,et al.  Outlier detection by sampling with accuracy guarantees , 2006, KDD '06.

[9]  Matthew J. Beal,et al.  Graphical Models and Variational Methods , 2001 .

[10]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[11]  G. Bierman Factorization methods for discrete sequential estimation , 1977 .

[12]  M. Aitkin,et al.  Mixture Models, Outliers, and the EM Algorithm , 1980 .

[13]  D. W. Scott Outlier Detection and Clustering by Partial Mixture Modeling , 2004 .

[14]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[15]  Michael E. Tipping,et al.  A Variational Approach to Robust Regression , 2001, ICANN.

[16]  P. Nurmi Mixture Models , 2008 .

[17]  Markus Breitenbach,et al.  Clustering through ranking on manifolds , 2005, ICML '05.

[18]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[19]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[20]  Andrew W. Moore,et al.  Locally Weighted Learning , 1997, Artificial Intelligence Review.