Efficient and fast estimation of the geometric median in Hilbert spaces with an averaged stochastic gradient algorithm

With the progress of measurement apparatus and the development of automatic sensors it is not unusual anymore to get thousands of samples of observations taking values in high dimension spaces such as functional spaces. In such large samples of high dimensional data, outlying curves may not be uncommon and even a few individuals may corrupt simple statistical indicators such as the mean trajectory. We focus here on the estimation of the geometric median which is a direct generalization of the real median and has nice robustness properties. The geometric median being defined as the minimizer of a simple convex functional that is differentiable everywhere when the distribution has no atoms, it is possible to estimate it with online gradient algorithms. Such algorithms are very fast and can deal with large samples. Furthermore they also can be simply updated when the data arrive sequentially. We state the almost sure consistency and the L2 rates of convergence of the stochastic gradient estimator as well as the asymptotic normality of its averaged version. We get that the asymptotic distribution of the averaged version of the algorithm is the same as the classic estimators which are based on the minimization of the empirical loss function. The performances of our averaged sequential estimator, both in terms of computation speed and accuracy of the estimations, are evaluated with a small simulation study. Our approach is also illustrated on a sample of more 5000 individual television audiences measured every second over a period of 24 hours.

[1]  J. Haldane Note on the median of a multivariate distribution , 1948 .

[2]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[3]  J. C. Gower,et al.  Algorithm AS 78: The Mediancentre , 1974 .

[4]  Harro Walk An invariance principle for the Robbins-Monro process in a Hilbert space , 1977 .

[5]  Harold J. Kushner,et al.  wchastic. approximation methods for constrained and unconstrained systems , 1978 .

[6]  D. Ruppert A Newton-Raphson Version of the Multivariate Robbins-Monro Procedure , 1985 .

[7]  J. K. Zuidweg Stochastic modelling and control: M.H.A. DAVIS and R.B. VINTER Monographs on Statistics and Applied Probability, Chapman and Hall, London, 1985, xii + 393 pages, £20.00 , 1987 .

[8]  Adam Jakubowski TIGHTNESS CRITERIA FOR RANDOM MEASURES WITH APPLICATION TO THE PRINCIPLE OF CONDITIONING IN I-ULBEWT SPACES , 1988 .

[9]  S. Haberman Concavity and estimation , 1989 .

[10]  C. Small A Survey of Multidimensional Medians , 1990 .

[11]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[12]  P. Chaudhuri Multivariate location estimation using extension of R-estimates through U-statistics type approach , 1992 .

[13]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[14]  G. Pflug,et al.  Stochastic approximation and optimization of random systems , 1992 .

[15]  P. Chaudhuri On a geometric notion of quantiles for multivariate data , 1996 .

[16]  V. Koltchinskii M-estimation, convexity and quantiles , 1997 .

[17]  Mariane Pelletier,et al.  Asymptotic Almost Sure Efficiency of Averaged Stochastic Algorithms , 2000, SIAM J. Control. Optim..

[18]  Cun-Hui Zhang,et al.  The multivariate L1-median and associated data depth. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Q. Shao,et al.  Gaussian processes: Inequalities, small ball probabilities and applications , 2001 .

[20]  R. Fraiman,et al.  Trimmed means for functional data , 2001 .

[21]  B. Cadre Convergent estimators for the l1-median of banach valued random variable , 2001 .

[22]  H. Kushner,et al.  Stochastic Approximation and Recursive Algorithms and Applications , 2003 .

[23]  Harro Walk,et al.  The Averaged Robbins – Monro Method for Linear Problems in a Banach Space , 2006 .

[24]  Yuan Yao,et al.  Online Learning Algorithms , 2006, Found. Comput. Math..

[25]  A. Nazarov Exact L_2-small ball asymptotics of Gaussian processes and the spectrum of boundary value problems with "non-separated" boundary conditions , 2007, 0710.1408.

[26]  Ricardo Fraiman,et al.  Robust estimation and classification for functional data via projection-based depth notions , 2007, Comput. Stat..

[27]  D. Gervini Robust functional estimation using the median and spherical principal components , 2008 .

[28]  A. Nazarov Exact L2-Small Ball Asymptotics of Gaussian Processes and the Spectrum of Boundary-Value Problems , 2009 .

[29]  Design-Based Estimation for Geometric Quantiles , 2009 .

[30]  Mohamed Chaouch,et al.  STOCHASTIC APPROXIMATION TO THE MULTIVARIATE AND THE FUNCTIONAL MEDIAN Submitted to COMPSTAT 2010 , 2010 .

[31]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[32]  M. Arnaudon,et al.  Stochastic algorithms for computing means of probability measures , 2011, 1106.5106.

[33]  Hervé Cardot,et al.  Fast clustering of large datasets with sequential $k$-medians : a stochastic gradient approach , 2011 .