Estimating Noisy Order Statistics

This paper proposes an estimation framework to assess the performance of sorting over perturbed/noisy data. In particular, the recovering accuracy is measured in terms of Minimum Mean Square Error (MMSE) between the values of the sorting function computed on data without perturbation and the estimator that operates on the sorted noisy data. It is first shown that, under certain symmetry conditions, satisfied for example by the practically relevant Gaussian noise perturbation, the optimal estimator can be expressed as a linear combination of estimators on the unsorted data. Then, two suboptimal estimators are proposed and performance guarantees on them are derived with respect to the optimal estimator. Finally, some surprising properties on the MMSE of interest are discovered. For instance, it is shown that the MMSE grows sublinearly with the data size, and that commonly used MMSE lower bounds such as the Bayesian Cram\'er-Rao and the maximum entropy bounds either cannot be applied or are not suitable.

[1]  H. Vincent Poor,et al.  On Estimation under Noisy Order Statistics , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[2]  Jonathan Weed,et al.  Minimax Rates and Efficient Algorithms for Noisy Sorting , 2017, ALT.

[3]  Feng Qi (祁锋),et al.  Sharp Inequalities for Polygamma Functions , 2009, 0903.1984.

[4]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[5]  Herbert A. David,et al.  Order Statistics , 2011, International Encyclopedia of Statistical Science.

[6]  Mark Braverman,et al.  Noisy sorting without resampling , 2007, SODA '08.

[7]  Thomas M. Cover,et al.  Elements of Information Theory: Cover/Elements of Information Theory, Second Edition , 2005 .

[8]  S. Ferrari,et al.  Beta Regression for Modelling Rates and Proportions , 2004 .

[9]  Thomas M. Cover,et al.  Network Information Theory , 2001 .

[10]  Steven Kay,et al.  Fundamentals Of Statistical Signal Processing , 2001 .

[11]  M. Evans Statistical Distributions , 2000 .

[12]  S. Resnick A Probability Path , 1999 .

[13]  Narayanaswamy Balakrishnan,et al.  A Useful Property of Best Linear Unbiased Predictors with Applications to Life-Testing , 1997 .

[14]  Narayanaswamy Balakrishnan,et al.  CRC Handbook of Tables for the Use of Order Statistics in Estimation , 1996 .

[15]  Ehud Weinstein,et al.  A general class of lower bounds in parameter estimation , 1988, IEEE Trans. Inf. Theory.

[16]  Gerald B. Folland,et al.  Real Analysis: Modern Techniques and Their Applications , 1984 .

[17]  R. E. Wheeler Statistical distributions , 1983, APLQ.

[18]  F. Downton,et al.  Statistical analysis of reliability and life-testing models : theory and methods , 1992 .

[19]  Calyampudi R. Rao Handbook of statistics , 1980 .

[20]  W. R. Buckland,et al.  Outliers in Statistical Data , 1979 .

[21]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[22]  A. Goldberger Best Linear Unbiased Prediction in the Generalized Linear Regression Model , 1962 .

[23]  E. H. Lloyd LEAST-SQUARES ESTIMATION OF LOCATION AND SCALE PARAMETERS USING ORDER STATISTICS , 1952 .