Sharp bounds for population recovery

The population recovery problem is a basic problem in noisy unsupervised learning that has attracted significant research attention in recent years [WY12,DRWY12, MS13, BIMP13, LZ15,DST16]. A number of different variants of this problem have been studied, often under assumptions on the unknown distribution (such as that it has restricted support size). In this work we study the sample complexity and algorithmic complexity of the most general version of the problem, under both bit-flip noise and erasure noise model. We give essentially matching upper and lower sample complexity bounds for both noise models, and efficient algorithms matching these sample complexity bounds up to polynomial factors.

[1]  Ananda Theertha Suresh,et al.  Sample complexity of population recovery , 2017, COLT.

[2]  Tamás Erdélyi,et al.  LITTLEWOOD-TYPE PROBLEMS ON SUBARCS OF THE UNIT CIRCLE , 1997 .

[3]  Avi Wigderson,et al.  Population recovery and partial identification , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[4]  Michael E. Saks,et al.  A Polynomial Time Algorithm for Lossy Population Recovery , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[5]  A. Paul,et al.  Pacific Journal of Mathematics , 1999 .

[6]  Shachar Lovett,et al.  Improved Noisy Population Recovery, and Reverse Bonami-Beckner Inequality for Sparse Functions , 2014, Electron. Colloquium Comput. Complex..

[7]  B. M. Fulk MATH , 1992 .

[8]  Leonid A. Levin,et al.  A hard-core predicate for all one-way functions , 1989, STOC '89.

[9]  Michael E. Saks,et al.  Noisy Population Recovery in Polynomial Time , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[10]  V. Rich Personal communication , 1989, Nature.

[11]  Tamás Erdélyi,et al.  Littlewood‐Type Problems on [0,1] , 1999 .

[12]  T. Erdélyi,et al.  Coppersmith–Rivlin type inequalities and the order of vanishing of polynomials at 1 , 2014, 1406.2560.

[13]  Richard Bellman,et al.  Recurrence times for the Ehrenfest model. , 1951 .

[14]  Yuval Peres,et al.  Trace reconstruction with exp(O(n1/3)) samples , 2017, STOC.

[15]  Ryan O'Donnell,et al.  Optimal mean-based algorithms for trace reconstruction , 2017, STOC.

[16]  Russell Impagliazzo,et al.  Finding Heavy Hitters from Lossy or Noisy Data , 2013, APPROX-RANDOM.

[17]  T. Sanders,et al.  Analysis of Boolean Functions , 2012, ArXiv.

[18]  Avi Wigderson,et al.  Restriction access , 2012, ITCS '12.