On Low-Risk Heavy Hitters and Sparse Recovery Schemes

We study the heavy hitters and related sparse recovery problems in the low-failure probability regime. This regime is not well-understood, and has only been studied for non-adaptive schemes. The main previous work is one on sparse recovery by Gilbert et al.(ICALP'13). We recognize an error in their analysis, improve their results, and contribute new non-adaptive and adaptive sparse recovery algorithms, as well as provide upper and lower bounds for the heavy hitters problem with low failure probability.

[1]  Ely Porat,et al.  For-All Sparse Recovery in Near-Optimal Time , 2014, ACM Trans. Algorithms.

[2]  Atri Rudra,et al.  Efficiently Decodable Error-Correcting List Disjunct Matrices and Applications - (Extended Abstract) , 2011, ICALP.

[3]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[4]  Daniel A. Spielman,et al.  Linear-time encodable and decodable error-correcting codes , 1995, STOC '95.

[5]  Devdatt P. Dubhashi,et al.  Balls and bins: A study in negative dependence , 1996, Random Struct. Algorithms.

[6]  David P. Woodruff,et al.  (1 + eps)-Approximate Sparse Recovery , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[7]  Vladimir Braverman,et al.  Clustering High Dimensional Dynamic Data Streams , 2017, ICML.

[8]  David P. Woodruff,et al.  On Deterministic Sketching and Streaming for Sparse Recovery and Norm Estimation , 2012, APPROX-RANDOM.

[9]  Subhankar Ghosh,et al.  Concentration of measures via size-biased couplings , 2009, 0906.3886.

[10]  Moses Charikar,et al.  Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[11]  Ely Porat,et al.  Sublinear time, measurement-optimal, sparse recovery for all , 2012, SODA.

[12]  Mikkel Thorup,et al.  Heavy Hitters via Cluster-Preserving Clustering , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[13]  David P. Woodruff,et al.  ( 1 + )-approximate Sparse Recovery , 2011 .

[14]  Atri Rudra,et al.  ℓ2/ℓ2-Foreach Sparse Recovery with Low Risk , 2013, ICALP.

[15]  Christian Sohler,et al.  Coresets in dynamic geometric data streams , 2005, STOC '05.

[16]  Joel A. Tropp,et al.  Algorithmic linear dimension reduction in the l_1 norm for sparse vectors , 2006, ArXiv.

[17]  David P. Woodruff,et al.  On deterministic sketching and streaming for sparse recovery and norm estimation , 2014 .

[18]  Piotr Indyk,et al.  Compressive sensing with local geometric features , 2011, SoCG '11.

[19]  Zeyuan Allen Zhu,et al.  Restricted Isometry Property for General p-Norms , 2016, IEEE Trans. Inf. Theory.

[20]  Ely Porat,et al.  Approximate sparse recovery: optimizing time and measurements , 2009, STOC '10.

[21]  T. S. Jayram,et al.  OPEN PROBLEMS IN DATA STREAMS AND RELATED TOPICS IITK WORKSHOP ON ALGORITHMS FOR DATA STREAMS ’06 , 2007 .

[22]  Hossein Jowhari,et al.  Tight bounds for Lp samplers, finding duplicates in streams, and related problems , 2010, PODS.

[23]  Eric Price,et al.  Efficient sketches for the set query problem , 2010, SODA '11.

[24]  Piotr Indyk,et al.  Algorithms for dynamic geometric problems over data streams , 2004, STOC '04.

[25]  R. D. Gordon Values of Mills' Ratio of Area to Bounding Ordinate and of the Normal Probability Integral for Large Values of the Argument , 1941 .

[26]  David P. Woodruff,et al.  On the Power of Adaptivity in Sparse Recovery , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[27]  Enkatesan G Uruswami Unbalanced expanders and randomness extractors from Parvaresh-Vardy codes , 2008 .

[28]  Pablo A. Parrilo,et al.  The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[29]  Zeyuan Allen Zhu,et al.  Restricted Isometry Property for General p-Norms , 2014, IEEE Transactions on Information Theory.

[30]  M. Talagrand,et al.  Probability in Banach spaces , 1991 .

[31]  David P. Woodruff,et al.  Fast moment estimation in data streams in optimal space , 2010, STOC '11.

[32]  Sumit Ganguly,et al.  CR-precis: A Deterministic Summary Structure for Update Data Streams , 2006, ESCAPE.

[33]  David P. Woodruff,et al.  Lower bounds for sparse recovery , 2010, SODA '10.

[34]  Sumit Ganguly,et al.  Data Stream Algorithms via Expander Graphs , 2008, ISAAC.

[35]  David P. Woodruff,et al.  Improved Algorithms for Adaptive Compressed Sensing , 2018, ICALP.