Shift Finding in Sub-Linear Time

We study the following basic pattern matching problem. Consider a "code" sequence c consisting of n bits chosen uniformly at random, and a "signal" sequence x obtained by shifting c (modulo n) and adding noise. The goal is to efficiently recover the shift with high probability. The problem models tasks of interest in several applications, including GPS synchronization and motion estimation. We present an algorithm that solves the problem in time O(n(f/(1+f)), where O(Nf) is the running time of the best algorithm for finding the closest pair among N "random" sequences of length O(log N). A trivial bound of f = 2 leads to a simple algorithm with a running time of O(n2/3). The asymptotic running time can be further improved by plugging in recent more efficient algorithms for the closest pair problem. Our results also yield a sub-linear time algorithm for approximate pattern matching algorithm for a random signal (text), even for the case when the error between the signal and the code (pattern) is asymptotically as large as the code size. This is the first sublinear time algorithm for such error rates.

[1]  P. C. Pandey,et al.  The Journal of the Acoustical Society of America , 1939 .

[2]  Piotr Indyk,et al.  Faster GPS via the sparse fourier transform , 2012, Mobicom '12.

[3]  Andrew McGregor,et al.  Periodicity and Cyclic Shifts via Linear Sketches , 2011, APPROX-RANDOM.

[4]  Piotr Indyk,et al.  Simple and practical algorithm for sparse Fourier transform , 2012, SODA.

[5]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.

[6]  Thomas G. Marr,et al.  Approximate String Matching and Local Similarity , 1994, CPM.

[7]  Alexandr Andoni,et al.  Near-optimal sublinear time algorithms for Ulam distance , 2010, SODA '10.

[8]  Brian P. Ginsburg,et al.  Low-Power Impulse UWB Architectures and Circuits , 2009, Proceedings of the IEEE.

[9]  NavarroGonzalo A guided tour to approximate string matching , 2001 .

[10]  M. Fischer,et al.  STRING-MATCHING AND OTHER PRODUCTS , 1974 .

[11]  Alexandr Andoni,et al.  Earth mover distance over high-dimensional spaces , 2008, SODA '08.

[12]  Gregory Valiant,et al.  Finding Correlations in Subquadratic Time, with Applications to Learning Parities and Juntas , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[13]  Robert S. Boyer,et al.  A fast string searching algorithm , 1977, CACM.

[14]  Dimitris Achlioptas,et al.  Database-friendly random projections: Johnson-Lindenstrauss with binary coins , 2003, J. Comput. Syst. Sci..

[15]  Ronitt Rubinfeld,et al.  A sublinear algorithm for weakly approximating edit distance , 2003, STOC '03.

[16]  Itu-T and Iso Iec Jtc Advanced video coding for generic audiovisual services , 2010 .

[17]  Donald E. Knuth,et al.  Fast Pattern Matching in Strings , 1977, SIAM J. Comput..

[18]  Y. Rabani,et al.  Improved lower bounds for embeddings into L 1 , 2006, SODA 2006.

[19]  Piotr Indyk,et al.  Nearly optimal sparse fourier transform , 2012, STOC '12.

[20]  Yuval Rabani,et al.  Improved lower bounds for embeddings into L1 , 2006, SODA '06.

[21]  Piotr Indyk,et al.  Nearest-neighbor-preserving embeddings , 2007, TALG.

[22]  Virginia Vassilevska Williams,et al.  Multiplying matrices faster than coppersmith-winograd , 2012, STOC '12.

[23]  David P. Woodruff,et al.  Open Problems in Data Streams, Property Testing, and Related Topics , 2011 .

[24]  Moshe Lewenstein,et al.  Faster algorithms for string matching with k mismatches , 2000, SODA '00.

[25]  Elliott D. Kaplan Understanding GPS : principles and applications , 1996 .

[26]  Dana Ron,et al.  Property Testing , 2000 .

[27]  Gad M. Landau,et al.  Fast Parallel and Serial Approximate String Matching , 1989, J. Algorithms.

[28]  Spiesberger Finding the right cross-correlation peak for locating sounds in multipath environments with a fourth-moment function , 2000, The Journal of the Acoustical Society of America.