Minimum Data Length for Integer Period Estimation

Detecting periodicity in a short sequence is an important problem, with many applications across science and engineering. Several efficient algorithms have been proposed for this over the years. There is a wide choice available today in terms of the tradeoff between algorithmic complexity and estimation accuracy. In spite of such a rich history, one particular aspect of period estimation has received very little attention from a fundamental perspective. Namely, given a discrete time periodic signal and a list of candidate integer periods, what is the absolute minimum datalength required to estimate its integer period? Notice that the answer we seek must be a fundamental bound, i.e., independent of any particular period estimation technique. Common intuition suggests the minimum datalength as twice the largest expected period. However, this is true only under some special contexts. This paper derives the exact necessary and sufficient bounds to this problem. The above-mentioned question is also extended to the case of mixtures of periodic signals. First, a careful mathematical formulation discussing the unique identifiability of the component periods (hidden integer periods) is presented. Once again, a rigorous theoretical framework in this regard is missing in the existing literature but is a necessary platform to derive precise bounds on the minimum necessary datalength. The bounds given here are generic, that is, independent of the algorithms used. Specific algorithm-dependent bounds are also presented in the end for the case of dictionary-based integer period estimation reported in recent years.

[1]  Markus Gruber,et al.  REPPER—repeats and their periodicities in fibrous proteins , 2005, Nucleic Acids Res..

[2]  P. P. Vaidyanathan,et al.  Critical data length for period estimation , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[3]  William A. Sethares,et al.  Periodicity transforms , 1999, IEEE Trans. Signal Process..

[4]  John M. Butler,et al.  STRBase: a short tandem repeat DNA database for the human identity testing community , 2001, Nucleic Acids Res..

[5]  Abeer Alwan,et al.  Joint Robust Voicing Detection and Pitch Estimation Based on Residual Harmonics , 2019, INTERSPEECH.

[6]  Andreas Jakobsson,et al.  Joint High-Resolution Fundamental Frequency and Order Estimation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Thomas W. Parks,et al.  Orthogonal, exactly periodic subspace decomposition , 2003, IEEE Trans. Signal Process..

[8]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[9]  Gregory Kucherov,et al.  mreps: efficient and flexible detection of tandem repeats in DNA , 2003, Nucleic Acids Res..

[10]  M. Schroeder Period histogram and product spectrum: new methods for fundamental-frequency measurement. , 1968, The Journal of the Acoustical Society of America.

[11]  Petros Maragos,et al.  Harmonic analysis and restoration of separation methods for periodic signal mixtures: Algebraic separation versus comb filtering , 1998, Signal Process..

[12]  Roy D. Patterson,et al.  Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of F0 and periodicity , 1999, EUROSPEECH.

[13]  Hideki Kawahara,et al.  Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..

[14]  Richard G. Baraniuk,et al.  Compressive Sensing , 2008, Computer Vision, A Reference Guide.

[15]  M. Lentze,et al.  A novel stable polyalanine [poly(A)] expansion in the HOXA13 gene associated with hand-foot-genital syndrome: proper function of poly(A)-harbouring transcription factors depends on a critical repeat length? , 2002, Human Genetics.

[16]  L. F. Willems,et al.  Measurement of pitch in speech: an implementation of Goldstein's theory of pitch perception. , 1982, The Journal of the Acoustical Society of America.

[17]  M. Nakashizuka A Sparse Decomposition for Periodic Signal Mixtures , 2007, 2007 15th International Conference on Digital Signal Processing.

[18]  M. Neville,et al.  Identification and characterization of ANKK1: A novel kinase gene closely linked to DRD2 on chromosome band 11q23.1 , 2004, Human mutation.

[19]  Mike E. Davies,et al.  Sampling Theorems for Signals From the Union of Finite-Dimensional Linear Subspaces , 2009, IEEE Transactions on Information Theory.

[20]  Michael A Kennedy,et al.  The 2A resolution crystal structure of HetL, a pentapeptide repeat protein involved in regulation of heterocyst differentiation in the cyanobacterium Nostoc sp. strain PCC 7120. , 2009, Journal of structural biology.

[21]  G. Hardy,et al.  An Introduction to the Theory of Numbers , 1938 .

[22]  Metod Saniga,et al.  Ramanujan sums analysis of long-period sequences and 1/f noise , 2008, 0812.2170.

[23]  Michael Elad,et al.  Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[25]  Liisa Holm,et al.  Rapid automatic detection and alignment of repeats in protein sequences , 2000, Proteins.

[26]  P. P. Vaidyanathan,et al.  Ramanujan filter banks for estimation and tracking of periodicities , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27]  Andrey V Kajava,et al.  Tandem repeats in proteins: from sequence to structure. , 2012, Journal of structural biology.

[28]  Soo-Chang Pei,et al.  Intrinsic Integer-Periodic Functions for Discrete Periodicity Detection , 2015, IEEE Signal Processing Letters.

[29]  P. P. Vaidyanathan,et al.  The farey-dictionary for sparse representation of periodic signals , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[30]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[31]  C. Carathéodory,et al.  Über den zusammenhang der extremen von harmonischen funktionen mit ihren koeffizienten und über den picard-landau’schen satz , 1911 .

[32]  T. Parks,et al.  Maximum likelihood pitch estimation , 1976 .

[33]  M.G. Christensen,et al.  Multi-Pitch Estimation Using Harmonic Music , 2006, 2006 Fortieth Asilomar Conference on Signals, Systems and Computers.

[34]  John G Harris,et al.  A sawtooth waveform inspired pitch estimator for speech and music. , 2008, The Journal of the Acoustical Society of America.

[35]  P. P. Vaidyanathan,et al.  A Unified Theory of Union of Subspaces Representations for Period Estimation , 2016, IEEE Transactions on Signal Processing.

[36]  P. P. Vaidyanathan,et al.  Nested Periodic Matrices and Dictionaries: New Signal Representations for Period Estimation , 2015, IEEE Transactions on Signal Processing.

[37]  C. Ponting,et al.  Protein repeats: structures, functions, and evolution. , 2001, Journal of structural biology.

[38]  Nikos D. Sidiropoulos,et al.  Generalizing Carathéodory's uniqueness of harmonic parameterization to N dimensions , 2001, IEEE Trans. Inf. Theory.

[39]  A. Restrepo,et al.  On the period of sums of discrete periodic signals , 1998, IEEE Signal Processing Letters.

[40]  Suparerk Janjarasjitt,et al.  Detection and visualization of tandem repeats in DNA sequences , 2003, IEEE Trans. Signal Process..

[41]  Randy G. Goldberg,et al.  A Practical Handbook of Speech Coders , 2000 .

[42]  P. P. Vaidyanathan,et al.  Detecting tandem repeats in DNA using Ramanujan Filter Bank , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[43]  D. J. Hermes,et al.  Measurement of pitch by subharmonic summation. , 1988, The Journal of the Acoustical Society of America.

[44]  C. Caskey,et al.  DNA typing and genetic mapping with trimeric and tetrameric tandem repeats. , 1991, American journal of human genetics.

[45]  Xuejing Sun,et al.  Pitch determination and voice quality analysis using Subharmonic-to-Harmonic Ratio , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[46]  Gajendra P. S. Raghava,et al.  Spectral Repeat Finder (SRF): identification of repetitive sequences using Fourier transformation , 2004, Bioinform..