论文信息 - Minimum-latency Time-frequency Analysis Using Asymmetric Window Functions

Minimum-latency Time-frequency Analysis Using Asymmetric Window Functions

We study the real-time dynamics retrieval from a time series via the time-frequency (TF) analysis with the minimal latency guarantee. While different from the well-known intrinsic latency definition in the filter design, a rigorous definition of intrinsic latency for different time-frequency representations (TFR) is provided, including the short time Fourier transform (STFT), synchrosqeezing transform (SST) and reassignment method (RM). To achieve the minimal latency, a systematic method is proposed to construct an asymmetric window from a well-designed symmetric one based on the concept of minimum-phase, if the window satisfies some weak conditions. We theoretically show that the TFR determined by SST with the constructed asymmetric window does have a smaller intrinsic latency. Finally, the music onset detection problem is studied to show the strength of the proposed algorithm.

Li Su | Hau-Tieng Wu | Hau‐Tieng Wu | Li Su

[1] Mark D. Plumbley,et al. B-Keeper: a beat-tracker for live performance , 2007, NIME '07.

[2] Rainer Martin,et al. A low delay, variable resolution, perfect reconstruction spectral analysis-synthesis system for speech enhancement , 2007, 2007 15th European Signal Processing Conference.

[3] A. W. M. van den Enden,et al. Discrete Time Signal Processing , 1989 .

[4] Douglas D. O'Shaughnessy,et al. Robust feature extraction based on an asymmetric level-dependent auditory filterbank and a subband spectrum enhancement technique , 2014, Digit. Signal Process..

[5] Douglas D. O'Shaughnessy,et al. Robust speech recognition under noisy environments using asymmetric tapers , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[6] Chin-Teng Lin,et al. A Real-Time Wireless Brain–Computer Interface System for Drowsiness Detection , 2010, IEEE Transactions on Biomedical Circuits and Systems.

[7] Hoon Heo,et al. Note onset detection based on harmonic cepstrum regularity , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[8] Marc Moonen,et al. Adaptive Time-Frequency Analysis for Noise Reduction in an Audio Filter Bank With Low Delay , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[9] Florian Krebs,et al. ONLINE REAL-TIME ONSET DETECTION WITH RECURRENT NEURAL NETWORKS , 2012 .

[10] Niranjan Damera-Venkata,et al. Design of optimal minimum-phase digital FIR filters using discrete Hilbert transforms , 2000, IEEE Trans. Signal Process..

[11] Damjan Vlaj,et al. ROBUST MFCC FEATURE EXTRACTION ALGORITHM USING EFFICIENT ADDITIVE AND CONVOLUTIONAL NOISE REDUCTION PROCEDURES , 2002 .

[12] S. Dixon. ONSET DETECTION REVISITED , 2006 .

[13] G. Widmer,et al. MAXIMUM FILTER VIBRATO SUPPRESSION FOR ONSET DETECTION , 2013 .

[14] K. Kodera,et al. Analysis of time-varying signals with small BT values , 1978 .

[15] Yi-Hsuan Yang,et al. Multipitch Estimation of Piano Music by Exemplar-Based Sparse Representation , 2012, IEEE Transactions on Multimedia.

[16] Heinrich W. Lollmann,et al. Low Delay Filter-Banks for Speech and Audio Processing , 2008 .

[17] Daniel P. W. Ellis,et al. Better beat tracking through robust onset aggregation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18] Christina Gloeckner. Foundations Of Time Frequency Analysis , 2016 .

[19] Florian Krebs,et al. Evaluating the Online Capabilities of Onset Detection Methods , 2012, ISMIR.

[20] Rainer Martin,et al. Improved Reproduction of Stops in Noise Reduction Systems with Adaptive Windows and Nonstationarity Detection , 2009, EURASIP J. Adv. Signal Process..

[21] Yi Wang,et al. ConceFT: concentration of frequency and time via a multitapered synchrosqueezed transform , 2015, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[22] Patrick Flandrin,et al. Improving the readability of time-frequency and time-scale representations by the reassignment method , 1995, IEEE Trans. Signal Process..

[23] D.A.F. Florencio. On the use of asymmetric windows for reducing the time delay in real-time spectral analysis , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[24] Masatoshi Suzuki,et al. An inverse problem for a class of canonical systems and its applications to self-reciprocal polynomials , 2013, Journal d'Analyse Mathématique.

[25] F. Harris. On the use of windows for harmonic analysis with the discrete Fourier transform , 1978, Proceedings of the IEEE.

[26] Kyogu Lee,et al. A pairwise approach to simultaneous onset/offset detection for singing voice using correntropy , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27] Anoop Gupta,et al. How low should we go?: understanding the perception of latency while inking , 2014, Graphics Interface.

[28] Hau-tieng Wu,et al. Convex Optimization approach to signals with fast varying instantaneous frequency , 2015, 1503.07591.

[29] Rainer Martin,et al. Optimization of switchable windows for low-delay spectral analysis-synthesis , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[30] Yannis Stylianou,et al. Three Dimensions of Pitched Instrument Onset Detection , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[31] J. Tsao,et al. Time‐varying spectral analysis revealing differential effects of sevoflurane anaesthesia: non‐rhythmic‐to‐rhythmic ratio , 2014, Acta anaesthesiologica Scandinavica.

[32] I. Daubechies,et al. Synchrosqueezed wavelet transforms: An empirical mode decomposition-like tool , 2011 .

[33] Hau-tieng Wu,et al. Nonparametric and adaptive modeling of dynamic seasonality and trend with heteroscedastic and dependent errors , 2012, 1210.4672.

[34] David Wessel,et al. Perceptual scheduling in real-time music and audio applications , 2001 .

[35] Alexander Lerch,et al. CMMSD: A Data Set for Note-Level Segmentation of Monophonic Music , 2014, Semantic Audio.

[36] Jun Xiao,et al. Multitaper Time-Frequency Reassignment for Nonstationary Spectrum Estimation and Chirp Enhancement , 2007, IEEE Transactions on Signal Processing.

[37] Valentin Emiya. Transcription automatique de la musique de piano , 2008 .

[38] Alan V. Oppenheim,et al. Discrete-time signal processing (2nd ed.) , 1999 .

[39] Andrew P. McPherson,et al. Low-Latency Audio Pitch Tracking: A Multi-Modal Sensor-Assisted Approach , 2014, NIME.

[40] William A. Sethares,et al. Classification Based on Speech Rhythm via a Temporal Alignment of Spoken Sentences , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[41] Björn W. Schuller,et al. Universal Onset Detection with Bidirectional Long Short-Term Memory Neural Networks , 2010, ISMIR.

[42] Dusan M. Kodek,et al. Using asymmetric windows in automatic speech recognition , 2007, Speech Commun..

[43] Hau-Tieng Wu,et al. Evaluating Physiological Dynamics via Synchrosqueezing: Prediction of Ventilator Weaning , 2013, IEEE Transactions on Biomedical Engineering.

[44] A. Walden,et al. Spectral analysis for physical applications : multitaper and conventional univariate techniques , 1996 .

[45] Yi-Hsuan Yang,et al. Power-Scaled Spectral Flux and Peak-Valley Group-Delay Methods for Robust Musical Onset Detection , 2014, ICMC.

[46] Rainer Martin,et al. Optimal signal reconstruction from a constant-Q spectrum , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[47] Erik Marchi,et al. Multi-resolution linear prediction based features for audio onset detection with bidirectional LSTM neural networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[48] Yi-Hsuan Yang,et al. Escaping from the Abyss of Manual Annotation: New Methodology of Building Polyphonic Datasets for Automatic Music Transcription , 2015, CMMR.

[49] Yi-Hsuan Yang,et al. Musical Onset Detection Using Constrained Linear Reconstruction , 2015, IEEE Signal Processing Letters.

[50] Ángel M. Gómez,et al. On the Use of Asymmetric Windows for Robust Speech Recognition , 2012, Circuits Syst. Signal Process..

[51] Joseph Timoney,et al. Real-time detection of musical onsets with linear prediction and sinusoidal modeling , 2011, EURASIP J. Adv. Signal Process..

[52] Hau-Tieng Wu,et al. Non‐parametric and adaptive modelling of dynamic periodicity and trend with heteroscedastic and dependent errors , 2014 .

[53] Mark B. Sandler,et al. A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[54] Roger B. Dannenberg,et al. Low-latency Music Software Using Off-the-shelf Operating Systems , 1998, ICMC.

[55] A. Nuttall. Some windows with very good sidelobe behavior , 1981 .

[56] Anssi Klapuri,et al. Automatic music transcription: challenges and future directions , 2013, Journal of Intelligent Information Systems.