Improved Real-Time Monophonic Pitch Tracking with the Extended Complex Kalman Filter

This paper proposes a real-time, sample-by-sample pitch tracker for monophonic audio signals using the Extended Kalman Filter in the complex domain (Extended Complex Kalman Filter). It improves upon the algorithm proposed by the same authors in a previous paper [1] by fixing the issue of slow tracking of rapid note changes. It does so by detecting harmonic change in the signal and resetting the filter whenever a significant harmonic change is detected. Along with the fundamental frequency, the ECKF also tracks the amplitude envelope and instantaneous phase of the input audio signal. The pitch tracker is ideal for detecting ornaments in solo instrument music—such as slides and vibratos. The improved algorithm is tested to track pitch of bowed string (double-bass), plucked string (guitar), and vocal singing samples.

[1]  A. Gray,et al.  A spectral-flatness measure for studying the autocorrelation method of linear prediction of speech analysis , 1974 .

[2]  M. Boutayeb,et al.  Convergence analysis of the extended Kalman filter used as an observer for nonlinear deterministic discrete-time systems , 1997, IEEE Trans. Autom. Control..

[3]  Julius O. Smith,et al.  Real-Time Pitch Tracking in Audio Signals with the Extended Complex Kalman Filter , 2017 .

[4]  P. Welch The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms , 1967 .

[5]  Ronald A. Cole,et al.  Pitch detection with a neural-net classifier , 1991, IEEE Trans. Signal Process..

[6]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[7]  Max A. Little,et al.  A Kalman-based fundamental frequency estimation algorithm , 2017, 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[8]  David Gerhard,et al.  Pitch Extraction and Fundamental Frequency: History and Current Techniques , 2003 .

[9]  Gabriel A. Terejanu,et al.  Extended Kalman Filter Tutorial , 2009 .

[10]  Michael I. Jordan,et al.  Discriminative training of hidden Markov models for multiple pitch tracking [speech processing examples] , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[11]  Jong Wook Kim,et al.  Crepe: A Convolutional Representation for Pitch Estimation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Yi-Hsuan Yang,et al.  Analysis of Expressive Musical Terms in Violin Using Score-Informed and Expression-Based Audio Features , 2015, ISMIR.

[13]  J. Licklider,et al.  A duplex theory of pitch perception , 1951, Experientia.

[14]  T. Parks,et al.  Maximum likelihood pitch estimation , 1976, 1977 IEEE Conference on Decision and Control including the 16th Symposium on Adaptive Processes and A Special Symposium on Fuzzy Set Theory and Applications.

[15]  James A. Moorer,et al.  On the Transcription of Musical Sound by Computer , 2016 .

[16]  Bryan Pardo,et al.  VocalSet: A Singing Voice Dataset , 2018, ISMIR.

[17]  Ganapati Panda,et al.  An extended complex Kalman filter for frequency measurement of distorted signals , 2000, 2000 IEEE Power Engineering Society Winter Meeting. Conference Proceedings (Cat. No.00CH37077).

[18]  Xavier Serra,et al.  A sound analysis/synthesis system based on a deterministic plus stochastic decomposition , 1990 .

[19]  Craig Stuart Sapp,et al.  Efficient Pitch Detection Techniques for Interactive Music , 2001, ICMC.

[20]  Simon Dixon,et al.  PYIN: A fundamental frequency estimator using probabilistic threshold distributions , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21]  Matti Karjalainen,et al.  A computationally efficient multipitch analysis model , 2000, IEEE Trans. Speech Audio Process..