Auditory-motivated Gammatone wavelet transform

The ability of the continuous wavelet transform (CWT) to provide good time and frequency localization has made it a popular tool in time-frequency analysis of signals. Wavelets exhibit constant-Q property, which is also possessed by the basilar membrane filters in the peripheral auditory system. The basilar membrane filters or auditory filters are often modeled by a Gammatone function, which provides a good approximation to experimentally determined responses. The filterbank derived from these filters is referred to as a Gammatone filterbank. In general, wavelet analysis can be likened to a filterbank analysis and hence the interesting link between standard wavelet analysis and Gammatone filterbank. However, the Gammatone function does not exactly qualify as a wavelet because its time average is not zero. We show how bona fide wavelets can be constructed out of Gammatone functions. We analyze properties such as admissibility, time-bandwidth product, vanishing moments, which are particularly relevant in the context of wavelets. We also show how the proposed auditory wavelets are produced as the impulse response of a linear, shift-invariant system governed by a linear differential equation with constant coefficients. We propose analog circuit implementations of the proposed CWT. We also show how the Gammatone-derived wavelets can be used for singularity detection and time-frequency analysis of transient signals.

[1]  J. Hillenbrand,et al.  Acoustic characteristics of American English vowels. , 1994, The Journal of the Acoustical Society of America.

[2]  B. Moore,et al.  A revision of Zwicker's loudness model , 1996 .

[3]  Patrick A. Naylor,et al.  Estimation of Glottal Closing and Opening Instants in Voiced Speech Using the YAGA Algorithm , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Richard F. Lyon,et al.  History and future of auditory filter models , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[5]  Michael T. Johnson,et al.  Speech signal enhancement through adaptive wavelet thresholding , 2007, Speech Commun..

[6]  Paul Suetens,et al.  Using Expert Systems for Image Understanding , 1987, Int. J. Pattern Recognit. Artif. Intell..

[7]  Kei Man Tsang Tempo extraction using the discrete wavelet transform , 2006 .

[8]  Stéphane Mallat,et al.  A Wavelet Tour of Signal Processing - The Sparse Way, 3rd Edition , 2008 .

[9]  Aaron E. Rosenberg,et al.  A comparative performance study of several pitch detection algorithms , 1976 .

[10]  R. Patterson Auditory filter shape. , 1974, The Journal of the Acoustical Society of America.

[11]  J. Flanagan Models for Approximating Basilar Membrane Displacement , 1960 .

[12]  Wing-Hung Ki,et al.  Analog VLSI implementations of auditory wavelet transforms using switched-capacitor circuits , 1994 .

[13]  Jorg Kliewer,et al.  The complex-valued continuous wavelet transform as a preprocessor for auditory scene analysis , 1998 .

[14]  Patrick A. Naylor,et al.  Detection of Glottal Closure Instants From Speech Signals: A Quantitative Review , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  B. Yegnanarayana,et al.  Epoch extraction from linear prediction residual for identification of closed glottis interval , 1979 .

[16]  Jack Xin,et al.  Signal processing of acoustic signals in the time domain with an active nonlinear nonlocal cochlear model , 2004, Signal Processing.

[17]  B. Moore,et al.  Auditory filter shapes at low center frequencies. , 1990, The Journal of the Acoustical Society of America.

[18]  Leon Cohen,et al.  Time Frequency Analysis: Theory and Applications , 1994 .

[19]  Richard Kronland-Martinet,et al.  Analysis of Sound Patterns through Wavelet transforms , 1987, Int. J. Pattern Recognit. Artif. Intell..

[20]  R. Patterson,et al.  B OF THE SVOS FINAL REPORT ( Part A : The Auditory Filterbank ) AN EFFICIENT AUDITORY FIL TERBANK BASED ON THE GAMMATONE FUNCTION , 2010 .

[21]  IrinoToshio,et al.  Segregating information about the size and shape of the vocal tract using a time-domain auditory model , 2002 .

[22]  C Giguère,et al.  A computational model of the auditory periphery for speech and hearing research. I. Ascending path. , 1994, The Journal of the Acoustical Society of America.

[23]  Alfred Mertins,et al.  Analysis and design of gammatone signal models. , 2009, The Journal of the Acoustical Society of America.

[24]  J. F. Kaiser,et al.  On a simple algorithm to calculate the 'energy' of a signal , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[25]  Stéphane Mallat,et al.  Singularity detection and processing with wavelets , 1992, IEEE Trans. Inf. Theory.

[26]  Fusheng Yang,et al.  Wavelet analysis of TEOAE simulated by a homomorphic auditory model , 1997, Proceedings of the 19th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. 'Magnificent Milestones and Emerging Opportunities in Medical Engineering' (Cat. No.97CH36136).

[27]  A. Nuttall,et al.  On the quadrature approximation to the Hilbert transform of modulated signals , 1966 .

[28]  Bayya Yegnanarayana,et al.  Epoch Extraction From Speech Signals , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[29]  Istvan Pintér,et al.  Perceptual wavelet-representation of speech signals and its application to speech enhancement , 1996, Comput. Speech Lang..

[30]  Jean Rouat,et al.  Wavelet speech enhancement based on time-scale adaptation , 2006, Speech Commun..

[31]  Alfred Mertins,et al.  Sparse gammatone signal model optimized for English speech does not match the human auditory filters , 2008, Brain Research.

[32]  Thomas Kailath,et al.  Linear Systems , 1980 .

[33]  N. Ruiz Reyes,et al.  A new algorithm for translating psycho-acoustic information to the wavelet domain , 2001 .

[34]  Hideki Kawahara,et al.  Signal reconstruction from modified auditory wavelet transform , 1993, IEEE Trans. Signal Process..

[35]  Roy D. Patterson,et al.  Segregating information about the size and shape of the vocal tract using a time-domain auditory model: The stabilised wavelet-Mellin transform , 2002, Speech Commun..

[36]  S. A. Shamma,et al.  A functional model of the early auditory system , 1992, [1992] Proceedings of the IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis.

[37]  Li Deng,et al.  Numerical property and efficient solution of a transmission-line model for basilar membrane wave motions , 1993, Signal Process..

[38]  Xing He,et al.  An enhanced psychoacoustic model based on the discrete wavelet packet transform , 2006, J. Frankl. Inst..

[39]  Hsiao-Chuan Wang,et al.  Enhancement of single channel speech based on masking property and wavelet transform , 2003, Speech Commun..

[40]  John J. Benedetto,et al.  An auditory motivated time-scale signal representation , 1992, [1992] Proceedings of the IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis.

[41]  T. V. Ananthapadmanabha,et al.  Calculation of true glottal flow and its components , 1982, Speech Commun..

[42]  Stphane Mallat,et al.  A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way , 2008 .

[43]  Harold H. Szu,et al.  Hermitian hat wavelet design for singularity detection in the Paraguay river-level data analyses , 1997, Defense, Security, and Sensing.

[44]  Kuansan Wang,et al.  Auditory representations of acoustic signals , 1992, IEEE Trans. Inf. Theory.

[45]  P. Woodland,et al.  A computational model of the auditory periphery for speech and hearing research. II. Descending paths. , 1994, The Journal of the Acoustical Society of America.

[46]  William J. Byrne,et al.  Noise robustness in the auditory representation of speech signals , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[47]  Mladen Victor Wickerhauser,et al.  Adaptive Wavelet-Analysis , 1996 .

[48]  Van Valkenburg,et al.  Introduction to Modern Network Synthesis , 1960 .

[49]  Yuan-Ting Zhang,et al.  Bionic wavelet transform: a new time-frequency method based on an auditory model , 2001, IEEE Trans. Biomed. Eng..

[50]  S. Neely Mathematical modeling of cochlear mechanics. , 1985, The Journal of the Acoustical Society of America.

[51]  E. D. Boer,et al.  On the Principle of Specific Coding , 1973 .

[52]  B. P. Lathi Signal Processing And Linear Systems , 1998 .

[53]  R. Lyon The all-pole gammatone filter and auditory models , 1996 .