Frame Theory for Signal Processing in Psychoacoustics

This review chapter aims to strengthen the link between frame theory and signal processing tasks in psychoacoustics. On the one side, the basic concepts of frame theory are presented and some proofs are provided to explain those concepts in some detail. The goal is to reveal to hearing scientists how this mathematical theory could be relevant for their research. In particular, we focus on frame theory in a filter bank approach, which is probably the most relevant view point for audio signal processing. On the other side, basic psychoacoustic concepts are presented to stimulate mathematicians to apply their knowledge in this field.

[1]  B C Moore,et al.  Masking patterns for sinusoidal and narrow-band noise maskers. , 1998, The Journal of the Acoustical Society of America.

[2]  Yonina C. Eldar,et al.  Dual Gabor frames: theory and computational aspects , 2005, IEEE Transactions on Signal Processing.

[3]  P. Balázs,et al.  Canonical forms of unconditionally convergent multipliers☆ , 2013, Journal of mathematical analysis and applications.

[4]  Christopher Heil,et al.  Continuous and Discrete Wavelet Transforms , 1989, SIAM Rev..

[5]  Gerald Kidd,et al.  Patterns of residual masking , 1981, Hearing Research.

[6]  Pavel Rajmic,et al.  Discrete Wavelet Transforms in the Large Time-Frequency Analysis Toolbox for MATLAB/GNU Octave , 2016, ACM Trans. Math. Softw..

[7]  S. Ystad,et al.  Simultaneous masking additivity for short Gaussian-shaped tones: spectral effects. , 2013, The Journal of the Acoustical Society of America.

[8]  Eliathamby Ambikairajah,et al.  Perceptual speech enhancement exploiting temporal masking properties of human auditory system , 2010, Speech Commun..

[9]  Karlheinz Gröchenig,et al.  Acceleration of the frame algorithm , 1993, IEEE Trans. Signal Process..

[10]  I. Mazin,et al.  Theory , 1934 .

[11]  Matthieu Kowalski,et al.  Adapted and Adaptive Linear Time-Frequency Representations: A Synthesis Point of View , 2013, IEEE Signal Processing Magazine.

[12]  Brian R Glasberg,et al.  Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.

[13]  Peter G. Casazza,et al.  Gabor Frames over Irregular Lattices , 2003, Adv. Comput. Math..

[14]  Martin Vetterli,et al.  Oversampled filter banks , 1998, IEEE Trans. Signal Process..

[15]  Hugo Fastl,et al.  Psychoacoustics: Facts and Models , 1990 .

[16]  DeLiang Wang,et al.  CASA-Based Robust Speaker Identification , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Peter L. Søndergaard,et al.  Gabor frames by sampling and periodization , 2007, Adv. Comput. Math..

[18]  Roy D. Patterson,et al.  A Dynamic Compressive Gammachirp Auditory Filterbank , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  H. Feichtinger,et al.  A First Survey of Gabor Multipliers , 2003 .

[20]  C. Cabrelli,et al.  Frames by Multiplication , 2010, 1004.1429.

[21]  Francesc Alías,et al.  Gammatone Cepstral Coefficients: Biologically Inspired Features for Non-Speech Audio Classification , 2012, IEEE Transactions on Multimedia.

[22]  Karlheinz Gröchenig,et al.  Foundations of Time-Frequency Analysis , 2000, Applied and numerical harmonic analysis.

[23]  Alan V. Oppenheim,et al.  Discrete-time Signal Processing. Vol.2 , 2001 .

[24]  Diana T. Stoeva,et al.  Representation of the inverse of a frame multiplier☆ , 2011, Journal of mathematical analysis and applications.

[25]  Thomas Grill,et al.  A Framework for Invertible, Real-Time Constant-Q Transforms , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[26]  Dustin G. Mixon,et al.  Finite Frames and Filter Banks , 2013 .

[27]  N. Holighaus Structure of nonstationary Gabor frames and their dual systems , 2013, 1306.5037.

[28]  I. Daubechies,et al.  PAINLESS NONORTHOGONAL EXPANSIONS , 1986 .

[29]  Christopher J. Plack,et al.  The Sense of Hearing , 2005 .

[30]  Alfred Mertins,et al.  Analysis and design of gammatone signal models. , 2009, The Journal of the Acoustical Society of America.

[31]  C. Heil A basis theory primer , 2011 .

[32]  Helmut Bölcskei,et al.  Frame-theoretic analysis of oversampled filter banks , 1998, IEEE Trans. Signal Process..

[33]  Nicki Holighaus,et al.  Efficient Algorithms for Discrete Gabor Transforms on a Nonseparable Lattice , 2013, IEEE Transactions on Signal Processing.

[34]  Philipp Birken,et al.  Numerical Linear Algebra , 2011, Encyclopedia of Parallel Computing.

[35]  Say Song Goh,et al.  Fourier-like frames on locally compact abelian groups , 2015, J. Approx. Theory.

[36]  E. Owens,et al.  An Introduction to the Psychology of Hearing , 1997 .

[37]  F. Hlawatsch,et al.  Linear Time–Frequency Filters: On-line Algorithms and Applications , 2018, Applications in Time-Frequency Signal Processing.

[38]  Carla Teixeira Lopes,et al.  TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .

[39]  P. Vaidyanathan,et al.  Non-uniform multirate filter banks: theory and design , 1989, IEEE International Symposium on Circuits and Systems,.

[40]  R. Kronland-Martinet,et al.  Car door closure sounds : characterization of perceptual properties through analysis-synthesis approach , 2015 .

[41]  Thibaud Necciari,et al.  Perceptual matching pursuit with Gabor dictionaries and time-frequency masking , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[42]  Jelena Kovacevic,et al.  Perfect reconstruction filter banks with rational sampling factors , 1993, IEEE Trans. Signal Process..

[43]  P. Vaidyanathan Multirate Systems And Filter Banks , 1992 .

[44]  Diana T. Stoeva,et al.  Riesz Bases Multipliers , 2014 .

[45]  P. Casazza THE ART OF FRAME THEORY , 1999, math/9910168.

[46]  A. Ron,et al.  Generalized Shift-Invariant Systems , 2005 .

[47]  W. Jesteadt,et al.  Forward masking as a function of frequency, masker level, and signal delay. , 1982, The Journal of the Acoustical Society of America.

[48]  A. Spanias,et al.  Perceptual coding of digital audio , 2000, Proceedings of the IEEE.

[49]  Sony Akkarakaran,et al.  10 - Nonuniform Filter Banks: New Results and open Problems , 2003 .

[50]  Massimo Fornasier,et al.  Theoretical Foundations and Numerical Methods for Sparse Recovery , 2010, Radon Series on Computational and Applied Mathematics.

[51]  I. Daubechies Ten Lectures on Wavelets , 1992 .

[52]  D. D. Greenwood A cochlear frequency-position function for several species--29 years later. , 1990, The Journal of the Acoustical Society of America.

[53]  John J. Benedetto,et al.  A Wavelet Auditory Model and Data Compression , 1993 .

[54]  R. Duffin,et al.  A class of nonharmonic Fourier series , 1952 .

[55]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[56]  Thomas Strohmer,et al.  Numerical algorithms for discrete Gabor expansions , 1998 .

[57]  R. Patterson,et al.  Complex Sounds and Auditory Images , 1992 .

[58]  Peter G. Casazza,et al.  Finite Frames: Theory and Applications , 2012 .

[59]  Gerald Matz,et al.  Extending the transfer function calculus of time-varying linear systems: a generalized underspread theory , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[60]  Ole Christensen,et al.  Pairs of Dual Gabor Frame Generators with Compact Support and Desired Frequency Localization , 2006 .

[61]  Peter Balazs,et al.  Frames and Finite Dimensionality: Frame Transformation, Classification and Algorithms , 2008 .

[62]  Thibaud Necciari,et al.  Auditory time-frequency masking: Psychoacoustical measures and application to the analysis-synthesis of sound signals , 2010 .

[63]  Jakob Lemvig,et al.  The canonical and alternate duals of a wavelet frame , 2007 .

[64]  Monika Dörfler,et al.  Nonstationary Gabor frames - Existence and construction , 2011, Int. J. Wavelets Multiresolution Inf. Process..

[65]  Nicki Holighaus,et al.  The Large Time-Frequency Analysis Toolbox 2.0 , 2013, CMMR.

[66]  Nathanael Perraudin,et al.  Gabor dual windows using convex optimization , 2013 .

[67]  E. Lopez-Poveda,et al.  A human nonlinear cochlear filterbank. , 2001, The Journal of the Acoustical Society of America.

[68]  Thibaud Necciari,et al.  The ERBlet transform: An auditory-based time-frequency representation with perfect reconstruction , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[69]  Richard Kronland-Martinet,et al.  Additivity of nonsimultaneous masking for short Gaussian-shaped sinusoids. , 2011, The Journal of the Acoustical Society of America.

[70]  T. Irino,et al.  Comparison of the roex and gammachirp filters as representations of the auditory filter. , 2006, The Journal of the Acoustical Society of America.

[71]  R. Young,et al.  An introduction to nonharmonic Fourier series , 1980 .

[72]  Bruno Torrésani,et al.  The Linear Time Frequency Analysis Toolbox , 2012, Int. J. Wavelets Multiresolution Inf. Process..

[73]  Jinsong Leng,et al.  Optimal Dual Frames for Communication Coding With Probabilistic Erasures , 2011, IEEE Transactions on Signal Processing.

[74]  O. Christensen An introduction to frames and Riesz bases , 2002 .

[75]  Gaël Richard,et al.  Union of MDCT Bases for Audio Coding , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[76]  Bernhard Laback,et al.  Time–Frequency Sparsity by Removing Perceptually Irrelevant Components Using a Simple Model of Simultaneous Masking , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[77]  Jonathan J O'Donovan,et al.  Perceptually motivated time-frequency analysis. , 2005, The Journal of the Acoustical Society of America.

[78]  A. Janssen From continuous to discrete Weyl-Heisenberg frames through sampling , 1997 .

[79]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[80]  Daniel P. W. Ellis Computational Auditory Scene Analysis: Principles, Practice and Applications , 1999 .

[81]  Nicki Holighaus,et al.  Theory, implementation and applications of nonstationary Gabor frames , 2011, J. Comput. Appl. Math..

[82]  E. Zwicker Dependence of post-masking on masker duration and its relation to temporal effects in loudness. , 1984, The Journal of the Acoustical Society of America.

[83]  Demetrio Labate,et al.  A unified characterization of reproducing systems generated by a finite family, II , 2002 .

[84]  E. Zwicker,et al.  Analytical expressions for critical‐band rate and critical bandwidth as a function of frequency , 1980 .

[85]  Richard F. Lyon,et al.  History and future of auditory filter models , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[86]  Jelena Kovacevic,et al.  Wavelets and Subband Coding , 2013, Prentice Hall Signal Processing Series.

[87]  Edoardo Mosca,et al.  Bound Ratio Minimization of Filter Bank Frames , 2010, IEEE Transactions on Signal Processing.

[88]  C. Conatser Biorthogonal systems and bases in a Hilbert space , 1963 .

[89]  Willi-Hans Steeb,et al.  Biophysical Parameters Modification Could Overcome Essential Hearing Gaps , 2008, PLoS Comput. Biol..

[90]  Diana T. Stoeva,et al.  Invertibility of multipliers , 2009, 0911.2783.

[91]  Demetrio Labate,et al.  A unified characterization of reproducing systems generated by a finite family , 2002 .

[92]  Hyunjoong Kim,et al.  Functional Analysis I , 2017 .

[93]  Albert S. Bregman,et al.  The Auditory Scene. (Book Reviews: Auditory Scene Analysis. The Perceptual Organization of Sound.) , 1990 .

[94]  M. Hampejs,et al.  Double Preconditioning for Gabor Frames , 2006, IEEE Transactions on Signal Processing.

[95]  Guy J. Brown,et al.  Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .

[96]  P. Balázs Basic definition and properties of Bessel multipliers , 2005, math/0510091.

[97]  D R Soderquist,et al.  Backward, simultaneous, and forward masking as a function of signal delay and frequency. , 1981, The Journal of auditory research.