Neural network adaptive wavelets for signal representation and classification

Methods are presented for adaptively generating wavelet templates for signal representation and classification using neural networks. Different network structures and energy functions are necessary and are given for representation and classification. The idea is introduced of a "super-wavelet," a linear combination of wavelets that itself is treated as a wavelet. The super-wavelet allows the shape of the wavelet to adapt to a particular problem, which goes beyond adapting parameters of a fixed-shape wavelet. Simulations are given for 1-D signals, with the concepts extendable to imagery. Ideas are discussed for applying the concepts in the paper to phoneme and speaker recognition.

[1]  Elaine Luskey The Sounds Of Speech Communication: A Primer Of Acoustic Phonetics And Speech Perception (review) , 2013 .

[2]  Harold H. Szu,et al.  Causal analytical wavelet transform , 1992 .

[3]  David Casasent,et al.  Optical Gabor and wavelet transforms for scene analysis , 1992, Defense, Security, and Sensing.

[4]  H. Szu,et al.  Implementing the minimum-misclassification-error energy function for target recognition , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[5]  M. Savic,et al.  Phoneme based speaker verification , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Hiroaki Hattori,et al.  Text-independent speaker recognition using neural networks , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Ronald R. Coifman,et al.  Entropy-based algorithms for best basis selection , 1992, IEEE Trans. Inf. Theory.

[8]  Ronald A. DeVore,et al.  Image compression through wavelet transform coding , 1992, IEEE Trans. Inf. Theory.

[9]  Deepen Sinha,et al.  On the optimal choice of a wavelet for signal representation , 1992, IEEE Trans. Inf. Theory.

[10]  Q. Zhang,et al.  Approximation by nonlinear wavelet networks , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[11]  A. Paoloni,et al.  A recurrent time-delay neural network for improved phoneme recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[12]  Shigeki Sagayama,et al.  Phoneme recognition by phoneme filter neural networks , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[13]  Shigeki Sagayama,et al.  A pairwise discriminant approach to robust phoneme recognition by time-delay neural networks , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[14]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[15]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[16]  John G. Daugman,et al.  Complete discrete 2-D Gabor transforms by neural networks for image analysis and compression , 1988, IEEE Trans. Acoust. Speech Signal Process..

[17]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[18]  R. Wohlford,et al.  A new method of text-independent speaker recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[19]  Francis Nolan,et al.  The Phonetic Bases of Speaker Recognition , 1983 .

[20]  J. Markel,et al.  Text-independent speaker recognition from a large linguistically unconstrained time-spaced data base , 1979 .

[21]  Harry Hollien,et al.  Speaker identification by long‐term spectra under normal and distorted speech conditions , 1977 .

[22]  Y. Chien,et al.  Pattern classification and scene analysis , 1974 .

[23]  G. W. Hughes,et al.  Talker differences as they appear in correlation matrices of continuous speech spectra. , 1974, The Journal of the Acoustical Society of America.

[24]  Yagyensh C. Pati,et al.  Analysis and synthesis of feedforward neural networks using discrete affine wavelet transformations , 1993, IEEE Trans. Neural Networks.

[25]  Harold H. Szu,et al.  Neural networks based on peano curves and hairy neurons , 1990 .

[26]  David S. Broomhead,et al.  Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[27]  Douglas D. O'Shaughnessy,et al.  Speech communication : human and machine , 1987 .

[28]  Egory Sequence,et al.  A Parametric Representation and a Clustering Method for Phoneme Recognition-Application to Stops in a CV Environment KAZUYOTANAKA , 1981 .

[29]  U G Goldstein,et al.  Speaker-identifying features based on formant tracks. , 1976, The Journal of the Acoustical Society of America.

[30]  J. Flanagan Speech Analysis, Synthesis and Perception , 1971 .