Neural network adaptive wavelets for signal representation and classification

Methods are presented for adaptively generating wavelet templates for signal representation and classification using neural networks. Different network structures and energy functions are necessary and are given for representation and classification. The idea is introduced of a "super-wavelet," a linear combination of wavelets that itself is treated as a wavelet. The super-wavelet allows the shape of the wavelet to adapt to a particular problem, which goes beyond adapting parameters of a fixed-shape wavelet. Simulations are given for 1-D signals, with the concepts extendable to imagery. Ideas are discussed for applying the concepts in the paper to phoneme and speaker recognition.

[1]  Harold H. Szu,et al.  Neural networks based on peano curves and hairy neurons , 1990 .

[2]  David S. Broomhead,et al.  Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[3]  Ronald A. DeVore,et al.  Image compression through wavelet transform coding , 1992, IEEE Trans. Inf. Theory.

[4]  Harold H. Szu,et al.  Causal analytical wavelet transform , 1992 .

[5]  R. Wohlford,et al.  A new method of text-independent speaker recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Q. Zhang,et al.  Approximation by nonlinear wavelet networks , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[7]  Deepen Sinha,et al.  On the optimal choice of a wavelet for signal representation , 1992, IEEE Trans. Inf. Theory.

[8]  Yagyensh C. Pati,et al.  Analysis and synthesis of feedforward neural networks using discrete affine wavelet transformations , 1993, IEEE Trans. Neural Networks.

[9]  Ronald R. Coifman,et al.  Entropy-based algorithms for best basis selection , 1992, IEEE Trans. Inf. Theory.

[10]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[11]  Francis Nolan The phonetic bases of speaker recognition : Cambridge Studies in Speech Science and Communication, Cambridge University Press, Cambridge, 1983, 221 pp. ISBN 0-521-24486-2 , 1987, Speech Commun..

[12]  J. Flanagan Speech Analysis, Synthesis and Perception , 1971 .

[13]  U G Goldstein,et al.  Speaker-identifying features based on formant tracks. , 1976, The Journal of the Acoustical Society of America.

[14]  Egory Sequence,et al.  A Parametric Representation and a Clustering Method for Phoneme Recognition-Application to Stops in a CV Environment KAZUYOTANAKA , 1981 .

[15]  Harry Hollien,et al.  Speaker identification by long‐term spectra under normal and distorted speech conditions , 1977 .

[16]  Y. Chien,et al.  Pattern classification and scene analysis , 1974 .

[17]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[18]  Shigeki Sagayama,et al.  Phoneme recognition by phoneme filter neural networks , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[19]  M. Savic,et al.  Phoneme based speaker verification , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20]  Biing-Hwang Juang,et al.  A vector quantization approach to speaker recognition , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[21]  H. Szu,et al.  Implementing the minimum-misclassification-error energy function for target recognition , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[22]  J. Markel,et al.  Text-independent speaker recognition from a large linguistically unconstrained time-spaced data base , 1979 .

[23]  A. Paoloni,et al.  A recurrent time-delay neural network for improved phoneme recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[24]  Douglas D. O'Shaughnessy,et al.  Speech communication : human and machine , 1987 .

[25]  Shigeki Sagayama,et al.  A pairwise discriminant approach to robust phoneme recognition by time-delay neural networks , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[26]  G. W. Hughes,et al.  Talker differences as they appear in correlation matrices of continuous speech spectra. , 1974, The Journal of the Acoustical Society of America.

[27]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[28]  Pickett,et al.  The Sounds of Speech Communication , 1980 .

[29]  Hiroaki Hattori,et al.  Text-independent speaker recognition using neural networks , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[30]  David Casasent,et al.  Optical Gabor and wavelet transforms for scene analysis , 1992, Defense, Security, and Sensing.

[31]  John G. Daugman,et al.  Complete discrete 2-D Gabor transforms by neural networks for image analysis and compression , 1988, IEEE Trans. Acoust. Speech Signal Process..