The successive mean quantization transform

This paper presents the successive mean quantization transform (SMQT). The transform reveals the organization or structure of the data and removes properties such as gain and bias. The transform is described and applied in speech processing and image processing. The SMQT is considered as an extra processing step for the mel frequency cepstral coefficients commonly used in speech recognition. In image processing the transform is applied in automatic image enhancement and dynamic range compression.

[1]  Hong Kook Kim,et al.  Cepstrum-domain model combination based on decomposition of speech and noise for noisy speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  David J. Kriegman,et al.  From few to many: generative models for recognition under variable pose and illumination , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[3]  Zia-ur Rahman,et al.  Multi-scale retinex for color image enhancement , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[4]  Ryan J. Cassidy Dynamic range compression of audio signals consistent with recent time-varying loudness models , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Zia-ur Rahman,et al.  Properties and performance of a center/surround retinex , 1997, IEEE Trans. Image Process..

[6]  Mark J. F. Gales,et al.  Cepstral parameter compensation for HMM recognition in noise , 1993, Speech Commun..

[7]  Agostinho C. Rosa,et al.  Towards automatic image enhancement using genetic algorithms , 2000, Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512).

[8]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[9]  Yifan Gong,et al.  Speech recognition in noisy environments: A survey , 1995, Speech Commun..

[10]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[11]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[12]  Andreas Ernst,et al.  Face detection with the modified census transform , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..