A study and experimental results for sound recognition in real-world robot interaction

This paper proposes a technology to improve a sound recognition in real-world. The main issue is the change of spectrum which is caused by acoustic devices. The process through research and logical experiment is leads to two solutions which is mono speaker and acoustic path modeling by the rerecording.

[1]  Douglas D. O'Shaughnessy,et al.  Compensated mel frequency cepstrum coefficients , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[2]  Joseph Picone,et al.  Signal modeling techniques in speech recognition , 1993, Proc. IEEE.

[3]  Rhee Man Kil,et al.  Zero-crossing-based speech segregation and recognition for humanoid robots , 2009, IEEE Transactions on Consumer Electronics.

[4]  Rhee Man Kil,et al.  Auditory processing of speech signals for robust speech recognition in real-world noisy environments , 1999, IEEE Trans. Speech Audio Process..

[5]  C. H. J. van der Merwe,et al.  Calculation of LPC-based cepstrum coefficients using mel-scale frequency warping , 1991 .

[6]  Alex Waibel,et al.  Review of TDNN (time delay neural network) architectures for speech recognition , 1991, 1991., IEEE International Sympoisum on Circuits and Systems.

[7]  Thippur V. Sreenivas,et al.  GMM based Bayesian approach to speech enhancement in signal / transform domain , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Hong Jeong,et al.  Realizing TDNN for Word Recognition on a Wavefront Toroidal Mesh-Array Neurocomputer , 1996 .

[9]  Gillian Davis,et al.  Noise Reduction in Speech Applications , 2002 .

[10]  Sen M. Kuo,et al.  Active noise control: a tutorial review , 1999, Proc. IEEE.