Advanced computational models and learning theories for spoken language processing

Recent developments in research on humanoid robots and interactive agents have highlighted the importance of and expectation on automatic speech recognition (ASR) as a means of endowing such an agent with the ability to communicate via speech. This article describes some of the approaches pursued at NTT Communication Science Laboratories (NTT-CSL) for dealing with such challenges in ASR. In particular, we focus on methods for fast search through finite-state machines, Bayesian solutions for modeling and classification of speech, and a discriminative training approach for minimizing errors in large vocabulary continuous speech recognition

[1]  Shinji Watanabe,et al.  Effects of Bayesian predictive classification using variational Bayesian posteriors for sparse training data in speech recognition , 2005, INTERSPEECH.

[2]  Farhad Samadzadegan,et al.  Automatic 3D object recognition and reconstruction based on neuro-fuzzy modelling , 2005 .

[3]  Jeng-Shyang Pan,et al.  Genetic watermarking based on transform-domain techniques , 2004, Pattern Recognit..

[4]  Dimitri Van De Ville,et al.  Noise reduction by fuzzy image filtering , 2003, IEEE Trans. Fuzzy Syst..

[5]  Christophe Garcia,et al.  Convolutional face finder: a neural architecture for fast and robust face detection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Hiromitsu Yamada,et al.  Automatic acquisition of hierarchical mathematical morphology procedures by genetic algorithms , 1999, Image Vis. Comput..

[7]  Katrin Franke,et al.  Fuzzy-Pareto-Dominance Driven Multiobjective Genetic Algorithm , 2003 .

[8]  Edwin R. Hancock,et al.  Recognising building patterns using matched filters and genetic search , 1998 .

[9]  Hagai Attias,et al.  Inferring Parameters and Structure of Latent Variable Models by Variational Bayes , 1999, UAI.

[10]  Mario Köppen,et al.  A System for the Automated Evaluation of Invoices , 1996, DAS.

[11]  Stephen Marshall,et al.  Parallel genetic algorithms in the optimization of morphological filters: a general design tool , 1997, J. Electronic Imaging.

[12]  Dietmar Saupe,et al.  Evolutionary fractal image compression , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[13]  Ki-Sang Hong,et al.  Road detection in spaceborne SAR images using a genetic algorithm , 2002, IEEE Trans. Geosci. Remote. Sens..

[14]  Lawrence Carin,et al.  Genetic Algorithm Wavelet Design for Signal Classification , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Takaaki Hori,et al.  Speech summarization using weighted finite-state transducers , 2003, INTERSPEECH.

[16]  Fernando Pereira,et al.  Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..

[17]  Masaaki Nagata,et al.  Efficient Decoding for Statistical Machine Translation with a Fully Expanded WFST Model , 2004, EMNLP.

[18]  Atsushi Nakamura,et al.  Generalized fast on-the-fly composition algorithm for WFST-based speech recognition , 2005, INTERSPEECH.

[19]  Essam A. El-Kwae,et al.  Edge detection in medical images using a genetic algorithm , 1998, IEEE Transactions on Medical Imaging.

[20]  Katrin Franke,et al.  The influence of physical and biomechanical processes on the ink trace. Methodological foundations for the forensic analysis of signatures , 2005 .

[21]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[22]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Cedric Nishan Canagarajah,et al.  Genetic stereo matching using complex conjugate wavelet pyramids , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[24]  Mohamed S. Kamel,et al.  A genetic algorithm for the estimation of ridges in fingerprints , 1999, IEEE Trans. Image Process..

[25]  Chang-Tsun Li,et al.  Unsupervised texture segmentation using multiresolution hybrid genetic algorithm , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[26]  Erik McDermott,et al.  Discriminative Training for Speech Recognition , 1997 .

[27]  C. A. Coello Coello,et al.  Evolutionary multi-objective optimization: a historical view of the field , 2006, IEEE Computational Intelligence Magazine.

[28]  Suman K. Mitra,et al.  Technique for fractal image compression using genetic algorithm , 1998, IEEE Trans. Image Process..

[29]  Takaaki Hori,et al.  Fast on-the-fly composition for weighted finite-state transducers in 1.8 million-word vocabulary continuous speech recognition , 2004, INTERSPEECH.

[30]  Michael C. Fairhurst,et al.  Optimisation of multiple classifier systems using genetic algorithms , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[31]  P.W.M. Tsang,et al.  Genetic algorithm for model-based matching of projected images of three-dimensional objects , 2003 .

[32]  Jonathan Le Roux,et al.  Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[33]  Mario Köppen,et al.  Genetic Algorithm Based Heuristic Measure for Pattern Similarity in Kirlian Photographs , 2001, EvoWorkshops.

[34]  Tomoharu Nagao,et al.  Automatic construction of tree-structural image transformations using genetic programming , 1999, Proceedings 10th International Conference on Image Analysis and Processing.

[35]  Qiang Ji,et al.  Camera calibration with genetic algorithms , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[36]  Gary B. Lamont,et al.  Evolutionary Algorithms for Solving Multi-Objective Problems , 2002, Genetic Algorithms and Evolutionary Computation.

[37]  Shinji Watanabe,et al.  Acoustic Model Adaptation Based on Coarse/Fine Training of Transfer Vectors Using Directional Statistics , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[38]  S. Sengupta,et al.  Robust camera parameter estimation using genetic algorithm , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[39]  Minglun Gong,et al.  Genetic-Based Stereo Algorithm and Disparity Map Evaluation , 2002, International Journal of Computer Vision.

[40]  Suchendra M. Bhandarkar,et al.  An edge detection technique using genetic algorithm-based optimization , 1994, Pattern Recognit..

[41]  Joshua D. Knowles,et al.  Bounded archiving using the lebesgue measure , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[42]  Shigeru Katagiri,et al.  String-level MCE for continuous phoneme recognition , 1997, EUROSPEECH.

[43]  Jin-Jang Leou,et al.  Detection and concealment of transmission errors in MPEG-2 images-a genetic algorithm approach , 1999, IEEE Trans. Circuits Syst. Video Technol..

[44]  Yaodong Wang,et al.  Detection of geometric shapes by the combination of genetic algorithm and subpixel accuracy , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[45]  Tomoharu Nagao,et al.  Automatic construction of image transformation processes using genetic algorithm , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[46]  Ja-Chen Lin,et al.  Image hiding by optimal LSB substitution and genetic algorithm , 2001, Pattern Recognit..

[47]  Shigeru Katagiri,et al.  Recent advances in efficient decoding combining on-line transducer composition and smoothed language model incorporation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[48]  Jerzy W. Bala,et al.  Visual routine for eye detection using hybrid genetic architectures , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[49]  Luiz Eduardo Soares de Oliveira,et al.  Feature selection using multi-objective genetic algorithms for handwritten digit recognition , 2002, Object recognition supported by user interaction for service robots.

[50]  Biing-Hwang Juang,et al.  Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method , 1998, Proc. IEEE.

[51]  Naonori Ueda,et al.  Bayesian model search for mixture models based on optimizing variational bounds , 2002, Neural Networks.

[52]  Mario Köppen,et al.  Concurrent Application of Genetic Algorithm in Pattern Recognition , 2003, HIS.

[53]  Claudio De Stefano,et al.  Exploiting reliability for dynamic selection of classi .ers by means of genetic algorithms , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[54]  Tomoharu Nagao,et al.  Automatic construction of tree-structural image transformation using genetic programming , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[55]  Stefano Rizzi,et al.  Topological clustering of maps using a genetic algorithm , 1995, Pattern Recognit. Lett..

[56]  Christian Roux,et al.  Registration of 3-D images by genetic optimization , 1995, Pattern Recognit. Lett..

[57]  Miki Haseyama,et al.  A genetic-algorithm based quantization method for fractal image coding , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[58]  Shigeru Katagiri,et al.  Minimum classification error for large scale speech recognition tasks using weighted finite state transducers , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[59]  Shinji Watanabe,et al.  Speech Recognition Based on Student's t-Distribution Derived from Total Bayesian Framework , 2006, IEICE Trans. Inf. Syst..

[60]  Steven Greenberg,et al.  AN INTRODUCTION TO THE DIAGNOSTIC EVALUATION OF SWITCHBOARD-CORPUS AUTOMATIC SPEECH RECOGNITION SYSTEMS , 2000 .

[61]  Hans J. G. A. Dolfing,et al.  Incremental language models for speech recognition using finite-state transducers , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[62]  Philippe Andrey,et al.  Unsupervised image segmentation using a distributed genetic algorithm , 1994, Pattern Recognit..

[63]  M. Anastasio,et al.  Multiobjective genetic optimization of diagnostic classifiers with implications for generating receiver operating characteristic curves , 1999, IEEE Transactions on Medical Imaging.

[64]  Naonori Ueda,et al.  Variational bayesian estimation and clustering for speech recognition , 2004, IEEE Transactions on Speech and Audio Processing.