A survey of hybrid ANN/HMM models for automatic speech recognition

[1]  J. Wade Davis,et al.  Statistical Pattern Recognition , 2003, Technometrics.

[2]  Kuldip K. Paliwal,et al.  Automatic Speech and Speaker Recognition: Advanced Topics , 1999 .

[3]  Alain Hillion,et al.  Toward the border between neural and Markovian paradigms , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[4]  Renato De Mori,et al.  Spoken Dialogues with Computers , 1998 .

[5]  Jean-Marc Boite,et al.  Context independent and context dependent hybrid HMM/ANN systems for vocabulary independent tasks , 1997, EUROSPEECH.

[6]  Christoph Neukirchen,et al.  Large vocabulary speech recognition with context dependent MMI-connectionist / HMM systems using the WSJ database , 1997, EUROSPEECH.

[7]  Hervé Bourlard,et al.  Estimation of global posteriors and forward-backward training of hybrid HMM/ANN systems , 1997, EUROSPEECH.

[8]  Roberto Gemello,et al.  Continuous speech recognition with neural networks and stationary-transitional acoustic units , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[9]  Yonghong Yan,et al.  Speech recognition using neural networks with forward-backward probability generated targets , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Anders Krogh,et al.  Hidden neural networks: a framework for HMM/NN hybrids , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Christoph Neukirchen,et al.  Advanced training methods and new network topologies for hybrid MMI-connectionist/HMM speech recognition systems , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Jenq-Neng Hwang,et al.  Robust speech recognition based on joint model and feature space optimization of hidden Markov models , 1997, IEEE Trans. Neural Networks.

[13]  Diego Giuliani,et al.  Speaker normalization with a mixture of recurrent networks , 1997, ESANN.

[14]  Michael I. Jordan Serial Order: A Parallel Distributed Processing Approach , 1997 .

[15]  Jean-François Mari,et al.  HMMs and OWE neural network for continuous speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[16]  Finn Tore Johansen,et al.  A comparison of hybrid HMM architecture using global discriminating training , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[17]  Thierry Moudenc,et al.  Segmental phonetic features recognition by means of neural-fuzzy networks and integration in an N-best solutions post-processing , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[18]  C. K. Un,et al.  A new parameter smoothing method in the hybrid TDNN/HMM architecture for speech recognition , 1996, Speech Commun..

[19]  Chong Kwan Un,et al.  An MLP/HMM hybrid model using nonlinear predictors , 1996, Speech Commun..

[20]  Yoshua Bengio,et al.  Input-output HMMs for sequence processing , 1996, IEEE Trans. Neural Networks.

[21]  Sin-Horng Chen,et al.  A speech recognition method based on the sequential multi-layer perceptrons , 1996, Neural Networks.

[22]  Yoshua Bengio,et al.  Neural networks for speech and sequence recognition , 1996 .

[23]  Kuldip K. Paliwal,et al.  Automatic Speech and Speaker Recognition , 1996 .

[24]  Yong Joo Chung,et al.  Multilayer perceptrons for state-dependent weightings of HMM likelihoods , 1996, Speech Commun..

[25]  Biing-Hwang Juang,et al.  An Overview of Automatic Speech Recognition , 1996 .

[26]  Samy Bengio,et al.  An EM Algorithm for Asynchronous Input/Output Hidden Markov Models , 1996 .

[27]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[28]  Jean-Claude Junqua,et al.  Robustness in Automatic Speech Recognition: Fundamentals and Applications , 1995 .

[29]  Vassilios Digalakis,et al.  Temporal correlation modeling in a hybrid neural network/hidden Markov model speech recognizer , 1995, EUROSPEECH.

[30]  Javier Ferreiros,et al.  Incorporating fuzzy modelling in a hybrid HMM-ANNs system for CSR tasks , 1995, EUROSPEECH.

[31]  Ciro Martins,et al.  Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system , 1995, EUROSPEECH.

[32]  Christoph Neukirchen,et al.  Large vocabulary speaker-independent continuous speech recognition with a new hybrid system based on MMI-neural networks , 1995, EUROSPEECH.

[33]  George Zavaliagkos,et al.  Batch, incremental and instantaneous adaptation techniques for speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[34]  Günther Ruske,et al.  A hybrid RBF-HMM system for continuous speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[35]  Steve Renals,et al.  Recent improvements to the ABBOT large vocabulary CSR system , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[36]  Hans-Peter Hutter,et al.  Comparison of a new hybrid connectionist-SCHMM approach with other hybrid approaches for speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[37]  Javier Ferreiros,et al.  Recent work in hybrid neural networks and HMM systems in CSR tasks , 1994, ICSLP.

[38]  Dong Yu,et al.  A multi-state NN/HMM hybrid method for high performance speech recognition , 1994, ICSLP.

[39]  Steve Renals,et al.  Large vocabulary continuous speech recognition using a hybrid connectionist-HMM system , 1994, ICSLP.

[40]  Jun-ichi Takahashi,et al.  Telephone line characteristic adaptation using vector field smoothing technique , 1994, ICSLP.

[41]  R. W. King,et al.  A continuous HMM based preprocessor for modular speech recognition neural networks , 1994, ICSLP.

[42]  Finn Tore Johansen,et al.  Global optimisation of HMM input transformations , 1994, ICSLP.

[43]  Jean-François Mari,et al.  Hidden Markov models and selectively trained neural networks for connected confusable word recognition , 1994, ICSLP.

[44]  Horacio Franco,et al.  Context-dependent connectionist probability estimation in a hybrid hidden Markov model-neural net speech recognition system , 1994, Comput. Speech Lang..

[45]  Magne Hallstein Johnsen,et al.  Non-linear input transformations for discriminative HMMs , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[46]  Michael Picheny,et al.  Adaptation techniques for ambience and microphone compensation in the IBM Tangora speech recognition system , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[47]  James L. Flanagan,et al.  Microphone Arrays and Neural Networks for Robust Speech Recognition , 1994, HLT.

[48]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[49]  Anthony J. Robinson,et al.  An application of recurrent nets to phone probability estimation , 1994, IEEE Trans. Neural Networks.

[50]  Gerhard Rigoll,et al.  Maximum mutual information neural networks for hybrid connectionist-HMM speech recognition systems , 1994, IEEE Trans. Speech Audio Process..

[51]  George Zavaliagkos,et al.  A hybrid segmental neural net/hidden Markov model system for continuous speech recognition , 1994, IEEE Trans. Speech Audio Process..

[52]  Dirk Van Compernolle,et al.  Multilayer perceptrons as labelers for hidden Markov models , 1994, IEEE Trans. Speech Audio Process..

[53]  Xavier L. Aubert,et al.  Combining TDNN and HMM in a hybrid system for improved continuous-speech recognition , 1994, IEEE Trans. Speech Audio Process..

[54]  Hervé Bourlard,et al.  Connectionist probability estimators in HMM speech recognition , 1994, IEEE Trans. Speech Audio Process..

[55]  Hervé Bourlard,et al.  Continuous speech recognition by connectionist statistical methods , 1993, IEEE Trans. Neural Networks.

[56]  Hervé Bourlard,et al.  Connectionist speech recognition , 1993 .

[57]  Hervé Bourlard,et al.  Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[58]  Yoshua Bengio A Connectionist Approach to Speech Recognition , 1993, Int. J. Pattern Recognit. Artif. Intell..

[59]  Richard Lippmann,et al.  Hybrid neural-network/HMM approaches to wordspotting , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[60]  Biing-Hwang Juang,et al.  Discriminative training of dynamic programming based speech recognizers , 1993, IEEE Trans. Speech Audio Process..

[61]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[62]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[63]  Biing-Hwang Juang,et al.  Discriminative learning for minimum error classification [pattern recognition] , 1992, IEEE Trans. Signal Process..

[64]  Piero Cosi,et al.  Phonetic recognition experiments with recurrent neural networks , 1992, ICSLP.

[65]  Yoshua Bengio,et al.  Learning the dynamic nature of speech with back-propagation for sequences , 1992, Pattern Recognit. Lett..

[66]  Steve Austin,et al.  Speech recognition using segmental neural nets , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[67]  Tony Robinson,et al.  A real-time recurrent error propagation network word recognition system , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[68]  Xuedong Huang Speaker normalization for speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[69]  Elliot Singer,et al.  A speech recognizer using radial basis function neural networks in an HMM framework , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[70]  M. L. Rossen,et al.  A whole word recurrent neural network for keyword spotting , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[71]  Yoshua Bengio,et al.  Phonetically motivated acoustic parameters for continuous speech recognition using artificial neural networks , 1991, Speech Commun..

[72]  Yoshua Bengio Radial Basis Functions for Speech Recognition , 1992 .

[73]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[74]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[75]  Roberto Pieraccini,et al.  Time-Warping Network: A Hybrid Framework for Speech Recognition , 1991, NIPS.

[76]  Biing-Hwang Juang,et al.  New discriminative training algorithms based on the generalized probabilistic descent method , 1991, Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop.

[77]  J. Makhoul,et al.  A hybrid continuous speech recognition system using segmental neural nets with hidden Markov models , 1991, Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop.

[78]  R. Kompe,et al.  Global optimization of a neural network-hidden Markov model hybrid , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[79]  Frank Fallside,et al.  A recurrent error propagation network speech recognition system , 1991 .

[80]  Chin-Hui Lee,et al.  A Minimax Classification Approach With Application To Robust Speech Recognition , 1991, Proceedings. 1991 IEEE International Symposium on Information Theory.

[81]  Shigeru Katagiri,et al.  LVQ-based shift-tolerant phoneme recognition , 1991, IEEE Trans. Signal Process..

[82]  Alex Waibel,et al.  Integrating time alignment and neural networks for high performance continuous speech recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[83]  D. P. Morgan,et al.  Multiple neural network topologies applied to keyword spotting , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[84]  Shigeru Katagiri,et al.  Speaker-independent large vocabulary word recognition using an LVQ/HMM hybrid algorithm , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[85]  Alex Waibel,et al.  Continuous speech recognition using linked predictive neural networks , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[86]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[87]  Biing-Hwang Juang,et al.  A study on speaker adaptation of the parameters of continuous density hidden Markov models , 1991, IEEE Trans. Signal Process..

[88]  Christopher L. Scofield,et al.  Neural networks and speech processing , 1991, The Kluwer international series in engineering and computer science.

[89]  James L. Flanagan,et al.  Autodirective Microphone Systems , 1991 .

[90]  H. Bourlard,et al.  Links Between Markov Models and Multilayer Perceptrons , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[91]  Ken-ichi Iso,et al.  Speaker-independent word recognition using a neural prediction model , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[92]  Hervé Bourlard,et al.  Continuous speech recognition using multilayer perceptrons with hidden Markov models , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[93]  Esther Levin,et al.  Word recognition using hidden control neural architecture , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[94]  Harvey F. Silverman,et al.  Combining hidden Markov model and neural network classifiers , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[95]  M. A. Bush,et al.  Speaker-independent vowel classification using hidden Markov models and LVQ2 , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[96]  A. Waibel,et al.  Connectionist Viterbi training: a new hybrid method for continuous speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[97]  D. Van Compernolle,et al.  TDNN labeling for a HMM recognizer , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[98]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[99]  Piero Cosi,et al.  Phonetically-based multi-layered neural networks for vowel classification , 1990, Speech Commun..

[100]  John S. Bridle,et al.  Alpha-nets: A recurrent 'neural' network architecture with a hidden Markov model interpretation , 1990, Speech Commun..

[101]  Alexander H. Waibel,et al.  A novel objective function for improved phoneme recognition using time delay neural networks , 1990, International 1989 Joint Conference on Neural Networks.

[102]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[103]  Kiyohiro Shikano,et al.  Modularity and scaling in large phonemic neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[104]  Kiyohiro Shikano,et al.  Fast back-propagation learning methods for large phonemic neural networks , 1989, EUROSPEECH.

[105]  Fernando J. Pineda,et al.  Recurrent Backpropagation and the Dynamical Approach to Adaptive Neural Computation , 1989, Neural Computation.

[106]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[107]  Alex Waibel,et al.  Consonant recognition by modular construction of large phonemic time-delay neural networks , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[108]  Richard Lippmann,et al.  Review of Neural Networks for Speech Recognition , 1989, Neural Computation.

[109]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[110]  Alexander H. Waibel,et al.  Modular Construction of Time-Delay Neural Networks for Speech Recognition , 1989, Neural Computation.

[111]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[112]  Barak A. Pearlmutter Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.

[113]  Ronald J. Williams,et al.  Experimental Analysis of the Real-time Recurrent Learning Algorithm , 1989 .

[114]  Yoh-Han Pao,et al.  Adaptive pattern recognition and neural networks , 1989 .

[115]  M. Gori,et al.  BPS: a learning algorithm for capturing the dynamic nature of speech , 1989, International 1989 Joint Conference on Neural Networks.

[116]  A. Linden,et al.  Inversion of multilayer nets , 1989, International 1989 Joint Conference on Neural Networks.

[117]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[118]  M. J. D. Powell,et al.  Radial basis functions for multivariable interpolation: a review , 1987 .

[119]  Anthony J. Robinson,et al.  Static and Dynamic Error Propagation Networks with Application to Speech Coding , 1987, NIPS.

[120]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[121]  Yann LeCun,et al.  Learning processes in an asymmetric threshold network , 1986 .

[122]  Françoise Fogelman-Soulié,et al.  Disordered Systems and Biological Organization , 1986, NATO ASI Series.

[123]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[124]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[125]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.