Extraction and Representation of Prosody for Speaker, Speech and Language Recognition

Extraction and Representation of Prosodic Features for Speech Processing Applications deals with prosody from speech processing point of view with topics including: The significance of prosody for speech processing applicationsWhy prosody need to be incorporated in speech processing applicationsDifferent methods for extraction and representation of prosody for applications such as speech synthesis, speaker recognition, language recognition and speech recognitionThis book is for researchers and students at the graduate level.

[1]  Lin-Shan Lee,et al.  Prosodic modeling in large vocabulary Mandarin speech recognition , 2006, INTERSPEECH.

[2]  Hsiao-Chuan Wang,et al.  Fusion of phonotactic and prosodic knowledge for language identification , 2006, INTERSPEECH.

[3]  Elizabeth Shriberg,et al.  Parameterization of Prosodic Feature Distributions for SVM Modeling in Speaker Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[4]  François Pellegrino,et al.  Automatic language identification: an alternative approach to phonetic modelling , 2000, Signal Process..

[5]  Andreas Stolcke,et al.  Prosody Modeling for Automatic Speech Understanding: An Overview of Recent Research at SRI , 2008 .

[6]  B. Yegnanarayana,et al.  Epoch extraction from linear prediction residual for identification of closed glottis interval , 1979 .

[7]  Bin Ma,et al.  A Vector Space Modeling Approach to Spoken Language Identification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  A. Stolcke,et al.  Speaker recognition using prosodic and lexical features , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[9]  Hsiao-Chuan Wang,et al.  Language identification using pitch contour information , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[10]  Bin Ma,et al.  Prosodic attribute model for spoken language identification , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  D. A. Reynolds,et al.  The effects of handset variability on speaker recognition performance: experiments on the Switchboard corpus , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[12]  Elizabeth Shriberg,et al.  A comparison of approaches for modeling prosodic features in speaker recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[13]  Andreas Stolcke,et al.  Modeling prosodic feature sequences for speaker recognition , 2005, Speech Commun..

[14]  Douglas A. Reynolds,et al.  Using prosodic and conversational features for high-performance speaker recognition: report from JHU WS'02 , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[15]  S. R. Mahadeva Prasanna,et al.  Detection of vowel onset point in speech , 2002, ICASSP.

[16]  Jérôme Farinas,et al.  Modeling prosody for language identification on read and spontaneous speech , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[17]  Andrew Hunt A generalised model for utilising prosodic information in continuous speech recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Mattias Heldner,et al.  A general-purpose 32 ms prosodic vector for hidden Markov modeling , 2009, INTERSPEECH.

[19]  Hynek Hermansky,et al.  Segmentation of speech for speaker and language recognition , 2003, INTERSPEECH.

[20]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[21]  李 時旭 Incorporation of prosodic modules for Large Vocabulary Continuous Speech Recognition , 2001 .

[22]  Andreas Stolcke,et al.  Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech? , 1998, Language and speech.

[23]  H. H. Rump,et al.  The perceptual prominence of fundamental frequency peaks. , 1997, The Journal of the Acoustical Society of America.

[24]  Bayya Yegnanarayana,et al.  Prosodic features for speaker verification , 2006, INTERSPEECH.

[25]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[26]  Andreas Stolcke,et al.  The case for automatic higher-level features in forensic speaker recognition , 2008, INTERSPEECH.

[27]  Kornel Laskowski,et al.  Modeling instantaneous intonation for speaker identification using the fundamental frequency variation spectrum , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[28]  Douglas A. Reynolds,et al.  Speaker identification and verification using Gaussian mixture speaker models , 1995, Speech Commun..

[29]  P Taylor,et al.  Analysis and synthesis of intonation using the Tilt model. , 2000, The Journal of the Acoustical Society of America.

[30]  Bin Ma,et al.  Exploiting prosodic information for Speaker Recognition , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[31]  Gökhan Tür,et al.  Modeling the prosody of hidden events for improved word recognition , 1999, EUROSPEECH.

[32]  Lukás Burget,et al.  Investigations into prosodic syllable contour features for speaker recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[33]  Douglas A. Reynolds,et al.  Modeling prosodic dynamics for speaker recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[34]  Fang Chen,et al.  Voiced/unvoiced pattern-based duration modeling for language identification , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[35]  Rainer Gruhn,et al.  Experiments on Chinese speech recognition with tonal models and pitch estimation using the Mandarin speecon data , 2006, INTERSPEECH.

[36]  Ronald A. Cole,et al.  A Segment-Based Automatic Language Identification System , 1991, NIPS.

[37]  H. Gish,et al.  A probabilistic approach to the understanding and training of neural network classifiers , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[38]  Jérôme Farinas,et al.  Rhythmic unit extraction and modelling for automatic language identification , 2005, Speech Commun..