论文信息 - A review of paralinguistic information processing for natural speech communication - 字舞流文

A review of paralinguistic information processing for natural speech communication

Yoichi Yamashita | Y. Yamashita

[1] Yoichi Yamashita,et al. Automatic prosodic labeling of accent information for Japanese spoken sentences , 2010, SSW.

[2] Shrikanth S. Narayanan,et al. An Acoustic Measure for Word Prominence in Spontaneous Speech , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3] Tetsunori Kobayashi,et al. Spoken Dialogue System Using Prosody as Para-Linguistic Information , 2004 .

[4] Donna Erickson,et al. Expressive speech: Production, perception and application to speech synthesis , 2005 .

[5] Keikichi Hirose,et al. Synthesis of F0 contours using generation process model parameters predicted from unlabeled corpora: application to emotional speech synthesis , 2005, Speech Commun..

[6] Tsuyoshi Moriyama,et al. Emotional Speech Synthesis using Subspace Constraints in Prosody , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[7] Shrikanth S. Narayanan,et al. Expressive speech synthesis using a concatenative synthesizer , 2002, INTERSPEECH.

[8] Ryuki Tachibana,et al. Automatic Prosody Labeling Using Multiple Models for Japanese , 2007, IEICE Trans. Inf. Syst..

[9] W. Sendlmeier,et al. Verification of acoustical correlates of emotional speech using formant-synthesis , 2000 .

[10] Fabio Tamburini,et al. Automatic prosodic prominence detection in speech using acoustic features: an unsupervised system , 2003, INTERSPEECH.

[11] Junichi Yamagishi,et al. Identification of contrast and its emphatic realization in HMM based speech synthesis , 2009, INTERSPEECH.

[12] Fakhri Karray,et al. Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[13] Akira Ichikawa,et al. Analysis of prominence in spoken Japanese sentences and application to text-to-speech synthesis , 1994, Speech Commun..

[14] J. Terken. Fundamental frequency and perceived prominence of accented syllables , 1989 .

[15] Winslow Burleson,et al. Detecting anger in automated voice portal dialogs , 2006, INTERSPEECH.

[16] Marc Schröder,et al. Expressing degree of activation in synthetic speech , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[17] Hideki Kasuya,et al. Constructing a spoken dialogue corpus for studying paralinguistic information in expressive conversation and analyzing its statistical/acoustic characteristics , 2011, Speech Commun..

[18] Shrikanth S. Narayanan,et al. Prominence Detection Using Auditory Attention Cues and Task-Dependent High Level Information , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[19] Roddy Cowie,et al. Acoustic correlates of emotion dimensions in view of speech synthesis , 2001, INTERSPEECH.

[20] Michael Picheny,et al. The IBM expressive text-to-speech synthesis system for American English , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[21] Keiichi Tokuda,et al. Recent advances in statistical parametric speech synthesis , 2010 .

[22] Takao Kobayashi,et al. Speech Synthesis with Various Emotional Expressions and Speaking Styles by Style Interpolation and Morphing , 2005, IEICE Trans. Inf. Syst..

[23] Satoshi Takahashi,et al. Detection of anger emotion in dialog speech using prosody feature and temporal relation of utterances , 2010, INTERSPEECH.

[24] Keikichi Hirose,et al. A dialogue processing system for speech response with high adaptability to dialogue topics , 1993 .

[25] Julia Hirschberg,et al. Pitch Accent in Context: Predicting Intonational Prominence from Text , 1993, Artif. Intell..

[26] Bin Yang,et al. Cascaded emotion classification via psychological emotion dimensions using a large set of voice quality parameters , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[27] Takashi Nose,et al. A Style Control Technique for HMM-Based Expressive Speech Synthesis , 2007, IEICE Trans. Inf. Syst..

[28] Nick Campbell,et al. A corpus-based speech synthesis system with emotion , 2003, Speech Commun..

[29] Takashi Nose,et al. A Perceptual Expressivity Modeling Technique for Speech Synthesis Based on Multiple-Regression HSMM , 2011, INTERSPEECH.

[30] Yoshinori Kitahara,et al. Prosodic Control to Express Emotions for Man-Machine Speech Interaction , 1992 .

[31] Takao Kobayashi,et al. Speaking style adaptation using context clustering decision tree for HMM-based speech synthesis , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[32] Tim Polzehl,et al. Anger recognition in speech using acoustic and linguistic cues , 2011, Speech Commun..

[33] John L. Arnott,et al. Implementation and testing of a system for producing emotion-by-rule in synthetic speech , 1995, Speech Commun..

[34] Petra Wagner,et al. Using Prominence Detection to Generate Acoustic Feedback in Tutoring Scenarios , 2011, INTERSPEECH.

[35] Roddy Cowie,et al. Emotional speech: Towards a new generation of databases , 2003, Speech Commun..

[36] Björn W. Schuller,et al. The INTERSPEECH 2009 emotion challenge , 2009, INTERSPEECH.

[37] Julia Hirschberg,et al. Detecting Levels of Interest from Spoken Dialog with Multistream Prediction Feedback and Similarity Based Hierarchical Fusion Learning , 2011, SIGDIAL Conference.

[38] Hiroya Fujisaki,et al. Information, prosody, and modeling - with emphasis on tonal features of speech - , 2004, Speech Prosody 2004.

[39] Nick Campbell. Speech & Expression; the Value of a Longitudinal Corpus , 2004, LREC.

[40] G. Bailly,et al. Editorial Special Section on Expressive Speech Synthesis , 2006 .

[41] Astrid Paeschke,et al. A database of German emotional speech , 2005, INTERSPEECH.

[42] Nick Campbell. Developments in Corpus-Based Speech Synthesis: Approaching Natural Conversational Speech , 2005, IEICE Trans. Inf. Syst..

[43] Takashi Nose,et al. A Technique for Estimating Intensity of Emotional Expressions and Speaking Styles in Speech Based on Multiple-Regression HSMM , 2010, IEICE Trans. Inf. Syst..

[44] Marc Schröder,et al. Emotional speech synthesis: a review , 2001, INTERSPEECH.

[45] Iain R. Murray,et al. Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. , 1993, The Journal of the Acoustical Society of America.

[46] Takashi Nose,et al. HMM-Based Emphatic Speech Synthesis Using Unsupervised Context Labeling , 2011, INTERSPEECH.

[47] Keikichi Hirose,et al. Modeling the effects of emphasis and question on fundamental frequency contours of Cantonese utterances , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[48] Anne Lacheret,et al. A corpus-based learning method for prominence detection in spontaneous speech , 2009 .

[49] Björn W. Schuller,et al. Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge , 2011, Speech Commun..

[50] Leonardo Badino,et al. Prosodic analysis of a multi-style corpus in the perspective of emotional speech synthesis , 2004, INTERSPEECH.

[51] Andrew Ortony,et al. The Cognitive Structure of Emotions , 1988 .

[52] Takashi Nose,et al. HMM-Based Style Control for Expressive Speech Synthesis with Arbitrary Speaker's Voice Using Model Adaptation , 2009, IEICE Trans. Inf. Syst..

[53] George N. Votsis,et al. Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[54] Björn W. Schuller,et al. The INTERSPEECH 2010 paralinguistic challenge , 2010, INTERSPEECH.

[55] Heiga Zen,et al. Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[56] Björn W. Schuller,et al. Introduction to the special issue on sensing emotion and affect - Facing realism in speech processing , 2011, Speech Commun..