A review of paralinguistic information processing for natural speech communication

[1]  Yoichi Yamashita,et al.  Automatic prosodic labeling of accent information for Japanese spoken sentences , 2010, SSW.

[2]  Shrikanth S. Narayanan,et al.  An Acoustic Measure for Word Prominence in Spontaneous Speech , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Tetsunori Kobayashi,et al.  Spoken Dialogue System Using Prosody as Para-Linguistic Information , 2004 .

[4]  Donna Erickson,et al.  Expressive speech: Production, perception and application to speech synthesis , 2005 .

[5]  Keikichi Hirose,et al.  Synthesis of F0 contours using generation process model parameters predicted from unlabeled corpora: application to emotional speech synthesis , 2005, Speech Commun..

[6]  Tsuyoshi Moriyama,et al.  Emotional Speech Synthesis using Subspace Constraints in Prosody , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[7]  Shrikanth S. Narayanan,et al.  Expressive speech synthesis using a concatenative synthesizer , 2002, INTERSPEECH.

[8]  Ryuki Tachibana,et al.  Automatic Prosody Labeling Using Multiple Models for Japanese , 2007, IEICE Trans. Inf. Syst..

[9]  W. Sendlmeier,et al.  Verification of acoustical correlates of emotional speech using formant-synthesis , 2000 .

[10]  Fabio Tamburini,et al.  Automatic prosodic prominence detection in speech using acoustic features: an unsupervised system , 2003, INTERSPEECH.

[11]  Junichi Yamagishi,et al.  Identification of contrast and its emphatic realization in HMM based speech synthesis , 2009, INTERSPEECH.

[12]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[13]  Akira Ichikawa,et al.  Analysis of prominence in spoken Japanese sentences and application to text-to-speech synthesis , 1994, Speech Commun..

[14]  J. Terken Fundamental frequency and perceived prominence of accented syllables , 1989 .

[15]  Winslow Burleson,et al.  Detecting anger in automated voice portal dialogs , 2006, INTERSPEECH.

[16]  Marc Schröder,et al.  Expressing degree of activation in synthetic speech , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Hideki Kasuya,et al.  Constructing a spoken dialogue corpus for studying paralinguistic information in expressive conversation and analyzing its statistical/acoustic characteristics , 2011, Speech Commun..

[18]  Shrikanth S. Narayanan,et al.  Prominence Detection Using Auditory Attention Cues and Task-Dependent High Level Information , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  Roddy Cowie,et al.  Acoustic correlates of emotion dimensions in view of speech synthesis , 2001, INTERSPEECH.

[20]  Michael Picheny,et al.  The IBM expressive text-to-speech synthesis system for American English , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  Keiichi Tokuda,et al.  Recent advances in statistical parametric speech synthesis , 2010 .

[22]  Takao Kobayashi,et al.  Speech Synthesis with Various Emotional Expressions and Speaking Styles by Style Interpolation and Morphing , 2005, IEICE Trans. Inf. Syst..

[23]  Satoshi Takahashi,et al.  Detection of anger emotion in dialog speech using prosody feature and temporal relation of utterances , 2010, INTERSPEECH.

[24]  Keikichi Hirose,et al.  A dialogue processing system for speech response with high adaptability to dialogue topics , 1993 .

[25]  Julia Hirschberg,et al.  Pitch Accent in Context: Predicting Intonational Prominence from Text , 1993, Artif. Intell..

[26]  Bin Yang,et al.  Cascaded emotion classification via psychological emotion dimensions using a large set of voice quality parameters , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[27]  Takashi Nose,et al.  A Style Control Technique for HMM-Based Expressive Speech Synthesis , 2007, IEICE Trans. Inf. Syst..

[28]  Nick Campbell,et al.  A corpus-based speech synthesis system with emotion , 2003, Speech Commun..

[29]  Takashi Nose,et al.  A Perceptual Expressivity Modeling Technique for Speech Synthesis Based on Multiple-Regression HSMM , 2011, INTERSPEECH.

[30]  Yoshinori Kitahara,et al.  Prosodic Control to Express Emotions for Man-Machine Speech Interaction , 1992 .

[31]  Takao Kobayashi,et al.  Speaking style adaptation using context clustering decision tree for HMM-based speech synthesis , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[32]  Tim Polzehl,et al.  Anger recognition in speech using acoustic and linguistic cues , 2011, Speech Commun..

[33]  John L. Arnott,et al.  Implementation and testing of a system for producing emotion-by-rule in synthetic speech , 1995, Speech Commun..

[34]  Petra Wagner,et al.  Using Prominence Detection to Generate Acoustic Feedback in Tutoring Scenarios , 2011, INTERSPEECH.

[35]  Roddy Cowie,et al.  Emotional speech: Towards a new generation of databases , 2003, Speech Commun..

[36]  Björn W. Schuller,et al.  The INTERSPEECH 2009 emotion challenge , 2009, INTERSPEECH.

[37]  Julia Hirschberg,et al.  Detecting Levels of Interest from Spoken Dialog with Multistream Prediction Feedback and Similarity Based Hierarchical Fusion Learning , 2011, SIGDIAL Conference.

[38]  Hiroya Fujisaki,et al.  Information, prosody, and modeling - with emphasis on tonal features of speech - , 2004, Speech Prosody 2004.

[39]  Nick Campbell Speech & Expression; the Value of a Longitudinal Corpus , 2004, LREC.

[40]  G. Bailly,et al.  Editorial Special Section on Expressive Speech Synthesis , 2006 .

[41]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[42]  Nick Campbell Developments in Corpus-Based Speech Synthesis: Approaching Natural Conversational Speech , 2005, IEICE Trans. Inf. Syst..

[43]  Takashi Nose,et al.  A Technique for Estimating Intensity of Emotional Expressions and Speaking Styles in Speech Based on Multiple-Regression HSMM , 2010, IEICE Trans. Inf. Syst..

[44]  Marc Schröder,et al.  Emotional speech synthesis: a review , 2001, INTERSPEECH.

[45]  Iain R. Murray,et al.  Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. , 1993, The Journal of the Acoustical Society of America.

[46]  Takashi Nose,et al.  HMM-Based Emphatic Speech Synthesis Using Unsupervised Context Labeling , 2011, INTERSPEECH.

[47]  Keikichi Hirose,et al.  Modeling the effects of emphasis and question on fundamental frequency contours of Cantonese utterances , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[48]  Anne Lacheret,et al.  A corpus-based learning method for prominence detection in spontaneous speech , 2009 .

[49]  Björn W. Schuller,et al.  Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge , 2011, Speech Commun..

[50]  Leonardo Badino,et al.  Prosodic analysis of a multi-style corpus in the perspective of emotional speech synthesis , 2004, INTERSPEECH.

[51]  Andrew Ortony,et al.  The Cognitive Structure of Emotions , 1988 .

[52]  Takashi Nose,et al.  HMM-Based Style Control for Expressive Speech Synthesis with Arbitrary Speaker's Voice Using Model Adaptation , 2009, IEICE Trans. Inf. Syst..

[53]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[54]  Björn W. Schuller,et al.  The INTERSPEECH 2010 paralinguistic challenge , 2010, INTERSPEECH.

[55]  Heiga Zen,et al.  Statistical Parametric Speech Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[56]  Björn W. Schuller,et al.  Introduction to the special issue on sensing emotion and affect - Facing realism in speech processing , 2011, Speech Commun..