ProsoTool, a method for automatic annotation of fundamental frequency

ProsoTool is a computer algorithm implemented as a Praat script for the automatic annotation of certain prosodic features in recorded dialogs. The tool was developed in the framework of the HuComTech project. The current version aims at making the raw F0 data more expressive and processable by smoothing and segmenting the pitch curve into larger tonal movements adjusting the calculation parameters to the individual vocal range of the speaker. This research paper contains the complete description of the modificated annotation method and its first samples in the HuComTech Corpus.

[1]  Andrew Rosenberg,et al.  AutoBI - a tool for automatic toBI annotation , 2010, INTERSPEECH.

[2]  M S Magnusson,et al.  Discovering hidden time patterns in behavior: T-patterns and their detection , 2000, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[4]  Johan Frid,et al.  Automatic estimation of pitch range through distribution fitting , 2010 .

[5]  Kornel Bertok,et al.  The outlines of a theory and technology of human-computer interaction as represented in the model of the HuComTech project , 2011, 2011 2nd International Conference on Cognitive Infocommunications (CogInfoCom).

[6]  Peter Baranyi,et al.  An overview of research trends in CogInfoCom , 2014, IEEE 18th International Conference on Intelligent Engineering Systems INES 2014.

[7]  Piet Mertens,et al.  The Prosogram: Semi-Automatic Transcription of Prosody Based on a Tonal Perception Model , 2004 .

[8]  G. Esfandiari Baiat,et al.  Topic change detection based on prosodic cues in unimodal setting , 2012, 2012 IEEE 3rd International Conference on Cognitive Infocommunications (CogInfoCom).

[9]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[10]  István Szekrényes Annotation and interpretation of prosodic data in the HuComTech corpus for multimodal user interfaces , 2013, Journal on Multimodal User Interfaces.

[11]  I. Szekrenyes,et al.  Annotation of spoken syntax in relation to prosody and multimodal pragmatics , 2012, 2012 IEEE 3rd International Conference on Cognitive Infocommunications (CogInfoCom).

[12]  Julia Hirschberg,et al.  Communication and prosody: Functional aspects of prosody , 2002, Speech Commun..

[13]  P. Baranyi,et al.  Definition and synergies of cognitive infocommunications , 2012 .

[14]  P Taylor,et al.  Analysis and synthesis of intonation using the Tilt model. , 2000, The Journal of the Acoustical Society of America.

[15]  Hennie Brugman,et al.  Annotating Multi-media/Multi-modal Resources with ELAN , 2004, LREC.

[16]  Mattias Heldner,et al.  Underpinning /nailon/: Automatic Estimation of Pitch Range and Speaker Relative Pitch , 2007 .