Automatic phonetic transcription: An overview

Both in linguistics and in speech technology phonetic transcriptions (PTs) are often needed. Given the many drawbacks in making manual PTs, researchers have been looking for ways to obtain PTs automatically. In this paper an overview is presented of automatic phonetic transcription (APT). Several aspects of APT are discussed: evaluation, generation and usability. Evaluation is needed to determine the quality of APTs. Usually this is done by comparing the APTs with human reference transcriptions. Generating APTs can be done in several ways, e.g. by means of phone recognition or forced recognition. The quality of the generated APTs can be enhanced by optimizing the automatic speech recognition systems used to make the APTs. In spite of the current limitations of ASR technology, APTs already offer some important advantages for phonetic research. In this paper we explain how.

[1]  Helmer Strik,et al.  Application-oriented validation of phonetic transcriptions: preliminary results , 2003 .

[2]  Catia Cucchiarini,et al.  Phonetic transcriptions in the spoken dutch corpus: how to combine efficiency and good transcription quality , 2001, INTERSPEECH.

[3]  C. Cucchiarini,et al.  Phonetic transcription: a methodological and empirical study , 1993 .

[4]  L. Shriberg,et al.  Reliability studies in broad and narrow phonetic transcription , 1991 .

[5]  L D Shriberg,et al.  A procedure for phonetic transcription by consensus. , 1984, Journal of speech and hearing research.

[6]  Nelson Morgan,et al.  Dynamic pronunciation models for automatic speech recognition , 1999 .

[7]  Joseph Picone,et al.  Automatic text alignment for speech system evaluation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[8]  Maria-Barbara Wesenick,et al.  Estimating the quality of phonetic transcriptions and segmentations of speech signals , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[9]  C. Cucchiarini,et al.  Phonetic Transcription of Large Speech Corpora: How to boost efficiency without affecting quality , 2003 .

[10]  Helmer Strik,et al.  Two automatic approaches for analyzing the frequency of connected speech processes in Dutch , 1998 .

[11]  Florian Schiel,et al.  Statistical Modelling Of Pronunciation: It's Not The Model, It's The Data , 1998 .

[12]  Helmer Strik,et al.  The selection of pronunciation variants: comparing the performance of man and machine , 1998, ICSLP.

[13]  C. Cucchiarini Assessing transcription agreement: Methodological aspects , 1996 .

[14]  Cecile T. L. Kuijpers,et al.  The Influence of Rhythmic Context on Schwa Epenthesis and Schwa Deletion in Dutch , 1998 .

[15]  Catia Cucchiarini,et al.  Validation and improvement of automatic phonetic transcriptions , 2002, INTERSPEECH.

[16]  Helmer Strik,et al.  Lower WERs do not guarantee better transcriptions , 2001, INTERSPEECH.

[17]  G. Booij The Phonology of Dutch , 1995 .

[18]  M Wester,et al.  Obtaining Phonetic Transcriptions: A Comparison between Expert Listeners and a Continuous Speech Recognizer , 2001, Language and speech.

[19]  Florian Schiel,et al.  Pronuncation modeling applied to automatic segmentation of spontaneous speech , 1997, EUROSPEECH.