Sequential dependency analysis for online spontaneous speech processing

A dependency structure interprets modification relationships between words and is often recognized as an important element in semantic information analysis. With conventional approaches for extracting this dependency structure, it is assumed that the complete sentence is known before the analysis starts. For spontaneous speech data, however, this assumption is not necessarily correct since sentence boundaries are not marked in the data and it is not easy to detect them correctly. Although sentence boundaries can be detected before dependency analysis, this cascaded implementation is not suitable for online processing since it delays the responses of the application. In this paper, we propose a sequential dependency analysis method for online spontaneous speech processing. The proposed method enables us to analyze incomplete sentences sequentially and detect sentence boundaries simultaneously. The analyzer can be trained using parsed data based on the maximum entropy principle. Experimental results using spontaneous lecture speech from the Corpus of Spontaneous Japanese show that our proposed method achieves online processing with an accuracy equivalent to that of offline processing in which boundary detection and dependency analysis are cascaded.

[1]  Sadaoki Furui,et al.  Automatic speech summarization applied to English broadcast news speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Yuji Matsumoto,et al.  Japanese Dependency Parsing Using Relative Preference of Dependency , 2005 .

[3]  Yasuyoshi Inagaki,et al.  Incremental dependency parsing based on headed context-free grammar , 2005 .

[4]  Hitoshi Isahara,et al.  Dependency Structure Analysis and Sentence Boundary Detection in Spontaneous Japanese , 2004, COLING.

[5]  Yuji Matsumoto,et al.  Japanese Dependency Analysis using Cascaded Chunking , 2002, CoNLL.

[6]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[7]  Joakim Nivre,et al.  Discriminative Classifiers for Deterministic Dependency Parsing , 2006, ACL.

[8]  Atsushi Nakamura,et al.  Efficient WFST-Based One-Pass Decoding With On-The-Fly Hypothesis Rescoring in Extremely Large Vocabulary Continuous Speech Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Michael Schiehlen,et al.  Global Learning of Labeled Dependency Trees , 2007, EMNLP-CoNLL.

[10]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[11]  Hideki Kashioka,et al.  Dependency Parsing of Japanese Spoken Monologue Based on Clause Boundaries , 2006, ACL.

[12]  Qibin Sun,et al.  A more precise model for web retrieval , 2005, WWW '05.

[13]  Joakim Nivre,et al.  Labeled Pseudo-Projective Dependency Parsing with Support Vector Machines , 2006, CoNLL.

[14]  Hitoshi Isahara,et al.  Spontaneous Speech Corpus of Japanese , 2000, LREC.

[15]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[16]  Masafumi Nishimura,et al.  A Stochastic Parser Based on a Structural Word Prediction Model , 2000, COLING.

[17]  Michael Collins,et al.  A New Statistical Parser Based on Bigram Lexical Dependencies , 1996, ACL.

[18]  Martin Cmejrek,et al.  Czech-English dependency-based machine translation , 2003 .

[19]  Hitoshi Isahara,et al.  Backward Beam Search Algorithm for Dependency Analysis of Japanese , 2000, COLING.

[20]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[21]  Joakim Nivre,et al.  Pseudo-Projective Dependency Parsing , 2005, ACL.

[22]  Hitoshi Isahara,et al.  Dependency structure analysis and sentence boundary detection in spontaneous Japanese , 2004, COLING 2004.