论文信息 - Spoken document understanding and organization

Spoken document understanding and organization

Spoken documents (or associated multimedia content) are in fact better understood and reorganized in a way that retrieval/browsing can be performed easily. For example, they are now in the form of short paragraphs, properly organized in some hierarchical visual presentation with titles/summaries/topic labels as references for retrieval and browsing. The retrieval can be performed based on the full content, the summaries/titles/topic labels, or both. In this article, this is referred to as spoken document understanding and organization for efficient retrieval/browsing applications. The purpose of this article is to present a concise, comprehensive, and integrated overview of related areas in a unified context of spoken document understanding and organization for efficient retrieval/browsing applications. In addition, we present an initial prototype system we developed at National Taiwan University as a new example of integrating the various technologies and functionalities.

Lin-shan Lee | B. Chen | Berlin Chen | Lin-Shan Lee

[1] Mark T. Maybury,et al. Advances in Automatic Text Summarization , 1999 .

[2] Daben Liu,et al. Speech and language technologies for audio indexing and retrieval , 2000, Proceedings of the IEEE.

[3] Tatsuya Kawahara,et al. Automatic indexing of lecture presentations using unsupervised learning of presumed discourse markers , 2004, IEEE Transactions on Speech and Audio Processing.

[4] Lin-Shan Lee,et al. Improved spoken document retrieval by exploring extra acoustic and linguistic cues , 2001, INTERSPEECH.

[5] Lin-Shan Lee,et al. Discriminating capabilities of syllable-based features and approaches of utilizing them for voice retrieval of speech information in Mandarin Chinese , 2002, IEEE Trans. Speech Audio Process..

[6] Xin Liu,et al. Generic text summarization using relevance measure and latent semantic analysis , 2001, SIGIR '01.

[7] Berlin Chen,et al. Exploring the use of latent topical information for statistical Chinese spoken document retrieval , 2006, Pattern Recognit. Lett..

[8] Mari Ostendorf,et al. Robust information extraction from spoken language data , 1999, EUROSPEECH.

[9] Michael E. Lesk,et al. Computer Evaluation of Indexing and Text Processing , 1968, JACM.

[10] Sadaoki Furui,et al. Speech-to-text and speech-to-speech summarization of spontaneous speech , 2004, IEEE Transactions on Speech and Audio Processing.

[11] Katunobu Itou,et al. A Method for Open-Vocabulary Speech-Driven Text Retrieval , 2002, EMNLP.

[12] John D. Lafferty,et al. Statistical Models for Text Segmentation , 1999, Machine Learning.

[13] Warren R. Greiff,et al. Fine-Grained Hidden Markov Modeling for Broadcast-News Story Segmentation , 2001, HLT.

[14] Yiming Yang,et al. An example-based mapping method for text categorization and retrieval , 1994, TOIS.

[15] Jade Goldstein-Stewart,et al. Summarizing text documents: sentence selection and evaluation metrics , 1999, SIGIR '99.

[16] Amit Singhal,et al. Document expansion for speech retrieval , 1999, SIGIR '99.

[17] Sadaoki Furui,et al. TWO-STAGE AUTOMATIC SPEECH SUMMARIZATION BY SENTENCE EXTRACTION AND COMPACTION , 2003 .

[18] Ye-Yi Wang,et al. Spoken language understanding , 2005, IEEE Signal Processing Magazine.

[19] T. Kalker,et al. IEEE Signal Processing Magazine Vol. 17 , 2000 .

[20] Vibhu O. Mittal,et al. Ultra-Summarization: A Statistical Approach to Generating Highly Condensed Non-Extractive Summaries (poster abstract). , 1998, SIGIR 1999.

[21] S. Furui. Recent Advances in Spontaneous Speech Recognition and Understanding , 2003 .

[22] Richard M. Schwartz,et al. A hidden Markov model information retrieval system , 1999, SIGIR '99.

[23] Richard M. Schwartz,et al. An Algorithm that Learns What's in a Name , 1999, Machine Learning.

[24] Dragutin Petkovic,et al. Phonetic confusion matrix based spoken document retrieval , 2000, SIGIR '00.

[25] Kenney Ng,et al. Subword-based approaches for spoken document retrieval , 2000, Speech Commun..

[26] Thomas Hofmann,et al. Probabilistic latent semantic indexing , 1999, SIGIR '99.

[27] W. Bruce Croft,et al. A language modeling approach to information retrieval , 1998, SIGIR '98.

[28] Lin-Shan Lee,et al. Automatic title generation for Chinese spoken documents considering the special structure of the language , 2003, INTERSPEECH.

[29] Yu Shi,et al. A system for spoken query information retrieval on mobile devices , 2002, IEEE Trans. Speech Audio Process..

[30] Berlin Chen,et al. Lightly supervised and data-driven approaches to Mandarin broadcast news transcription , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[31] D. Harman,et al. Text Retrieval Conference and Message Understanding Conference , 2006 .

[32] Marcello Federico,et al. Bootstrapping Named Entity Recognition for Italian Broadcast News , 2002, EMNLP.

[33] Mari Ostendorf,et al. Modeling uncertainty for information extraction from speech data , 2001 .

[34] Lin-Shan Lee,et al. Why is the special structure of the language important for Chinese spoken language processing? - examples on spoken document retrieval, segmentation and summarization , 2003, INTERSPEECH.

[35] Hermann Ney,et al. Named entity extraction from Japanese broadcast news , 2003, INTERSPEECH.

[36] Richard A. Harshman,et al. Information retrieval using a singular value decomposition model of latent semantic structure , 1988, SIGIR '88.

[37] Ralph Grishman,et al. Message Understanding Conference- 6: A Brief History , 1996, COLING.

[38] Vibhu O. Mittal,et al. Ultra-summarization (poster abstract): a statistical approach to generating highly condensed non-extractive summaries , 1999, SIGIR '99.

[39] S. Furui,et al. Automatic recognition and understanding of spoken language - a first step toward natural human-machine communication , 2000, Proceedings of the IEEE.

[40] J.R. Bellegarda,et al. Latent semantic mapping [information retrieval] , 2005, IEEE Signal Processing Magazine.

[41] David M. Blei,et al. Topic segmentation with an aspect hidden Markov model , 2001, SIGIR '01.

[42] Julia Hirschberg,et al. SCAN: designing and evaluating user interfaces to support retrieval from speech archives , 1999, SIGIR '99.

[43] Samuel Kaski,et al. Self organization of a massive document collection , 2000, IEEE Trans. Neural Networks Learn. Syst..

[44] Mikko Kurimo,et al. Thematic indexing of spoken documents by using self-organizing maps , 2002, Speech Commun..

[45] Thomas Hofmann,et al. ProbMap - A probabilistic approach for mapping large document collections , 2000, Intell. Data Anal..

[46] S. Renals,et al. Content-based access to spoken audio , 2005, IEEE Signal Processing Magazine.

[47] Lei Zhang,et al. Chinese Named Entity Identification Using Class-based Language Model , 2002, COLING.

[48] Teuvo Kohonen,et al. Self-Organizing Maps , 2010 .

[49] Rong Jin,et al. Title generation for spoken broadcast news using a training corpus , 2000, INTERSPEECH.

[50] Sadaoki Furui,et al. Automatic speech summarization based on word significance and linguistic likelihood , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[51] Douglas E. Appelt,et al. SRI International FASTUS SystemMUC-6 Test Results and Analysis , 1995, MUC.

[52] Hsin-Min Wang,et al. Statistical Chinese spoken document retrieval using latent topical information , 2004, INTERSPEECH.

[53] Lin-Shan Lee,et al. A discriminative HMM/N-gram-based retrieval approach for mandarin spoken documents , 2004, TALIP.

[54] Lin-Shan Lee,et al. Automatic title generation for Chinese spoken documents using an adaptive k nearest-neighbor approach , 2003, INTERSPEECH.

[55] Larry Gillick,et al. A hidden Markov model approach to text segmentation and event tracking , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[56] Ralph Grishman,et al. A Maximum Entropy Approach to Named Entity Recognition , 1999 .

[57] Bernt A. Bremdal,et al. Information Extraction: State-of-the-Art Report , 2000 .

[58] Douglas E. Appelt,et al. Introduction to Information Extraction Technology , 1999, IJCAI 1999.

[59] Jerome Rene Bellegarda,et al. Latent Semantic Mapping , 2007 .

[60] Lin-Shan Lee,et al. Improved Chinese spoken document retrieval with hybrid modeling and data-driven indexing features , 2002, INTERSPEECH.