论文信息 - Bootstrapping Named Entity Extraction for the Creation of Mobile Services

Bootstrapping Named Entity Extraction for the Creation of Mobile Services

As users become more accustomed to using their mobile devices to organize and schedule their lives, there is more of a demand for applications that can make that process easier. Automatic speech recognition technology has already been developed to enable essentially unlimited vocabulary in a mobile setting. Understanding the words that are spoken is the next challenge. In this paper, we describe efforts to develop a dataset and classifier to recognize named entities in speech. Using sets of both real and simulated data, in conjunction with a very large set of real named entities, we created a challenging corpus of training and test data. We use these data to develop a classifier to identify names and locations on a word-by-word basis. In this paper, we describe the process of creating the data and determining a set of features to use for named entity recognition. We report on our classification performance on these data, as well as point to future work in improving all aspects of the system.

Joseph Polifroni | Imre Kiss | Mark Adler

[1] Yoram Singer,et al. Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[2] Stan Matwin,et al. Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity , 2006, Canadian AI.

[3] Frédéric Béchet,et al. Robust Named Entity Extraction from Large Spoken Archives , 2005, HLT/EMNLP.

[4] Ralph Weischedel,et al. NAMED ENTITY EXTRACTION FROM SPEECH , 1998 .

[5] Marc Moens,et al. Named Entity Recognition without Gazetteers , 1999, EACL.

[6] Junlan Feng,et al. Role of natural language understanding in voice local search , 2009, INTERSPEECH.

[7] Doug Downey,et al. Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[8] Richard M. Schwartz,et al. An Algorithm that Learns What's in a Name , 1999, Machine Learning.

[9] Geoffrey Zweig,et al. Information Extraction from Voicemail , 2001, ACL.

[10] Dilek Z. Hakkani-Tür,et al. Detecting and extracting named entities from spontaneous speech in a mixed-initiative spoken dialogue context: How May I Help You?sm, tm , 2004, Speech Commun..

[11] Martin Jansche,et al. Information Extraction from Voicemail Transcripts , 2002, EMNLP.