论文信息 - Methods for automatic term recognition in domain-specific text collections: A survey

Methods for automatic term recognition in domain-specific text collections: A survey

Applications related to domain specific text processing often use glossaries and ontologies, and the main step of such resource construction is term recognition. This paper presents a survey of existing definitions of the term and its linguistic features, formulates the task definition for term recognition, and analyzes presently-available methods for automatic term recognition, such as methods for candidates collection, methods based on statistics and contexts of term occurrences, methods using topic models, and methods based on external resources (such as text collections from other domains, ontologies, and Wikipedia). This paper also provides an overview of standard methodologies and datasets for experimental research.

[1] Rosa Estopà. Les unités de signification spécialisées élargissant l'objet du travail en terminologie , 2001 .

[2] Hao Yu,et al. Fault-Tolerant Learning for Term Extraction , 2010, PACLIC.

[3] Kyo Kageura,et al. METHODS OF AUTOMATIC TERM RECOGNITION : A REVIEW , 1996 .

[4] Flavius Frasincar,et al. A semantic approach for extracting domain taxonomies from text , 2014, Decis. Support Syst..

[5] G. FedorenkoD.,et al. AutomAtic EnrichmEnt of informAl ontology by AnAlyzing , 2014 .

[6] Ian H. Witten,et al. Mining Domain-Specific Thesauri from Wikipedia: A Case Study , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[7] Rosa Estopà Bagot,et al. Les unités de signification spécialisées élargissant l’objet du travail en terminologie , 2001 .

[8] Silvia Bernardini,et al. BootCaT: Bootstrapping Corpora and Terms from the Web , 2004, LREC.

[9] Hinrich Schütze,et al. Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[10] Hinrich Schütze,et al. Unsupervised Training Set Generation for Automatic Acquisition of Technical Terminology in Patents , 2014, COLING.

[11] Simone Paolo Ponzetto,et al. WikiRelate! Computing Semantic Relatedness Using Wikipedia , 2006, AAAI.

[12] Feiyu Xu,et al. A Domain Adaptive Approach to Automatic Acquisition of Domain Relevant Terms and their Relations with Bootstrapping , 2002, LREC.

[13] Victor Sadler,et al. Book Reviews: Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon , 1993, CL.

[14] Udo Hahn,et al. You Can't Beat Frequency (Unless You Use Linguistic Knowledge) - A Qualitative Evaluation of Association Measures for Collocation and Term Extraction , 2006, ACL.

[15] Horacio Rodríguez,et al. Using Wikipedia for term extraction in the biomedical domain: first experiences , 2010, Proces. del Leng. Natural.

[16] Gabriel Bernier-Colborne,et al. Creating a test corpus for term extractors through term annotation. , 2014 .

[17] Natalia V. Loukachevitch,et al. Multiple Evidence for Term Extraction in Broad Domains , 2011, RANLP.

[18] Julio Gonzalo,et al. Corpus-based terminology extraction applied to information access , 2001 .