METHODS OF AUTOMATIC TERM RECOGNITION : A REVIEW

Following the growing interest in "corpus-based" approaches to computational linguistics, a number of studies have recently appeared on the topic of automatic term recognition or extraction. Because a successful term-recognition method has to be based on proper insights into the nature of terms, studies of automatic term recognition not only contribute to the applications of computational linguistics but also to the theoretical foundation of terminology. Many studies on automatic term recognition treat interesting aspects of terms, but most of them are not well founded and described.This paper tries to give an overview of the principles and methods of automatic term recognition. For that purpose, two major trends are examined, i.e., studies in automatic recognition of significant elements for indexing mainly carried out in information-retrieval circles and current research in automatic term recognition in the field of computational linguistics.

[1]  Hans Peter Luhn,et al.  A Statistical Approach to Mechanized Encoding and Searching of Literary Information , 1957, IBM J. Res. Dev..

[2]  H. P. Edmundson,et al.  Automatic abstracting and indexing—survey and recommendations , 1961, CACM.

[3]  Fred J. Damerau,et al.  An experiment in automatic indexing , 1965 .

[4]  Morris Rubinoff,et al.  Statistical generation of a technical vocabulary , 1968 .

[5]  John M. Carroll,et al.  Computer selection of keywords using word-frequency analysis , 1969 .

[6]  Lois L. Earl,et al.  Experiments in automatic extracting and indexing , 1970, Inf. Storage Retr..

[7]  Paul H. Klingbiel Machine-aided indexing of technical literature , 1973, Inf. Storage Retr..

[8]  Karen Spärck Jones Index term weighting , 1973, Inf. Storage Retr..

[9]  Gerard Salton,et al.  On the Specification of Term Values in Automatic Indexing , 1973 .

[10]  Paul H. Klingbiel A technique for machine-aided indexing , 1973, Inf. Storage Retr..

[11]  Ivo Steinacker Indexing and automatic significance analysis , 1974, J. Am. Soc. Inf. Sci..

[12]  Don R. Swanson,et al.  Probabilistic models for automatic indexing , 1974, J. Am. Soc. Inf. Sci..

[13]  Stephen P. Harter,et al.  A probabilistic approach to automatic keyword indexing. Part II. An algorithm for probabilistic indexing , 1975, J. Am. Soc. Inf. Sci..

[14]  Don R. Swanson,et al.  A decision theoretic foundation for indexing , 1975, J. Am. Soc. Inf. Sci..

[15]  Stephen P. Harter,et al.  A probabilistic approach to automatic keyword indexing. Part I. On the Distribution of Specialty Words in a Technical Literature , 1975, J. Am. Soc. Inf. Sci..

[16]  Clement T. Yu,et al.  A theory of term importance in automatic text analysis , 1974, J. Am. Soc. Inf. Sci..

[17]  Makoto Nagao,et al.  An Automatic Method of the Extraction of Important Words from Japanese Scientific Documents , 1976 .

[18]  William S. Cooper,et al.  Foundations of Probabilistic and Utility-Theoretic Indexing , 1978, JACM.

[19]  Michael McGill,et al.  A performance evaluation of similarity measures, document term weighting schemes and representations in a Boolean environment , 1980, SIGIR '80.

[20]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[21]  Martin Dillon,et al.  FASIT: A fully automatic syntactically based indexing system , 1983, J. Am. Soc. Inf. Sci..

[22]  Gerard Salton,et al.  Syntactic Approaches to Automatic Book Indexing , 1988, ACL.

[23]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[24]  Gerald Salton,et al.  Automatic text processing , 1988 .

[25]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[26]  Juan C. Sager,et al.  A practical course in terminology processing , 1990 .

[27]  Fred J. Damerau Evaluating computer-generated domain-oriented vocabularies , 1990, Inf. Process. Manag..

[28]  Sridhar Radhakrishnan,et al.  INDEX: The statistical basis for an automatic conceptual phrase-indexing system , 1990 .

[29]  Didier Bourigault,et al.  Surface Grammatical Analysis for the Extraction of Terminological Noun Phrases , 1992, COLING.

[30]  Ted Dunning,et al.  Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[31]  Fred J. Damerau,et al.  Generating and Evaluating Domain-Oriented Multi-Word Terms from Texts , 1993, Inf. Process. Manag..

[32]  Sophia Ananiadou,et al.  A Methodology for Automatic Term Recognition , 1994, COLING.

[33]  Kenneth Ward Church,et al.  Termight: Identifying and Translating Technical Terminology , 1994, ANLP.

[34]  Andy Lauriston Automatic recognition of complex terms: Problems and the TERMINO solution , 1994 .

[35]  Gregory Grefenstette,et al.  Explorations in automatic thesaurus discovery , 1994 .

[36]  Kenji Kita,et al.  A comparative study of automatic extraction of collocations from corpora: mutual information vs , 1994 .

[37]  Éric Gaussier,et al.  Towards Automatic Extraction of Monolingual and Bilingual Terminology , 1994, COLING.

[38]  Donald H. Kraft,et al.  Measurement in Information Science , 1994 .

[39]  Chantal Enguehard,et al.  Automatic Natural Acquisition of a Terminology , 1995, J. Quant. Linguistics.

[40]  Slava M. Katz,et al.  Technical terminology: some linguistic properties and an algorithm for identification in text , 1995, Natural Language Engineering.

[41]  Sophia Ananiadou,et al.  Statistical measures for terminological extraction , 1995 .

[42]  Jonathan D. Cohen Highlights: language- and domain-independent automatic indexing terms for abstracting , 1995 .

[43]  Mark Lauer Conserving Fuel in Statistical Language Learning: Predicting Data Requirements , 1995, ArXiv.

[44]  K. Kageura Toward the theoretical study of terms: A sketch from the linguistic viewpoint , 1995 .

[45]  Sophia Ananiadou,et al.  Extracting terminological expressions , 1996 .

[46]  강승식,et al.  [서평]「Electric Words : Dictionaries, Computers and Meanings」 , 1997 .