论文信息 - Extracting Semantic Relationships between Terms: Supervised vs. Unsupervised Methods

Extracting Semantic Relationships between Terms: Supervised vs. Unsupervised Methods

As the amount of electronic documents (corpora, dictionaries, newspapers, newswires, etc.) becomes more andmore important and diversiﬁed, there is a need to extract inf ormation automatically from these texts.In order to extract terms and relations between terms, two methods can be used. The ﬁrst method is theunsupervised approach, which requires a term extraction module and few predeﬁned t ypes, especially termtypes, in order to ﬁnd relationships between terms and to ass ign appropriate types to the relationships.Works on automatic term recognition usually involve predeﬁ nition of a set of term patterns, extractionprocedure and a scoring mechanism to ﬁlter out non-relevant candidates. Smadja (1993) describes a set oftechniques based on statistical methods for retrieving collocations from large text collections. Daille (1996)presents a combination of linguistic ﬁlters and statistica l methods to extract two-word terms. This work imple-ments ﬁnite automata for each term pattern, then various sta tistical scores for ranking the extracted terms arecompared.Unsupervised identiﬁcation of term relationships is a more complicated task, reported in works from variousﬁelds including Computational Linguistics and Knowledge D iscovery in Texts. A keyword-based model for textmining is described in Feldman and Dagan (1995). The work suggests to use a wide range of KDD (KnowledgeDiscovery in Databases) operations on collections of textual documents, including association discovery amongkeywords within the documents. Cooper and Byrd (1997) reports the T

Emmanuel Morin | Michal Finkelstein-Landau

[1] Scott B. Huffman,et al. Learning information extraction patterns from examples , 1995, Learning for Natural Language Processing.

[2] Béatrice Daille,et al. Study and Implementation of Combined Techniques for Automatic Extraction of Terminology , 1994 .

[3] James W. Cooper,et al. Lexical navigation: visually prompted query expansion and refinement , 1997, DL '97.

[4] Marti A. Hearst. Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[5] Frank Smadja,et al. Retrieving Collocations from Text: Xtract , 1993, CL.

[6] Ellen Riloff,et al. Automatically Constructing a Dictionary for Information Extraction Tasks , 1993, AAAI.

[7] Atro Voutilainen,et al. NPtool, a Detector of English Noun Phrases , 1995, VLC@ACL.

[8] Ido Dagan,et al. Knowledge Discovery in Textual Databases (KDT) , 1995, KDD.