As the amount of electronic documents (corpora, dictionaries, newspapers, newswires, etc.) becomes more andmore important and diversified, there is a need to extract inf ormation automatically from these texts.In order to extract terms and relations between terms, two methods can be used. The first method is theunsupervised approach, which requires a term extraction module and few predefined t ypes, especially termtypes, in order to find relationships between terms and to ass ign appropriate types to the relationships.Works on automatic term recognition usually involve predefi nition of a set of term patterns, extractionprocedure and a scoring mechanism to filter out non-relevant candidates. Smadja (1993) describes a set oftechniques based on statistical methods for retrieving collocations from large text collections. Daille (1996)presents a combination of linguistic filters and statistica l methods to extract two-word terms. This work imple-ments finite automata for each term pattern, then various sta tistical scores for ranking the extracted terms arecompared.Unsupervised identification of term relationships is a more complicated task, reported in works from variousfields including Computational Linguistics and Knowledge D iscovery in Texts. A keyword-based model for textmining is described in Feldman and Dagan (1995). The work suggests to use a wide range of KDD (KnowledgeDiscovery in Databases) operations on collections of textual documents, including association discovery amongkeywords within the documents. Cooper and Byrd (1997) reports the T
[1]
Scott B. Huffman,et al.
Learning information extraction patterns from examples
,
1995,
Learning for Natural Language Processing.
[2]
Béatrice Daille,et al.
Study and Implementation of Combined Techniques for Automatic Extraction of Terminology
,
1994
.
[3]
James W. Cooper,et al.
Lexical navigation: visually prompted query expansion and refinement
,
1997,
DL '97.
[4]
Marti A. Hearst.
Automatic Acquisition of Hyponyms from Large Text Corpora
,
1992,
COLING.
[5]
Frank Smadja,et al.
Retrieving Collocations from Text: Xtract
,
1993,
CL.
[6]
Ellen Riloff,et al.
Automatically Constructing a Dictionary for Information Extraction Tasks
,
1993,
AAAI.
[7]
Atro Voutilainen,et al.
NPtool, a Detector of English Noun Phrases
,
1995,
VLC@ACL.
[8]
Ido Dagan,et al.
Knowledge Discovery in Textual Databases (KDT)
,
1995,
KDD.