An Overview of Document Mining Technology

Living through the Information Revolution is becoming a diicult task-humans were not designed to process massive quantities of information. The computer rst found it's use in speeding our number crunching, performing a large number of calculations blindingly fast. We are now beginning to turn to computers to solve another human inadequacy, mining through our masses of information to nd items of interest. Document mining has many uses in our information era, looking for patterns in commonly available texts such as news-feeds. How many terrorist attacks were there in 1995? Is there a strong relationship between the IRA and car bombs? Do frequent changes of company management lead to better proots? Document mining has the potential to identify patterns such as these hidden inside vast collections of text data, possibly giving companies that competitive edge they need to survive.

[1]  Filippo Neri,et al.  Machine Learning for Information Extraction , 1997, SCIE.

[2]  Margaret King,et al.  Evaluating natural language processing systems , 1996, CACM.

[3]  Claire Cardie,et al.  A Case-Based Approach to Knowledge Acquisition for Domain-Specific Sentence Analysis , 1993, AAAI.

[4]  Gregory Piatetsky-Shapiro,et al.  The KDD process for extracting useful knowledge from volumes of data , 1996, CACM.

[5]  Roberto Basili,et al.  Lexical Acquisition and Information Extraction , 1997, SCIE.

[6]  Ido Dagan,et al.  Knowledge Discovery in Textual Databases (KDT) , 1995, KDD.

[7]  Beth Sundheim,et al.  A Performance Evaluation of Text-Analysis Technologies , 1991, AI Mag..

[8]  Yorick Wilks,et al.  Information Extraction as a Core Language Technology , 1997, SCIE.

[9]  Ellen Riloff,et al.  Using learned extraction patterns for text classification , 1995, Learning for Natural Language Processing.

[10]  Mika Klemettinen,et al.  Mining in the Phrasal Frontier , 1997, PKDD.

[11]  Stephen Soderland,et al.  Learning to Extract Text-Based Information from the World Wide Web , 1997, KDD.

[12]  Karen Spärck Jones,et al.  Natural language processing for information retrieval , 1996, CACM.

[13]  Ronen Feldman,et al.  Pattern Based Browsing in Document Collections , 1997, PKDD.

[14]  Y. Wilks,et al.  A General Architecture for Text Engineering (gate) { a New Approach to Language Engineering R&d a General Architecture for Text Engineering (gate) | a New Approach to Language Engineering R&d a E G T , 1995 .

[15]  Oren Etzioni,et al.  The World-Wide Web: quagmire or gold mine? , 1996, CACM.

[16]  Automatically Learned vs. Hand-crafted Text Analysis Rules , 1997 .

[17]  Wendy G. Lehnert,et al.  Information extraction , 1996, CACM.

[18]  Ralph Grishman,et al.  Information Extraction: Techniques and Challenges , 1997, SCIE.

[19]  Ellen Riloff,et al.  Automatically Acquiring Conceptual Patterns without an Annotated Corpus , 1995, VLC@ACL.

[20]  A Min Tjoa,et al.  Data Mining in Large Free Text Document Archives , 1996, CODAS.

[21]  Yorick Wilks,et al.  Software Infrastructure for Language Engineering , 1996 .

[22]  Jiawei Han,et al.  Knowledge Discovery in Databases: An Attribute-Oriented Approach , 1992, VLDB.

[23]  Helena Ahonen Barbara,et al.  Improving the Accessibility of SGML Documents A Content-analytical Approach , 1997 .

[24]  Ellen Riloff,et al.  Little words can make a big difference for text classification , 1995, SIGIR '95.

[25]  M. Klemettinen,et al.  Applying Data Mining Techniques in Text Analysis , 1997 .

[26]  Ralph Grishman,et al.  Message Understanding Conference- 6: A Brief History , 1996, COLING.