论文信息 - Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization

Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization

Textual data in electronic documents today around the world have no doubt brought forward all the information one could need and as data banks build up worldwide, and access gets easier through technology, it has become easier to overlook vital facts and figures that could bring about groundbreaking discoveries. This research paper discusses in detail an implementation of Information Extraction and Categorization in the text mining application that we have implemented. To extract terms from the document we have used modified version of Porter’s Algorithm for inflectional stemming. For calculating term frequencies for categorization, we have used a domain dictionary for ‘Computer Science’ domain.

[1] William F. Punch,et al. Automated Concept Extraction From Plain Text , 1998 .

[2] Romaric Besançon,et al. Text Mining, knowledge extraction from unstructured textual data , 1998 .

[3] Antoine Spinakis,et al. Comparative Study of Text Mining Tools , 2005 .

[4] Rafael A. Calvo,et al. Mining Text with Pimiento , 2006, IEEE Internet Computing.

[5] Tova Milo,et al. Active Views for Electronic Commerce , 1999, VLDB.