A Novel Data Mining Approach for Multi Variant Text Classification

Text classification, which aims to assign a document to one or more categories based on its content, is a fundamental task for Web and/or document data mining applications. In natural language processing and information extraction fields Text classification is emerging as an important part, were we can use this approach to discover useful information from large database. These approaches allow individuals to construct classifiers that have relevance for a variety of domains. Existing algorithms such as Svm Light have less GUI support and take more time to perform classification task. In this presented work classification of multi-domain documents is performed by using weka-LibSVM classifier. Here to transform collected training set and test set documents into term-document matrix (TDM), the vector space model is used. In classifier TDM is used to generate predicted results. The results emerged from weka with its GUI support using TDM have quick response time in classifying the documents.

[1]  N. Gopalan,et al.  Notice of Violation of IEEE Publication PrinciplesSentence Similarity Computation Based on Wordnet and Corpus Statistics , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[2]  Jian-Ping Li,et al.  Application of the sentiment classification techniques for web site monitor system , 2010, The 2010 International Conference on Apperceiving Computing and Intelligence Analysis Proceeding.

[3]  Christian Wartena,et al.  Topic Detection by Clustering Keywords , 2008, 2008 19th International Workshop on Database and Expert Systems Applications.

[4]  Sule Gündüz Ögüdücü,et al.  A taxonomy based semantic similarity of documents using the cosine measure , 2009, 2009 24th International Symposium on Computer and Information Sciences.

[5]  Qiang Yang,et al.  Topic-bridged PLSA for cross-domain text classification , 2008, SIGIR '08.

[6]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[7]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[8]  Sun Yueheng,et al.  The Mining of Term Semantic Relationships and its Application in Text Classification , 2012, 2012 Fifth International Conference on Intelligent Computation Technology and Automation.

[9]  Qiang Yang,et al.  Bridging Domains Using World Wide Knowledge for Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.