Comparison of Novel Semi supervised Text classification using BPNN by Active search with KNN Algorithm

With the availability of huge amount of text in internet, news, institutes, organization etc need of automatic text classification also increases, The proposed work comprised to deal with the major challenge of getting labeled data for training in classifier, since the availability of labeled data is expensive, time consuming, it also requires the involvement of annotator . A novel semi supervised test classification algorithm based on Back Propagation Neural Network is proposed which makes use of web assisted unlabeled data by Active search, this algorithm is compared with standard KNN algorithm on test data and standard data Mini Newsgroup. Experimental results state that the proposed algorithm outperforms KNN with Micro averaged F1measure.

[1]  Maria-Florina Balcan,et al.  A discriminative model for semi-supervised learning , 2010, J. ACM.

[2]  Falguni N. Patel,et al.  Text mining: A Brief survey , 2012 .

[3]  Michael W. Berry,et al.  Survey of Text Mining , 2003, Springer New York.

[4]  Mahak Motwani,et al.  Comparative Study and Analysis of Supervised and Unsupervised Term Weighting Methods on Text Classification , 2013 .

[5]  Dik Lun Lee,et al.  Feature reduction for neural network based text categorization , 1999, Proceedings. 6th International Conference on Advanced Systems for Advanced Applications.

[6]  Zenglin Xu,et al.  Semi-supervised text categorization by active search , 2008, CIKM '08.

[7]  Minoru Nakayama,et al.  Subject Categorization for Web Educational Resources using MLP , 2003, ESANN.

[8]  Yanchun Zhang,et al.  Enhancing text classification using synopses extraction , 2003, Proceedings of the Fourth International Conference on Web Information Systems Engineering, 2003. WISE 2003..

[9]  Vipin Kumar,et al.  Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification , 2001, PAKDD.

[10]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[11]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[12]  Jian Su,et al.  Supervised and Traditional Term Weighting Methods for Automatic Text Categorization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Hwee Tou Ng,et al.  Feature selection, perceptron learning, and a usability case study for text categorization , 1997, SIGIR '97.

[14]  Andreas Hotho,et al.  A Brief Survey of Text Mining , 2005, LDV Forum.

[15]  Padmini Srinivasan,et al.  Automatic Text Categorization Using Neural Networks , 1997 .