Text Classification through Statistical and Machine Learning Methods: A Survey

With the instant growth of information, text classification has become a vital technique for handling and organizing text data. In general, Text classification plays an important role in information extraction, text summarization text retrieval, medical diagnosis, news group filtering, spam filtering, and sentiment analysis. This paper illustrates the text classification process using machine learning techniques and statistical techniques such as k-nearest neighbors, support vector machine, naive Bayesian method.

[1]  Arindam Chaudhuri,et al.  Modified fuzzy support vector machine for credit approval classification , 2014, AI Commun..

[2]  Dino Isa,et al.  A hybrid text classification approach with low dependency on parameter by integrating K-nearest neighbor and support vector machine , 2012, Expert Syst. Appl..

[3]  Renato Bruni,et al.  Effective Classification Using a Small Training Set Based on Discretization and Statistical Analysis , 2015, IEEE Transactions on Knowledge and Data Engineering.

[4]  Xijin Tang,et al.  Text classification based on multi-word with support vector machine , 2008, Knowl. Based Syst..

[5]  Hongbin Zhang,et al.  Research of Text Categorization Based on SVM , 2011 .

[6]  Kun Liu,et al.  Study on SVM Compared with the other Text Classification Methods , 2010, 2010 Second International Workshop on Education Technology and Computer Science.

[7]  Ming-Syan Chen,et al.  On the Design and Analysis of the Privacy-Preserving SVM Classifier , 2011, IEEE Transactions on Knowledge and Data Engineering.

[8]  Marcin Kepa,et al.  Two Stage SVM and kNN Text Documents Classifier , 2015, PReMI.

[9]  Dino Isa,et al.  Automatic folder allocation system using Bayesian-support vector machines hybrid classification approach , 2010, Applied Intelligence.

[10]  Charu C. Aggarwal,et al.  Mining Text Data , 2012, Springer US.

[11]  Rong Jin,et al.  Efficient Algorithm for Localized Support Vector Machine , 2010, IEEE Transactions on Knowledge and Data Engineering.

[12]  Bart Baesens,et al.  Decompositional Rule Extraction from Support Vector Machines by Active Learning , 2009, IEEE Transactions on Knowledge and Data Engineering.

[13]  MirHossein Dezfoulian,et al.  Persian Text Classification Based on K-NN Using Wordnet , 2012, IEA/AIE.

[14]  Xiao Fang,et al.  Inference-Based Naïve Bayes: Turning Naïve Bayes Cost-Sensitive , 2013, IEEE Transactions on Knowledge and Data Engineering.

[15]  Luca Cagliero,et al.  EnBay: A Novel Pattern-Based Bayesian Classifier , 2013, IEEE Transactions on Knowledge and Data Engineering.