A neuro-SVM model for text classification using latent semantic indexing

This paper presents a new model integrating a recurrent neural network (RNN) and a least squares support vector machine (LS-SVM) for classification of document titles according to different predetermined categories. The new model proposed in this paper is abbreviated as neuro-SVM. Based on the neuro-SVM model, a system is implemented, using latent semantic indexing (LSI) to generate probabilistic coefficients from document titles, which are used as the input to the system. The system's performance is demonstrated with a corpus of 96956 words, from University of Denver's Penrose library catalogue and the accuracy rate of the proposed system is found to be 99.66%.

[1]  S. Wermter,et al.  Recurrent neural network learning for text routing , 1999 .

[2]  Thorsten Joachims,et al.  Text categorization with support vector machines , 1999 .

[3]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[4]  Peter W. Foltz Using latent semantic indexing for information filtering , 1990 .

[5]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[6]  Jerome R. Bellegarda,et al.  A multispan language modeling framework for large vocabulary speech recognition , 1998, IEEE Trans. Speech Audio Process..

[7]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[8]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[9]  S. Gunn Support Vector Machines for Classification and Regression , 1998 .

[10]  Grace Wahba,et al.  Spline Models for Observational Data , 1990 .

[11]  Michael L. Littman,et al.  Automatic Cross-Language Retrieval Using Latent Semantic Indexing , 1997 .

[12]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[13]  Jose C. Principe,et al.  Neural and adaptive systems : fundamentals through simulations , 2000 .

[14]  Johan A. K. Suykens,et al.  Benchmarking Least Squares Support Vector Machine Classifiers , 2004, Machine Learning.

[15]  Peter W. Foltz,et al.  The Measurement of Textual Coherence with Latent Semantic Analysis. , 1998 .

[16]  Tomaso A. Poggio,et al.  Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..