Multi-Label Text Categorization Using a Probabilistic Neural Network

Techniques for categorization and clustering, range from support vector machines, neural networks to Bayesian in- ference and algebraic methods. The k-Nearest Neighbor Algo- rithm (kNN) is a popular example of the latter class of these al- gorithms. Recently, slightly modified versions of support vector machines, kNN and decision trees have been proposed to deal better with multi-label classification problems. In this paper, we also proposed a new version of a Probabilistic Neural Network (PNN) to tackle these kind of problems. This PNN was pro- posed aiming at executing automatic classification of economic activities, which is the focus of this article. Nevertheless, we compared the PNN algorithm against other classifiers. In addi- tion to economic activities database, we applied our algorithm to some other databases found in the literature. In general, our approach surpassed the other algorithms in many metrics typ- ically well known in the literature for the multi-label categ oriza-

[1]  Ioannis Anagnostopoulos,et al.  Classifying Web pages employing a probabilistic neural network , 2004, IEE Proc. Softw..

[2]  Alberto Ferreira de Souza,et al.  Automated Free Text Classification of Economic Activities Using VG-RAM Weightless Neural Networks , 2007, Seventh International Conference on Intelligent Systems Design and Applications (ISDA 2007).

[3]  Jian-xiong Dong,et al.  Fast SVM training algorithm with decomposition on very large data sets , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Yoav Freund,et al.  The Alternating Decision Tree Learning Algorithm , 1999, ICML.

[5]  Yoram Singer,et al.  Boosting and Rocchio applied to text filtering , 1998, SIGIR '98.

[6]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[7]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[8]  Chenn-Jung Huang,et al.  A comparative study of feature selection methods for probabilistic neural networks in cancer classification , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[9]  P. K. Patra,et al.  Probabilistic neural network for pattern classification , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[10]  Alberto Ferreira de Souza,et al.  A Comparison between a KNN Based Approach and a PNN Algorithm for a Multi-label Classification Problem , 2008, 2008 Eighth International Conference on Intelligent Systems Design and Applications.

[11]  W. Grundy,et al.  Combining Microarray Expression Data and Phylogenetic Profiles to Learn Gene Functional Categories Using Support Vector Machines , 2000, RECOMB 2000.

[12]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[13]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[14]  Wee Ser,et al.  Probabilistic neural-network structure determination for pattern classification , 2000, IEEE Trans. Neural Networks Learn. Syst..

[15]  D. F. Specht,et al.  Probabilistic neural networks for classification, mapping, or associative memory , 1988, IEEE 1988 International Conference on Neural Networks.

[16]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[17]  Cornelis H. A. Koster,et al.  Multi-classification of Patent Applications with Winnow , 2003, Ershov Memorial Conference.

[18]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[19]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[20]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[21]  Hsinchun Chen,et al.  Automatic patent classification using citation network information: an experimental study in nanotechnology , 2007, JCDL '07.

[22]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[23]  P. M. Ciarelli,et al.  THE AUTOMATION OF THE CLASSIFICATION OF ECONOMIC ACTIVITIES FROM FREE TEXT DESCRIPTIONS USING AN ARRAY ARCHITECTURE OF PROBABILISTIC NEURAL NETWORK , 2007 .

[24]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[25]  Kok Wai Wong,et al.  Comparing the Performance of Different Neural Networks Architectures for the Prediction of Mineral Prospectivity , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[26]  Rémi Gilleron,et al.  Learning Multi-label Alternating Decision Trees from Texts and Data , 2003, MLDM.

[27]  Dionisis Cavouras,et al.  Comparative evaluation of probabilistic neural network versus support vector machines classifiers in discriminating ERP signals of depressive patients from healthy controls , 2003, 3rd International Symposium on Image and Signal Processing and Analysis, 2003. ISPA 2003. Proceedings of the.

[28]  W. Jatmiko,et al.  Optimized probabilistic neural networks in recognizing fragrance mixtures using higher number of sensors , 2005, IEEE Sensors, 2005..

[29]  Naonori Ueda,et al.  Parametric Mixture Models for Multi-Labeled Text , 2002, NIPS.

[30]  Alberto Ferreira de Souza,et al.  Automated multi-label text categorization with VG-RAM weightless neural networks , 2009, Neurocomputing.

[31]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .