Organizing Books and Authors by Multilayer SOM

This paper introduces a new framework for the organization of electronic books (e-books) and their corresponding authors using a multilayer self-organizing map (MLSOM). An author is modeled by a rich tree-structured representation, and an MLSOM-based system is used as an efficient solution to the organizational problem of structured data. The tree-structured representation formulates author features in a hierarchy of author biography, books, pages, and paragraphs. To efficiently tackle the tree-structured representation, we used an MLSOM algorithm that serves as a clustering technique to handle e-books and their corresponding authors. A book and author recommender system is then implemented using the proposed framework. The effectiveness of our approach was examined in a large-scale data set containing 3868 authors along with the 10500 e-books that they wrote. We also provided visualization results of MLSOM for revealing the relevance patterns hidden from presented author clusters. The experimental results corroborate that the proposed method outperforms other content-based models (e.g., rate adapting poisson, latent Dirichlet allocation, probabilistic latent semantic indexing, and so on) and offers a promising solution to book recommendation, author recommendation, and visualization.

[1]  Bernd Fritzke,et al.  A Growing Neural Gas Network Learns Topologies , 1994, NIPS.

[2]  Loriene Roy,et al.  Content-based book recommending using learning for text categorization , 1999, DL '00.

[3]  Tommy W. S. Chow,et al.  A new dual wing harmonium model for document retrieval , 2009, Pattern Recognit..

[4]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[5]  E. Cuadros-Vargas,et al.  A SAM-SOM family: incorporating spatial access methods into constructive self-organizing maps , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[6]  Tommy W. S. Chow,et al.  A flexible multi-layer self-organizing map for generic processing of tree-structured data , 2007, Pattern Recognit..

[7]  Juan-Zi Li,et al.  Typicality-Based Collaborative Filtering Recommendation , 2014, IEEE Transactions on Knowledge and Data Engineering.

[8]  Volker Wulf,et al.  Expert recommender systems in practice: evaluating semi-automatic profile generation , 2009, CHI.

[9]  J.A.F. Costa,et al.  A new tree-structured self-organizing map for data analysis , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[10]  Peter V. Gehler,et al.  The rate adapting poisson model for information retrieval and object recognition , 2006, ICML.

[11]  Abraham Kandel,et al.  Classification Of Web Documents Using Graph Matching , 2004, Int. J. Pattern Recognit. Artif. Intell..

[12]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[13]  Anh Duc Duong,et al.  Addressing cold-start problem in recommendation systems , 2008, ICUIMC '08.

[14]  Andreas Rauber,et al.  The growing hierarchical self-organizing map: exploratory analysis of high-dimensional data , 2002, IEEE Trans. Neural Networks.

[15]  Pável Calado,et al.  Improving a hybrid literary book recommendation system through author ranking , 2012, JCDL '12.

[16]  Allen Kent,et al.  Use of Library Materials: The University of Pittsburgh Study. , 1979 .

[17]  Elaine Rich Users are individuals: individualizing user models , 1999, Int. J. Hum. Comput. Stud..

[18]  Tamara Heck,et al.  Testing Collaborative Filtering against Co-Citation Analysis and Bibliographic Coupling for Academic Author Recommendation , 2011 .

[19]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[20]  Martha Larson,et al.  Collaborative Filtering beyond the User-Item Matrix , 2014, ACM Comput. Surv..

[21]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[22]  Paula Viana,et al.  Tuning metadata for better movie content-based recommendation systems , 2014, Multimedia Tools and Applications.

[23]  Tommy W. S. Chow,et al.  Multilayer SOM With Tree-Structured Data for Efficient Document Retrieval and Plagiarism Detection , 2009, IEEE Transactions on Neural Networks.

[24]  Masao Fuketa,et al.  A document classification method by using field association words , 2000, Inf. Sci..

[25]  Elaine Rich,et al.  Users are Individuals: Individualizing User Models , 1999, Int. J. Man Mach. Stud..

[26]  Elaine Rich,et al.  User Modeling via Stereotypes , 1998, Cogn. Sci..

[27]  Yuan-Fang Wang,et al.  The use of bigrams to enhance text categorization , 2002, Inf. Process. Manag..

[28]  B. John Oommen,et al.  Topology-oriented self-organizing maps: a survey , 2014, Pattern Analysis and Applications.

[29]  Osmar R. Zaïane,et al.  Text document categorization by term association , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[30]  Tommy W. S. Chow,et al.  A new document representation using term frequency and vectorized graph connectionists with application to document retrieval , 2009, Expert Syst. Appl..

[31]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[32]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[33]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[34]  Choon Hui Teo,et al.  Fast and space efficient string kernels using suffix arrays , 2006, ICML.

[35]  Marcos R. Vieira,et al.  DBM-Tree: Trading height-balancing for performance in metric access methods , 2005, Journal of the Brazilian Computer Society.

[36]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.