Determining the Similarity of Web Pages based on Learning Automata and Probabilistic Grammar

As the number of web pages increases, search for useful information by users on web sites will become more significant. By determining the similarity of web pages, search quality can be improved; hence, users can easily find their relevant information. In this paper, distributed learning automata and probabilistic grammar were used to propose a new hybrid algorithm in order to specify the similarity of web pages by means of web usage data. In the proposed algorithm, a Learning Automata (LA) for each web page is assigned which its function is to evaluate association rules extracted by hypertext system. This learning process continues until the similarity of web pages are determined. Experimental results demonstrate the efficiency of the proposed algorithm over other existing techniques.

[1]  Mark Levene,et al.  A Probabilistic Approach to Navigation in Hypertext , 1999, Inf. Sci..

[2]  Jiming Liu,et al.  Characterizing Web usage regularities with information foraging agents , 2004, IEEE Transactions on Knowledge and Data Engineering.

[3]  Mohammad Reza Meybodi,et al.  Clustering Web Access Patterns Based on learning Automata , 2011 .

[4]  M. R. Meybodi,et al.  Determining web pages similarity using distributed learning automata and graph partitioning , 2011, 2011 International Symposium on Artificial Intelligence and Signal Processing (AISP).

[5]  William W. Cohen Learning and Discovering Structure in Web Pages , 2003, IEEE Data Eng. Bull..

[6]  Mark Levene,et al.  Mining Association Rules in Hypertext Databases , 1998, KDD.

[7]  Mark Levene,et al.  Data Mining of User Navigation Patterns , 1999, WEBKDD.

[8]  Javad Akbari Torkestani,et al.  A new method based on distributed learning automata for page ranking in web , 2012 .

[9]  Célia Ghedini Ralha,et al.  AntWeb - the adaptive Web server based on the ants' behavior , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[10]  Ramesh R. Sarukkai,et al.  Link prediction and path analysis using Markov chains , 2000, Comput. Networks.

[11]  MAGDALINI EIRINAKI,et al.  Web mining for web personalization , 2003, TOIT.

[12]  Mohammad Reza Meybodi,et al.  Web page ranking based on fuzzy and learning automata , 2009, MEDES.

[13]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .

[14]  Jianhan Zhu Using Markov Chains for Structural Link Prediction in Adaptive Web Sites , 2001, User Modeling.

[15]  Jaideep Srivastava,et al.  Proceedings of the 8th Knowledge discovery on the web international conference on Advances in web mining and web usage analysis , 2006 .

[16]  Johan Bollen,et al.  Hebbian algorithms for a digital library recommendation system , 2002, Proceedings. International Conference on Parallel Processing Workshop.

[17]  Mohammad Reza Meybodi,et al.  A New Distributed Learning Automata Based Algorithm For Solving Stochastic Shortest Path Problem , 2002, JCIS.

[18]  Oren Etzioni,et al.  Adaptive Web sites , 2000, CACM.

[19]  Prabhakar Raghavan,et al.  Mining the Link Structure of the World Wide Web , 1998 .

[20]  Maurice Mulvenna,et al.  Personalization on the Net using Web Mining , 2000 .

[21]  Maurice D. Mulvenna,et al.  Personalization on the Net using Web mining: introduction , 2000, CACM.

[22]  Yanchun Zhang,et al.  Web Mining and Social Networking: Techniques and Applications , 2010 .

[23]  C. S. Wetherell,et al.  Probabilistic Languages: A Review and Some Open Questions , 1980, CSUR.

[24]  John Yen,et al.  Advances in Web Mining and Web Usage Analysis, 8th International Workshop on Knowledge Discovery on the Web, WebKDD 2006, Philadelphia, PA, USA, August 20, 2006, Revised Papers , 2007, WebKDD.

[25]  Jianhan Zhu,et al.  Mining Web Site Link Structures for Adaptive Web Site Navigation and Search , 2003 .

[26]  Mohammad Reza Meybodi,et al.  Solving Stochastic Shortest Path Problem Using Monte Carlo Sampling Method: A Distributed Learning Automata Approach , 2003 .

[27]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[28]  Bamshad Mobasher,et al.  Intelligent Techniques for Web Personalization: IJCAI 2003 Workshop, ITWP 2003, Acapulco, Mexico, August 11, 2003, Revised Selected Papers , 2005 .

[29]  Jaideep Srivastava,et al.  Creating adaptive Web sites through usage-based clustering of URLs , 1999, Proceedings 1999 Workshop on Knowledge and Data Engineering Exchange (KDEX'99) (Cat. No.PR00453).

[30]  Bamshad Mobasher,et al.  Intelligent Techniques for Web Personalization , 2005, Lecture Notes in Computer Science.