Application of Genetic Algorithms to the Identification of Website Link Structure

This paper explores website link structure considering websites as interconnected graphs and analyzing their features as a social network. Factor Analysis provides the statistical methodology to adequately extract the main website profiles in terms of their internal structure. However, due to the large number of indicators, a genetic search of their optimum number is proposed, and applied to a case study based on 80 Spanish University websites. Results provide coherent and relevant website profiles, and highlight the possibilities of Genetic Algorithms as a tool for discovering new knowledge related to website link structures.

[1]  Sergio L. Toral Marín,et al.  Analysis of virtual communities supporting OSS projects using social network analysis , 2010, Inf. Softw. Technol..

[2]  Mike Thelwall,et al.  Bibliometrics to webometrics , 2008, J. Inf. Sci..

[3]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[4]  Cristina Faba Pérez,et al.  Comparative analysis of webometric measurements in thematic environments , 2005, J. Assoc. Inf. Sci. Technol..

[5]  Sergio L. Toral Marín,et al.  Virtual communities as a resource for the development of OSS projects: the case of Linux ports to embedded processors , 2009, Behav. Inf. Technol..

[6]  Gek Woo Tan,et al.  An empirical study of Web browsing behaviour: Towards an effective Website design , 2006, Electron. Commer. Res. Appl..

[7]  Ricardo A. Baeza-Yates,et al.  Characterization of national Web domains , 2007, TOIT.

[8]  E. Smith Methods of Multivariate Analysis , 1997 .

[9]  Ioannis Pitas,et al.  Combining text and link analysis for focused crawling - An application for vertical search engines , 2007, Inf. Syst..

[10]  Sergio Toral,et al.  International comparison of R&D investment by European, US and Japanese companies , 2010 .

[11]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[12]  José Luis Ortega,et al.  Visualization of the Nordic academic web: Link analysis using social network tools , 2008, Inf. Process. Manag..

[13]  Dawn Iacobucci Graphs and Matrices , 1994 .

[14]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[15]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994 .

[16]  Vladimir Batagelj,et al.  Exploratory Social Network Analysis with Pajek , 2005 .

[17]  Connie M. Borror,et al.  Methods of Multivariate Analysis, 2nd Ed. , 2004 .

[18]  Peter Ingwersen,et al.  Toward a basic framework for webometrics , 2004, J. Assoc. Inf. Sci. Technol..

[19]  Thierson Couto,et al.  Modeling the web as a hypergraph to compute page reputation , 2010, Inf. Syst..

[20]  José Luis Ortega,et al.  Mapping world-class universities on the web , 2009, Inf. Process. Manag..

[21]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[22]  Mike Thelwall,et al.  Link Analysis: An Information Science Approach , 2004 .

[23]  Bo Yang,et al.  Data collection system for link analysis , 2008, 2008 Third International Conference on Digital Information Management.

[24]  Sergio L. Toral Marín,et al.  Strategic group identification using evolutionary computation , 2010, Expert Syst. Appl..

[25]  Sergio L. Toral Marín,et al.  International comparison of R&D investment by European, US and Japanese companies , 2010, Int. J. Technol. Manag..

[26]  D. E. Goldberg,et al.  Genetic Algorithm in Search , 1989 .

[27]  Sergio L. Toral Marín,et al.  An empirical study of the driving forces behind online communities , 2009, Internet Res..