Solving hanging relevancy using genetic algorithm

Continuous growth of hanging pages with Web makes a significant problem for ranking in the information retrieval. Exclusion of these pages in ranking calculation can give biased/inconsistent result. On the other hand inclusion of these pages will reduce the speed significantly. However most of the IR ranking algorithms exclude the hanging pages. But there are relevant and important hanging pages on the Web and they cannot be ignored because of the complexity in computation and time. In our proposed method, we include the relevant hanging pages in the ranking. Relevancy or non-relevancy of hanging pages is achieved by application of Genetic Algorithm (GA).

[1]  Melanie Mitchell,et al.  An introduction to genetic algorithms , 1996 .

[2]  Ashutosh Kumar Singh,et al.  PyBot: An Algorithm for Web Crawling , 2011, 2011 International Conference on Nanoscience, Technology and Societal Implications.

[3]  Franco Scarselli,et al.  Inside PageRank , 2005, TOIT.

[4]  Taher H. Haveliwala Efficient Computation of PageRank , 1999 .

[5]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[6]  Gene H. Golub,et al.  Exploiting the Block Structure of the Web for Computing , 2003 .

[7]  Jeremy T. Bradley,et al.  PageRank: Splitting Homogeneous Singular Linear Systems of Index One , 2009, ICTIR.

[8]  Ashutosh Kumar Singh,et al.  Efficient algorithm for handling dangling pages using hypothetical node , 2010, 6th International Conference on Digital Content, Multimedia Technology and its Applications.

[9]  Kevin S. McCurley,et al.  Ranking the web frontier , 2004, WWW '04.

[10]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[11]  Amy Nicole Langville,et al.  A Survey of Eigenvector Methods for Web Information Retrieval , 2005, SIAM Rev..

[12]  Eli Upfal,et al.  The Web as a graph , 2000, PODS.

[13]  Ilse C. F. Ipsen,et al.  PageRank Computation, with Special Attention to Dangling Nodes , 2007, SIAM J. Matrix Anal. Appl..

[14]  Ashutosh Kumar Singh,et al.  EFFICIENT METHODOLOGIES TO HANDLE HANGING PAGES USING VIRTUAL NODE , 2011, Cybern. Syst..

[15]  G. Golub,et al.  A Fast Two-Stage Algorithm for Computing PageRank , 2003 .

[16]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..