Web cartography for online site promotion: an algorithm for clustering Web resources

Presents a Web cartography approach to be used in the context of online site promotion. The overall objective is to provide users with handy maps offering information about candidate sites for the creation of hyperlinks that enable a large flow of targeted visitors. Two main types of data must be considered: texts and hyperlinks. We propose to exploit the latter to construct a relevant corpus on which semantic as well as graph analyses can be applied. The stress is put on the clustering of Web resources based on the link network, which makes it possible to highlight groups of strongly connected sites which are of the utmost interest for our application. To tackle the site graph partitioning problem, we turn to a promising iterative approach initially developed in the context of computer-aided design. It uses spectral decomposition of the Laplacian matrix to embed the considered graph in a geometric space where efficient methods can be applied. An algorithm that was adapted from an existing one implements the method. Experiments were conducted on a real application case concerning the promotion of a site dealing with Cognac. We present the obtained map as well as leads to exploit it.

[1]  Ramana Rao,et al.  Silk from a sow's ear: extracting usable structures from the Web , 1996, CHI.

[2]  L. Platzman,et al.  Heuristics Based on Spacefilling Curves for Combinatorial Problems in Euclidean Space , 1988 .

[3]  Richard M. Karp,et al.  Probabilistic Analysis of Partitioning Algorithms for the Traveling-Salesman Problem in the Plane , 1977, Math. Oper. Res..

[4]  Ivan Herman,et al.  Graph Visualization and Navigation in Information Visualization: A Survey , 2000, IEEE Trans. Vis. Comput. Graph..

[5]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[6]  Andrew B. Kahng,et al.  Recent directions in netlist partitioning: a survey , 1995, Integr..

[7]  Helen P. Burwell,et al.  Online Competitive Intelligence : Increase Your Profits Using Cyber-Intelligence , 1999 .

[8]  Jon M. Kleinberg,et al.  Mining the Web's Link Structure , 1999, Computer.

[9]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[10]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[11]  C. Alpert,et al.  Splitting an Ordering into a Partition to Minimize Diameter , 1997 .

[12]  Martine D. F. Schlag,et al.  Spectral K-way ratio-cut partitioning and clustering , 1994, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[13]  Tim Bray,et al.  Measuring the Web , 1996, World Wide Web J..

[14]  Andrew B. Kahng,et al.  Multiway partitioning via geometric embeddings, orderings, and dynamic programming , 1995, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[15]  Loren G. Terveen,et al.  Constructing, organizing, and visualizing collections of topically related Web resources , 1999, TCHI.

[16]  Brian W. Kernighan,et al.  An Effective Heuristic Algorithm for the Traveling-Salesman Problem , 1973, Oper. Res..

[17]  K. Fan On a Theorem of Weyl Concerning Eigenvalues of Linear Transformations: II. , 1949, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Kenneth M. Hall An r-Dimensional Quadratic Placement Algorithm , 1970 .