EXPLORATORY ANALYSIS OF WEB DATA: METHODS, TOOLS AND GEOGRAPHICAL DISTRIBUTION

We propose a methodology for the exploration of Web data, built on the principles of exploratory data analysis and analyt- ical visualisation of information. Our approach aims at combining these two approaches in order to benefit from both of them. This allows us to explore heterogeneous complex dynamic systems such as the Web, and to construct emergent structures and indicators without getting lost. By studying the geographical dimension for a specific Web locality, which is exemplary in many ways, we were able to test our methodology and various visualisation tools, thus validating our theoretical proposals.

[1]  Jeffrey Heer,et al.  Protovis: A Graphical Toolkit for Visualization , 2009, IEEE Transactions on Visualization and Computer Graphics.

[2]  Fabien Pfaender Spatialisation de l'information , 2009 .

[3]  Jon M. Kleinberg,et al.  The Web as a Graph: Measurements, Models, and Methods , 1999, COCOON.

[4]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[5]  F. Pfaender,et al.  Two Visions of the Web: from Globality to Localities , 2006, 2006 2nd International Conference on Information & Communication Technologies.

[6]  Noel M. O'Boyle Janert PK: Data Analysis with Open Source Tools , 2011 .

[7]  John W. Tukey,et al.  Exploratory Data Analysis. , 1979 .

[8]  Arno Scharl,et al.  The Geospatial Web: How Geobrowsers, Social Software and the Web 2.0 are Shaping the Network Society , 2007, The Geospatial Web.

[9]  E. Tufte Beautiful Evidence , 2006 .

[10]  Kristin A. Cook,et al.  Illuminating the Path: The Research and Development Agenda for Visual Analytics , 2005 .

[11]  Ben Shneiderman,et al.  Readings in information visualization - using vision to think , 1999 .

[12]  D. Watts The “New” Science of Networks , 2004 .

[13]  Philipp K. Janert,et al.  Data Analysis with Open Source Tools , 2010 .

[14]  B. Huberman,et al.  The Deep Web : Surfacing Hidden Value , 2000 .

[15]  Dominique Boullier,et al.  L'outre-lecture. Manipuler, (s') approprier, interpréter le Web , 2003 .

[16]  Maximino Aldana-Gonzalez,et al.  Linked: The New Science of Networks , 2003 .

[17]  P. Cilliers,et al.  Complexity and post-modernism: understanding complex systems , 1999 .

[18]  Martin Bergman,et al.  The deep web:surfacing the hidden value , 2000 .

[19]  Gennady L. Andrienko,et al.  Exploratory analysis of spatial and temporal data - a systematic approach , 2005 .

[20]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.