Analyzing broken links on the web of data: An experiment with DBpedia

Linked open data allow interlinking and integrating any kind of data on the web. Links between various data sources play a key role insofar as they allow software applications (e.g., browsers, search engines) to operate over the aggregated data space as if it was a unique local database. In this new data space, where DBpedia, a data set including structured information from Wikipedia, seems to be the central hub, we analyzed and highlighted outgoing links from this hub in an effort to discover broken links. The paper reports on an experiment to examine the causes of broken links and proposes some treatments for solving this problem.

[1]  Ben Shneiderman,et al.  Analyzing Social Media Networks with NodeXL: Insights from a Connected World , 2010 .

[2]  John Scott Social Network Analysis , 1988 .

[3]  Michael Hausenblas,et al.  Scripting User Contributed Interlinking , 2008 .

[4]  Jens Lehmann,et al.  DBpedia and the live extraction of structured data from Wikipedia , 2012, Program.

[5]  Hugh C. Davis,et al.  Hypertext link integrity , 1999, CSUR.

[6]  Fangfang Liu,et al.  Using Metadata to Maintain Link Integrity for Linked Data , 2011, 2011 International Conference on Internet of Things and 4th International Conference on Cyber, Physical and Social Computing.

[7]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[8]  Enrico Motta,et al.  Linking Data across Universities: An Integrated Video Lectures Dataset , 2011, International Semantic Web Conference.

[9]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[10]  Martin Gaedke,et al.  Silk - A Link Discovery Framework for the Web of Data , 2009, LDOW.

[11]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[12]  Bernhard Haslhofer,et al.  DSNotify: handling broken links in the web of data , 2010, WWW '10.

[13]  Sören Auer,et al.  LIMES - A Time-Efficient Approach for Large-Scale Link Discovery on the Web of Data , 2011, IJCAI.

[14]  Sam X. Sun Establishing Persistent Identity using the Handle System , 2001, WWW Posters.

[15]  Marián Boguñá,et al.  Decoding the structure of the WWW: A comparative analysis of Web crawls , 2007, TWEB.

[16]  Les Carr,et al.  Preserving Linked Data on the Semantic Web by the application of Link Integrity techniques from Hypermedia , 2010, LDOW.

[17]  Gunther Eysenbach,et al.  Going, Going, Still There: Using the WebCite Service to Permanently Archive Cited Web Pages , 2005, AMIA.