Agent for Mining of Significant Concepts in DBpedia

DBpedia.org is a community effort that tries to extract structured information from Wikipedia such that the extracted information can be queried just like a database. This information is opened to public in the form of RDF triple which is compatible with the semantic web standard. Various applications are developed for the purpose of utilizing the structured data in DBpedia. This paper makes an attempt to apply PageRank analysis on the link structure of DBpedia using a mining agent to mine significant concepts in DBpedia. Based on the result, popular concepts have the tendency to be ranked higher than the less popular ones. This paper also proposes an alternative view on how PageRank analysis can be applied to DBpedia link structure based on special characteristics of Wikipedia. The result shows that even concepts with a low PageRank value can be used as a valuable resource for recommending pages in Wikipedia.

[1]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[2]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[3]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[4]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[5]  Gustavo Rossi,et al.  Web Engineering , 2001, Lecture Notes in Computer Science.

[6]  Tommaso Di Noia,et al.  Ranking the Linked Data: The Case of DBpedia , 2010, ICWE.

[7]  Patricia Anthony,et al.  PageRank: A modified random surfer model , 2011, 2011 7th International Conference on Information Technology in Asia.

[8]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..