Tagpedia: a Semantic Reference to Describe and Search for Web Resources

Nowadays the Web represents a growing collection of an enormous amount of contents where the need for better ways to find and organize the available data is becoming a fundamental issue, in order to deal with information overload. Keyword based Web searches are actually the preferred mean to seek for contents related to a specific topic. Search engines and collaborative tagging systems make possible the search for information thanks to the association of descriptive keywords to Web resources. All of them show problems of inconsistency and consequent reduction of recall and precision of searches, due to polysemy, synonymy and in general all the dierent lexical forms that can be used to refer to a particular meaning. A possible way to face or at least reduce these problems is represented by the introduction of semantics to characterize the contents of Web resources: each resource is described by one or more concepts instead of simple and often ambiguous keywords. To support these task the availability of a global semantic resource of reference is fundamental. On the basis of our past experience with the semantic tagging of Web resources and the SemKey Project, we are developing Tagpedia, a general-domain ”encyclopedia” of tags, semantically structured for generating semantic descriptions of contents over the Web, created by mining Wikipedia. In this paper, starting from an analysis of the weak points of non-semantic keyword based Web searches, we introduce our idea of semantic characterization of Web resources describing the structure and organization of Tagpedia. We introduce our first realization of Tagpedia, suggesting all the possible improvements that can be carried out in order to exploit its full potential.

[1]  Jens Lehmann,et al.  What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content , 2007, ESWC.

[2]  Hugh C. Davis,et al.  Creating Structure from Disorder - Using Folksonomies to Create Semantic Metadata , 2007, WEBIST.

[3]  Daniel S. Weld,et al.  Autonomously semantifying wikipedia , 2007, CIKM '07.

[4]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[5]  Andreas Hotho,et al.  FolkRank : A Ranking Algorithm for Folksonomies , 2006, LWA.

[6]  Wolfgang Nejdl,et al.  Extracting Semantics Relationships between Wikipedia Categories , 2006, SemWiki.

[7]  Xueqi Cheng,et al.  Semantic Convergence of Wikipedia Articles , 2007, IEEE/WIC/ACM International Conference on Web Intelligence (WI'07).

[8]  Razvan C. Bunescu,et al.  Using Encyclopedic Knowledge for Named entity Disambiguation , 2006, EACL.

[9]  Marieke Guy,et al.  Folksonomies: Tidying Up Tags? , 2006, D Lib Mag..

[10]  Simone Paolo Ponzetto,et al.  WikiRelate! Computing Semantic Relatedness Using Wikipedia , 2006, AAAI.

[11]  Jianchang Mao,et al.  Towards the Semantic Web: Collaborative Tag Suggestions , 2006 .

[12]  J. Voß Measuring Wikipedia , 2005 .

[13]  Mining a Large-Scale Term-Concept Network from Wikipedia , 2006 .

[14]  Bernardo A. Huberman,et al.  The Structure of Collaborative Tagging Systems , 2005, ArXiv.

[15]  Andrea Marchetti,et al.  SemKey: A Semantic Collaborative Tagging System , 2007 .

[16]  Stefano Levialdi,et al.  Semantic Halo for Collaboration Tagging Systems , 2006 .

[17]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.