The Snoopy Concept: Fighting heterogeneity in semistructured and collaborative information systems by using recommendations

The collaborative creation and manipulation of semistructured data imposes the major problem of structure heterogeneity. The more users enter information, the more heterogeneous the structure of information becomes. This proliferation of the schema has a significantly negative impact on the performance of querying facilities as structured, unified access of data is no longer possible. In this paper we present the Snoopy Concept, a novel approach for collaborative, semistructured information systems within an online environment. It deals with structure heterogeneity by incorporating the user in the alignment process of data already during the insertion. This is accomplished by providing the users with useful recommendations how to structure information. Furthermore, the system encourages users to enter more information as it points users to missing bits of information.

[1]  Daniel S. Weld,et al.  Automatically refining the wikipedia infobox ontology , 2008, WWW.

[2]  Susan T. Dumais,et al.  The vocabulary problem in human-system communication , 1987, CACM.

[3]  Gerhard Weikum,et al.  Database and information-retrieval methods for knowledge discovery , 2009, CACM.

[4]  Hiroki Arimura,et al.  Efficient Substructure Discovery from Large Semi-Structured Data , 2001, IEICE Trans. Inf. Syst..

[5]  Eva Zangerle,et al.  Recommending structure in collaborative semistructured information systems , 2010, RecSys '10.

[6]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[7]  Oren Etzioni,et al.  Structured Querying of Web Text Data: A Technical Challenge , 2007, CIDR.

[8]  Brian McBride,et al.  The Resource Description Framework (RDF) and its Vocabulary Description Language RDFS , 2004, Handbook on Ontologies.

[9]  Sophie Ahrens,et al.  Recommender Systems , 2012 .

[10]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[11]  Paul Resnick,et al.  Recommender systems , 1997, CACM.

[12]  AnHai Doan,et al.  Mass Collaboration Systems on the World-Wide Web , 2010 .

[13]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[14]  Sören Auer,et al.  OntoWiki: A Tool for Social, Semantic Collaboration , 2006, CKC.

[15]  Jérôme Euzenat,et al.  A Survey of Schema-Based Matching Approaches , 2005, J. Data Semant..

[16]  Bo Leuf,et al.  The Wiki Way: Quick Collaboration on the Web , 2001 .

[17]  Markus Krötzsch,et al.  Semantic Wikipedia , 2006, WikiSym '06.

[18]  James Fogarty,et al.  Intelligence in Wikipedia , 2008, AAAI.

[19]  Eva Zangerle,et al.  SnoopyDB: narrowing the gap between structured and unstructured information using recommendations , 2010, HT '10.

[20]  Raghu Ramakrishnan,et al.  Building Structured Web Community Portals: A Top-Down, Compositional, and Incremental Approach , 2007, VLDB.

[21]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..

[22]  Yusuke Suzuki,et al.  Discovery of Frequent Tag Tree Patterns in Semistructured Web Documents , 2002, PAKDD.