Structured data clouding across multiple webs

The variety of web resources available to users for their business or personal needs is growing, spanning from fast, short, ready-to-consume news/posts to well-structured, formal ontology instances of the Semantic Web. In this context, users require to retrieve very fast all available prominent information about target entities regarding events, people, situations and similar. In this paper, we introduce the notion of inCloud (information Cloud) and we propose an approach to web resource clouding for the construction of inClouds. inClouds are built for a target entity of interest by distinguishing, also in a visual way, how much prominent the retrieved web resource(s) are with respect to the target entity and by organizing web resources according to their reciprocal levels of closeness. An application of the proposed approach to a collection of real web resources about movies is presented. Applicability and evaluation issues are also discussed.

[1]  Eyal Oren,et al.  Sindice.com: Weaving the Open Linked Data , 2007, ISWC/ASWC.

[2]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[3]  Sonia Bergamaschi,et al.  Schema Normalization for Improving Schema Matching , 2009, ER.

[4]  Mark E. J. Newman A measure of betweenness centrality based on random walks , 2005, Soc. Networks.

[5]  Gary Marchionini,et al.  Exploratory search , 2006, Commun. ACM.

[6]  Tommaso Di Noia,et al.  Semantic Wonder Cloud: Exploratory Search in DBpedia , 2010, ICWE Workshops.

[7]  Michael Hausenblas,et al.  Exploiting Linked Data to Build Web Applications , 2009, IEEE Internet Computing.

[8]  Xin Li,et al.  A novel clustering-based RSS aggregator , 2007, WWW '07.

[9]  Xue Dong Yang,et al.  User-Oriented Evaluation Methods for Interactive Web Search Interfaces , 2007, 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops.

[10]  Christian Bauckhage,et al.  Detecting Trends in Social Bookmarking Systems: A del.icio.us Endeavor , 2010, Int. J. Data Warehous. Min..

[11]  Monica M. C. Schraefel,et al.  Tabulator Redux: Browsing and Writing Linked Data , 2008, LDOW.

[12]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[13]  Jayant Madhavan,et al.  Web-Scale Data Integration: You can afford to Pay as You Go , 2007, CIDR.

[14]  K. Goh,et al.  Betweenness centrality correlation in social networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Silvana Castano,et al.  Multimedia Interpretation for Dynamic Ontology Evolution , 2009, J. Log. Comput..

[16]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[17]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.

[18]  Ed H. Chi,et al.  The Social Web: Research and Opportunities , 2008, Computer.

[19]  John Grundy,et al.  Interactive Visualization Tools for Exploring the Semantic Graph of Large Knowledge Spaces , 2009 .

[20]  Mike Bauer,et al.  Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative Research, November 5-7, 2001, Toronto, Ontario, Canada , 2001, CASCON.

[21]  Silvana Castano,et al.  Global Viewing of Heterogeneous Data Sources , 2001, IEEE Trans. Knowl. Data Eng..

[22]  Mo Chen,et al.  A practical system of keyphrase extraction for web pages , 2005, CIKM '05.

[23]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[24]  Yolanda Gil,et al.  Towards content trust of web resources , 2007, J. Web Semant..

[25]  Stefan Decker,et al.  Sig.ma: Live views on the Web of Data , 2010, J. Web Semant..

[26]  Georgia Koutrika,et al.  Social Systems: Can We Do More Than Just Poke Friends? , 2009, CIDR.

[27]  A. Leclercq The perceptual evaluation of information systems using the construct of user satisfaction: case study of a large french group , 2007, DATB.

[28]  John Skvoretz,et al.  Node centrality in weighted networks: Generalizing degree and shortest paths , 2010, Soc. Networks.

[29]  Jignesh M. Patel,et al.  Efficient aggregation for graph summarization , 2008, SIGMOD Conference.

[30]  Silvana Castano,et al.  Dealing with Matching Variability of Semantic Web Data Using Contexts , 2010, CAiSE.

[31]  E. David,et al.  Networks, Crowds, and Markets: Reasoning about a Highly Connected World , 2010 .

[32]  Margaret-Anne D. Storey,et al.  Synchronized tag clouds for exploring semi-structured clinical trial data , 2008, CASCON '08.

[33]  Takashi Washio,et al.  State of the art of graph-based data mining , 2003, SKDD.

[34]  Wolfram Wöß,et al.  A Semantic Web middleware for Virtual Data Integration on the Web , 2008, ESWC.

[35]  Maurizio Vincini,et al.  RELEVANTNews: a Semantic News Feed Aggregator , 2007, SWAP.

[36]  Silvana Castano,et al.  Building Collective Tag Intelligence through Folksonomy Coordination , 2011, Next Generation Data Technologies for Collective Computational Intelligence.

[37]  Georgia Koutrika,et al.  Data clouds: summarizing keyword search results over structured data , 2009, EDBT '09.

[38]  Benjamin M. Good,et al.  Tag clouds for summarizing web search results , 2007, WWW '07.

[39]  J. Euzenat,et al.  Ontology Matching , 2007, Springer Berlin Heidelberg.

[40]  Dragomir R. Radev,et al.  NewsInEssence: summarizing online news topics , 2005, Commun. ACM.