A Framework for Conjunctive Query Answering over Distributed Deep Web Information Resources

Deep Web Information Resources (DWIRs) are data that are accessible through web forms but are not indexable by search engines. We propose a novel framework to tackle the problem of conjunctive query (CQ) answering over a mediated schema in which the local resources are DWIR. To this aim, we propose to use techniques from the field of Distributed Information Retrieval (DIR). We discuss a novel approach to automated DWIR sampling, size estimation and selection, as well as an approach to result list merging.

[1]  Fernando Diaz,et al.  Classification-based resource selection , 2009, CIKM.

[2]  Diego Calvanese,et al.  The DL-Lite Family and Relations , 2009, J. Artif. Intell. Res..

[3]  Umberto Straccia,et al.  A General Framework for Representing, Reasoning and Querying with Annotated Semantic Web Data , 2011, J. Web Semant..

[4]  Milad Shokouhi,et al.  Federated Search , 2011, Found. Trends Inf. Retr..

[5]  Milad Shokouhi,et al.  Robust result merging using sample-based score estimates , 2009, TOIS.

[6]  Umberto Straccia,et al.  Top-k retrieval for ontology mediated access to relational databases , 2012, Inf. Sci..

[7]  Avi Arampatzis,et al.  On CORI Results Merging , 2013, ECIR.

[8]  Ling Liu,et al.  Distributed query sampling: a quality-conscious approach , 2006, SIGIR '06.

[9]  Umberto Straccia,et al.  Web metasearch: rank vs. score based rank aggregation methods , 2003, SAC '03.

[10]  Umberto Straccia,et al.  Information retrieval and machine learning for probabilistic schema matching , 2005, CIKM '05.

[11]  Andrea Calì,et al.  Dynamic Query Optimization under Access Limitations and Dependencies , 2009, J. Univers. Comput. Sci..

[12]  James P. Callan,et al.  Query-based sampling of text databases , 2001, TOIS.

[13]  Andrea Calì,et al.  Datalog+/-: A Family of Logical Knowledge Representation and Query Languages for New Applications , 2010, 2010 25th Annual IEEE Symposium on Logic in Computer Science.

[14]  Edward Y. Chang,et al.  Query planning with limited source capabilities , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[15]  Luo Si,et al.  Unified utility maximization framework for resource selection , 2004, CIKM '04.

[16]  Andrea Calì,et al.  Querying Data under Access Limitations , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[17]  Umberto Straccia,et al.  A Probabilistic, Logic-Based Framework for Automated Web Directory Alignment , 2006 .

[18]  Umberto Straccia,et al.  Towards Distributed Information Retrieval in the Semantic Web: Query Reformulation Using the oMAP Framework , 2006, ESWC.

[19]  David Hawking,et al.  Server selection methods in personal metasearch: a comparative empirical study , 2009, Information Retrieval.

[20]  King-Lup Liu,et al.  A Methodology to Retrieve Text Documents from Multiple Databases , 2002, IEEE Trans. Knowl. Data Eng..

[21]  Milad Shokouhi,et al.  Federated Search , 2011, Found. Trends Inf. Retr..

[22]  Peter Harrington,et al.  Machine Learning in Action , 2012 .

[23]  Umberto Straccia,et al.  A top-k query answering procedure for fuzzy logic programming , 2012, Fuzzy Sets Syst..

[24]  Paul Thomas,et al.  To what problem is distributed information retrieval the solution? , 2012, J. Assoc. Inf. Sci. Technol..

[25]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[26]  Umberto Straccia,et al.  On the Top-k Retrieval Problem for Ontology-Based Access to Databases , 2013, Flexible Approaches in Data, Information and Knowledge Management.