Answering queries using views: A survey

Abstract. The problem of answering queries using views is to find efficient methods of answering a query using a set of previously defined materialized views over the database, rather than accessing the database relations. The problem has recently received significant attention because of its relevance to a wide variety of data management problems. In query optimization, finding a rewriting of a query using a set of materialized views can yield a more efficient query execution plan. To support the separation of the logical and physical views of data, a storage schema can be described using views over the logical schema. As a result, finding a query execution plan that accesses the storage amounts to solving the problem of answering queries using views. Finally, the problem arises in data integration systems, where data sources can be described as precomputed views over a mediated schema. This article surveys the state of the art on the problem of answering queries using views, and synthesizes the disparate works into a coherent framework. We describe the different applications of the problem, the algorithms proposed to solve it and the relevant theoretical results.

[1]  Gio Wiederhold,et al.  Mediators in the architecture of future information systems , 1992, Computer.

[2]  Arthur M. Keller,et al.  A predicate-based caching scheme for client-server database architectures , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[3]  Jonathan Goldstein,et al.  Optimizing queries using materialized views: a practical, scalable solution , 2001, SIGMOD '01.

[4]  Serge Abiteboul,et al.  Complexity of answering queries using materialized views , 1998, PODS.

[5]  Michael R. Genesereth,et al.  Query planning and optimization in information integration , 1997 .

[6]  Rada Chirkova,et al.  A formal perspective on the view selection problem , 2002, The VLDB Journal.

[7]  Anand Rajaraman,et al.  Conjunctive query containment revisited: Extended Abstract , 1997, ICDT 1997.

[8]  Patrick Valduriez,et al.  Answering Queries Using OQL View Expressions , 1996, VIEWS.

[9]  Michael R. Genesereth,et al.  Query planning in infomaster , 1997, SAC '97.

[10]  Alberto O. Mendelzon,et al.  Querying partially sound and complete data sources , 2001, PODS '01.

[11]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[12]  Elena Baralis,et al.  Materialized Views Selection in a Multidimensional Database , 1997, VLDB.

[13]  Hamid Pirahesh,et al.  Answering complex SQL queries using automatic summary tables , 2000, SIGMOD 2000.

[14]  Alon Y. Halevy,et al.  Queries Independent of Updates , 1993, VLDB.

[15]  Daniela Florescu,et al.  Quilt: An XML Query Language for Heterogeneous Data Sources , 2000, WebDB.

[16]  Maurizio Lenzerini,et al.  Representing and Using Interschema Knowledge in Cooperative Information Systems , 1993, Int. J. Cooperative Inf. Syst..

[17]  Jennifer Widom,et al.  The Lorel query language for semistructured data , 1997, International Journal on Digital Libraries.

[18]  Zachary G. Ives,et al.  An adaptive query execution engine for data integration , 1999 .

[19]  Felix Naumann,et al.  Quality-driven Integration of Heterogenous Information Systems , 1999, VLDB.

[20]  Patrick Valduriez,et al.  Join indices , 1987, TODS.

[21]  Surajit Chaudhuri,et al.  On the equivalence of recursive and nonrecursive datalog programs , 1992, J. Comput. Syst. Sci..

[22]  Surajit Chaudhuri,et al.  Optimization of real conjunctive queries , 1993, PODS '93.

[23]  Alon Y. Halevy,et al.  Obtaining Complete Answers from Incomplete Databases , 1996, VLDB.

[24]  Dan Suciu,et al.  A query language and optimization techniques for unstructured data , 1996, SIGMOD '96.

[25]  Dan Suciu,et al.  A query language for a Web-site management system , 1997, SGMD.

[26]  Daniel S. Weld,et al.  Planning to Gather Information , 1996, AAAI/IAAI, Vol. 1.

[27]  Alin Deutsch,et al.  A chase too far? , 2000, SIGMOD '00.

[28]  David Maier,et al.  Testing implications of data dependencies , 1979, SIGMOD '79.

[29]  Per-Åke Larson,et al.  Query Transformation for PSJ-Queries , 1987, VLDB.

[30]  Alin Deutsch,et al.  Physical Data Independence, Constraints, and Optimization with Universal Plans , 1999, VLDB.

[31]  Alon Y. Halevy,et al.  Efficiently ordering query plans for data integration , 1999, Proceedings 18th International Conference on Data Engineering.

[32]  Ashish Gupta,et al.  Materialized views: techniques, implementations, and applications , 1999 .

[33]  Joann J. Ordille,et al.  Query-Answering Algorithms for Information Agents , 1996, AAAI/IAAI, Vol. 1.

[34]  Ashish Gupta,et al.  Aggregate-Query Processing in Data Warehousing Environments , 1995, VLDB.

[35]  K. Selçuk Candan,et al.  Query caching and optimization in distributed mediator systems , 1996, SIGMOD '96.

[36]  Laurent Amsaleg,et al.  Cost-based query scrambling for initial delays , 1998, SIGMOD '98.

[37]  Oded Shmueli,et al.  Equivalence of DATALOG Queries is Undecidable , 1993, J. Log. Program..

[38]  Phokion G. Kolaitis,et al.  On the complexity of the containment problem for conjunctive queries with built-in predicates , 1998, PODS '98.

[39]  Oren Etzioni,et al.  Sound and Efficient Closed-World Reasoning for Planning , 1997, Artif. Intell..

[40]  Surajit Chaudhuri,et al.  Microsoft index turning wizard for SQL Server 7.0 , 1998, SIGMOD '98.

[41]  Nick Roussopoulos,et al.  The Implementation and Performance Evaluation of the ADMS Query Optimizer: Integrating Query Result Caching and Matching , 1994, EDBT.

[42]  Jennifer Widom,et al.  The TSIMMIS Project: Integration of Heterogeneous Information Sources , 1994, IPSJ.

[43]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[44]  Val Tannen,et al.  An Equational Chase for Path-Conjunctive Queries, Constraints, and Views , 1999, ICDT.

[45]  Divesh Srivastava,et al.  Pushing constraint selections , 1992, J. Log. Program..

[46]  Peter Buneman,et al.  Semistructured data , 1997, PODS.

[47]  Anand Rajaraman,et al.  Answering queries using templates with binding patterns (extended abstract) , 1995, PODS.

[48]  Serge Abiteboul,et al.  Querying Semi-Structured Data , 1997, Encyclopedia of Database Systems.

[49]  Diego Calvanese,et al.  Answering regular path queries using views , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[50]  Michael R. Genesereth,et al.  Answering recursive queries using views , 1997, PODS '97.

[51]  Mohamed Ziauddin,et al.  Materialized Views in Oracle , 1998, VLDB.

[52]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[53]  Surajit Chaudhuri,et al.  Automated Selection of Materialized Views and Indexes in SQL Databases , 2000, VLDB.

[54]  Yehoshua Sagiv,et al.  Optimizing datalog programs , 1987, Foundations of Deductive Databases and Logic Programming..

[55]  Rada Chirkova,et al.  Linearly Bounded Reformulations of Conjunctive Databases , 2000, Computational Logic.

[56]  Alexander Borgida,et al.  Description Logics in Data Management , 1995, IEEE Trans. Knowl. Data Eng..

[57]  Alon Y. Halevy,et al.  MiniCon: A scalable algorithm for answering queries using views , 2000, The VLDB Journal.

[58]  Jeffrey D. Ullman,et al.  Answering queries using templates with binding patterns (extended abstract) , 1995, PODS '95.

[59]  Manolis Gergatsoulis,et al.  Answering Queries Using Materialized Views with Disjunctions , 1999, ICDT.

[60]  Renée J. Miller Using schematically heterogeneous structures , 1998, SIGMOD '98.

[61]  Jeffrey D. Ullman,et al.  Information integration using logical views , 1997, Theor. Comput. Sci..

[62]  Alon Y. Halevy,et al.  Recursive Query Plans for Data Integration , 2000, J. Log. Program..

[63]  Marc Friedman,et al.  Efficient execution of information gatheriug plans , 1997, IJCAI 1997.

[64]  Gao Jun,et al.  QUERY REWRITING FOR SEMI-STRUCTURED DATA , 2002 .

[65]  Ashok K. Chandra,et al.  Optimal implementation of conjunctive queries in relational data bases , 1977, STOC '77.

[66]  Anand Rajaraman,et al.  Answering Queries Using Limited External Processors. , 1996, ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems.

[67]  Subbarao Kambhampati,et al.  Optimizing Recursive Information-Gathering Plans , 1999, IJCAI.

[68]  Alberto O. Mendelzon,et al.  Database techniques for the World-Wide Web: a survey , 1998, SGMD.

[69]  Diego Calvanese,et al.  Answering Queries Using Views in Description Logics , 1999, KRDB.

[70]  J W Ballard,et al.  Data on the web? , 1995, Science.

[71]  Alon Y. Halevy,et al.  Recursive Plans for Information Gathering , 1997, IJCAI.

[72]  Venky Harinarayan,et al.  Implementing Data Cubes E ciently , 1996 .

[73]  Alon Y. Halevy,et al.  An adaptive query execution system for data integration , 1999, SIGMOD '99.

[74]  Anand Rajaraman,et al.  Conjunctive query containment revisited , 1997, Theor. Comput. Sci..

[75]  Chen Li,et al.  Generating efficient plans for queries using views , 2001, SIGMOD '01.

[76]  Jian Yang,et al.  Algorithms for Materialized View Design in Data Warehousing Environment , 1997, VLDB.

[77]  Jarek Gryz,et al.  Query folding with inclusion dependencies , 1998, Proceedings 14th International Conference on Data Engineering.

[78]  Alon Y. Halevy,et al.  Updating XML , 2001, SIGMOD '01.

[79]  Hamid Pirahesh,et al.  Extensible query processing in starburst , 1989, SIGMOD '89.

[80]  Guido Moerkotte,et al.  Heuristic and randomized optimization for the join ordering problem , 1997, The VLDB Journal.

[81]  Dan Suciu,et al.  What Can Database Do for Peer-to-Peer? , 2001, WebDB.

[82]  Xiaolei Qian,et al.  Query folding , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[83]  David J. DeWitt,et al.  Relational Databases for Querying XML Documents: Limitations and Opportunities , 1999, VLDB.

[84]  Marvin H. Solomon,et al.  The GMAP: a versatile tool for physical data independence , 1996, The VLDB Journal.

[85]  Diego Calvanese,et al.  View-based query processing for regular path queries with inverse , 2000, PODS '00.

[86]  Todd D. Millstein,et al.  Query containment for data integration systems , 2000, PODS '00.

[87]  Divesh Srivastava,et al.  Semantic Data Caching and Replacement , 1996, VLDB.

[88]  Surajit Chaudhuri,et al.  On the complexity of equivalence between recursive and nonrecursive Datalog programs , 1994, PODS '94.

[89]  Oren Etzioni,et al.  Tractable Closed World Reasoning with Updates , 1994, KR.

[90]  HalevyAlon,et al.  MiniCon: A scalable algorithm for answering queries using views , 2001, VLDB 2001.

[91]  Marek Rusinkiewicz,et al.  The identification of missing information resources through the query difference operator , 1999, Proceedings Fourth IFCIS International Conference on Cooperative Information Systems. CoopIS 99 (Cat. No.PR00384).

[92]  Mihalis Yannakakis,et al.  Equivalence among Relational Expressions with the Union and Difference Operation , 1978, VLDB.

[93]  Hamid Pirahesh,et al.  Answering complex SQL queries using automatic summary tables , 2000, SIGMOD '00.

[94]  Werner Nutt,et al.  Rewriting aggregate queries using views , 1999, PODS.

[95]  Daniel S. Weld,et al.  Planning to gather inforrnation , 1996, AAAI 1996.

[96]  Jeffrey D. Ullman,et al.  Index selection for OLAP , 1997, Proceedings 13th International Conference on Data Engineering.

[97]  Valérie Issarny,et al.  Caching Strategies for Data-Intensive Web Sites , 2000, VLDB.

[98]  Alon Y. Halevy,et al.  Speeding up Inferences Using Relevance Reasoning: A Formalism and Algorithms , 1997, Artif. Intell..

[99]  Dan Suciu,et al.  Optimization of Run-time Management of Data Intensive Web-sites , 1999, VLDB.

[100]  Divesh Srivastava,et al.  Answering Queries Using Views. , 1999, PODS 1995.

[101]  Stéphane Grumbach,et al.  On the content of materialized aggregate views , 2000, PODS '00.

[102]  Yannis Papakonstantinou,et al.  Query rewriting for semistructured data , 1999, SIGMOD '99.

[103]  Alberto O. Mendelzon,et al.  Tableau Techniques for Querying Information Sources through Global Schemas , 1999, ICDT.

[104]  Alon Y. Levy Logic-based techniques in data integration , 2001 .

[105]  Timos K. Sellis,et al.  Data Warehouse Configuration , 1997, VLDB.

[106]  Maurizio Rafanelli,et al.  Querying aggregate data , 1999, PODS '99.

[107]  Jeffrey D. Ullman,et al.  Principles of Database and Knowledge-Base Systems, Volume II , 1988, Principles of computer science series.

[108]  Yannis Papakonstantinou,et al.  Describing and Using Query Capabilities of Heterogeneous Sources , 1997, VLDB.

[109]  Inderpal Singh Mumick,et al.  Selection of Views to Materialize Under a Maintenance Cost Constraint , 1999, ICDT.

[110]  Mihalis Yannakakis,et al.  Equivalences Among Relational Expressions with the Union and Difference Operators , 1980, J. ACM.

[111]  Catriel Beeri,et al.  Rewriting queries using views in description logics , 1997, PODS '97.

[112]  Anthony C. Klug On conjunctive queries containing inequalities , 1988, JACM.

[113]  Kyuseok Shim,et al.  Optimizing queries with materialized views , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[114]  Surajit Chaudhuri,et al.  AutoAdmin “what-if” index analysis utility , 1998, SIGMOD '98.

[115]  Z. Meral Özsoyoglu,et al.  On Efficient Reasoning with Implication Constraints , 1993, DOOD.

[116]  Alin Deutsch,et al.  A Query Language for XML , 1999, Comput. Networks.

[117]  Jeffrey D. Ullman,et al.  Answering queries using limited external query processors (extended abstract) , 1996, PODS.

[118]  Diego Calvanese,et al.  Rewriting of regular expressions and regular path queries , 1999, PODS '99.

[119]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[120]  Y HalevyAlon Answering queries using views: A survey , 2001, VLDB 2001.

[121]  Jeffrey D. Ullman,et al.  Principles Of Database And Knowledge-Base Systems , 1979 .

[122]  Alon Y. Halevy,et al.  Using Probabilistic Information in Data Integration , 1997, VLDB.

[123]  Chen Li,et al.  Generating E cient Plans for Queries Using Views , 2001 .

[124]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[125]  Oliver M. Duschka Query Optimization Using Local Completeness , 1997, AAAI/IAAI.

[126]  Alin Deutsch,et al.  Storing semistructured data with STORED , 1999, SIGMOD '99.