Algebras for object-oriented query languages

Algebraic query processing and optimization for relational databases is a proven and reasonably well-understood technology. This thesis presents a data model and query language that were developed in part to facilitate the study of algebraic query processing and optimization for advanced data models. It also continues the evolution of the algebraic paradigm by presenting operators and transformation rules that correspond to this data model and thus extend the paradigm in several directions. Additional research into algebraic constructs is also presented; this work was done in the context of a newer system that provides even more features than the previous one. A comprehensive survey of database algebras is included to help place the work in perspective. Many of the results were obtained in the context of the EXTRA/EXCESS system, which was intended as a test vehicle for the EXODUS extensible database system toolkit. The EXTRA data model includes support for complex objects (via orthogonal type constructors), sharing, and object and value semantics. The EXCESS query language provides facilities for querying and updating EXTRA data, and it can be extended through the addition of procedures and functions for manipulating EXTRA schema types and generic set functions. The algebraic constructs developed for EXTRA/EXCESS include operators and transformations supporting grouping, arrays, references, and multisets. I also propose a new approach to processing and optimizing overridden methods in the presence of multiple inheritance. The thesis presents an intuitive set-theoretic semantics for the domains of object identifiers in the presence of multiple inheritance. I constructively prove that the algebra is equipollent to EXCESS, giving a complete semantics and a translation algorithm for the EXCESS language. In the context of a more powerful data model, the thesis will describe techniques for handling abstraction in an algebraic setting; differing notions of equality in an algebra and their relation to non-determinism; and algebraic techniques for processing and optimizing some queries over tree structures. Taken as a whole, the techniques and results presented in this thesis indicate that the algebraic approach to query processing and optimization is both feasible and beneficial in object-oriented database systems and beyond.

[1]  David J. DeWitt,et al.  The EXODUS optimizer generator , 1987, SIGMOD '87.

[2]  David Maier,et al.  Query optimization in object-oriented database systems : the REVELATION project , 1988 .

[3]  Randy H. Katz,et al.  An extended relational algebra with control over duplicate elimination , 1982, PODS.

[4]  Larry L. Dornhoff,et al.  Applied Modern Algebra , 1978 .

[5]  Rakesh Agrawal Alpha: An extension of relational algebra to express a class of recursive queries , 1987, 1987 IEEE Third International Conference on Data Engineering.

[6]  James Clifford,et al.  On an algebra for historical relational databases: two views , 1985, SIGMOD Conference.

[7]  Irving L. Traiger,et al.  System R: relational approach to database management , 1976, TODS.

[8]  Peter P. Chen An algebra for a directional binary entity-relationship model , 1984, 1984 IEEE First International Conference on Data Engineering.

[9]  Jianwen Su,et al.  On accessing object-oriented databases: expressive power, complexity, and restrictions , 1989, SIGMOD '89.

[10]  Jay Banerjee,et al.  Data model issues for object-oriented applications , 1987, TOIS.

[11]  E. F. Codd,et al.  A relational model of data for large shared data banks , 1970, CACM.

[12]  Hans-Jörg Schek,et al.  Towards A Basic Relational NF² Algebra Processor , 1985, FODO.

[13]  Bennet Vance,et al.  Towards an object-oriented query algebra , 1992 .

[14]  Michael Stonebraker,et al.  Implementation of data abstraction in the relational database system INGRES , 1983, SGMD.

[15]  Hans-Jörg Schek,et al.  The relational model with relation-valued attributes , 1986, Inf. Syst..

[16]  Bernhard Mitschang,et al.  Extending the Relational Algebra to Capture Complex Objects , 1989, VLDB.

[17]  Patrick Valduriez,et al.  Sharing, Persistence, and Object-Orientation: A Database Perspective , 1990, DBPL.

[18]  David J. DeWitt,et al.  The EXODUS Extensible DBMS Project: An Overview , 1989 .

[19]  Carlo Zaniolo,et al.  The database language GEM , 1983, SIGMOD '83.

[20]  David Maier,et al.  Indexing in an Object-Oriented DBMS , 1986, OODBS.

[21]  Richard Hull,et al.  A Survey of Theoretical Research on Typed Complex Database Objects , 1988, XP7.52 Workshop on Database Theory.

[22]  David W. Shipman The functional data model and the data language DAPLEX , 1979, SIGMOD '79.

[23]  James J. Horning,et al.  The Larch Family of Specification Languages , 1985, IEEE Software.

[24]  Won Kim,et al.  A Model of Queries for Object-Oriented Databases , 1989, VLDB.

[25]  Carlo Zaniolo,et al.  LDL: A Logic-Based Data Language , 1986, VLDB.

[26]  C. L. Liu Elements of Discrete Mathematics , 1985 .

[27]  Craig Harris,et al.  Combining language and database advances in an object-oriented development environment , 1987, OOPSLA 1987.

[28]  Georg Gottlob,et al.  Closed World Databases Opened Through Null Values , 1988, VLDB.

[29]  Jeffrey D. Ullman,et al.  Principles of Database Systems , 1980 .

[30]  Henry F. Korth,et al.  SQL/NF: a query language for ¬1 NF relational databases , 1987, Inf. Syst..

[31]  Serge Abiteboul,et al.  Histories and Versions for Multimedia Complex Objects. , 1988 .

[32]  Joel E. Richardson,et al.  Supporting Lists in a Data Model (A Timely Approach) , 1992, VLDB.

[33]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[34]  Peter Pistor,et al.  Designing A Generalized NF2 Model with an SQL-Type Language Interface , 1986, VLDB.

[35]  Marc Gyssens,et al.  A grammar-based approach towards unifying hierarchical data models , 1989, SIGMOD '89.

[36]  Jianwen Su,et al.  Untyped sets, invention, and computable queries , 1989, PODS '89.

[37]  Anthony C. Klug Equivalence of Relational Algebra and Relational Calculus Query Languages Having Aggregate Functions , 1982, JACM.

[38]  Alfred V. Aho,et al.  Universality of data retrieval languages , 1979, POPL.

[39]  David Maier,et al.  Development of an object-oriented DBMS , 1986, OOPSLA 1986.

[40]  Gultekin Özsoyoglu,et al.  Extending relational algebra and relational calculus with set-valued attributes and aggregate functions , 1987, TODS.

[41]  Karen A. Frenkel,et al.  The human genome project and informatics , 1991, CACM.

[42]  John L. Pfaltz,et al.  Summary of the final report of the NSF workshop on scientific database management , 1990, SGMD.

[43]  Jonathan J. King QUIST: A System for Semantic Query Optimization in Relational Databases , 1981, VLDB.

[44]  Patrick Pfeffer,et al.  The Design and Implementation of O2, an Object-Oriented Database Systems , 1988, OODBS.

[45]  Richard Hull,et al.  Four Views of Complex Objects: A Sophisticate's Introduction , 1987, NF².

[46]  Alfons Kemper,et al.  An analysis of geometric modeling in database systems , 1987, CSUR.

[47]  Jeffrey D. Ullman,et al.  Database theory—past and future , 1987, PODS.

[48]  Irving L. Traiger,et al.  Views, authorization, and locking in a relational data base system , 1975, AFIPS '75.

[49]  Ming-Chien Shan,et al.  Iris: An Object-Oriented Database Management System , 1989, ACM Trans. Inf. Syst..

[50]  David Robson,et al.  Smalltalk-80: The Language and Its Implementation , 1983 .

[51]  Hamid Pirahesh,et al.  Extensibility in the Starburst Database System , 1986, OODBS.

[52]  Dirk Van Gucht,et al.  On the expressive power of the extended relational algebra for the unnormalized relational model , 1987, PODS.

[53]  Dirk Van Gucht,et al.  An Implementation for Nested Relational Databases , 1988, VLDB.

[54]  S. Spaccapietra,et al.  An Algebra for a General Entity-Relationship Model , 1985, IEEE Transactions on Software Engineering.

[55]  Stanley B. Zdonik,et al.  A shared, segmented memory system for an object-oriented database , 1987, TOIS.

[56]  Latha S. Colby A recursive algebra and query optimization for nested relations , 1989, SIGMOD '89.

[57]  Sylvia L. Osborn Identity, Equality and Query Optimization , 1988, OODBS.

[58]  Craig Schaffert,et al.  An introduction to Trellis/Owl , 1986, OOPSLA 1986.

[59]  Malcolm P. Atkinson,et al.  Design Issues in a Map Language , 1992, DBPL.

[60]  John Mylopoulos,et al.  A language facility for designing database-intensive applications , 1980, TODS.

[61]  Stanley B. Zdonik,et al.  A query algebra for object-oriented databases , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[62]  Peter P. Chen The entity-relationship model: toward a unified view of data , 1975, VLDB '75.

[63]  Catriel Beeri,et al.  On the power of languages for manipulation of complex objects , 1987, VLDB 1987.

[64]  Li Yu,et al.  An evaluation framework for algebraic object-oriented query models , 1991, [1991] Proceedings. Seventh International Conference on Data Engineering.

[65]  Catriel Beeri,et al.  Bulk Data Types, A Theoretical Approach , 1993, DBPL.

[66]  Christoph M. Hoffmann,et al.  Pattern Matching in Trees , 1982, JACM.

[67]  Per-Åke Larson,et al.  An algebra for nested relations , 1987 .

[68]  Serge Abiteboul,et al.  Non First Normal Form Relations: An Algebra Allowing Data Restructuring , 1986, J. Comput. Syst. Sci..

[69]  Joel E. Richardson,et al.  MDM: An Object-Oriented Data Model , 1991, DBPL.

[70]  Michael J. Carey,et al.  Performance enhancement through replication in an object-oriented DBMS , 1989, SIGMOD '89.

[71]  Hans-Jörg Schek,et al.  Remarks on the algebra of non first normal form relations , 1982, PODS.

[72]  Ravi Sethi,et al.  Programming languages - concepts and constructs , 1988 .

[73]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[74]  Stanley B. Zdonik,et al.  The AQUA Data Model and Algebra , 1993, DBPL.

[75]  Masatoshi Yoshikawa,et al.  ILOG: Declarative Creation and Manipulation of Object Identifiers , 1990, VLDB.

[76]  Robert Langridge,et al.  Mapping and interpreting biological information , 1991, CACM.

[77]  Michael J. Carey,et al.  Programming constructs for database system implementation in EXODUS , 1987, SIGMOD '87.

[78]  Marc H. Scholl,et al.  Theoretical Foundation of Algebraic Optimization Utilizing Unnormalized Relations , 1986, ICDT.

[79]  E. F. Codd Data models in database management , 1981, SIGMOD 1981.

[80]  Ralf Hartmut Güting,et al.  An algebra for structured office documents , 1989, TOIS.

[81]  Peter Buneman,et al.  A Type System that Reconsiles Classes and Extents , 1992, DBPL.

[82]  John Doner,et al.  Tree Acceptors and Some of Their Applications , 1970, J. Comput. Syst. Sci..

[83]  Klaus R. Dittrich,et al.  Object-Oriented Database Systems: The Notion and the Issue , 1986, OODBS.

[84]  Michael Stonebraker,et al.  The design and implementation of INGRES , 1976, TODS.

[85]  Arnold L. Rosenberg,et al.  Rapid identification of repeated patterns in strings, trees and arrays , 1972, STOC.

[86]  Akifumi Makinouchi,et al.  A Consideration on Normal Form of Not-Necessarily-Normalized Relation in the Relational Data Model , 1977, VLDB.

[87]  Donald Ervin Knuth,et al.  The Art of Computer Programming, Volume II: Seminumerical Algorithms , 1970 .

[88]  David Maier,et al.  Making smalltalk a database system , 1984, SIGMOD '84.

[89]  Peter M. D. Gray Logic, algebra and databases , 1984, Ellis Horwood series in computers and their applications.

[90]  Gjpm Geert-Jan Houben,et al.  The R2-algebra : an extension of an algebra for nested relations , 1987 .

[91]  Z. Meral Ozsoyoglu,et al.  An extension of relational algebra for summary tables , 1983 .

[92]  Dirk Van Gucht,et al.  Possibilities and limitations of using flat operators in nested algebra expressions , 1988, PODS '88.

[93]  G. X. Ritter,et al.  Image Algebra: A Unified Approach To Image Processing , 1987, Medical Imaging.

[94]  John Mylopoulos,et al.  Implementation of a compiler for a semantic data model: Experiences with taxis , 1987, SIGMOD '87.

[95]  Setrag Khoshafian,et al.  A calculus for complex objects , 1985, PODS '86.

[96]  Marc Gyssens,et al.  The powerset algebra as a result of adding programming constructs to the nested relational algebra , 1988, SIGMOD '88.

[97]  Abdullah Uz Tansel,et al.  Nested historical relations , 1989, SIGMOD '89.

[98]  Patrick Valduriez,et al.  FAD, a Powerful and Simple Database Language , 1987, VLDB.

[99]  Serge Abiteboul,et al.  Non-deterministic languages to express deterministic transformations , 1990, PODS '90.

[100]  Kevin Wilkinson,et al.  Overview of the Iris DBMS , 1989, Research Foundations in Object-Oriented and Semantic Database Systems.

[101]  Catriel Beeri,et al.  Algebraic Optimization of Object-Oriented Query Languages , 1990, Theor. Comput. Sci..

[102]  Guy M. Lohman,et al.  Remotely-sensed geophysical databases: experience and implications for generalized DBMS , 1983, SIGMOD '83.

[103]  Christophe Lécluse,et al.  O2, an object-oriented data model , 1988, SIGMOD '88.

[104]  Ashok K. Chandra Theory of database queries , 1988, PODS '88.

[105]  Patrick C. Fischer,et al.  Some classes of multilevel relational structures , 1985, PODS '86.

[106]  Renzo Orsini,et al.  Objects for a Database Programming Language , 1991, DBPL.

[107]  Takao Miura,et al.  Operations and the Properties on Non-First-Normal-Form Relational Databases , 1983, VLDB.

[108]  Z. Meral Özsoyoglu,et al.  A system for semantic query optimization , 1987, SIGMOD '87.

[109]  Michael Stonebraker,et al.  Extending a database system with procedures , 1987, TODS.

[110]  Setrag Khoshafian,et al.  Object identity , 1986, OOPSLA 1986.

[111]  Hans-Jörg Schek,et al.  Supporting Flat Relations by a Nested Relational Kernel , 1987, VLDB.

[112]  Matthias Jarke,et al.  Query Optimization in Database Systems , 1984, CSUR.

[113]  David J. DeWitt,et al.  A data model and query language for EXODUS , 1988, SIGMOD '88.

[114]  Umeshwar Dayal,et al.  PDM: An Object-Oriented Data Model , 1986, OODBS.

[115]  Serge Abiteboul,et al.  Object identity as a query language primitive , 1989, SIGMOD '89.

[116]  Michael Stonebraker,et al.  The POSTGRES Data Model , 1987, Research Foundations in Object-Oriented and Semantic Database Systems.

[117]  Gabriel M. Kuper,et al.  The logical data model: a new approach to database logic , 1986 .

[118]  William Kent,et al.  Limitations of record-based information models , 1979, TODS.

[119]  David J. DeWitt,et al.  Algebraic support for complex objects with arrays, identity, and inheritance , 1991, SIGMOD '91.

[120]  Stanley B. Zdonik,et al.  Ordered Types in the AQUA Data Model , 1993, DBPL.

[121]  Stanley B. Zdonik,et al.  Issues in the design of object-oriented database programming languages , 1987, OOPSLA 1987.