Data Base System Performance Prediction Using an Analytical Model (Invited Paper)

Much progress has been made recently in developing strategies for data base design at both the logical and physical levels. Various approaches, some built into automated design aids, produce designs that are known to be "good" (or even "optimal" in some sense). The measurement criteria by which the designs are judged, however, are difficult to relate to some of the performance measures of importance to computer system managers and data base system users. Such performance measures include device utilizations, transaction throughputs, and the distribution of responsetimes. In this paper, we suggest an overall framework for assessing and predicting the effect on resource consumption, throughputs, and response times of a variety of physical and logical data base design decisions that affect performance. We use ananalytical model based, at the lowest level, on queueing network models. Queueing network models have already proven useful in understanding and predicting performance in many actual computer systems (with and without data base components). At higher levels of the analytical model, we establish a sequence of data base system workload descriptions, each one dependent on more performance related design decisions. By analytical techniques, the workload description at one level and a set of design choices are transformed into the workload description at the next lower (more fully specified) level. By this approach, many data base design alternatives can be represented by changes at a single level of the layered model. The design alternatives can be assessed with respect to their effect on a variety of performance measures, including record accesses, block accesses, physical disk transfers, throughputs, and mean response times. The presence of other workload components running concurrently on the same hardware configuration can also be taken into account.

[1]  Guy M. Lohman,et al.  Differential files: their application to the maintenance of large databases , 1976, TODS.

[2]  Robert Demolombe,et al.  Estimation of the Number of Tuples Satisfying a Query Expressed in Predicate Calculus Language , 1980, VLDB.

[3]  Alan Jay Smith,et al.  Sequentiality and prefetching in database systems , 1978, TODS.

[4]  S. Bing Yao An attribute based model for database access cost analysis , 1977, TODS.

[5]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[6]  S. B. Yao,et al.  Optimization of query evaluation algorithms , 1979, ACM Trans. Database Syst..

[7]  Matti Jakobsson,et al.  Reducing block accesses in inverted files by partial clustering , 1980, Inf. Syst..

[8]  Billy G. Claybrook,et al.  Efficient algorithms for answering queries with unsorted multilists , 1978, Inf. Syst..

[9]  Theo Härder Implementing a generalized access path structure for a relational database system , 1978, TODS.

[10]  P. Bruce Berra,et al.  Minimum cost selection of secondary indexes for formatted files , 1977, TODS.

[11]  Guy M. Lohman,et al.  Optimal policy for batch operations: backup, checkpointing, reorganization, and updating , 1977, TODS.

[12]  Mario Schkolnick,et al.  A clustering algorithm for hierarchical structures , 1977, TODS.

[13]  Rob Gerritsen,et al.  A Data Base Design Decision Support System , 1977, VLDB.

[14]  Clement T. Yu,et al.  On the estimation of the number of desired records with respect to a given query , 1978, TODS.

[15]  Keki B. Irani,et al.  Evaluation and Optimization , 1977, VLDB.

[16]  Tomás Lang,et al.  Database buffer paging in virtual storage systems , 1977, TODS.

[17]  Bo Sundgren,et al.  Data Base Design in Theory and Practice - Towards an Integrated Methodology , 1978, VLDB Surveys.

[18]  Peter P. Chen,et al.  Design and Performance Tools for Data Base Systems , 1977, VLDB.

[19]  Stephen W. Sherman,et al.  An extension of the performance of a database manager in a virtual memory system using partially locked virtual buffers , 1977, TODS.

[20]  Keki B. Irani,et al.  Automatic data base schema design and optimization , 1975, VLDB '75.

[21]  Dennis G. Severance,et al.  A Practical Approach to Selecting Record Access Paths , 1977, CSUR.

[22]  Christer Hulten,et al.  A Simulation Model for Performance Analysis of Large Shared Data Bases , 1977, VLDB.

[23]  S. B. Yao,et al.  Approximating block accesses in database organizations , 1977, CACM.

[24]  Anthony I. Wasserman,et al.  Annotated Bibliography on Data Design , 1981, SIGMOD Rec..

[25]  Dennis McLeod,et al.  The semantic data model: a modelling mechanism for data base applications , 1978, SIGMOD Conference.

[26]  Beverly K. Kahn A method for describing information required by the database design process , 1976, SIGMOD '76.

[27]  Carlo Zaniolo,et al.  On the design of relational database schemata , 1981, TODS.

[28]  Alfonso F. Cardenas,et al.  Modeling and analysis of data base organization. The doubly chained tree structure , 1975, Inf. Syst..

[29]  Malcolm C. Easton Model for Database Reference Strings Based on Behavior of Reference Clusters , 1978, IBM J. Res. Dev..

[30]  Toby J. Teorey,et al.  Application of an analytical model to evaluate storage structures , 1976, SIGMOD '76.

[31]  Michael Stonebraker,et al.  Locking granularity revisited , 1979, ACM Trans. Database Syst..

[32]  Alfonso F. Cardenas,et al.  Evaluation and selection of file organization—a model and system , 1973, Commun. ACM.

[33]  Dennis G. Severance,et al.  The determination of efficient record segmentations and blocking factors for shared data files , 1977, TODS.

[34]  Tomás Lang,et al.  Effect of Replacement Algorithms on a Paged Buffer Database System , 1978, IBM J. Res. Dev..

[35]  Jayanta Banerjee,et al.  Performance Study of a Database Machine in Supporting Relational Databases , 1978, VLDB.

[36]  Peter J. Denning,et al.  The Operational Analysis of Queueing Network Models , 1978, CSUR.

[37]  Stephen W. Sherman,et al.  Performance of a database manager in a virtual memory system , 1976, TODS.

[38]  Alfonso F. Cardenas Analysis and performance of inverted data base structures , 1975, CACM.

[39]  Isao Miyamoto Hierarchical performance analysis models for data base systems , 1975, VLDB '75.

[40]  S. B. Yao,et al.  Evaluation of database access paths , 1978, SIGMOD Conference.

[41]  Kenneth F. Siler A stochastic evaluation model for database organizations in data retrieval systems , 1976, CACM.

[42]  William G. Tuel An Analysis of Buffer Paging in Virtual Storage Systems , 1976, IBM J. Res. Dev..

[43]  Gary H. Sockut,et al.  Database Reorganization—Principles and Practice , 1979, CSUR.

[44]  Mario Schkolnick,et al.  The Optimal Selection of Secondary Indices for Files , 1975, Inf. Syst..

[45]  Frank Wm. Tompa Choosing an Efficient Internal Schema , 1976, VLDB.

[46]  Dennis G. Severance,et al.  A practitioner's guide to addressing algorithms , 1976, CACM.

[47]  Toby J. Teorey,et al.  The Logical Record Access Approach to Database Design , 1980, CSUR.

[48]  Rangasami L. Kashyap,et al.  Analysis of the Multiple-Attribute-Tree Data-Base Organization , 1977, IEEE Transactions on Software Engineering.

[49]  Eugene Wong,et al.  Decomposition—a strategy for query processing , 1976, TODS.

[50]  Michael Hammer,et al.  A heuristic approach to attribute partitioning , 1979, SIGMOD '79.

[51]  Patricia G. Selinger,et al.  Support for repetitive transactions and ad hoc queries in System R , 1981, TODS.

[52]  Toby J. Teorey,et al.  A Designer For DBMS-Processable Logical Database Structures , 1979, Fifth International Conference on Very Large Data Bases, 1979..

[53]  John Zahorjan,et al.  Balanced job bound analysis of queueing networks , 1982, CACM.

[54]  Philippe Richard,et al.  Evaluation of the size of a query expressed in relational algebra , 1981, SIGMOD '81.

[55]  Tetsuo Mizoguchi,et al.  An Analysis of Storage Utilization Factor in Block Split Data Structuring Scheme , 1978, VLDB.

[56]  Peter P. Chen The entity-relationship model: toward a unified view of data , 1975, VLDB '75.

[57]  Michael E. Senko,et al.  DIAM II and Levels of Abstraction - The Physical Device Level: A General Model for Access Methods , 1976, VLDB.

[58]  C. E. Veni Madhavan,et al.  Performance evaluation of attribute-based tree organization , 1980, TODS.

[59]  Dennis G. Severance,et al.  A mathematical modeling approach to the automatic selection of database designs , 1978, SIGMOD '78.

[60]  Dominique Potier,et al.  Analysis of locking policies in database management systems , 1980, CACM.

[61]  T. H. Merrett Database cost analysis: a top-down approach , 1977, SIGMOD '77.

[62]  Stewart A. Schuster,et al.  Query execution and index selection for relational data bases , 1975, VLDB '75.

[63]  Dennis G. Severance,et al.  The use of cluster analysis in physical data base design , 1975, VLDB '75.

[64]  Michael Stonebraker,et al.  Performance analysis of a relational data base management system , 1979, SIGMOD '79.

[65]  Anne Putkonen On the selection of the access path in inverted database organization , 1979, Inf. Syst..

[66]  Sakti P. Ghosh,et al.  A Design of an Experiment to Model Data Base System Performance , 1976, IEEE Transactions on Software Engineering.

[67]  S. Bing Yao,et al.  Selection of file organization using an analytic model , 1975, VLDB '75.

[68]  Eugene Wong,et al.  Query Processing In A Relational Database Management System , 1979, Fifth International Conference on Very Large Data Bases, 1979..

[69]  Raymond T. Yeh,et al.  Toward a Design Methodology for DBMS: A Software Engineering Approach , 1977, VLDB.

[70]  Ben Shneiderman,et al.  Batched searching of sequential and tree structured files , 1976, TODS.

[71]  Jane W.-S. Liu Algorithms for parsing search queries in systems with inverted file organization , 1976, TODS.

[72]  John Mylopoulos,et al.  Using semantic networks for data base management , 1975, VLDB '75.

[73]  Keki B. Irani,et al.  Queueing network models for concurrent transaction processing in a database system , 1979, SIGMOD '79.

[74]  Arvola Chan,et al.  Index selection in a self-adaptive data base management system , 1976, SIGMOD '76.

[75]  M. W. Blasgen,et al.  Storage and Access in Relational Data Bases , 1977, IBM Syst. J..

[76]  Mario Schkolnick A Survey of Physical Database Design Methodology and Techniques , 1978, VLDB.

[77]  Ole Oren,et al.  Statistics For The Usage Of A Conceptual Data Model As A Basis For Logical Data Base Design , 1979, Fifth International Conference on Very Large Data Bases, 1979..

[78]  H SockutGary,et al.  Database ReorganizationPrinciples and Practice , 1979 .