Projecting the performance of decision support workloads on systems with smart storage (SmartSTOR)

Recent developments in both hardware and software have made it worthwhile to consider embedding intelligence in storage to handle general-purpose processing that can be off-loaded from the hosts. In particular, low-cost processing power is now widely available and software can be made robust, secure and mobile. In this paper, we propose a general smart storage (SmartSTOR) architecture in which a processing unit that is coupled to one or more disks can be used to perform such off-loaded processing. A major part of the paper is devoted to understanding the performance potential of the SmartSTOR architecture for decision support workloads. Our analysis suggests that there is a definite performance advantage in using fewer but more powerful processors, a result that bolsters the case for sharing a powerful processor among multiple disks. As for software architecture, we find that the off-loading of database operations that involve only a single relation is not very promising. In order to achieve significant speed-up, we have to consider the off-loading of multiple-relation operations. In general, if embedding intelligence in storage is an inevitable architectural trend, we have to focus on developing parallel software systems that can effectively take advantage of the large number of processing units that will be in the system.

[1]  Joel H. Saltz,et al.  Active disks: programming model, algorithms and evaluation , 1998, ASPLOS VIII.

[2]  Ali R. Hurson,et al.  Parallel Architectures for Database Systems , 1989, Adv. Comput..

[3]  James A. Gosling,et al.  The java language environment: a white paper , 1995 .

[4]  Raymond A. Lorie,et al.  Exploiting database parallelism in a message-passing multiprocessor , 1991, IBM J. Res. Dev..

[5]  Donovan A. Schneider,et al.  The Gamma Database Machine Project , 1990, IEEE Trans. Knowl. Data Eng..

[6]  Robert Wahbe,et al.  Efficient software-based fault isolation , 1994, SOSP '93.

[7]  David J. DeWitt,et al.  A PERFORMANCE EVALUATION OF DATABASE MACHINE ARCHITECTURES , 1981 .

[8]  Garth A. Gibson,et al.  RAID: high-performance, reliable secondary storage , 1994, CSUR.

[9]  David A. Patterson,et al.  ISTORE: introspective storage for data-intensive network services , 1999, Proceedings of the Seventh Workshop on Hot Topics in Operating Systems.

[10]  Christos Faloutsos,et al.  Active Storage for Large-Scale Data Mining and Multimedia , 1998, VLDB.

[11]  Lyle Adams,et al.  Processor integration in a disk controller , 1997, IEEE Micro.

[12]  David A. Patterson,et al.  A case for intelligent disks (IDISKs) , 1998, SGMD.

[13]  Alan Jay Smith,et al.  I/O reference behavior of production database workloads and the TPC benchmarks—an analysis at the logical level , 1999, TODS.

[14]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[15]  David J. DeWitt,et al.  Performance Analysis of Alternative Database Machine Architectures , 1982, IEEE Transactions on Software Engineering.

[16]  Joel H. Saltz,et al.  Structure and Performance of Decision Support Algorithms on Active Disks , 1998 .

[17]  Joel H. Saltz,et al.  An Evaluation of Architectural Alternatives for Rapidly Growing Datasets: Active Disks, Clusters, SMPs , 1998 .

[18]  Alan Jay Smith,et al.  Analysis of the Characteristics of Production Database Workloads and Comparison with the TPC Benchmarks , 1999 .