Benchmarking Data Management Systems:From Traditional Database to Emergent Big Data

The arrival of big data era means the emergence of novel techniques,systems and products.How to compare and evaluate different database systems objectively becomes a hot research area,which is similar to the age when database systems were just flourishing thirty years ago.As well as we know,database benchmarking plays an important role in the development of database systems,and greatly promotes the development of database technology and systems.The database benchmark refers to a set of specifications to evaluate and compare different database systems,which is capable of reflecting the performance gap between various database systems objectively and comprehensively,so as to promote technological progress and guide the positive development of the industry.Database benchmark is closely related to the application developments:it describes new data management needs,sparks innovative data management theory,gives birth to new data management systems,and ultimately needs to develop appropriatebenchmarks for evaluation.There exist various kinds of database benchmarks,including that for relational databases,for non-relational databases(semi-structured data,object-oriented data,streaming data,and spatial data),and for big data most recently.Nowadays,the tide of the research on big data benchmarking is also coming.The research on big data is strongly related to application requirements.So far,existing work cannot fully reflects the distinctive characteristics of big data applications.From a technical point of view,the developments of database benchmarks in the past thirty years are of great help to develop big data benchmarks,which is the main motivation of this paper.This paper reviews the progress of database benchmarks systematically,and points out future directions.