Querying the Internet with PIER

Achieving scalability is one of the goals of the database research community at present. The Internet is estimated to have a few hundreds of nodes, yet the largest database systems in the world scale up to at most a few hundred nodes. Supporting large databases is still a challenge because of the lack in the degree of distribution. The main goal is for databases to scale over Internet, thus making easy for applications (e.g. e-commerce) to develop. Scalable databases mean scalable data size, speed, workload and transaction cost. Three primary factors for operational databases are: huge number of concurrent users, the need for continuous availability and extremely large stored data volume. We define in what follows some requirements needed for (linear) scalability:

[1]  A. N. Wilschut,et al.  Dataflow query execution in a parallel main-memory environment , 1991, Distributed and Parallel Databases.