Following the 'no one size fits all' philosophy, active research in big data platforms is focusing on creating an environment for multiple 'one-size' systems to co-exist and cooperate in the same cluster. Consequently, it has now become imperative to provide an integrated management solution that provides a database-centric view of the underlying multi-system environment. We outline the proposal of DBMS+, a database management platform over multiple 'one-size' systems. Our prototype implementation of DBMS+, called Thoth, adaptively chooses a best-fit system based on application requirements. In this demonstration, we propose to showcase Thoth DM, a data management framework for Thoth which consists of a data collection pipeline utility, data consolidation and dispatcher module, and a warehouse for storing this data. We further introduce the notion of apps; an app is a utility that registers with Thoth DM and interfaces with its warehouse to provide core database management functionalities like dynamic provisioning of resources, designing a multi-system-aware optimizer, tuning of configuration parameters on each system, data storage, and layout schemes.
We will demonstrate Thoth DM in action over Hive, Hadoop, Shark, Spark, and the Hadoop Distributed File System. This demonstration will focus on the following apps: (i) Dashboard for administration and control that will let the audience monitor and visualize a database-centric view of the multi-system cluster, and (ii) Data Layout Recommender app will allow searching for the optimal data layout in the multi-system setting.
[1]
Vivek R. Narasayya,et al.
Automatic physical design tuning: workload as a sequence
,
2006,
SIGMOD Conference.
[2]
Michael Abd-El-Malek,et al.
Omega: flexible, scalable schedulers for large compute clusters
,
2013,
EuroSys '13.
[3]
Randy H. Katz,et al.
Chukwa: A System for Reliable Large-Scale Log Collection
,
2010,
LISA.
[4]
Randy H. Katz,et al.
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
,
2011,
NSDI.
[5]
Yanpei Chen,et al.
Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads
,
2012,
Proc. VLDB Endow..
[6]
Carlo Curino,et al.
Apache Hadoop YARN: yet another resource negotiator
,
2013,
SoCC.
[7]
Shivnath Babu,et al.
How to Fit when No One Size Fits
,
2013,
CIDR.