The amount of data that we produce and consume is growing exponentially in the modern world. Increasing use of social media and new innovations such as smartphones generate large amounts of data that can yield invaluable information if properly managed. These large datasets, popularly known as Big Data, are difficult to manage using traditional computing technologies. New technologies are emerging in the market to address the problem of managing and analyzing Big Data to produce invaluable insights from it. Organizations are finding it difficult to implement these Big Data technologies effectively due to problems such as lack of available expertise. Some of the latest innovations in the industry are related to cloud computing and Big Data. There is significant interest in academia and industry in combining Big Data and cloud computing to create new technologies that can solve the Big Data problem. Big Data based on cloud computing is an upcoming area in computer science and many vendors are providing their ideas on this topic. The combination of Big Data technologies and cloud computing platforms has led to the emergence of a new category of technology called Big Data as a Service or BDaaS. This thesis aims to define the BDaaS service stack and to evaluate a few technologies in the cloud computing ecosystem using the BDaaS service stack. The BDaaS service stack provides an effective way to classify the Big Data technologies that enable technology users to evaluate and chose the technology that meets their requirements effectively. Technology vendors can use the same BDaaS stack to communicate the product offerings better to the consumer. Thesis Advisor: Stuart Madnick Title: John Norris Maguire Professor of Information Technologies, MIT Sloan School of Management & Professor of Engineering Systems, MIT School of Engineering
[1]
F. E..
A Relational Model of Data Large Shared Data Banks
,
2000
.
[2]
Thomas H. Davenport,et al.
Enterprise Analytics: Optimize Performance, Process, and Decisions Through Big Data
,
2012
.
[3]
David Cearley,et al.
Hype Cycle for Cloud Computing , 2010
,
2010
.
[4]
Francis X. Diebold,et al.
On the Origin(s) and Development of the Term 'Big Data'
,
2012
.
[5]
Sanjay Ghemawat,et al.
MapReduce: Simplified Data Processing on Large Clusters
,
2004,
OSDI.
[6]
Michael Cox,et al.
Application-controlled demand paging for out-of-core visualization
,
1997
.
[7]
E. Ditmas.
The Scholar and the Future of the Research Library
,
1944,
Nature.
[8]
David Ellsworth,et al.
Application-controlled demand paging for out-of-core visualization
,
1997,
Proceedings. Visualization '97 (Cat. No. 97CB36155).
[9]
P. Mell,et al.
The NIST Definition of Cloud Computing
,
2011
.