Chapter 7 – Big Data Tools and Techniques
暂无分享,去创建一个
This chapter discusses and provides a high-level overview of the big data tool ecosystem, delving into details using Hadoop as the example. This chapter provides an overview of high performance architecture, and then discusses aspects of the way that different aspects of Hadoop and associated tools address application development and deployment needs. The chapter discusses the Hadoop Distributed File System (HDFS), introduces the programming model provided by MapReduce and YARN, and then walks through associated tools such as Zookeeper (used for synchronization and control), the table-based data management scheme defined using HBase, an alternate data management scheme called Hive that can be used for data warehousing, simplifying application development using Pig, and the analytic algorithms provided within the Mahout libraries.