Big Data Analytics with R and Hadoop

Set up an integrated infrastructure of R and Hadoop to turn your data analytics into Big Data analytics Overview Write Hadoop MapReduce within R Learn data analytics with R and the Hadoop platform Handle HDFS data within R Understand Hadoop streaming with R Encode and enrich datasets into R In Detail Big data analytics is the process of examining large amounts of data of a variety of types to uncover hidden patterns, unknown correlations, and other useful information. Such information can provide competitive advantages over rival organizations and result in business benefits, such as more effective marketing and increased revenue. New methods of working with big data, such as Hadoop and MapReduce, offer alternatives to traditional data warehousing. Big Data Analytics with R and Hadoop is focused on the techniques of integrating R and Hadoop by various tools such as RHIPE and RHadoop. A powerful data analytics engine can be built, which can process analytics algorithms over a large scale dataset in a scalable manner. This can be implemented through data analytics operations of R, MapReduce, and HDFS of Hadoop. You will start with the installation and configuration of R and Hadoop. Next, you will discover information on various practical data analytics examples with R and Hadoop. Finally, you will learn how to import/export from various data sources to R. Big Data Analytics with R and Hadoop will also give you an easy understanding of the R and Hadoop connectors RHIPE, RHadoop, and Hadoop streaming. What you will learn from this book Integrate R and Hadoop via RHIPE, RHadoop, and Hadoop streaming Develop and run a MapReduce application that runs with R and Hadoop Handle HDFS data from within R using RHIPE and RHadoop Run Hadoop streaming and MapReduce with R Import and export from various data sources to R Approach Big Data Analytics with R and Hadoop is a tutorial style book that focuses on all the powerful big data tasks that can be achieved by integrating R and Hadoop. Who this book is written for This book is ideal for R developers who are looking for a way to perform big data analytics with Hadoop. This book is also aimed at those who know Hadoop and want to build some intelligent applications over Big data with R packages. It would be helpful if readers have basic knowledge of R.

[1]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[2]  William Coleman In a Moment. , 1938 .

[3]  Joydeep Ghosh,et al.  Parallel Simultaneous Co-clustering and Learning with Map-Reduce , 2010, 2010 IEEE International Conference on Granular Computing.

[4]  Marc Delisle,et al.  Mastering phpMyAdmin for Effective MySQL Management , 2004 .

[5]  Ronald C. Taylor An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics , 2010, BMC Bioinformatics.

[6]  Feng Li,et al.  An Efficient Hierarchical Clustering Method for Large Datasets with Map-Reduce , 2009, 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies.

[7]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[8]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.