Development of Multiple Big Data Processing Platforms for Business Intelligence

The crucial problem of the integration of different platforms is how to adjust the distinct computing features between them with the capability of assigning an appropriate platform to best execute the inquire command. In business intelligence (BI), this paper introduced the integration of RHhadoop and SparkR platforms for the highperformance multiple big data processing platforms to carry out rapid data retrieval and data analysis with R programming. The goal of this paper is to design the optimization for job scheduling as well as to implement the optimized platform selection for highly improving the response time of data analysis. Alternatively, we proposed the very simple and straightforward manner for user to give R commands input instead of Java programming or Scala programming to realize the data retrieval or data analysis in the platforms. As a result, although the optimized platform selection can reduce the execution time for the data retrieval and data analysis significantly, furthermore scheduling optimization definitely increases the system efficiency a lot.

[1]  Saumya Salian,et al.  Big Data Analytics Predicting Risk of Readmissions of Diabetic Patients , 2015 .

[2]  Surajit Chaudhuri,et al.  An overview of business intelligence technology , 2011, Commun. ACM.

[3]  Christopher D Wickens,et al.  Processing Resources in Attention, Dual Task Performance, and Workload Assessment. , 1981 .

[4]  Scott Shenker,et al.  Fast and Interactive Analytics over Hadoop Data with Spark , 2012, login Usenix Mag..

[5]  Chin-Fu Kuo,et al.  Integration and optimization of multiple big data processing platforms , 2016 .

[6]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[7]  Xian-He Sun,et al.  Visualization and Adaptive Subsetting of Earth Science Data in HDFS: A Novel Data Analysis Strategy with Hadoop and Spark , 2016, 2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom) (BDCloud-SocialCom-SustainCom).

[8]  Muhammad Aslam,et al.  Minimizing big data problems using cloud computing based on Hadoop architecture , 2014, 2014 11th Annual High Capacity Optical Networks and Emerging/Enabling Technologies (Photonics for Energy).

[9]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[10]  Chi-Ming Chen,et al.  Empirical Analysis of Cloud-Mobile Computing Based VVoIP with Intelligent Adaptation , 2016 .

[11]  Hsiu Fen Tsai,et al.  Secondary Index to Big Data NoSQL Database ¡V Incorporating Solr to HBase Approach , 2016, J. Inf. Hiding Multim. Signal Process..

[12]  Anju Gahlawat Big Data Analysis using R and Hadoop , 2014 .

[13]  Bao Rong Chang,et al.  High-Performed Virtualization Services for In-Cloud Enterprise Resource Planning System , 2014, J. Inf. Hiding Multim. Signal Process..