Big data predictive analysis: Using R analytical tool

Data available in large volume, variety is generally termed as Big Data. Since Big data is difficult to analyze using traditional data processing techniques, many new data processing tools and techniques have evolved over the need to practice result-oriented big data analysis. In this paper, big data has been analyzed using one of the advance and effective data processing tool known as R Studio to depict predictive model based on results of big data analysis. Couples of algorithms — Random Forest (RF) and Latent Dirichlet Allocation (LDA) are applied over R package in order to find out more concrete results. To portray operational demonstration of this model, author has performed case study by analyzing fertility associated big data and come up with predictive model which will help to foretell certain possibilities well in advance.

[1]  Francisco Herrera,et al.  Analysis of Data Preprocessing Increasing the Oversampling Ratio for Extremely Imbalanced Big Data Classification , 2015, 2015 IEEE Trustcom/BigDataSE/ISPA.

[2]  Murtaza Haider,et al.  Beyond the hype: Big data concepts, methods, and analytics , 2015, Int. J. Inf. Manag..

[3]  Shan Suthaharan,et al.  Big data classification: problems and challenges in network intrusion prediction with machine learning , 2014, PERV.

[5]  Manish Varma Datla Bench marking of classification algorithms: Decision Trees and Random Forests - a case study using R , 2015, 2015 International Conference on Trends in Automation, Communications and Computing Technology (I-TACT-15).

[6]  V. Govindasamy,et al.  An online big data take oution using latent dirichlet allocation , 2016, 2016 International Conference on Communication and Signal Processing (ICCSP).

[7]  Jimeng Sun,et al.  Big data analytics for healthcare , 2013, KDD.

[8]  Rajiv Pandey,et al.  Elective Recommendation Support through K-Means Clustering Using R-Tool , 2015, 2015 International Conference on Computational Intelligence and Communication Networks (CICN).

[9]  Aziz Nasridinov,et al.  Visual Analytics for Big Data Using R , 2013, 2013 International Conference on Cloud and Green Computing.

[10]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[11]  Suryakant Soni,et al.  R-tool: Data analytic framework for big data , 2016, 2016 Symposium on Colossal Data Analysis and Networking (CDAN).