Big Data Analytics: A Comparison of Tools and Applications

With an ever-increasing amount of both data volume and variety, traditional data processing tools became unsuitable for the big data context. This has pushed toward the creation of specific processing tools that are well aligned with emerging needs. However, it is often hard to choose the adequate solution as the wide list of available tools are continuously changing. For this, we present in this paper both a literature review and a technical comparison of the most known analytics tools in order to help mapping it to different needs. Moreover, we underline how much important choosing the appropriate tool is acting for different kind of applications and especially for smart cities environment.

[1]  Jameela Al-Jaroodi,et al.  Applications of big data to smart cities , 2015, Journal of Internet Services and Applications.

[2]  Lekha R. Nair,et al.  Applying spark based machine learning model on streaming big data for health status prediction , 2017, Comput. Electr. Eng..

[3]  Nasseh Tabrizi,et al.  A Survey on Real-Time Big Data Analytics: Applications and Tools , 2016, 2016 International Conference on Computational Science and Computational Intelligence (CSCI).

[4]  Anthony C. Boucouvalas,et al.  Evolving analytics for e-commerce applications: Utilizing big data and social media extensions , 2016, 2016 International Conference on Telecommunications and Multimedia (TEMU).

[5]  Chuan-Ming Liu,et al.  Big data stream computing in healthcare real-time analytics , 2016, 2016 IEEE International Conference on Cloud Computing and Big Data Analysis (ICCCBDA).

[6]  Balaji Bodkhe,et al.  Homogenizing social networking with smart education by means of machine learning and Hadoop: A case study , 2016, 2016 International Conference on Internet of Things and Applications (IOTA).

[7]  G. Kavianand,et al.  Smart drip irrigation system for sustainable agriculture , 2016, 2016 IEEE Technological Innovations in ICT for Agriculture and Rural Development (TIAR).

[8]  Awais Ahmad,et al.  Efficient Graph-Oriented Smart Transportation Using Internet of Things Generated Big Data , 2015, 2015 11th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS).

[9]  Sebastiaan Meijer,et al.  Analytics on public transport delays with spatial big data , 2016, BigSpatial '16.

[10]  Anand Paul,et al.  IoT-based smart city development using big data analytical approach , 2016, 2016 IEEE International Conference on Automatica (ICA-ACCA).

[11]  Hui Chen,et al.  A literature survey on smart cities , 2015, Science China Information Sciences.

[12]  J. M. Eklund,et al.  Classifying neonatal spells using real-time temporal analysis of physiological data streams: Algorithm development , 2013, 2013 IEEE Point-of-Care Healthcare Technologies (PHT).

[13]  Xike Xie,et al.  Survey of real-time processing systems for big data , 2014, IDEAS.

[14]  Radhika M. Pai,et al.  Stock market prediction: A big data approach , 2015, TENCON 2015 - 2015 IEEE Region 10 Conference.

[15]  Lei Deng,et al.  Building a Big Data Analytics Service Framework for Mobile Advertising and Marketing , 2015, 2015 IEEE First International Conference on Big Data Computing Service and Applications.

[16]  Wiratmoko Yuwono,et al.  Building platform application big sensor data for e-health wireless body area network , 2016, 2016 International Electronics Symposium (IES).

[17]  Herald Noronha,et al.  Big data integration for transition from e-learning to smart learning framework , 2016, 2016 3rd MEC International Conference on Big Data and Smart City (ICBDSC).

[18]  Yon Dohn Chung,et al.  Parallel data processing with MapReduce: a survey , 2012, SGMD.

[19]  Tommi Kramer,et al.  Enrichment of Smart Home Services by Integrating Social Network Services and Big Data Analytics , 2016, 2016 49th Hawaii International Conference on System Sciences (HICSS).

[20]  Jacob Hochstetler,et al.  An Optimal Police Patrol Planning Strategy for Smart City Safety , 2016, 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS).

[21]  Sherif Sakr,et al.  Big Data 2.0 Processing Systems: Taxonomy and Open Challenges , 2016, Journal of Grid Computing.

[22]  Fei Li,et al.  User behavior prediction model for smart home using parallelized neural network algorithm , 2016, 2016 IEEE 20th International Conference on Computer Supported Cooperative Work in Design (CSCWD).

[23]  Maribel Yasmina Santos,et al.  BASIS: A big data architecture for smart cities , 2016, 2016 SAI Computing Conference (SAI).

[24]  Haopeng Chen,et al.  ASC: Improving spark driver performance with SPARK automatic checkpoint , 2016 .

[25]  Yang Chen,et al.  TR-Spark: Transient Computing for Big Data Analytics , 2016, SoCC.

[26]  Murad Khan,et al.  Big Data Analytics Embedded Smart City Architecture for Performance Enhancement through Real-Time Data Processing and Decision-Making , 2017, Wirel. Commun. Mob. Comput..

[27]  Athanasios V. Vasilakos,et al.  Big data analytics: a survey , 2015, Journal of Big Data.

[28]  K. Manoj Kumar,et al.  Effective Implementation of Data Segregation and Extraction Using Big Data in E-Health Insurance as a Service , 2016, 2016 3rd International Conference on Advanced Computing and Communication Systems (ICACCS).

[29]  Richard O. Sinnott,et al.  The design and benchmarking of a Cloud-based platform for processing and visualization of traffic data , 2017, 2017 IEEE International Conference on Big Data and Smart Computing (BigComp).

[30]  Hamidreza Zareipour,et al.  Big Data Analytics for Modelling the Impact of Wind Power Generation on Competitive Electricity Market Prices , 2016, 2016 49th Hawaii International Conference on System Sciences (HICSS).

[31]  Gang Wu,et al.  Stream Bench: Towards Benchmarking Modern Distributed Stream Computing Frameworks , 2014, 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing.

[32]  Carlo Curino,et al.  Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.

[33]  Helen M. Meng,et al.  Indoor Air Monitoring Platform and Personal Health Reporting System: Big Data Analytics for Public Health Research , 2015, 2015 IEEE International Congress on Big Data.

[34]  Fei Hu,et al.  ASC: Improving spark driver performance with automatic spark checkpoint , 2016, 2016 18th International Conference on Advanced Communication Technology (ICACT).

[35]  Tianhai Tian,et al.  Volatility Analysis of Chinese Stock Market Using High-Frequency Financial Big Data , 2015, 2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity).

[36]  Luca Foschini,et al.  Towards an Infrastructure to Support Big Data for a Smart City Project , 2016, 2016 IEEE 25th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE).

[37]  Mohammad Kazem Akbari,et al.  An effective model for store and retrieve big health data in cloud computing , 2016, Comput. Methods Programs Biomed..

[38]  Miryung Kim,et al.  BigDebug: interactive debugger for big data analytics in Apache Spark , 2016, SIGSOFT FSE.

[39]  Lei Zhang,et al.  A novel precision marketing model based on telecom big data analysis for luxury cars , 2016, 2016 16th International Symposium on Communications and Information Technologies (ISCIT).

[40]  Shrinivas Deshpande,et al.  Distributed data management in energy sector using Hadoop , 2015, 2015 IEEE Bombay Section Symposium (IBSS).

[41]  Pooja Tripathi,et al.  An emerging trend of big data analytics with health insurance in India , 2016, 2016 International Conference on Innovation and Challenges in Cyber Security (ICICCS-INBUSH).

[42]  S. Suguna,et al.  Big data analysis in e-commerce system using HadoopMapReduce , 2016, 2016 International Conference on Inventive Computation Technologies (ICICT).

[43]  Sooyong Park,et al.  IRIS: A goal-oriented big data analytics framework on Spark for better Business decisions , 2017, 2017 IEEE International Conference on Big Data and Smart Computing (BigComp).

[44]  J. Yamini,et al.  Design And implementation of smart home energy management system , 2016, 2016 International Conference on Communication and Electronics Systems (ICCES).

[45]  Shuai Wang,et al.  Data-Driven Digital Advertising with Uncertain Demand Model in Metro Networks , 2017 .

[46]  Wei Xu,et al.  Improving Spark performance with zero-copy buffer management and RDMA , 2016, 2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[47]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[48]  Fan Yang,et al.  A Hybrid Outlier Detection Method for Health Care Big Data , 2016, 2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom) (BDCloud-SocialCom-SustainCom).

[49]  Nor Badrul Anuar,et al.  The role of big data in smart city , 2016, Int. J. Inf. Manag..

[50]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[51]  Yuxuan Jiang,et al.  Towards Max-Min Fair Resource Allocation for Stream Big Data Analytics in Shared Clouds , 2018, IEEE Transactions on Big Data.

[52]  Yongzheng Zhang,et al.  Predicting purchase behaviors from social media , 2013, WWW.

[53]  Hongbin Yang,et al.  Improving Spark performance with MPTE in heterogeneous environments , 2016, 2016 International Conference on Audio, Language and Image Processing (ICALIP).

[54]  Shuai Shao,et al.  Impacts of air pollution and its spatial spillover effect on public health based on China's big data sample , 2017 .

[55]  E. Marcheggiani,et al.  Mapping Cilento: Using geotagged social media data to characterize tourist flows in southern Italy , 2016 .

[56]  Farnaz Mosannenzadeh,et al.  Identifying and prioritizing barriers to implementation of smart energy city projects in Europe: An empirical approach , 2017 .

[57]  Wei Li,et al.  Big Health Application System based on Health Internet of Things and Big Data , 2017, IEEE Access.