Big data analytics meets social media: A systematic review of techniques, open issues, and future directions

Social Networking Services (SNSs) connect people worldwide, where they communicate through sharing contents, photos, videos, posting their first-hand opinions, comments, and following their friends. Social networks are characterized by velocity, volume, value, variety, and veracity, the 5V’s of big data. Hence, big data analytic techniques and frameworks are commonly exploited in Social Network Analysis (SNA). By the ever-increasing growth of social networks, the analysis of social data, to describe and find communication patterns among users and understand their behaviors, has attracted much attention. In this paper, we demonstrate how big data analytics meets SNA, and a comprehensive review is provided on big data analytic approaches in social networks to search published studies between 2013 and August 2020, with 74 identified papers. The findings of this paper are presented in terms of main journals/conferences, yearly distributions, and the distribution of studies among publishers. Furthermore, the big data analytic approaches are classified into two main categories: Content-oriented approaches and network-oriented approaches. The main ideas, evaluation parameters, tools, evaluation methods, advantages, and disadvantages are also discussed in detail. Finally, the open challenges and future directions that are worth further investigating are discussed.

[1]  Seungwoo Lee,et al.  Distributed and Parallel Big Textual Data Parsing for Social Sensor Network , 2013, Int. J. Distributed Sens. Networks.

[2]  Dalila Chiadmi,et al.  Toward a Big Data-as-a-Service for Social Networks Graphs Analysis , 2016 .

[3]  Neil Y. Yen,et al.  State transition in communication under social network: An analysis using fuzzy logic and Density Based Clustering towards big data paradigm , 2016, Future Gener. Comput. Syst..

[4]  Salvatore Catanese,et al.  Crawling Facebook for social network analysis purposes , 2011, WIMS '11.

[5]  Tat-Seng Chua,et al.  Learning Wellness Profiles of Users on Social Networks: The Case of Diabetes , 2019, Social Web and Health Research.

[6]  Claus Pahl,et al.  Cloud Migration Research: A Systematic Review , 2013, IEEE Transactions on Cloud Computing.

[7]  Ronald Rousseau,et al.  Social network analysis: a powerful strategy, also for the information sciences , 2002, J. Inf. Sci..

[8]  Bharat Tidke,et al.  A survey of big data in social media using data mining techniques , 2015, 2015 International Conference on Advanced Computing and Communication Systems.

[9]  Abdullah Gani,et al.  Predicting Cyberbullying on Social Media in the Big Data Era Using Machine Learning Algorithms: Review of Literature and Open Challenges , 2019, IEEE Access.

[10]  Choong Seon Hong,et al.  A Heuristic Mixed Model for Viral Marketing Cost Minimization in Social Networks , 2019, 2019 International Conference on Information Networking (ICOIN).

[11]  Avita Katal,et al.  Big data: Issues, challenges, tools and Good practices , 2013, 2013 Sixth International Conference on Contemporary Computing (IC3).

[12]  Jian Ma,et al.  Leverage RAF to find domain experts on research social network services: A big data analytics methodology with MapReduce framework , 2015 .

[13]  Vikas Khullar,et al.  Social media generated big data clustering using genetic algorithm , 2017, 2017 International Conference on Computer Communication and Informatics (ICCCI).

[14]  Yogesh Kumar Dwivedi,et al.  Artificial intelligence for decision making in the era of Big Data - evolution, challenges and research agenda , 2019, Int. J. Inf. Manag..

[15]  Sonam Sharma Building Real-time knowledge in Social Media on Focus Point: An Apache Spark Streaming Implementation , 2018, 2018 IEEE Punecon.

[16]  Kathleen M. Carley,et al.  Positive Affectivity and Accuracy in Social Network Perception , 1999, Motivation and Emotion.

[17]  Brij B. Gupta,et al.  Fake profile detection in multimedia big data on online social networks , 2020 .

[18]  Fehmi Ben Abdesslem,et al.  Reliable Online Social Network Data Collection , 2012, Computational Social Networks.

[19]  Yifei Tong,et al.  Double-layered big data analytics architecture for solar cells series welding machine , 2018, Comput. Ind..

[20]  Xiaohua Jia,et al.  The Impact of Sampling on Big Data Analysis of Social Media: A Case Study on Flu and Ebola , 2014, 2015 IEEE Global Communications Conference (GLOBECOM).

[21]  Pearl Brereton,et al.  Systematic literature reviews in software engineering - A systematic literature review , 2009, Inf. Softw. Technol..

[22]  Antonio Pescapè,et al.  Benchmarking big data architectures for social networks data processing using public cloud platforms , 2018, Future Gener. Comput. Syst..

[23]  D. Boyd,et al.  CRITICAL QUESTIONS FOR BIG DATA , 2012 .

[24]  Feng Wang,et al.  Analyzing Entrepreneurial Social Networks with Big Data , 2017 .

[25]  Mohamed Ali Hadj Taieb,et al.  Review of social media analytics process and Big Data pipeline , 2018, Social Network Analysis and Mining.

[26]  Nor Badrul Anuar,et al.  TEMPORARY REMOVAL: Information fusion in social big data: Foundations, state-of-the-art, applications, challenges, and future research directions , 2016 .

[27]  Brian E. Weeks,et al.  Big Data and Political Social Networks , 2017 .

[28]  Zhou Su,et al.  Big data in mobile social networks: a QoE-oriented framework , 2016, IEEE Network.

[29]  Nima Jafari Navimipour,et al.  Toward Efficient Service Composition Techniques in the Internet of Things , 2018, IEEE Internet of Things Journal.

[30]  James She,et al.  Connection Discovery Using Big Data of User-Shared Images in Social Media , 2015, IEEE Transactions on Multimedia.

[31]  Abdelsalam H. Busalim,et al.  Understanding social commerce: A systematic literature review and directions for further research , 2016, Int. J. Inf. Manag..

[32]  Pearl Brereton,et al.  Performing systematic literature reviews in software engineering , 2006, ICSE.

[33]  Lise Getoor,et al.  Link mining: a survey , 2005, SKDD.

[34]  Rumi Chunara,et al.  From the User to the Medium: Neural Profiling Across Web Communities , 2018, ICWSM.

[35]  Xingwei Yang,et al.  A big data analytics framework for detecting user-level depression from social networks , 2020, Int. J. Inf. Manag..

[36]  Suresh Cuganesan,et al.  Accounting, accountability, social media and big data: revolution or hype? , 2017 .

[37]  Izabela Moise,et al.  2016 Ieee International Conference on Big Data (big Data) the Technical Hashtag in Twitter Data: a Hadoop Experience , 2022 .

[38]  M. Spruce,et al.  Using social media to measure impacts of named storm events in the United Kingdom and Ireland , 2020, Meteorological Applications.

[39]  Carson Kai-Sang Leung,et al.  Management of Distributed Big Data for Social Networks , 2016, 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid).

[40]  Incheon Paik,et al.  Stock market analysis from Twitter and news based on streaming big data infrastructure , 2017, 2017 IEEE 8th International Conference on Awareness Science and Technology (iCAST).

[41]  David Dominguez-Sal,et al.  Comparison of influence metrics in information diffusion networks , 2011, 2011 International Conference on Computational Aspects of Social Networks (CASoN).

[42]  K Sailaja Kumar,et al.  Identify the influential user in online social networks using R, Hadoop and Python , 2016, 2016 International Conference on Circuits, Controls, Communications and Computing (I4C).

[43]  Athor Subroto,et al.  Cyber risk prediction through social media big data analytics and statistical machine learning , 2019, Journal of Big Data.

[44]  Melody Moh,et al.  Mining Frequency of Drug Side Effects over a Large Twitter Dataset Using Apache Spark , 2017, 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[45]  Germán Terrazas,et al.  A cloud-based framework for shop floor big data management and elastic computing analytics , 2019, Comput. Ind..

[46]  Hai Jin,et al.  Differentially Private Online Learning for Cloud-Based Video Recommendation With Multimedia Big Data in Social Networks , 2015, IEEE Transactions on Multimedia.

[47]  Ming Yang,et al.  Filtering big data from social media - Building an early warning system for adverse drug reactions , 2015, J. Biomed. Informatics.

[48]  L. D. Dhinesh Babu,et al.  A firefly swarm approach for establishing new connections in social networks based on big data analytics , 2015, Int. J. Commun. Networks Distributed Syst..

[49]  Yogesh Kumar Dwivedi,et al.  Social media in marketing: A review and analysis of the existing literature , 2017, Telematics Informatics.

[50]  G. Bebek,et al.  Network based model of social media big data predicts contagious disease diffusion. , 2017, Information discovery and delivery.

[51]  Pan Zhou,et al.  Accurate Content Push for Content-Centric Social Networks: A Big Data Support Online Learning Approach , 2018, IEEE Transactions on Emerging Topics in Computational Intelligence.

[52]  Qianmu Li,et al.  Targeted Influence Maximization Based on Cloud Computing Over Big Data in Social Networks , 2020, IEEE Access.

[53]  S. Chitrakala,et al.  Social influence determination on big data streams in an online social network , 2017, Multimedia Tools and Applications.

[54]  Ramesh Sharda,et al.  Social Media for Nowcasting Flu Activity: Spatio-Temporal Big Data Analysis , 2019, Information Systems Frontiers.

[55]  Yue Wang,et al.  An incentive-based protection and recovery strategy for secure big data in social networks , 2020, Inf. Sci..

[56]  E. di Bella,et al.  Big Data and Social Indicators: Actual Trends and New Perspectives , 2018 .

[57]  Antonio Puliafito,et al.  A big video data transcoding service for social media over federated clouds , 2019, Multimedia Tools and Applications.

[58]  Athanasios K. Tsakalidis,et al.  An Apache Spark Implementation for Sentiment Analysis on Twitter Data , 2016, ALGOCLOUD.

[59]  Desmond J. Higham,et al.  A clustering coefficient for weighted networks, with application to gene expression data , 2007, AI Commun..

[60]  Incheon Paik,et al.  Efficient Service Discovery Using Social Service Network Based on Big Data Infrastructure , 2017, 2017 IEEE 11th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC).

[61]  Muhammad Ali Babar,et al.  Systematic reviews in software engineering: An empirical investigation , 2013, Inf. Softw. Technol..

[62]  A. Vinay,et al.  Cloud Based Big Data Analytics Framework for Face Recognition in Social Networks Using Machine Learning , 2015 .

[63]  Ravi Kumar,et al.  Structure and evolution of online social networks , 2006, KDD '06.

[64]  Mohsen Jamali,et al.  Different Aspects of Social Network Analysis , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[65]  Jaroslav Bukovina,et al.  Social media big data and capital markets—An overview , 2016 .

[66]  Arini,et al.  The Influence of Iteration Calculation Manipulation On Social Network Analysis Toward Twitter's Users Against Hoax In Indonesia With Single Cluster Multi-Node Method Using Apache Hadoop Hortonworkstm Distribution , 2018, 2018 6th International Conference on Cyber and IT Service Management (CITSM).

[67]  S. Borgatti,et al.  Making Invisible Work Visible: Using Social Network Analysis to Support Strategic Collaboration , 2002 .

[68]  M. Newman Analysis of weighted networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[69]  Xiaofei Wang,et al.  Spark-Based Measurement and Analysis on Offline Mobile Application Market over Device-to-Device Sharing in Mobile Social Networks , 2017, 2017 IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS).

[70]  P. Victer Paul,et al.  A survey on big data analytics using social media data , 2017, 2017 Innovations in Power and Advanced Computing Technologies (i-PACT).

[71]  María Martínez-Rojas,et al.  Twitter as a tool for the management and analysis of emergency situations: A systematic literature review , 2018, Int. J. Inf. Manag..

[72]  Phillip Wolff,et al.  Predicting future mental illness from social media: A big-data approach , 2019, Behavior research methods.

[73]  Z. Di,et al.  Clustering coefficient and community structure of bipartite networks , 2007, 0710.0117.

[74]  Huan Liu,et al.  Discovering Overlapping Groups in Social Media , 2010, 2010 IEEE International Conference on Data Mining.

[75]  Weichang Du,et al.  Toward Semantic Social Network Analysis for Business Big Data , 2018, 2018 14th International Conference on Semantics, Knowledge and Grids (SKG).

[76]  Coral Calero,et al.  A systematic literature review for software sustainability measures , 2013, 2013 2nd International Workshop on Green and Sustainable Software (GREENS).

[77]  Athena Vakali,et al.  A Distributed Framework for Early Trending Topics Detection on Big Social Networks Data Threads , 2016, INNS Conference on Big Data.

[78]  Ron S. Kenett,et al.  Social Media Big Data Integration: A New Approach Based on Calibration , 2017, Expert Syst. Appl..

[79]  Tin Yu Wu,et al.  Analysis and evaluation of random-based message propagation models on the social networks , 2020, Comput. Networks.

[80]  Mohamed M. Abdallah,et al.  Big Data analysis using a metaheuristic algorithm: Twitter as Case Study , 2020, 2020 International Conference on Innovative Trends in Communication and Computer Engineering (ITCE).

[81]  Zhenyu Wu,et al.  An Incremental Community Detection Method in Social Big Data , 2018, 2018 IEEE/ACM 5th International Conference on Big Data Computing Applications and Technologies (BDCAT).

[82]  A. Billings,et al.  Twitter-Based BIRGing: Big Data Analysis of English National Team Fans During the 2018 FIFA World Cup , 2020, Communication & Sport.

[83]  Tian Wang,et al.  Using Mobile Nodes to Control Rumors in Big Data Based on a New Rumor Propagation Model in Vehicular Social Networks , 2018, IEEE Access.

[84]  Tao Jiang,et al.  Distributed private online learning for social big data computing over data center networks , 2016, 2016 IEEE International Conference on Communications (ICC).

[85]  R. Kitchin,et al.  The real-time city? Big data and smart urbanism , 2013, GeoJournal.

[86]  Pearl Brereton,et al.  Lessons from applying the systematic literature review process within the software engineering domain , 2007, J. Syst. Softw..

[87]  Mostafa Haghi Kashani,et al.  Fog-based smart homes: A systematic review , 2020, J. Netw. Comput. Appl..

[88]  Li Ma,et al.  A Big Data Privacy Respecting Dissemination Method for Social Network , 2018, J. Signal Process. Syst..

[89]  M. BalaAnand,et al.  Envisioning Social Media Information for Big Data Using Big Vision Schemes in Wireless Environment , 2019, Wirel. Pers. Commun..

[90]  Jason J. Jung,et al.  Social big data: Recent achievements and new challenges , 2015, Information Fusion.

[91]  Imene Guellil,et al.  Social big data mining: A survey focused on opinion mining and sentiments analysis , 2015, 2015 12th International Symposium on Programming and Systems (ISPS).

[92]  Rashid Mehmood,et al.  Iktishaf: a Big Data Road-Traffic Event Detection Tool Using Twitter and Spark Machine Learning , 2020 .

[93]  Özgür Ulusoy,et al.  Multi-resolution Social Network Community Identification and Maintenance on Big Data Platform , 2013, 2013 IEEE International Congress on Big Data.

[94]  Xiangfeng Luo,et al.  Sentiment Computing for the News Event Based on the Social Media Big Data , 2017, IEEE Access.

[95]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[96]  Jianhua Ma,et al.  PRDiscount: A Heuristic Scheme of Initial Seeds Selection for Diffusion Maximization in Social Networks , 2014, ICIC.

[97]  Fei Wang,et al.  Wellness Representation of Users in Social Media: Towards Joint Modelling of Heterogeneity and Temporality , 2017, IEEE Transactions on Knowledge and Data Engineering.

[98]  Ainin Sulaiman,et al.  Social media usage and organizational performance: Reflections of Malaysian social media managers , 2015, Telematics Informatics.

[99]  Norjihan Abdul Ghani,et al.  Social media big data analytics: A survey , 2019, Comput. Hum. Behav..

[100]  Jaafar M. H. Elmirghani,et al.  Big data analytics for wireless and wired network design: A survey , 2018, Comput. Networks.

[101]  Rajkumar Buyya,et al.  Computational Intelligence Based QoS-Aware Web Service Composition: A Systematic Literature Review , 2017, IEEE Transactions on Services Computing.

[102]  Florian Probst,et al.  Online social networks: A survey of a global phenomenon , 2012 .

[103]  Tat-Seng Chua,et al.  Leveraging Behavioral Factorization and Prior Knowledge for Community Discovery and Profiling , 2017, WSDM.

[104]  L. Manovich,et al.  Trending: The Promises and the Challenges of Big Social Data , 2012 .

[105]  Nima Jafari Navimipour,et al.  Deployment Strategies in the Wireless Sensor Networks: Systematic Literature Review, Classification, and Current Trends , 2016, Wireless Personal Communications.

[106]  S. Bauer,et al.  Analyzing big data in social media: Text and network analyses of an eating disorder forum , 2018, The International journal of eating disorders.

[107]  Niranjan N. Chiplunkar,et al.  A new big data approach for topic classification and sentiment analysis of Twitter data , 2019, Evolutionary Intelligence.

[108]  Seref Sagiroglu,et al.  Big data: A review , 2013, 2013 International Conference on Collaboration Technologies and Systems (CTS).

[109]  Chunxiao Jiang,et al.  Big Data Driven Similarity Based U-Model for Online Social Networks , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[110]  P. Franses,et al.  Big Data Analysis of Volatility Spillovers of Brands across Social Media and Stock Markets , 2020 .

[111]  Maninder Kaur,et al.  Intelligent content-based cybercrime detection in online social networks using cuckoo search metaheuristic approach , 2019, The Journal of Supercomputing.

[112]  Yogesh Kumar Dwivedi,et al.  Measuring social media influencer index- insights from facebook, Twitter and Instagram , 2019, Journal of Retailing and Consumer Services.

[113]  Martin G. Everett,et al.  Centrality and the dual-projection approach for two-mode social network data , 2016 .

[114]  F. Fiorentin,et al.  Big data of innovation literature at the firm level: a review based on social network and text mining techniques , 2019, Economics of Innovation and New Technology.

[115]  Bo Deng,et al.  Community structure mining in big data social media networks with MapReduce , 2015, Cluster Computing.

[116]  P. Holland,et al.  TRANSITIVITY IN STRUCTURAL MODELS OF SMALL GROUPS , 1977 .

[117]  Reda Alhajj,et al.  A password creation and validation system for social media platforms based on big data analytics , 2020, J. Ambient Intell. Humaniz. Comput..

[118]  Xiaoqing Gu,et al.  User Multi-Modal Emotional Intelligence Analysis Method Based on Deep Learning in Social Network Big Data Environment , 2019, IEEE Access.

[119]  F. Fairman Introduction to dynamic systems: Theory, models and applications , 1979, Proceedings of the IEEE.

[120]  Belén Ruíz-Mezcua,et al.  Towards a big data framework for analyzing social media content , 2019, Int. J. Inf. Manag..

[121]  Thomas Y. Choi,et al.  Structural investigation of supply networks: A social network analysis approach , 2011 .

[122]  Xin-qi Zheng,et al.  Analysis of spatiotemporal characteristics of big data on social media sentiment with COVID-19 epidemic topics , 2020, Chaos, Solitons & Fractals.

[123]  M. Shamim Hossain,et al.  Localization Based on Social Big Data Analysis in the Vehicular Networks , 2017, IEEE Transactions on Industrial Informatics.

[124]  Yi-Liang Zhao,et al.  Bridging the Vocabulary Gap between Health Seekers and Healthcare Knowledge , 2015, IEEE Transactions on Knowledge and Data Engineering.

[125]  Didem Makaroğlu,et al.  Social Media and Clickstream Analysis in Turkish News with Apache Spark , 2019 .

[126]  Jörn Altmann,et al.  Identifying the effects of co-authorship networks on the performance of scholars: A correlation and regression analysis of performance measures and social network analysis measures , 2011, J. Informetrics.

[127]  Nadine Schuurman,et al.  Social Media Big Data Acquisition and Analysis for Qualitative GIScience: Challenges and Opportunities , 2020, Annals of the American Association of Geographers.

[128]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[129]  Sérgio Moro,et al.  Unfolding the relations between companies and technologies under the Big Data umbrella , 2018, Comput. Ind..

[130]  Eun Go,et al.  But not all social media are the same: Analyzing organizations' social media usage patterns , 2016, Telematics Informatics.

[131]  Chetna Dabas Big data analytics for exploratory social network analysis , 2017, Int. J. Inf. Technol. Manag..

[132]  Michel Laroche,et al.  Using big data analytics to study brand authenticity sentiments: The case of Starbucks on Twitter , 2017, Int. J. Inf. Manag..

[133]  Clara Pizzuti,et al.  FOR CLOSENESS : ADJUSTING NORMALIZED MUTUAL INFORMATION MEASURE FOR CLUSTERING COMPARISON , 2016 .

[134]  Mark Manulis,et al.  Security and Privacy in Online Social Networks , 2010, Handbook of Social Network Technologies.

[135]  Kehua Guo,et al.  A comprehensive ranking model for tweets big data in online social network , 2016, EURASIP Journal on Wireless Communications and Networking.

[136]  Murtaza Haider,et al.  Beyond the hype: Big data concepts, methods, and analytics , 2015, Int. J. Inf. Manag..

[137]  Sancheng Peng,et al.  An Immunization Framework for Social Networks Through Big Data Based Influence Modeling , 2019, IEEE Transactions on Dependable and Secure Computing.

[138]  Dongqing Xie,et al.  Social Influence Analysis in Social Networking Big Data: Opportunities and Challenges , 2017, IEEE Network.

[139]  Ravikiran Vatrapu,et al.  Big Social Data Analytics for Public Health: Predicting Facebook Post Performance Using Artificial Neural Networks and Deep Learning , 2017, 2017 IEEE International Congress on Big Data (BigData Congress).

[140]  Anja Bechmann,et al.  Using APIs for Data Collection on Social Media , 2014, Inf. Soc..

[141]  Nei Kato,et al.  A Novel Embedding Method for Information Diffusion Prediction in Social Network Big Data , 2017, IEEE Transactions on Industrial Informatics.

[142]  Alexander Richter,et al.  "Thanks for sharing" - Identifying users' roles based on knowledge contribution in Enterprise Social Networks , 2018, Comput. Networks.

[143]  Nima Jafari Navimipour,et al.  Quality of service‐aware approaches in fog computing , 2020, Int. J. Commun. Syst..

[144]  Bart Baesens,et al.  The value of big data for credit scoring: Enhancing financial inclusion using mobile phone data and social network analytics , 2019, Appl. Soft Comput..

[145]  David Gil,et al.  A framework for big data analytics in commercial social networks: A case study on sentiment analysis and fake review detection for marketing decision-making , 2019 .

[146]  Zhigang Chen,et al.  Small Data: Effective Data Based on Big Communication Research in Social Networks , 2018, Wirel. Pers. Commun..