Big data and social media: A scientometrics analysis

Article history: Received: October 29, 2018 Received in revised format: January 21, 2019 Accepted: February 8, 2019 Available online: February 9, 2019 The purpose of this research is to investigate the status and the evolution of the scientific studies for the effect of social networks on big data and usage of big data for modeling the social networks users’ behavior. This paper presents a comprehensive review of the studies associated with big data in social media. The study uses Scopus database as a primary search engine and covers 2000 of highly cited articles over the period 2012-2019. The records are statistically analyzed and categorized in terms of different criteria. The findings show that researches have grown exponentially since 2014 and the trend has continued at relatively stable rates. Based on the survey, decision support systems is the key-word which has carried the highest densities followed by heuristics methods. Among the most cited articles, papers published by re-searchers in United States have received the highest citations (7548), followed by United Kingdom (588) and China with 543 citations. Thematic analysis shows that the subject nearly maintained an important and well-developed research field and for better results we can merge our research with “big data analytics” and “twitter” that are important topics in this field but not developed well. © 2019 by the authors; licensee Growing Science, Canada.

[1]  R. Procter,et al.  Reading the riots: what were the police doing on Twitter? , 2013 .

[2]  A. Guerry,et al.  Using social media to quantify nature-based tourism and recreation , 2013, Scientific Reports.

[3]  I. Kohane,et al.  Big Data and Machine Learning in Health Care. , 2018, JAMA.

[4]  Wu He,et al.  A novel social media competitive analytics framework with sentiment benchmarks , 2015, Inf. Manag..

[5]  Leslie F. Sikos,et al.  Mastering Structured Data on the Semantic Web , 2015, Apress.

[6]  Taghi M. Khoshgoftaar,et al.  A review of data mining using big data in health informatics , 2013, Journal Of Big Data.

[7]  Richard Whittington,et al.  Information Systems Strategy and Strategy-as-Practice: A joint agenda , 2014, J. Strateg. Inf. Syst..

[8]  Stefanie Haustein,et al.  Grand challenges in altmetrics: heterogeneity, data quality and dependencies , 2016, Scientometrics.

[9]  Marcello M. Mariani,et al.  Facebook as a destination marketing tool: Evidence from Italian regional Destination Management Organizations , 2016 .

[10]  Siti Mariyam Shamsuddin,et al.  Classification with class imbalance problem: A review , 2015, SOCO 2015.

[11]  William Ribarsky,et al.  Social media analytics for competitive advantage , 2014, Comput. Graph..

[12]  Bin Jiang,et al.  Geospatial analysis requires a different way of thinking: the problem of spatial heterogeneity , 2015 .

[13]  M. Williams,et al.  Crime sensing with big data: the affordances and limitations of using open source communications to estimate crime patterns , 2016 .

[14]  Carlo Ratti,et al.  Urban magnetism through the lens of geo-tagged photography , 2015, EPJ Data Science.

[15]  Gian M. Fulgoni,et al.  Digital Game Changers , 2014, Journal of Advertising Research.

[16]  Miftachul Huda,et al.  Big Data Emerging Technology: Insights into Innovative Environment for Online Learning Resources , 2018, Int. J. Emerg. Technol. Learn..

[17]  Juan M. Corchado,et al.  A polarity analysis framework for Twitter messages , 2015, Appl. Math. Comput..

[18]  Gregory J. Park,et al.  Psychological Language on Twitter Predicts County-Level Heart Disease Mortality , 2015, Psychological science.

[19]  Madhusudhan Govindaraju,et al.  An Evaluation of Cassandra for Hadoop , 2013, 2013 IEEE Sixth International Conference on Cloud Computing.

[20]  D. Boyd,et al.  CRITICAL QUESTIONS FOR BIG DATA , 2012 .

[21]  Matthew Zook,et al.  Social Media and the City: Rethinking Urban Socio-Spatial Inequality Using User-Generated Geographic Information , 2015 .

[22]  Hugh J. Watson,et al.  Tutorial: Big Data Analytics: Concepts, Technologies, and Applications , 2014, Commun. Assoc. Inf. Syst..

[23]  Linchi Kwok,et al.  Factors contributing to the helpfulness of online hotel reviews , 2016 .

[24]  A. Zipf,et al.  Exploiting Big VGI to Improve Routing and Navigation Services , 2013 .

[25]  David J. Crandall,et al.  Where have all the people gone? Enhancing global conservation using night lights and social media. , 2015, Ecological applications : a publication of the Ecological Society of America.

[26]  Jason J. Jung,et al.  Social big data: Recent achievements and new challenges , 2015, Information Fusion.

[27]  Haoran Xie,et al.  Community-aware user profile enrichment in folksonomy , 2014, Neural Networks.

[28]  Christos Makris,et al.  T-PICE: Twitter Personality Based Influential Communities Extraction System , 2014, 2014 IEEE International Congress on Big Data.

[29]  Matthew W. Wilson Morgan Freeman is dead and other big data stories , 2015 .

[30]  Xuelong Li,et al.  Toward an SDN-enabled big data platform for social TV analytics , 2015, IEEE Network.

[31]  B. Chae,et al.  Insights from hashtag #supplychain and Twitter Analytics: Considering Twitter and Twitter data for supply chain practice and research , 2015 .

[32]  M. Schatz,et al.  Big Data: Astronomical or Genomical? , 2015, PLoS biology.

[33]  Hsinchun Chen,et al.  AZDrugMiner: An Information Extraction System for Mining Patient-Reported Adverse Drug Events in Online Patient Forums , 2013, ICSH.

[34]  D. Mohr,et al.  Behavioral intervention technologies: evidence review and recommendations for future research in mental health. , 2013, General hospital psychiatry.

[35]  Hernán A. Makse,et al.  Influence maximization in complex networks through optimal percolation , 2015, Nature.

[36]  Fan Yu,et al.  Towards large-scale twitter mining for drug-related adverse events , 2012, SHB '12.

[37]  Zeynep Tufekci,et al.  Engineering the public: Big data, surveillance and computational politics , 2014, First Monday.

[38]  Odej Kao,et al.  Elastic Stream Processing with Latency Guarantees , 2015, 2015 IEEE 35th International Conference on Distributed Computing Systems.

[39]  Felipe Bravo-Marquez,et al.  Meta-level sentiment models for big social data analysis , 2014, Knowl. Based Syst..

[40]  M. Conway Determining the Role of the Internet in Violent Extremism and Terrorism: Six Suggestions for Progressing Research , 2017 .

[41]  Aron Culotta,et al.  Mining Brand Perceptions from Twitter Social Networks , 2016, Mark. Sci..

[42]  Jun Wang,et al.  Comparing apples to oranges: a scalable solution with heterogeneous hashing , 2013, KDD.

[43]  Stefan Stieglitz,et al.  Social Media Analytics , 2014 .

[44]  Lu Liu,et al.  Muppet: MapReduce-Style Processing of Fast Data , 2012, Proc. VLDB Endow..

[45]  Xuelong Li,et al.  Toward Multiscreen Social TV with Geolocation-Aware Social Sense , 2014, IEEE MultiMedia.

[46]  Junwei Wang,et al.  ComSoc: adaptive transfer of user behaviors over composite social network , 2012, KDD.

[47]  Sean D. Young,et al.  A "big data" approach to HIV epidemiology and prevention. , 2015, Preventive medicine.

[48]  Nancy K. Baym,et al.  Data not seen: The uses and shortcomings of social media metrics , 2013, First Monday.

[49]  Dhavan V. Shah,et al.  How Can Research Keep Up With eHealth? Ten Strategies for Increasing the Timeliness and Usefulness of eHealth Research , 2014, Journal of medical Internet research.

[50]  Karolin Kappler,et al.  Communication dynamics in twitter during political campaigns: The case of the 2011 Spanish national election , 2013 .

[51]  Ryan Burns Rethinking big data in digital humanitarianism: practices, epistemologies, and social relations , 2014, GeoJournal.

[52]  Alessandro Vespignani,et al.  Beating the news using social media: the case study of American Idol , 2012, EPJ Data Science.

[53]  B. Chae,et al.  Using Twitter Data for Cruise Tourism Marketing and Research , 2016 .

[54]  Sugam Sharma,et al.  Expanded cloud plumes hiding Big Data ecosystem , 2016, Future Gener. Comput. Syst..

[55]  Nili Steinfeld,et al.  Local engagement online: Municipal Facebook pages as hubs of interaction , 2015, Gov. Inf. Q..

[56]  Carol A Gotway Crawford,et al.  A New Source of Data for Public Health Surveillance: Facebook Likes , 2015, Journal of medical Internet research.

[57]  Matthew Zook,et al.  Beyond the geotag: situating ‘big data’ and leveraging the potential of the geoweb , 2013 .

[58]  Yanchi Liu,et al.  Diagnosing New York city's noises with ubiquitous data , 2014, UbiComp.

[59]  K. Ruyter,et al.  Unveiling What Is Written in the Stars: Analyzing Explicit, Implicit, and Discourse Patterns of Sentiment in Social Media , 2017 .

[60]  John P. A. Ioannidis,et al.  Big data meets public health , 2014, Science.

[61]  Gregory J. Park,et al.  Automatic personality assessment through social media language. , 2015, Journal of personality and social psychology.

[62]  C. Bail The cultural environment: measuring culture with big data , 2014, Theory and Society.

[63]  Christopher A. Bail,et al.  Terrified: How Anti-Muslim Fringe Organizations Became Mainstream , 2014 .

[64]  Joseph DiGrazia,et al.  Twitter publics: how online political communities signaled electoral outcomes in the 2010 US house election , 2014 .

[65]  Kathleen M. Carley,et al.  Crowd sourcing disaster management: The complex nature of Twitter usage in Padang Indonesia , 2016 .

[66]  Stephanie Q. Liu,et al.  Airbnb: Online targeted advertising, sense of power, and consumer decisions , 2017 .

[67]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[68]  M. White,et al.  Digital workplaces , 2012 .

[69]  Bin Jiang Head/tail Breaks for Visualization of City Structure and Dynamics , 2016 .

[70]  Matthew Zook,et al.  Mapping the Data Shadows of Hurricane Sandy: Uncovering the Sociospatial Dimensions of ‘Big Data’ , 2014 .

[71]  Bo Huang,et al.  Using multi-source geospatial big data to identify the structure of polycentric cities , 2017 .

[72]  M. Couper Is the sky falling? new technology, changing media, and the future of surveys , 2013 .

[73]  Opher Etzion,et al.  Event processing under uncertainty , 2012, DEBS.

[74]  Zizi Papacharissi Affective publics and structures of storytelling: sentiment, events and mediality , 2016 .

[75]  Daniel Fried,et al.  Analyzing the language of food on social media , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[76]  A. Bruns,et al.  The Arab Spring and Social Media Audiences , 2013 .

[77]  Matthew Leighton Williams,et al.  Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making , 2015 .

[78]  Stefan Stieglitz,et al.  Social media analytics - Challenges in topic discovery, data collection, and data preparation , 2018, Int. J. Inf. Manag..

[79]  H. Christensen,et al.  Detecting suicidality on Twitter , 2015 .

[80]  Athanasios V. Vasilakos,et al.  Big data: From beginning to future , 2016, Int. J. Inf. Manag..

[81]  Christopher M. Danforth,et al.  Twitter reciprocal reply networks exhibit assortativity with respect to happiness , 2011, J. Comput. Sci..

[82]  Gianmarco De Francisci Morales SAMOA: a platform for mining big data streams , 2013, WWW '13 Companion.

[83]  Hyun Suk Kim Attracting Views and Going Viral: How Message Features and News-Sharing Channels Affect Health News Diffusion. , 2015, The Journal of communication.

[84]  Susan E. Sheridan,et al.  Patient-powered research networks aim to improve patient care and health research. , 2014, Health affairs.

[85]  Pekka Pääkkönen,et al.  Evaluating the Quality of Social Media Data in Big Data Architecture , 2015, IEEE Access.

[86]  Ramesh C. Jain,et al.  Situation recognition: an evolving problem for heterogeneous dynamic big multimedia data , 2012, ACM Multimedia.

[87]  Ming Yang,et al.  Filtering big data from social media - Building an early warning system for adverse drug reactions , 2015, J. Biomed. Informatics.

[88]  Ravikiran Vatrapu,et al.  Social Data Analytics Tool (SODATO) , 2014, DESRIST.

[89]  Wei Jiang,et al.  Using Social Media to Detect Outdoor Air Pollution and Monitor Air Quality Index (AQI): A Geo-Targeted Spatiotemporal Analysis Framework with Sina Weibo (Chinese Twitter) , 2015, PloS one.

[90]  M. J. O’Brien,et al.  Mapping collective behavior in the big-data era , 2014, Behavioral and Brain Sciences.

[91]  Shah Jahan Miah,et al.  A Big Data Analytics Method for Tourist Behaviour Analysis , 2017, Inf. Manag..

[92]  Jinjun Chen,et al.  Authorized Public Auditing of Dynamic Big Data Storage on Cloud with Efficient Verifiable Fine-Grained Updates , 2014, IEEE Transactions on Parallel and Distributed Systems.

[93]  Shaowen Wang,et al.  A scalable framework for spatiotemporal analysis of location-based social media data , 2014, Comput. Environ. Urban Syst..

[94]  Massimo Aria,et al.  bibliometrix: An R-tool for comprehensive science mapping analysis , 2017, J. Informetrics.

[95]  Gang Hua,et al.  Multimedia Big Data Computing , 2015, IEEE Multim..

[96]  Zeynep Tufekci,et al.  Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls , 2014, ICWSM.

[97]  M M Hansen,et al.  Big Data in Science and Healthcare: A Review of Recent Literature and Perspectives , 2014, Yearbook of Medical Informatics.

[98]  W. R. Neuman,et al.  The Dynamics of Public Attention: Agenda‐Setting Theory Meets Big Data , 2014 .

[99]  Robert K. Cunningham,et al.  Computing on masked data: a high performance method for improving big data veracity , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).

[100]  R. Procter,et al.  Reading the riots on Twitter: methodological innovation for the analysis of big data , 2013 .

[101]  Haoran Xie,et al.  Exploring personalized searches using tag-based user profiles and resource profiles in folksonomy , 2014, Neural Networks.

[102]  N Peek,et al.  Technical Challenges for Big Data in Biomedicine and Health: Data Sources, Infrastructure, and Analytics , 2014, Yearbook of Medical Informatics.

[103]  Diansheng Guo,et al.  Understanding U.S. regional linguistic variation with Twitter data analysis , 2016, Comput. Environ. Urban Syst..

[104]  Harvey J. Miller,et al.  Beyond sharing: cultivating cooperative transportation systems through geographic information science , 2013 .

[105]  Alessandro Vespignani,et al.  The Twitter of Babel: Mapping World Languages through Microblogging Platforms , 2012, PloS one.

[106]  Estela Marine-Roig,et al.  Tourism analytics with massive user-generated content: a case study of Barcelona. , 2015 .

[107]  Jeffrey T. Hancock,et al.  Experimental evidence of massive-scale emotional contagion through social networks , 2014, Proceedings of the National Academy of Sciences.

[108]  Bin Jiang,et al.  The Evolution of Natural Cities from the Perspective of Location-Based Social Media , 2014, Digital Social Networks and Travel Behaviour in Urban Environments.

[109]  Mimmo Parente,et al.  Time Aware Knowledge Extraction for microblog summarization on Twitter , 2015, Inf. Fusion.

[110]  Ruha Benjamin,et al.  The rise of the platform economy , 2016 .

[111]  Ranga Raju Vatsavai,et al.  Spatiotemporal data mining in the era of big spatial data: algorithms and applications , 2012, BigSpatial '12.

[112]  Bartosz Krawczyk,et al.  Learning from imbalanced data: open challenges and future directions , 2016, Progress in Artificial Intelligence.

[113]  Marc Lipsitch,et al.  Inference of seasonal and pandemic influenza transmission dynamics , 2015, Proceedings of the National Academy of Sciences.

[114]  A.P.J. van den Bosch,et al.  Dealing with big data: The case of Twitter , 2013, CLIN 2013.

[115]  J. Dijck Datafication, dataism and dataveillance: Big Data between scientific paradigm and ideology , 2014 .

[116]  Dylan B. George,et al.  Big Data Opportunities for Global Infectious Disease Surveillance , 2013, PLoS medicine.

[117]  Julie Uldam,et al.  Corporate management of visibility and the fantasy of the post-political: Social media and surveillance , 2016, New Media Soc..

[118]  Matthew Smith,et al.  Big data privacy issues in public social media , 2012, 2012 6th IEEE International Conference on Digital Ecosystems and Technologies (DEST).

[119]  Luis Fernández-Luque,et al.  Health and Social Media: Perfect Storm of Information , 2015, Healthcare informatics research.

[120]  Axel Bruns,et al.  Faster than the speed of print: Reconciling 'big data' social media analysis and academic scholarship , 2013, First Monday.

[121]  Dimitrios Buhalis,et al.  SoCoMo marketing for travel and tourism: Empowering co-creation of value , 2015 .

[122]  M. Kosinski,et al.  Computer-based personality judgments are more accurate than those made by humans , 2015, Proceedings of the National Academy of Sciences.

[123]  Gonzalo Mateos,et al.  Stochastic Approximation vis-a-vis Online Learning for Big Data Analytics [Lecture Notes] , 2014, IEEE Signal Processing Magazine.

[124]  M. Tsou,et al.  Research challenges and opportunities in mapping social media and Big Data , 2015 .

[125]  Z. Schwartz,et al.  What can big data and text analytics tell us about hotel guest experience and satisfaction , 2015 .

[126]  Zhiyong Lu,et al.  Crowdsourcing in biomedicine: challenges and opportunities , 2016, Briefings Bioinform..

[127]  Kristopher Welsh,et al.  The danger of big data: Social media as computational social science , 2012, First Monday.

[128]  Gregory J. Park,et al.  From "Sooo excited!!!" to "So proud": using language to study development. , 2014, Developmental psychology.

[129]  Hamid Bagheri,et al.  Big Data: Challenges, Opportunities and Cloud Based Solutions , 2015 .

[130]  Kevin Driscoll,et al.  Big Data, Big Questions| Working Within a Black Box: Transparency in the Collection and Production of Big Twitter Data , 2014 .

[131]  Satish V. Ukkusuri,et al.  Urban activity pattern classification using topic models from online geo-location data , 2014 .

[132]  Wenli Zhang,et al.  Predicting Asthma-Related Emergency Department Visits Using Big Data , 2015, IEEE Journal of Biomedical and Health Informatics.

[133]  Nick Couldry,et al.  Understanding micro-processes of community building and mutual learning on Twitter: a ‘small data’ approach , 2014 .

[134]  M. Williams,et al.  Cyber-hate on social media in the aftermath of Woolwich , 2015 .

[135]  Keyuan Jiang,et al.  Mining Twitter Data for Potential Drug Effects , 2013, ADMA.

[136]  Karin M. Verspoor,et al.  Big Data in Medicine Is Driving Big Changes , 2014, Yearbook of Medical Informatics.

[137]  Scott A. Golder,et al.  Digital Footprints: Opportunities and Challenges for Online Social Research , 2014 .

[138]  Hernán A. Makse,et al.  Collective Influence Algorithm to find influencers via optimal percolation in massively large social media , 2016, Scientific Reports.

[139]  L. Ohno-Machado,et al.  “Big Data” and the Electronic Health Record , 2014, Yearbook of Medical Informatics.

[140]  Gerardo Chowell,et al.  Big Data for Infectious Disease Surveillance and Modeling. , 2016, The Journal of infectious diseases.

[141]  David Allen,et al.  Geotagging one hundred million Twitter accounts with total variation minimization , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[142]  Johnny S. Wong,et al.  A Brief Review on Leading Big Data Models , 2014, Data Sci. J..

[143]  B. Lewis,et al.  Methods of using real-time social media technologies for detection and remote monitoring of HIV outcomes. , 2014, Preventive medicine.

[144]  P. Leeflang,et al.  Challenges and solutions for marketing in a digital era , 2014 .

[145]  M. Tseng,et al.  Toward Sustainability : Using Big Data to Explore Decisive Attributes of Supply Chain Risks and Uncertainties , 2017 .

[146]  Paulo B. Góes,et al.  Business Intelligence and Analytics Education, and Program Development: A Unique Opportunity for the Information Systems Discipline , 2012, TMIS.

[147]  Jeremy Kepner,et al.  D4M 2.0 schema: A general purpose high performance schema for the Accumulo database , 2013, 2013 IEEE High Performance Extreme Computing Conference (HPEC).