Big Data for Development: A Review of Promises and Challenges

type="main" xml:id="dpr12142-abs-0001"> The article uses a conceptual framework to review empirical evidence and some 180 articles related to the opportunities and threats of Big Data Analytics for international development. The advent of Big Data delivers a cost-effective prospect for improved decision-making in critical development areas such as healthcare, economic productivity and security. At the same time, the well-known caveats of the Big Data debate, such as privacy concerns and human resource scarcity, are aggravated in developing countries by long-standing structural shortages in the areas of infrastructure, economic resources and institutions. The result is a new kind of digital divide: a divide in the use of data-based knowledge to inform intelligent decision-making. The article systematically reviews several available policy options in terms of fostering opportunities and minimising risks.

[1]  Lorin M. Hitt,et al.  Strength in Numbers: How Does Data-Driven Decisionmaking Affect Firm Performance? , 2011, ICIS 2011.

[2]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[3]  Omowunmi E. Isafiade,et al.  Efficient frequent pattern knowledge for crime situation recognition in developing countries , 2013, ACM DEV-4 '13.

[4]  Dirk Helbing,et al.  From social data mining to forecasting socio-economic crises , 2010, The European physical journal. Special topics.

[5]  Martin Hilbert,et al.  Mapping the Dimensions and Characteristics of the World’s Technological Communication Capacity During the Period of Digitization (1986-2007/2010) , 2011 .

[6]  Jorma Rissanen,et al.  Information and Complexity in Statistical Modeling , 2006, ITW.

[7]  Galit Shmueli,et al.  Early statistical detection of anthrax outbreaks by tracking over-the-counter medication sales , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Christofer Edling,et al.  Growing artificial societies: Social science from the bottom up. , 1998 .

[9]  Charles M. Jones,et al.  Does Algorithmic Trading Improve Liquidity? , 2010 .

[10]  Martin Hilbert,et al.  ICT4ICTD: Computational Social Science for Digital Development , 2015, 2015 48th Hawaii International Conference on System Sciences.

[11]  M. Hilbert Toward a Conceptual Framework for ICT for Development: Lessons Learned from the Cube Framework Used in Latin America (English) , 2012 .

[12]  Stephen Baker,et al.  The Numerati , 2008 .

[13]  Anna Chernyakhovskaya,et al.  emerging economies Russia : A Complex Transition Increasing Transparency and Accountability in the Extractive Industries , 2012 .

[14]  L. Manovich,et al.  Trending: The Promises and the Challenges of Big Social Data , 2012 .

[15]  Taner Kizilhan,et al.  The Rise of the Network Society - The Information Age: Economy, Society, and Culture , 2016 .

[16]  Sue Nelson The Harvard computers: Big data , 2008 .

[17]  Colin Norman 2011 International Science & Engineering Visualization Challenge. , 2012, Science.

[18]  W. Hamilton,et al.  The evolution of cooperation. , 1984, Science.

[19]  Paul Zikopoulos,et al.  Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data , 2011 .

[20]  R. Snow,et al.  Mobile phones and malaria: modeling human and parasite travel. , 2013, Travel medicine and infectious disease.

[21]  L. Bengtsson,et al.  Improved Response to Disasters and Outbreaks by Tracking Population Movements with Mobile Phone Network Data: A Post-Earthquake Geospatial Study in Haiti , 2011, PLoS medicine.

[22]  Yoneji Masuda The information society as post-industrial society , 1980 .

[23]  Franci Pivec,et al.  Measuring the information society , 2003 .

[24]  Guy Holmes,et al.  The World’s Technological Capacity to Store, Compute and Communicate Information that has Already Been Created and Does not Need to be Done Again – 2012 , 2012 .

[25]  Alain Biem,et al.  Real-Time Traffic Information Management using Stream Computing , 2010, IEEE Data Eng. Bull..

[26]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[27]  Viktor Mayer-Schnberger,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2013 .

[28]  H. Varian,et al.  Predicting the Present with Google Trends , 2009 .

[29]  Max Roser,et al.  Human Development Index (HDI) , 2014 .

[30]  Nathan Eagle,et al.  Risk and Reciprocity Over the Mobile Phone Network: Evidence from Rwanda , 2011 .

[31]  Xiaofeng Wang,et al.  Automatic Crime Prediction Using Events Extracted from Twitter Posts , 2012, SBP.

[32]  D. Bamber The Coming of Post-Industrial Society — A Venture in Social Forecasting , 1980 .

[33]  Carlota Perez,et al.  Technological Revolutions, Paradigm Shifts and Socio-institutional Change , 2004 .

[34]  Yehuda Koren,et al.  All Together Now: A Perspective on the Netflix Prize , 2010 .

[35]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[36]  Judith Hurwitz,et al.  Big Data For Dummies , 2013 .

[37]  J. Brownstein,et al.  Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak. , 2012, The American journal of tropical medicine and hygiene.

[38]  Russell S. Kirby,et al.  The Dartmouth Atlas of Health Care , 1998 .

[39]  Vanessa Frías-Martínez,et al.  Characterizing social response to urban earthquakes using cell-phone network data: the 2012 oaxaca earthquake , 2013, UbiComp.

[40]  Martin Hilbert,et al.  Info Capacity| How to Measure the World’s Technological Capacity to Communicate, Store and Compute Information? Part I: Results and Scope , 2012 .

[41]  Brendan Smoker A matter of trust. , 2002, Health management technology.

[42]  Lada A. Adamic,et al.  Computational Social Science , 2009, Science.

[43]  M. Castells The Information Age: Economy, Society and Culture , 1999 .

[44]  Mika Raento,et al.  Smartphones , 2009 .

[45]  G. Nigel Gilbert,et al.  Simulation for the social scientist , 1999 .

[46]  Seth Lloyd,et al.  Information measures, effective complexity, and total information , 1996, Complex..

[47]  J. Schumpeter,et al.  Business Cycles: A Theoretical, Historical, and Statistical Analysis of the Capitalist Process , 1940 .

[48]  H. Dalton The Measurement of the Inequality of Incomes , 1920 .

[49]  H. Romijn,et al.  Pathways of Technological Change in Developing Countries: Review and New Agenda , 2011 .

[50]  D. Boyd,et al.  CRITICAL QUESTIONS FOR BIG DATA , 2012 .

[51]  Nathan Eagle,et al.  Who's Calling? Demographics of Mobile Phone Use in Rwanda , 2010, AAAI Spring Symposium: Artificial Intelligence for Development.

[52]  Direito Freedom of information legislation , 2010 .

[53]  S. Olesen,et al.  Technology and History , 2013 .

[54]  A. Debons,et al.  The control revolution: Technological and economic origins of the information society , 1990, J. Am. Soc. Inf. Sci..

[55]  M. Kranzberg Technology and History: "Kranzberg's Laws" , 1986 .

[56]  王靜詩 Open Government Declaration , 2010 .

[57]  Xin Lu,et al.  Approaching the Limit of Predictability in Human Mobility , 2013, Scientific Reports.

[58]  Herbert A. Simon,et al.  Simplicity, Inference and Modelling: Science seeks parsimony, not simplicity: searching for pattern in phenomena , 2002 .

[59]  J. Bolter The control revolution: Technological and economic origins of the information society: James R. Beniger (Harvard University Press, Cambridge, MA, 1986) pp. x+493, $25.00 , 1988 .

[60]  Christopher Steiner,et al.  Automate This: How Algorithms Came to Rule Our World , 2012 .

[61]  R. Michener The Surrender of Secrecy? Explaining the Strength of Transparency and Access to Information Laws , 2009 .

[62]  M. Osborne,et al.  Using Prediction Markets and Twitter to Predict a Swine Flu Pandemic , 2009 .

[63]  Sue Nelson,et al.  Big data: The Harvard computers , 2008, Nature.

[64]  Cuihua Shen,et al.  Unpacking Time Online: Connecting Internet and Massively Multiplayer Online Game Use With Psychosocial Well-Being , 2011, Commun. Res..

[65]  Martin Hilbert,et al.  The World’s Technological Capacity to Store, Communicate, and Compute Information , 2011, Science.

[66]  J. Wennberg,et al.  Time to tackle unwarranted variations in practice , 2011, BMJ : British Medical Journal.

[67]  Jesus Virseda,et al.  Cell Phone Analytics: Scaling Human Behavior Studies into the Millions , 2013 .

[68]  James N Weinstein,et al.  Extending the P4P agenda, part 1: how Medicare can improve patient decision making and reduce unnecessary care. , 2007, Health affairs.

[69]  J. Sethian 2013 Visualization Challenge. , 2014, Science.

[70]  A. Kyle,et al.  The Flash Crash: High-Frequency Trading in an Electronic Market , 2017 .

[71]  Michele Banko,et al.  Scaling to Very Very Large Corpora for Natural Language Disambiguation , 2001, ACL.

[72]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[73]  Colin Norman,et al.  2012 Visualization Challenge , 2013 .

[74]  M. Waldrop,et al.  Community cleverness required , 2008, Nature.

[75]  Vanessa Frías-Martínez,et al.  A Gender-Centric Analysis of Calling Behavior in a Developing Economy Using Call Detail Records , 2010, AAAI Spring Symposium: Artificial Intelligence for Development.

[76]  Hua Wang,et al.  Multimodality and Interactivity: Connecting Properties of Serious Games with Educational Outcomes , 2009, Cyberpsychology Behav. Soc. Netw..

[77]  Martin Hilbert,et al.  How much of the global information and communication explosion is driven by more, and how much by better technology? , 2014, J. Assoc. Inf. Sci. Technol..

[78]  Alejandra Naser,et al.  El desafío hacia el gobierno abierto en la hora de la igualdad , 2012 .

[79]  John H. Gerdes,et al.  Using web-based search data to predict macroeconomic statistics , 2005, CACM.

[80]  Alessandro Vespignani,et al.  The Twitter of Babel: Mapping World Languages through Microblogging Platforms , 2012, PloS one.

[81]  Lori B. Andrews,et al.  I Know Who You Are and I Saw What You Did: Social Networks and the Death of Privacy , 2012 .

[82]  Martin Hilbert,et al.  The end justifies the definition: The manifold outlooks on the digital divide and their practical usefulness for policy-making , 2011 .

[83]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[84]  D. Foley,et al.  The economy needs agent-based modelling , 2009, Nature.

[85]  Joseph T. Mahoney,et al.  Information Rules: A Strategic Guide to the Network Economy , 2000 .

[86]  Rob Thomas,et al.  Big Data Revolution , 2015 .

[87]  Nile W. Hatch,et al.  As Time Goes By: From the Industrial Revolutions to the Information Revolution , 2002 .

[88]  D. Sim The Rise of the Network Society (The Information Age: Economy, Society and Culture, Volume 1) , 1998 .

[89]  Leo P. Kadanoff,et al.  The Unreasonable Effectiveness of , 2000 .

[90]  O. Morgenstern,et al.  Business Cycles: A Theoretical, Historical, and Statistical Analysis of the Capitalist Process. , 1940 .

[91]  Mitch Waldrop,et al.  Big data: Wikiomics , 2008, Nature.

[92]  Galit Shmueli,et al.  To Explain or To Predict? , 2010, 1101.0891.

[93]  Efthimios Tambouris,et al.  Understanding the Predictive Power of Social Media This is a pre-print version of the following article : , 2013 .

[94]  J. Blumenstock,et al.  Divided We Call: Disparities in Access and Use of Mobile Phones in Rwanda , 2012 .

[95]  Shantanu Choudhary,et al.  Big Data, Small World , 2015 .

[96]  R. Ivkov,et al.  CORRIGENDUM: Effect of magnetic dipolar interactions on nanoparticle heating efficiency: Implications for cancer hyperthermia , 2014, Scientific Reports.

[97]  Charles Anderson,et al.  The end of theory: The data deluge makes the scientific method obsolete , 2008 .

[98]  Daniel Lathrop,et al.  Open Government: Collaboration, Transparency, and Participation in Practice , 2010 .

[99]  Subhash Bhatnagar,et al.  Why A Global Exchange for Scaling Up Success Scaling Up Poverty Reduction : A Global Learning Process and Conference Shanghai , May 25 27 , 2004 Online Delivery of Land Titles to Rural Farmers in Karnataka , India , 2004 .

[100]  Sara Vannini,et al.  Exploring the meanings of community multimedia centers in Mozambique: a social representation perspective , 2013 .

[101]  Dirk Helbing,et al.  From Social Datamining to Forecasting Socio-Economic Crisis , 2010, ArXiv.

[102]  Idrc,et al.  Information societies in Latin America and the Caribbean: development of techonologies and technologies for development , 2010 .

[103]  Athanasios V. Vasilakos,et al.  Big data: From beginning to future , 2016, Int. J. Inf. Manag..

[104]  C. K. Paul,et al.  Remote Sensing in Development , 1981, Science.

[105]  Petter Holme,et al.  Predictability of population displacement after the 2010 Haiti earthquake , 2012, Proceedings of the National Academy of Sciences.

[106]  Hidde Leijnse,et al.  Country-wide rainfall maps from cellular communication networks , 2013, Proceedings of the National Academy of Sciences.

[107]  Hye-Chung Kum,et al.  Dealing with data: governments records. , 2011, Science.

[108]  T. Hughes South Africa: A Driver of Change Increasing Transparency and Accountability in the Extractive Industries , 2012 .

[109]  Declan Butler Data sharing threatens privacy , 2007, Nature.

[110]  Sadaf Ashtari I Know Who You Are and I Saw What You Did: Social Networks and the Death of Privacy , 2013 .

[111]  Galit Shmueli,et al.  Predictive Analytics in Information Systems Research , 2010, MIS Q..

[112]  J. Manyika,et al.  Are you ready for the era of ‘big data’? , 2010 .

[113]  G. King,et al.  Ensuring the Data-Rich Future of the Social Sciences , 2011, Science.

[114]  Eric Gossett,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2015 .

[115]  Ramanathan V. Guha,et al.  The predictive power of online chatter , 2005, KDD '05.

[116]  Peter Norvig,et al.  The Unreasonable Effectiveness of Data , 2009, IEEE Intelligent Systems.

[117]  Víctor Soto,et al.  Prediction of socioeconomic levels using cell phone records , 2011, UMAP'11.

[118]  J. Cox,et al.  Quantifying travel behavior for infectious disease research: a comparison of data from surveys and mobile phones , 2014, Scientific Reports.

[119]  Felice C. Frankel,et al.  Big data: Distilling meaning from data , 2008, Nature.

[120]  Tim Berners-Lee,et al.  Linked data , 2020, Semantic Web for the Working Ontologist.

[121]  D. Cummings,et al.  Prediction of Dengue Incidence Using Search Query Surveillance , 2011, PLoS neglected tropical diseases.

[122]  A. Saxenian The New Argonauts: Regional Advantage in a Global Economy , 1994 .

[123]  M. Hilbert,et al.  When is Cheap, Cheap Enough to Bridge the Digital Divide? Modeling Income Related Structural Challenges of Technology Diffusion in Latin America , 2010 .

[124]  Douglas W. Hubbard Pulse: The New Science of Harnessing Internet Buzz to Track Threats and Opportunities , 2011 .

[125]  E. Goffman The Presentation of Self in Everyday Life , 1959 .

[126]  Eric Bonabeau,et al.  Agent-based modeling: Methods and techniques for simulating human systems , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[127]  Jaideep Srivastava,et al.  Churn Prediction in MMORPGs Using Player Motivation Theories and an Ensemble Approach , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[128]  Andrea De Mauro,et al.  What is big data? A consensual definition and a review of key research topics , 2015, AIP Conference Proceedings.

[129]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[130]  Martin Hilbert,et al.  Technological information inequality as an incessantly moving target: The redistribution of information and communication capacities between 1986 and 2010 , 2014, J. Assoc. Inf. Sci. Technol..

[131]  Ralf Schweizer,et al.  The Evolution Of Co Operation , 2016 .

[132]  Paul G. Biondich,et al.  Changing course to make clinical decision support work in an HIV clinic in Kenya , 2010, Int. J. Medical Informatics.

[133]  Diane H. Sonnenwald,et al.  Association for Information Science and Technology , 2017 .

[134]  J. Carpenter May the best analyst win. , 2011, Science.

[135]  Vladislav Lazarov,et al.  Churn Prediction , 2007 .

[136]  D. L. Wehmeyer Strength in numbers. , 1997, Texas medicine.

[137]  Mark Dredze,et al.  You Are What You Tweet: Analyzing Twitter for Public Health , 2011, ICWSM.

[138]  A. Tversky,et al.  The framing of decisions and the psychology of choice. , 1981, Science.

[139]  Moshe Y. Vardi The end of the American network , 2013, CACM.

[140]  Joshua B. Plotkin,et al.  Spatiotemporal correlations in criminal offense records , 2011, TIST.

[141]  Kevin Driscoll,et al.  From Punched Cards to "Big Data": A Social History of Database Populism , 2012 .

[142]  Michael Marien,et al.  Book Review: The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies , 2014 .

[143]  Solomon Benjamin,et al.  Bhoomi : ‘ E-Governance ’ , Or , An Anti-Politics Machine Necessary to Globalize Bangalore ? , 2007 .

[144]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[145]  D. Bell,et al.  The Coming of Post-Industrial Society: A Venture in Social Forecasting , 1974 .

[146]  Albert-László Barabási,et al.  Limits of Predictability in Human Mobility , 2010, Science.

[147]  Joshua M. Epstein,et al.  Growing Artificial Societies: Social Science from the Bottom Up , 1996 .

[148]  A. Kyle,et al.  The Flash Crash: The Impact of High Frequency Trading on an Electronic Market , 2011 .