AAPOR Report on Big Data

In recent years we have seen an increase in the amount of statistics in society describing different phenomena based on so called Big Data. The term Big Data is used for a variety of data as explained in the report, many of them characterized not just by their large volume, but also by their variety and velocity, the organic way in which they are created, and the new types of processes needed to analyze them and make inference from them. The change in the nature of the new types of data, their availability, the way in which they are collected, and disseminated are fundamental. The change constitutes a paradigm shift for survey research.

[1]  J. Stock,et al.  Forecasting Using Principal Components From a Large Number of Predictors , 2002 .

[2]  P. Biemer Total Survey Error: Design, Implementation, and Evaluation , 2010 .

[3]  Paul Ohm Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization , 2009 .

[4]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[5]  P. Daas,et al.  Social media sentiment and consumer confidence , 2014 .

[6]  David S. Evans Tests of Alternative Theories of Firm Growth , 1987, Journal of Political Economy.

[7]  Anthony J. G. Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery [Point of View] , 2011 .

[8]  Seth Earley The Role of the Chief Data Officer: Managing Expectations , 2017, IT Professional.

[9]  Erik Brynjolfsson,et al.  Big data: the management revolution. , 2012, Harvard business review.

[10]  Peter Hall,et al.  Using Generalized Correlation to Effect Variable Selection in Very High Dimensional Problems , 2009 .

[11]  Colin Combe,et al.  Privacy, Big Data, and the Public Good: Frameworks for Engagement , 2015 .

[12]  Juan José SALAZAR-GONZÁLEZ,et al.  Statistical Confidentiality: Principles and Practice , 2011 .

[13]  William E. Winkler,et al.  Re-identification Methods for Evaluating the Confidentiality of Analytically Valid Microdata , 1998 .

[14]  Rolando V. del Carmen,et al.  Stop and frisk , 2012 .

[15]  A. Acquisti Privacy, Big Data, and the Public Good: The Economics and Behavioral Economics of Privacy , 2014 .

[16]  Lorin M. Hitt,et al.  Strength in Numbers: How Does Data-Driven Decisionmaking Affect Firm Performance? , 2011, ICIS 2011.

[17]  Jerome P. Reiter,et al.  Using Statistics to Protect Privacy , 2014 .

[18]  Carlo Vaccari,et al.  Big Data and Official Statistics , 2016 .

[19]  Edward C. Norton,et al.  Health Care Expenditures , 2009 .

[20]  Helen Nissenbaum,et al.  Big Data’s End Run around Anonymity and Consent , 2014, Book of Anonymity.

[21]  Victoria Stodden,et al.  What? Me Worry?: What to Do About Privacy, Big Data, and Statistical Research , 2013 .

[22]  J. Ibrahim,et al.  Power prior distributions for regression models , 2000 .

[23]  H. Nissenbaum A Contextual Approach to Privacy Online , 2011, Daedalus.

[24]  The High Concentration of U.S. Health Care Expenditures , 2006 .

[25]  Frauke Kreuter,et al.  Extracting information from big data: Issues of measurement, inference and linkage , 2014 .

[26]  Ck Cheng,et al.  The Age of Big Data , 2015 .

[27]  ON THE USE OF INTERNET ROBOTS FOR OFFICIAL STATISTICS , 2014 .

[28]  Thomas Vogt,et al.  Reinventing Discovery: The New Era of Networked Science , 2012 .

[29]  N. McGlynn Thinking fast and slow. , 2014, Australian veterinary journal.

[30]  Johannes Fernandes-Huessy,et al.  Avoiding Disclosure of Individually Identifiable Health Information , 2011 .

[31]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[32]  Hyun-Woo Kim,et al.  Run amok: Group crowd participation in identifying the bomb and bomber from the Boston marathon bombing , 2014, ISCRAM.

[33]  Alan F. Karr,et al.  Data Confidentiality: The Next Five Years Summary and Guide to Papers , 2010, J. Priv. Confidentiality.

[34]  Paul P. Biemer,et al.  A Review of Measurement Error Effects on the Analysis of Survey Data , 1997 .

[35]  Anders Wallgren,et al.  Register-based Statistics: Administrative Data for Statistical Purposes , 2007 .

[36]  S. Koonin,et al.  Big data and city living – what can it do for us? , 2012 .

[37]  M. Larsen,et al.  The Psychology of Survey Response , 2002 .

[38]  Viktor Mayer-Schönberger,et al.  The Rise of Big Data: How It’s Changing the Way We Think about the World , 2014 .

[39]  A. Gelman,et al.  An Analysis of the New York City Police Department's “Stop-and-Frisk” Policy in the Context of Claims of Racial Bias , 2007 .

[40]  Steven E. Koonin,et al.  The Value of Big Data for Urban Science , 2014 .

[41]  T. Hardjono,et al.  Privacy, Big Data, and the Public Good: The New Deal on Data: A Framework for Institutional Controls , 2014 .

[42]  Steven D. Levitt,et al.  ECONOMIC CONTRIBUTIONS TO THE UNDERSTANDING OF CRIME , 2006 .

[43]  Anders Wallgren,et al.  Register-Based Statistics: Statistical Methods for Administrative Data , 2014 .

[44]  Declan Butler,et al.  When Google got flu wrong , 2013, Nature.

[45]  Prasanna Tambe,et al.  The Productivity of Information Technology Investments: New Evidence from IT Labor Data , 2011, Inf. Syst. Res..

[46]  Mario Callegaro,et al.  Social media in public opinion research: Report of the AAPOR task force on emerging technologies in public opinion research , 2014 .

[47]  Matthew Zook,et al.  Mapping the Data Shadows of Hurricane Sandy: Uncovering the Sociospatial Dimensions of ‘Big Data’ , 2014 .

[48]  M. Couper Is the sky falling? new technology, changing media, and the future of surveys , 2013 .

[49]  J. Parker,et al.  Self-report of diabetes and claims-based identification of diabetes among Medicare beneficiaries. , 2013, National health statistics reports.

[50]  Hal R. Varian,et al.  Big Data: New Tricks for Econometrics , 2014 .

[51]  Philip B. Stark Privacy, Big Data, and the Public Good: Frameworks for Engagement , 2016 .

[52]  Jianqing Fan,et al.  Endogeneity in Ultrahigh Dimension , 2012 .

[53]  P. Doyle,et al.  Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies , 2001 .

[54]  W. Thompson,et al.  Epidemiology of seasonal influenza: use of surveillance data and statistical models to estimate the burden of disease. , 2006, The Journal of infectious diseases.

[55]  Edward I. George,et al.  Bayes and big data: the consensus Monte Carlo algorithm , 2016, Big Data and Information Theory.

[56]  Boyan Jovanovic Selection and the evolution of industry , 1981 .

[57]  Han Liu,et al.  Challenges of Big Data Analysis. , 2013, National science review.

[58]  Katherine J. Strandburg,et al.  Monitoring, Datafication and Consent: Legal Approaches to Privacy in the Big Data Context , 2014 .

[59]  Andrew D. Asher,et al.  Smarter, Better, Faster: The Potential for Predictive Analytics and Rapid-Cycle Evaluation to Improve Program Development and Outcomes , 2014 .

[60]  Yichao Wu,et al.  Ultrahigh Dimensional Feature Selection: Beyond The Linear Model , 2009, J. Mach. Learn. Res..

[61]  Peter Norvig,et al.  The Unreasonable Effectiveness of Data , 2009, IEEE Intelligent Systems.

[62]  P. Squire,et al.  WHY THE 1936 LITERARY DIGEST POLL FAILED , 1988 .

[63]  R. Groves Three Eras of Survey Research , 2011 .

[64]  Michael J. Cafarella,et al.  Using Social Media to Measure Labor Market Flows , 2014 .