Big Data Methods

Advances in data science, such as data mining, data visualization, and machine learning, are extremely well-suited to address numerous questions in the organizational sciences given the explosion of available data. Despite these opportunities, few scholars in our field have discussed the specific ways in which the lens of our science should be brought to bear on the topic of big data and big data's reciprocal impact on our science. The purpose of this paper is to provide an overview of the big data phenomenon and its potential for impacting organizational science in both positive and negative ways. We identifying the biggest opportunities afforded by big data along with the biggest obstacles, and we discuss specifically how we think our methods will be most impacted by the data analytics movement. We also provide a list of resources to help interested readers incorporate big data methods into their existing research. Our hope is that we stimulate interest in big data, motivate future research using big data sources, and encourage the application of associated data science techniques more broadly in the organizational sciences.

[1]  Bill Franks,et al.  Taming The Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics , 2012 .

[2]  A. Pentland The new science of building great teams , 2012 .

[3]  D. Hambrick THE FIELD OF MANAGEMENT'S DEVOTION TO THEORY: TOO MUCH OF A GOOD THING? , 2007 .

[4]  R. Landis,et al.  Inductive reasoning: The promise of big data , 2017 .

[5]  M. Kosinski,et al.  Computer-based personality judgments are more accurate than those made by humans , 2015, Proceedings of the National Academy of Sciences.

[6]  Daniel Olguín Olguín Assessing Group Performance from Collective Behavior , 2010 .

[7]  Christopher Potts,et al.  Enculturation Trajectories and Individual Attainment: An Interactional Language Use Model of Cultural Dynamics in Organizations , 2015 .

[8]  L. HARKing: Hypothesizing After the Results are Known , 2002 .

[9]  R. Kitchin,et al.  Big Data, new epistemologies and paradigm shifts , 2014, Big Data Soc..

[10]  R. Yerkes,et al.  PSYCHOLOGY AND NATIONAL SERVICE. , 1917, Science.

[11]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[12]  Frederick L. Oswald,et al.  Implications of the Big Data Movement for the Advancement of I-O Science and Practice , 2015 .

[13]  Mark John Somers,et al.  Using Artificial Neural Networks to Model Nonlinearity , 2009 .

[14]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[15]  Adrian E. Raftery,et al.  Bayesian Model Averaging: A Tutorial , 2016 .

[16]  Tony Plate,et al.  Visualizing the Function Computed by a Feedforward Neural Network , 2000, Neural Computation.

[17]  Kevin R. Murphy,et al.  Unintended Consequences of EEO Enforcement Policies: Being Big is Worse than Being Bad , 2013 .

[18]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001, Statistical Science.

[19]  M. Somers,et al.  Application of two neural network paradigms to the study of voluntary employee turnover. , 1999, The Journal of applied psychology.

[20]  Michael S. Bernstein,et al.  Designing and deploying online field experiments , 2014, WWW.

[21]  J. Colquitt,et al.  TRENDS IN THEORY BUILDING AND THEORY TESTING: A FIVE-DECADE STUDY OF THE ACADEMY OF MANAGEMENT JOURNAL , 2007 .

[22]  S. Maxwell The persistence of underpowered studies in psychological research: causes, consequences, and remedies. , 2004, Psychological methods.

[23]  Ronald S. Landis,et al.  Is Ours a Hard Science (and Do We Care) , 2014 .

[24]  P. Hanges,et al.  A network model of organizational climate: Friendship clusters, subgroup agreement, and climate schemas , 2008 .

[25]  Mark John Somers,et al.  Thinking differently: Assessing nonlinearities in the relationship between work attitudes and job performance using a Bayesian neural network , 2001 .

[26]  Tom Cox,et al.  The use of artificial neural networks and multiple linear regression in modelling work–health relationships: Translating theory into analytical practice , 2010 .

[27]  F. Schmidt,et al.  The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. , 1998 .

[28]  Limsoon Wong,et al.  DATA MINING TECHNIQUES , 2003 .

[29]  Roger Calantone,et al.  The Promise and Perils of Wearable Sensors in Organizational Research , 2017 .

[30]  Kristian E. Markon,et al.  The Effect of Response Model Misspecification and Uncertainty on the Psychometric Properties of Estimates , 2013 .

[31]  R. Guion Review of Managerial Behavior, Performance and Effectiveness. , 1971 .

[32]  Robert Karasek,et al.  Job decision latitude and mental strain: Implications for job redesign , 1979 .

[33]  Rosalind C. Barnett,et al.  Same Difference: How Gender Myths Are Hurting Our Relationships, Our Children, and Our Jobs , 2004 .

[34]  Philippe Jacquart,et al.  On making causal claims: A review and recommendations , 2010 .

[35]  Kevin Crowston,et al.  Semi-Automatic Content Analysis of Qualitative Data , 2014 .

[36]  Benjamin N. Waber,et al.  People Analytics: How Social Sensing Technology Will Transform Business and What It Tells Us about the Future of Work , 2013 .

[37]  Charles Anderson,et al.  The end of theory: The data deluge makes the scientific method obsolete , 2008 .

[38]  S. Hathaway,et al.  A multiphasic personality schedule (Minnesota) : IV. Psychasthenia , 1942 .

[39]  Jianshen Chen,et al.  Bayesian Model Averaging for Propensity Score Analysis , 2014, Multivariate behavioral research.

[40]  Lun-Ping Hung,et al.  A data driven ensemble classifier for credit scoring analysis , 2009, Expert Syst. Appl..

[41]  D. Sharpe Why the resistance to statistical innovations? Bridging the communication gap. , 2013, Psychological methods.

[42]  Anders Blok,et al.  Complementary social science? Quali-quantitative experiments in a Big Data world , 2014 .

[43]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[44]  P. Ackerman,et al.  Age, ability, and the role of prior knowledge on the acquisition of new domain knowledge: promising results in a real-world learning environment. , 2005, Psychology and aging.

[45]  Joseph R. Rausch,et al.  Sample size planning for statistical power and accuracy in parameter estimation. , 2008, Annual review of psychology.

[46]  Stephen G West,et al.  Doctoral training in statistics, measurement, and methodology in psychology: replication and extension of Aiken, West, Sechrest, and Reno's (1990) survey of PhD programs in North America. , 2008, The American psychologist.

[47]  Kevin R. Murphy,et al.  Implications of the multidimensional nature of job performance for the validity of selection tests , 1997 .

[48]  D. Freyer The objective and subjective measurement of interests--an acceptance-rejection theory. , 1930 .

[49]  Brooke Foucault Welles,et al.  On minorities and outliers: The case for making Big Data small , 2014, Big Data Soc..

[50]  Brent A. Scott,et al.  The Interactive Effects of Personal Traits and Experienced States on Intraindividual Patterns of Citizenship Behavior , 2006 .

[51]  Nathan Intrator,et al.  Interpreting neural-network results: a simulation study , 2001 .

[52]  Kevin Crowston,et al.  Design of an Active Learning System with Human Correction for Content Analysis , 2014 .

[53]  Judith M. Collins,et al.  AN APPLICATION OF THE THEORY OF NEURAL COMPUTATION TO THE PREDICTION OF WORKPLACE BEHAVIOR: AN ILLUSTRATION AND ASSESSMENT OF NETWORK ANALYSIS , 1993 .

[54]  Jacob Cohen,et al.  A power primer. , 1992, Psychological bulletin.

[55]  H. O. Schmidt Test profiles as a diagnostic aid: the Minnesota Multiphasic Inventory. , 1945 .

[56]  K Edward,et al.  Nineteen-year followup of engineer interests. , 1952 .

[57]  K Edward,et al.  Permanence of interest scores over 22 years. , 1951 .

[58]  Herbert A. Toops,et al.  Mental tests of unemployed men. , 1917 .

[59]  Gregory J. Park,et al.  Automatic personality assessment through social media language. , 2015, Journal of personality and social psychology.

[60]  Stefan Krumm,et al.  Toward Stable Predictions of Apprentices’ Training Success , 2011 .

[61]  Andrew B. Collmus,et al.  A primer on theory-driven web scraping: Automatic extraction of big data from the Internet for use in psychological research. , 2016, Psychological methods.

[62]  R. P. Jarrett A scale of intelligence of college students for the use of college appointment committees. , 1918 .

[63]  Scott Tonidandel,et al.  Big data at work : the data science revolution and organizational psychology , 2016 .

[64]  Winslow Burleson,et al.  Predicting creativity in the wild: experience sample and sociometric modeling of teams , 2012, CSCW.

[65]  Stephen J. Guastello,et al.  Chaos and complexity in psychology: The theory of nonlinear dynamical systems. , 2008 .

[66]  Adrian E. Raftery,et al.  Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors , 1999 .

[67]  John E. Mathieu,et al.  A Temporally Based Framework and Taxonomy of Team Processes , 2001 .

[68]  Giovanni Seni,et al.  Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions , 2010, Ensemble Methods in Data Mining.