Big data, agents, and machine learning: towards a data-driven agent-based modeling approach

We have recently witnessed the proliferation of large-scale behavioral data that can be used to empirically develop agent-based models (ABMs). Despite this opportunity, the literature has neglected to offer a structured agent-based modeling approach to produce agents or its parts directly from data. In this paper, we present initial steps towards an agent-based modeling approach that focuses on individual-level data to generate agent behavioral rules and initialize agent attribute values. We present a structured way to integrate Big Data and machine learning techniques at the individual agent-level. We also describe a conceptual use-case study of an urban mobility simulation driven by millions of geo-tagged Twitter social media messages. We believe our approach will advance the-state-of-the-art in developing empirical ABMs and conducting their validation. Further work is needed to assess data suitability, to compare with other approaches, to standardize data collection, and to serve all these features in near-real time.

[1]  Ross Gore,et al.  The spread of Wi-Fi router malware revisited , 2017, IEEE CNS 2017.

[2]  Lu Yang,et al.  Getting Away from Numbers: Using Qualitative Observation for Agent-Based Modeling , 2008, Adv. Complex Syst..

[3]  Sue Moon,et al.  Inferring Twitter user locations with 10 km accuracy , 2014, WWW.

[4]  Samer Hassan,et al.  Towards a Data-driven Approach for Agent-Based Modelling: Simulating Spanish Postmodernisation , 2010 .

[5]  Sune Lehmann,et al.  Understanding the Demographics of Twitter Users , 2011, ICWSM.

[6]  Zbigniew Smoreda,et al.  Unravelling daily human mobility motifs , 2013, Journal of The Royal Society Interface.

[7]  Eric Bonabeau,et al.  Agent-based modeling: Methods and techniques for simulating human systems , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Rob Kitchin,et al.  What makes Big Data, Big Data? Exploring the ontological characteristics of 26 datasets , 2016, Big Data Soc..

[9]  G. Nigel Gilbert,et al.  Agent-Based Models , 2007 .

[10]  Joseph Ferreira,et al.  Activity-Based Human Mobility Patterns Inferred from Mobile Phone Data: A Case Study of Singapore , 2017, IEEE Transactions on Big Data.

[11]  Jon Atwell,et al.  Agent-Based Models in Empirical Social Research , 2015, Sociological methods & research.

[12]  Mark Birkin,et al.  Estimating Individual Behaviour from Massive Social Data for an Urban Agent-Based Model , 2012 .

[13]  Karandeep Singh,et al.  A Data-Driven Approach for Agent-Based Modeling: Simulating the Dynamics of Family Formation , 2016, J. Artif. Soc. Soc. Simul..

[14]  Eben M. Haber,et al.  Making Use of Derived Personality: The Case of Social Media Ad Targeting , 2015, ICWSM.

[15]  Frederic D. McKenzie,et al.  Systems Modeling: Analysis and Operations Research , 2010 .

[16]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.

[17]  Venkat Ankam Big Data Analytics , 2016 .

[18]  Dirk Helbing,et al.  How to Do Agent-Based Simulations in the Future: From Modeling Social Mechanisms to Emergent Phenomena and Interactive Systems Design , 2013 .

[19]  Jose J. Padilla,et al.  Fine-Scale Prediction of People's Home Location Using Social Media Footprints , 2018, SBP-BRiMS.

[20]  Daniel G. Brown,et al.  Empirical characterisation of agent behaviours in socio-ecological systems , 2011, Environ. Model. Softw..

[21]  Volker Sorge,et al.  AIMSS: An Architecture for Data Driven Simulations in the Social Sciences , 2007, International Conference on Computational Science.

[22]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[23]  Thomas C. Schelling,et al.  Dynamic models of segregation , 1971 .

[24]  Thorben Jensen,et al.  Automating agent-based modeling: Data-driven generation and application of innovation diffusion models , 2017, Environ. Model. Softw..

[25]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[26]  Ross Gore,et al.  Semi-automated initialization of simulations: an application to healthcare , 2016 .

[27]  Jose J. Padilla,et al.  Leveraging social media data in agent-based simulations , 2014, SpringSim.

[28]  Jeffrey Nichols,et al.  You read what you value: understanding personal values and reading interests , 2014, CHI.