A deep learning framework to generate realistic population and mobility data

—Census and Household Travel Survey datasets are regularly collected from households and individuals and provide information on their daily travel behavior with demographic and economic characteristics. These datasets have important applications ranging from travel demand estimation to agent- based modeling. However, they often represent a limited sample of the population due to privacy concerns or are given aggregated. Synthetic data augmentation is a promising avenue in addressing these challenges. In this paper, we propose a framework to generate a synthetic population that includes both socioeconomic features (e.g., age, sex, industry) and trip chains (i.e., activity locations). Our model is tested and compared with other recently proposed models on multiple assessment metrics.

[1]  M. Bierlaire,et al.  DATGAN: Integrating expert knowledge into deep learning for synthetic tabular data , 2022, ArXiv.

[2]  Esteban Moro Egido,et al.  Generating synthetic mobility data for a realistic population with RNNs to improve utility and privacy , 2022, SAC.

[3]  Miloš Balać,et al.  Synthetic population and travel demand for Paris and Île-de-France based on open and publicly available data , 2021 .

[4]  Taghi M. Khoshgoftaar,et al.  Text Data Augmentation for Deep Learning , 2021, Journal of Big Data.

[5]  Robert Birke,et al.  CTAB-GAN: Effective Table Data Synthesizing , 2021, ACML.

[6]  Bilal Farooq,et al.  Composite Travel Generative Adversarial Networks for Tabular and Sequential Population Synthesis , 2020, IEEE Transactions on Intelligent Transportation Systems.

[7]  Tero Karras,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Lei Xu,et al.  Modeling Tabular data using Conditional GAN , 2019, NeurIPS.

[9]  Joelle Pineau,et al.  Language GANs Falling Short , 2018, ICLR.

[10]  Ashish Khetan,et al.  PacGAN: The Power of Two Samples in Generative Adversarial Networks , 2017, IEEE Journal on Selected Areas in Information Theory.

[11]  Carl A. Gunter,et al.  Plausible Deniability for Privacy-Preserving Data Synthesis , 2017, Proc. VLDB Endow..

[12]  Weinan Zhang,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[13]  Ming-Yu Liu,et al.  Coupled Generative Adversarial Networks , 2016, NIPS.

[14]  Alexander Erath,et al.  A Bayesian network approach for population synthesis , 2015 .

[15]  Michel Bierlaire,et al.  Simulation based Population Synthesis , 2013 .

[16]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[17]  Lei Xu,et al.  Modeling Tabular Data using Conditional GAN , 2019 .

[18]  Peter R. Stopher,et al.  METHODS FOR HOUSEHOLD TRAVEL SURVEYS , 1996 .