Population Synthesis Based on Joint Distribution Inference Without Disaggregate Samples

Synthetic population is a fundamental input to dynamic micro-simulation in social applications. Based on the review of current major approaches, this paper presents a new sample-free synthesis method by inferring joint distribution of the total target population. Convergence of multivariate Iterative Proportional Fitting used in our method is also proved theoretically. The method, together with other existing ones, is applied to generate a nationwide synthetic population database of China by using its overall cross-classification tables as well as a sample from census. Marginal and partial joint distribution consistencies of each database are compared and evaluated quantitatively. Final results manifest sample-based methods have better performances on marginal indicators while the sample-free ones match partial distributions more precisely. Among the five methods, our proposed method significantly reduces the computational cost for generating synthetic population in large scale. An open source implementation of the population synthesizer based on C# used in this research is available at https://github.com/PeijunYe/PopulationSynthesis.git.

[1]  Michel Bierlaire,et al.  Associations Generation in Synthetic Population for Transportation Applications , 2014 .

[2]  Eric J. Miller,et al.  Advances in population synthesis: fitting many attributes per agent and fitting to household and person margins simultaneously , 2012 .

[3]  Guillaume Deffuant,et al.  An Iterative Approach for Generating Statistically Realistic Populations of Households , 2010, PloS one.

[4]  R. Little,et al.  Models for Contingency Tables with Known Margins when Target and Sampled Populations Differ , 1991 .

[5]  S. Kullback,et al.  Contingency tables with given marginals. , 1968, Biometrika.

[6]  M. D. McKay,et al.  Creating synthetic baseline populations , 1996 .

[7]  Pascal Perez,et al.  A Heuristic Combinatorial Optimisation Approach to Synthesising a Population for Agent Based Modelling Purposes , 2016, J. Artif. Soc. Soc. Simul..

[8]  Oliver Pretzel,et al.  CONVERGENCE OF THE ITERATIVE SCALING PROCEDURE FOR NON-NEGATIVE MATRICES , 1980 .

[9]  D. Ballas,et al.  Using SimBritain to Model the Geographical Impact of National Government Policies , 2007 .

[10]  P. Kanaroglou,et al.  Population Synthesis: Comparing the Major Techniques Using a Small, Complete Population of Firms , 2009 .

[11]  Friedrich Pukelsheim,et al.  Biproportional scaling of matrices and the iterative proportional fitting procedure , 2014, Ann. Oper. Res..

[12]  Kay W. Axhausen,et al.  Hierarchical IPF: Generating a synthetic population for Switzerland , 2011 .

[13]  S. Fienberg An Iterative Procedure for Estimation in Contingency Tables , 1970 .

[14]  Johan Barthelemy A parallelized micro-simulation platform for population and mobility behaviour - Application to Belgium , 2014 .

[15]  Alison J. Heppenstall,et al.  Creating Realistic Synthetic Populations at Varying Spatial Scales: A Comparative Critique of Population Synthesis Techniques , 2012, J. Artif. Soc. Soc. Simul..

[16]  Paul Williams Using Microsimulation to Create Synthetic Small-Area Estimates from Australia ' s 2001 Census , 2002 .

[17]  Zengyi Huang A COMPARISON OF SYNTHETIC RECONSTRUCTION AND COMBINATORIAL OPTIMISATION APPROACHES TO THE CREATION OF SMALL-AREA MICRODATA , 2002 .

[18]  Laxminarayana Ganapathi,et al.  Synthesized Population Databases: A US Geospatial Database for Agent-Based Models. , 2009, Methods report.

[19]  Graham Clarke,et al.  SimBritain: a spatial microsimulation approach to population dynamics , 2005 .

[20]  Kevin B. Korb,et al.  Synthetic Population Dynamics: A Model of Household Demography , 2013, J. Artif. Soc. Soc. Simul..

[21]  Marcus Blake,et al.  An evaluation of synthetic household populations for census collection districts created using optimisation techniques , 2002 .

[22]  Michel Bierlaire,et al.  Simulation based Population Synthesis , 2013 .

[23]  P H Rees,et al.  The Estimation of Population Microdata by Using Data from Small Area Statistics and Samples of Anonymised Records , 1998, Environment & planning A.

[24]  A. G. Wilson,et al.  A new representation of the urban system for modelling and for the study of micro-level interdependence , 1976 .

[25]  Rachel Lloyd,et al.  Regional microsimulation for improved service delivery in Australia: Centrelink's CuSP model , 2002 .

[26]  Yisheng Lv,et al.  Generating artificial popullations for traffic microsimulation , 2009, IEEE Intelligent Transportation Systems Magazine.

[27]  Kay W. Axhausen,et al.  Synthetic Population Generation by Combining a Hierarchical, Simulation-Based Approach with Reweighting by Generalized Raking , 2015 .

[28]  Joshua Auld,et al.  Efficient Methodology for Generating Synthetic Populations with Multiple Control Levels , 2010 .

[29]  Lu Ma,et al.  Synthetic Population Generation with Multilevel Controls: A Fitness‐Based Synthesis Approach and Validations , 2015, Comput. Aided Civ. Infrastructure Eng..

[30]  John E Abraham,et al.  Population Synthesis Using Combinatorial Optimization at Multiple Levels , 2012 .

[31]  Cheng Chen,et al.  Hybrid Agent Modeling in Population Simulation: Current Approaches and Future Directions , 2016 .

[32]  Ann Harding,et al.  Assessing Poverty and Inequality at a Detailed Regional Level: New Advances in Spatial Microsimulation , 2004 .

[33]  Johan Barthelemy,et al.  Synthetic Population Generation Without a Sample , 2013, Transp. Sci..

[34]  L. Rüschendorf Convergence of the iterative proportional fitting procedure , 1995 .

[35]  Guillaume Deffuant,et al.  Generating a Synthetic Population of Individuals in Households: Sample-Free Vs Sample-Based Methods , 2012, J. Artif. Soc. Soc. Simul..

[36]  Fei-Yue Wang,et al.  Computational experiments for studying impacts of land use on traffic systems , 2010, 13th International IEEE Conference on Intelligent Transportation Systems.

[37]  Chandra R. Bhat,et al.  Population Synthesis for Microsimulating Travel Behavior , 2007 .

[38]  W. Deming,et al.  On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals are Known , 1940 .