Advances in population synthesis: fitting many attributes per agent and fitting to household and person margins simultaneously

Agent-based microsimulation models of transportation, land use or other socioeconomic processes require an initial synthetic population derived from census data, conventionally created using the iterative proportional fitting (IPF) procedure. This paper introduces a novel computational method that allows the synthesis of many more attributes and finer attribute categories than previous approaches, both of which are long-standing limitations discussed in the literature. Additionally, a new approach is used to fit household and person zonal attribute distributions simultaneously. This technique was first adopted to address limitations specific to Canadian census data, but could also be useful in U.S. and other applications. The results of each new method are evaluated empirically in terms of goodness-of-fit.

[1]  Zengyi Huang A COMPARISON OF SYNTHETIC RECONSTRUCTION AND COMBINATORIAL OPTIMISATION APPROACHES TO THE CREATION OF SMALL-AREA MICRODATA , 2002 .

[2]  W. Deming,et al.  On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals are Known , 1940 .

[3]  A. Agresti Categorical data analysis , 1993 .

[4]  Joshua Auld,et al.  Population Synthesis with Region-Level Control Variable Aggregation , 2008 .

[5]  Chandra R. Bhat,et al.  Population Synthesis for Microsimulating Travel Behavior , 2007 .

[6]  Harry Timmermans,et al.  Albatross version 2: A learning-Based Transportation Oriented Simulation System , 2005 .

[7]  M. D. McKay,et al.  Creating synthetic baseline populations , 1996 .

[8]  Eric J. Miller,et al.  ILUTE: An Operational Prototype of a Comprehensive Microsimulation Model of Urban Systems , 2005 .

[9]  David R. Pritchard,et al.  Synthesizing agents and relationships for land use/transportation modelling , 2008 .

[10]  J. Bowman A COMPARISON OF POPULATION SYNTHESIZERS USED IN MICROSIMULATION MODELS OF ACTIVITY AND TRAVEL DEMAND , 2004 .

[11]  Alan Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[12]  Michel Bierlaire,et al.  Simulation based Population Synthesis , 2013 .

[13]  P H Rees,et al.  The Estimation of Population Microdata by Using Data from Small Area Statistics and Samples of Anonymised Records , 1998, Environment & planning A.

[14]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[15]  R. Little,et al.  Models for Contingency Tables with Known Margins when Target and Sampled Populations Differ , 1991 .

[16]  F. F. Stephan An Iterative Method of Adjusting Sample Frequency Tables When Expected Marginal Totals are Known , 1942 .

[17]  P. Waddell,et al.  Methodology to Match Distributions of Both Household and Person Attributes in Generation of Synthetic Populations , 2009 .

[18]  Daniel C. Knudsen,et al.  Matrix Comparison, Goodness-of-Fit, and Spatial Interaction Modeling , 1986 .

[19]  T. Wickens,et al.  Multiway Contingency Tables Analysis for the Social Sciences , 1992 .

[20]  I. Csiszár $I$-Divergence Geometry of Probability Distributions and Minimization Problems , 1975 .