Neighborhoods are defined as geographical areas containing similar populations and roughly homogenous housing markets. Research defining boundaries for neighborhoods and sub-markets (groups of neighborhoods that are homogenous) is motivated largely by automated valuation systems (AVMs). With AVMs, an accurate estimate of house value requires hedonic regressions using transactions from neighborhoods or sub-markets. Most of the previous work on this topic focuses on aggregating previously defined geographical units into larger homogenous markets. For example, Bourrassa, Hamelink, Hoesli and MacGregor (1999) use Australian local government areas (LGA’s), defined as Census tracts in the United States, as the basic unit of observation. They then use individual transactions and socioeconomic characteristics of the LGA’s with principal components analysis and cluster analysis to aggregate up to larger neighborhoods. Similarly, Goodman and Thibodeau (1998) developed methods for aggregating contiguous school districts into larger areas. We attack a complementary problem: how to define the boundaries of the basic geographic units that can be then aggregated into sub-markets. There are several reasons to question predefined boundaries for LGA’s, Census tracts or school districts: 1. The boundaries of the tracts were not drawn using consistent and wellmeasured criteria. 2. The government changes the boundaries of tracts infrequently whereas population and socioeconomic change occurs continuously. 3. School district boundaries change over time in ways that are difficult to quantify. 4. School district boundaries are made less relevant by magnet schools and inclusionary busing. This paper uses individual transactions and a new statistical technique known as CART (classification and regression trees) to define the optimal number of neighborhoods and the boundaries of these neighborhoods. The number of neighborhoods defined by CART, and the locations of neighborhood boundaries, can be compared to Census tracts. There are substantially fewer CART boundaries in the town tested (West Hartford, Connecticut). Moreover, the boundaries defined by CART make more sense: E.g., the boundary lines run behind the houses rather than down the middle of the street. Furthermore, these boundaries can be shown to reduce residual variation when compared to Census tracts. Finally, GIS greatly simplifies the task of allocating Census tract demographic characteristics to the newly defined neighborhoods. Defining Neighborhood Boundaries page 3 Defining Neighborhood Boundaries: The Use of Transactions Data November 17, 2003 The ordinary language definition of a neighborhood can be approximated by the dictionary definition: “a community, district or area ... (an old neighborhood). The people living near one another (Webster’s New World College Dictionary, third edition 1988).” Census geography (tracts, blocks and/or block groups) maybe viewed as being based on several criteria designed to implement the ordinary language definition of neighborhood. These criteria are homogeneity – similarity in housing and in demographic characteristics; simplicity – fewer neighborhoods in a given geographic area are preferred to more; and contiguity – a neighborhood is a closed geographical area. This concept of neighborhood is somewhat at odds with a large part of urban economic theory. The theory emphasizes continuous distributions over space for accessibility to important points of interest such as the CBD or sub centers. In general, the value of urban land is given by its accessibility to other land uses. There maybe a large number of accessibility factors such as recreation, social activities, open space, important public buildings or an odiferous factory. The time and money cost of transportation from the subject parcel to all of these land uses defines its “location value.” Thus, location value is best viewed as a continuously changing surface rather than by known boundaries. 1 More generally, ordinary language philosophy holds that the meaning of a word is given by its use and context in a language (see J.L. Austin, 1962). A good overview of this concept is in J.L. Austin, 1964, chapter X, especially pp. 117-124. 2 Goodman (1981, p. 117) has an excellent discussion of these criteria and of their roots in earlier academic literature. However, Census definitions are less precise. See discussion in Section 2. 3 Details on this point are provided in the literature review. Defining Neighborhood Boundaries page 4 Recently, street addresses for transactions data have been geocoded: i.e., latitude and longitude coordinates are available for each transaction. This has spawned several econometric techniques – notably, the spatial-temporal autoregressive (STAR) model, spatial auto regressive techniques (SAR and CAR), and the local regression model (LRM) – designed to estimate the value surface predicted by theory (See Pace et al., 2000; Clapp et al., 1998; and Clapp, 2003). The concept of neighborhood remains viable because it is necessary for the purposes of data collection, notably by the Bureau of the Census. Confidentiality prevents release of the individual household records, so neighborhood boundaries are used as a unit of spatial aggregation. Moreover, considerable hedonic pricing literature has emphasized the need for using concepts of neighborhood and submarkets in order to allow implicit prices of property characteristics to vary over space (See Strazheim, 1975; Schnare and Struyk, 1976; Dale-Johnson, 1982; Goodman and Dubin, 1990; Bourassa et al., 1997; and Goodman and Thibodeau, 1998). A submarket is defined here as a group of neighborhoods, not necessarily contiguous, where the implicit prices of property characteristics are roughly constant. A sub market differs from a neighborhood because it is a larger geographic area composed of groups of neighborhoods. The purpose of this study is to address a fundamental issue in the literature on submarkets: how are the boundaries of neighborhoods that comprise the submarkets defined? Thus, this study is complementary with the submarket literature because it develops a method for estimating the boundaries of a fundamental building block for
[1]
J. L. Goodman.
Aggregation of Local Housing Markets
,
1998
.
[2]
Martin Hoesli,et al.
Defining Housing Submarkets
,
1999
.
[3]
J. O. Urmson,et al.
How to Do Things with Words@@@The William James Lectures
,
1963
.
[4]
Sandra E. Black.
Do better schools matter? Parental valuation of elementary education
,
1999
.
[5]
Brian A. Cromwell,et al.
How Much is a Neighborhood School Worth
,
2000
.
[6]
Burton H. Singer,et al.
Recursive partitioning in the health sciences
,
1999
.
[7]
P. Robinson.
ROOT-N-CONSISTENT SEMIPARAMETRIC REGRESSION
,
1988
.
[8]
Ronald P. Barry,et al.
Spatiotemporal Autoregressive Models of Neighborhood Effects
,
1998
.
[9]
Hyon-Jung Kim,et al.
Spatial Prediction of House Prices Using LPR and Bayesian Smoothing
,
2001
.
[10]
D. Weimer,et al.
School Performance and Housing Values: Using Non-Contiguous District and Incorporation Boundaries to Identify School Effects
,
2001,
National Tax Journal.
[11]
J. Stock.
Nonparametric Policy Analysis
,
1989
.
[12]
Bradford Case,et al.
Modeling Spatial and Temporal House Price Patterns: A Comparison of Four Models
,
2004
.
[13]
Leo Breiman,et al.
Classification and Regression Trees
,
1984
.
[14]
Alan E. Gelfand,et al.
Predicting Spatial Patterns of House Prices Using LPR and Bayesian Smoothing
,
2002
.
[15]
A. Goodman.
Hedonic prices, price indices and housing markets
,
1978
.
[16]
Allen C. Goodman,et al.
HOUSING SUBMARKETS WITHIN URBAN AREAS: DEFINITIONS AND EVIDENCE*
,
1981
.
[17]
R. Struyk,et al.
Segmentation in Urban Housing Markets.
,
1976
.
[18]
Robert M. Kozelka.
Elements of Statistical Inference
,
1961
.
[19]
T. Thibodeau,et al.
Housing Market Segmentation
,
1998
.
[20]
Donald R. Haurin,et al.
School Quality and Real House Prices: Inter- and Intrametropolitan Effects
,
1996
.
[21]
M. Straszheim,et al.
An econometric analysis of the urban housing market
,
1975
.
[22]
Dennis Epple,et al.
Estimating Equilibrium Models of Local Jurisdictions
,
1998,
Journal of Political Economy.
[23]
Ronald P. Barry,et al.
A method for spatial–temporal forecasting with an application to real estate prices
,
2000
.
[24]
Peter M. Mieszkowski,et al.
Racial Discrimination, Segregation, and the Price of Housing
,
1973,
Journal of Political Economy.
[25]
J. Clapp.
A Semiparametric Method for Valuing Residential Location
,
2001
.