Identifying spatial concentrations of surnames

Surnames (family names) have been overlooked as a valuable source of spatially referenced population data. Presented here is a methodology, based on kernel density estimation, which is used to identify the areas of Great Britain where any surname is most concentrated. This not only indicates a surname's geographic origin in the country but also its current spatial extent and spatial relationship with other surnames and place names. We argue that such analysis can provide baseline and change measures, and an empirical basis to change forecasting. Such analysis offers valuable insights into national, regional and local changes in population structure, and testimony to the relevance of GIScience to population genetics, historical geography and genealogy.

[1]  Pablo Mateos,et al.  What's in a name? The frequency and geographic distributions of UK surnames , 2008 .

[2]  A. Bowman,et al.  Applied smoothing techniques for data analysis : the kernel approach with S-plus illustrations , 1999 .

[3]  B S Weir,et al.  Estimation of the coancestry coefficient: basis for a short-term genetic distance. , 1983, Genetics.

[4]  M. Jobling,et al.  What's in a name? Y chromosomes, surnames and the genetic genealogy revolution. , 2009, Trends in genetics : TIG.

[5]  J. Porteous,et al.  Surname geography: a study of the Mell family name c. 1538-1980 , 1982 .

[6]  Paul A. Longley,et al.  Creating a regional geography of Britain through the spatial analysis of surnames , 2011 .

[7]  E. Lucchetti,et al.  Delimitation and aggregation between populations analyzed by surname structure , 1990 .

[8]  P. Diggle,et al.  Non-parametric estimation of spatial variation in relative risk. , 1995, Statistics in medicine.

[9]  D. Martin,et al.  Mapping population data from zone centroid locations. , 1989, Transactions.

[10]  Susanna C. Manrubia,et al.  Genealogy in the era of genomics: Models of cultural and family traits reveal human homogeneity , 2003 .

[11]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[12]  C. G. N. Mascie-Taylor,et al.  The Distribution of Surnames in England and Wales: A Model for Genetic Distribution , 1990 .

[13]  Paul A. Longley,et al.  The Quantitative Analysis of Family Names: Historic Migration and the Present Day Neighborhood Structure of Middlesbrough, United Kingdom , 2007 .

[14]  J. Burt,et al.  Elementary statistics for geographers , 1995 .

[15]  Ikuho Yamada,et al.  Statistical Detection and Surveillance of Geographic Clusters , 2008 .

[16]  C. Lloyd Local Models for Spatial Analysis , 2006 .

[17]  G. Lasker,et al.  Use of Surname Models in Human Population Biology: A Review of Recent Developments , 2003, Human biology.

[18]  Jacqueline Warren Mills,et al.  Geospatial Analysis: A Comprehensive Guide to Principles, Techniques, and Software Tools, Second Edition - by Michael J. de Smith, Michael F. Goodchild, and Paul A. Longley , 2008, Trans. GIS.

[19]  E. Heyer,et al.  Geographic Patterns of (Genetic, Morphologic, Linguistic) Variation: How Barriers Can Be Detected by Using Monmonier's Algorithm , 2004, Human biology.

[20]  G W Lasker,et al.  A spatial analysis of 100 surnames in England and Wales. , 1992, Annals of human biology.