The Use of Exhaustive Micro-Data Firm Databases for Economic Geography: The Issues of Geocoding and Usability in the Case of the Amadeus Database

Economic geography has begun to explore the options involved in micro-data. New databases have become available and new techniques and an increase in computer power allow their treatment. However, two major issues impede the use of these datasets: the lack of geocoded spatial location and lack of exhaustivity in coverage. In this article, I explore the possibilities of using large micro-scale firm databases for economic geography in Europe. I show that current evolution in European official spatial data dissemination alows for geocoding of such databases using means that are accessible for researchers with minimal programming knowledge. For the specific case of the Amadeus database of the Bureau Van Dijk, I show that its limitations in terms of coverage have to be taken into acount, but do not hinder its use for analysis. Resulting maps show how the data allows to go further than classic databases such as the Eurostat Structural Business Statistics.

[1]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[2]  S. Redding,et al.  Theories of Heterogeneous Firms and Trade , 2010 .

[3]  P. Diggle,et al.  Modelling Individual Behaviour of Firms in the Study of Spatial Concentration , 2009 .

[4]  G. Arbia,et al.  A micro spatial analysis of firm demography: the case of food stores in the area of Trento (Italy) , 2015 .

[5]  Peter K. Schott,et al.  The Empirics of Firm Heterogeneity and International Trade , 2011 .

[6]  Florence Puech,et al.  Evaluating the geographic concentration of industries using distance-based methods , 2003 .

[7]  S. Redding,et al.  Heterogeneous Firms and Trade , 2012 .

[8]  Claude Grasland,et al.  Modifiable Area Unit Problem , 2006 .

[9]  Pamina Koenig Agglomeration and the export decisions of French firms , 2009 .

[10]  N. Huijboom,et al.  Open data: An International comparison of strategies , 2011 .

[11]  Heterogeneous Firms and Trade , 2012 .

[12]  W. Tobler Frame independent spatial analysis , 1989 .

[13]  Christof AMELUNXEN An Approach to Geocoding based on Volunteered Spatial Data , 2010, Geoinformatik.

[14]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[15]  P. Combes,et al.  Dots to Boxes: Do the Size and Shape of Spatial Units Jeopardize Economic Geography Estimations? , 2008 .

[16]  Luciana Lazzeretti,et al.  Creative clusters in Europe: a microdata approach , 2011 .

[17]  B. A. Ranzijn A Geocoding Algorithm Based On A Comparative Study Of Address Matching Techniques , 2013 .

[18]  Jean-Marc Vincent,et al.  Multiscalar analysis and map generalisation of discrete social phenomena: Statistical problems and political consequences , 2000 .

[19]  Esko Ukkonen,et al.  Approximate String Matching with q-grams and Maximal Matches , 1992, Theor. Comput. Sci..

[20]  Florence Puech,et al.  Measures of the geographic concentration of industries: improving distance-based methods , 2010 .

[21]  C. Rozenblat,et al.  Opening the Black Box of Agglomeration Economies for Measuring Cities’ Competitiveness through International Firm Networks , 2010 .

[22]  G. Arbia,et al.  Spatio-temporal clustering in the pharmaceutical and medical device manufacturing industry: A geographical micro-level analysis , 2014 .

[23]  Craig A. Knoblock,et al.  From Text to Geographic Coordinates: The Current State of Geocoding , 2007 .

[24]  David Greenaway,et al.  Firm Heterogeneity, Exporting and Foreign Direct Investment , 2007 .

[25]  W. Tobler Smooth pycnophylactic interpolation for geographical regions. , 1979, Journal of the American Statistical Association.

[26]  Giuseppe Arbia,et al.  Modelling the geography of economic activities on a continuous space , 2001 .

[27]  Zohra Bellahsene,et al.  A Flexible Approach for Planning Schema Matching Algorithms , 2008, OTM Conferences.

[28]  Roy T. Fielding,et al.  Principled design of the modern Web architecture , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[29]  Florence Puech,et al.  A typology of distance-based measures of spatial concentration , 2017 .

[30]  C. E. Gehlke,et al.  Certain Effects of Grouping upon the Size of the Correlation Coefficient in Census Tract Material , 1934 .