Singling Out Individual Inventors from Patent Data

An increasing number of studies in recent years have sought to identify individual inventors from patent data. A variety of heuristics have been proposed for using the names and other information disclosed in patent documents to establish “who is who” in patents. This paper contributes to this literature by describing a methodology for identifying inventors using patents applied to the European Patent Office (EPO hereafter). As in much of this literature, we basically follow a three-step procedure: (1) the parsing stage, aimed at reducing the noise in the inventor’s name and other fields of the patent; (2) the matching stage, where name matching algorithms are used to group similar names; and (3) the filtering stage, where additional information and various scoring schemes are used to filter out these similarly-named inventors. The paper presents the results obtained by using the algorithms with the set of European inventors applying to the EPO over a long period of time.

[1]  Marta Gómez-Puig,et al.  EU-15 sovereign governments' cost of borrowing after seven years of Monetary Union , 2007 .

[2]  Juan Eugenio Jiménez González,et al.  Cómo (no) adaptar una asignatura al EEES: Lecciones desde la experiencia comparada en España , 2010 .

[3]  José Luis Raymond Bara,et al.  Capital humano: un análisis comparativo Catalunya - España , 2006 .

[4]  F. Manca Appropriate IPRs, Human Capital Composition and Economic Growth , 2009 .

[5]  Mariacristina Piva,et al.  The Transatlantic Productivity Gap: Is R&D the Main Culprit? , 2014 .

[6]  Daniel Albalate,et al.  Lowering Blood Alcohol Content Levels to Save Lives: The European Experience , 2008 .

[7]  Rosina Moreno,et al.  Decomposing Differences in Total Factor Productivity Across Firm Size , 2007 .

[8]  José María Durán Cabré,et al.  An empirical analysis of wealth taxation: Equity vs. tax compliance , 2007 .

[9]  Josep Lluís Carrion-i-Silvestre,et al.  Panel data stochastic convergence analysis of the Mexican regions , 2009 .

[10]  M. Callejón,et al.  The Black Box of Business Dynamics , 2009 .

[11]  Jordi Perdiguero,et al.  (No) Competition in the Spanish Retailing Gasoline Market: A Variance Filter Approach , 2009 .

[12]  Elisabet Motellón,et al.  Human Capital and Regional Wage Gaps , 2012 .

[13]  Daniel Albalate,et al.  Urban transport governance reform in Barcelona , 2010 .

[14]  P. Claeys,et al.  Fiscal Sustainability across Government Tiers: An Assessment of Soft Budget Constraints , 2007 .

[15]  Miguel Santolino,et al.  Determinants of the decision to appeal against motor bodily injury judgements made by Spanish trial courts , 2010 .

[16]  Colin Webb,et al.  The OECD REGPAT Database , 2008 .

[17]  Daniel Albalate Social Preferences and Transport Policy:The case of US speed limits , 2009 .

[18]  F. Lissoni Academic inventors as brokers: An exploratory analysis of the KEINS database , 2008 .

[19]  Oriol Tejada,et al.  A theoretical and practical study on linear reforms of dual taxes , 2009 .

[20]  Xavier Fageda,et al.  Choosing hybrid organizations for local services delivery: An empirical analysis of partial privatization , 2008 .

[21]  Vicente Royuela Mora,et al.  La reforma de la contratación en el mercado de trabajo: entre la flexibilidad y la seguridad , 2010 .

[22]  Rosina Moreno,et al.  Do innovation and human capital explain the productivity gap between small and large firms , 2007 .

[23]  E. Crimmins,et al.  Health of Immigrants in European Countries 1 , 2008, The International migration review.

[24]  Montserrat Guillén,et al.  Health Care Utilization Among Immigrants and Native-Born Populations in 11 European Countries. Results from the Survey of Health, Ageing and Retirement in Europe , 2009 .

[25]  Vicente Royuela,et al.  An analysis of the determinants in Economics and Business publications by Spanish universities between 1994 and 2004 , 2007, Scientometrics.

[26]  Vicente Royuela,et al.  The Institutional vs. the Academic Definition of the Quality of Work Life. What is the Focus of the European Commission? , 2008 .

[27]  Fabio Manca,et al.  Technology Catching-up and the Role of Institutions , 2009 .

[28]  Karl Branting A comparative evaluation of name-matching algorithms , 2003, ICAIL.

[29]  Xavier Fageda,et al.  Privatization and regulation of toll motorways in europe , 2007 .

[30]  Julio Raffo,et al.  How to play the “Names Game”: Patent retrieval comparing different heuristics , 2009 .

[31]  J. Perdiguero Symmetric or asymmetric gasoline prices? A meta-analysis approach , 2010 .

[32]  Gustavo Crespi,et al.  The mobility of university inventors in Europe , 2007 .

[33]  Pere Castells,et al.  The Choice of Banking Firm: Are the Interest Rate a Significant Criteria? , 2006 .

[34]  Karin Hoisl,et al.  Inventors and invention processes in Europe: Results from the PatVal-EU survey , 2007 .

[35]  Raúl Ramos Lobo,et al.  Portabilidad del capital humano y asimilación de los inmigrantes: evidencia para España , 2008 .

[36]  G. Bel,et al.  Institutional determinants of military spending , 2012 .

[37]  Montserrat Guillén,et al.  Prediction of the Economic Cost of Individual Long-Term Care in the Spanish Population , 2010 .

[38]  Josep Lluís Carrion-i-Silvestre,et al.  Another Look at the Null of Stationary RealExchange Rates. Panel Data with Structural Breaks and Cross-section Dependence , 2007 .

[39]  Peter Claeys,et al.  Estimating the effects of fiscal policy under the budget constraint , 2007 .

[40]  Jaime Martínez-Martín General equilibrium long-run determinants for Spanish FDI: a spatial panel data approach , 2011 .

[41]  Miguel A. Santolino Prieto Determinants of the decision to appeal against motor bodily injury settlements awarded by Spanish trial courts , 2008 .

[42]  Tomás del Barrio Castro,et al.  The Determinants of University Patenting: Do Incentives Matter? , 2009 .

[43]  J. Borrell,et al.  Assessing Excess Profits from Different Entry Regulations , 2009 .

[44]  Francesc Trillas,et al.  Productive Efficiency and Regulatory Reform: The Case of Vehicle Inspection Services , 2006 .

[45]  Term structure of interest rate , 1998 .

[46]  Manuela Pulina,et al.  Tourism and Exports as a means of Growth , 2009 .

[47]  A. D. Paolo,et al.  Language Knowledge and Earnings in Catalonia , 2010 .

[48]  Christian Durán Weitkamp,et al.  Economic Effects of Road Accessibility in the Pyrenees: User Perspective , 2008 .

[49]  Volodymyr Bilotkach,et al.  Scheduled Service Versus Personal Transportation: The Role of Distance , 2008 .

[50]  E. López‐Bazo,et al.  The Spatial Distribution of Human Capital: Can It Really Be Explained by Regional Differences in Market Access? , 2011 .

[51]  Manuela Alcañiz,et al.  Calculation of the Variance in Surveys of the Economic Climate , 2006 .

[52]  Z. Griliches Patent Statistics as Economic Indicators: a Survey , 1990 .

[53]  A. Vaglio,et al.  Why Do Educated Mothers Matter? A Model of Parental Help , 2010 .

[54]  Xavier Fageda,et al.  Is it Redistribution or Centralization? On the Determinants of Government Investment in Infrastructure , 2010 .

[55]  Rosina Moreno,et al.  DOES HUMAN CAPITAL STIMULATE INVESTMENT IN PHYSICAL CAPITAL? EVIDENCE FROM A COST SYSTEM FRAMEWORK , 2008 .

[56]  Alessandro Maravalle,et al.  Fiscal policy and economic stability: does PIGS stand for Procyclicality In Government Spending? , 2010 .

[57]  Alexandrina Stoyanova,et al.  Changes in the demand for private medical insurance following a shift in tax incentives. , 2008, Health economics.

[58]  F. Manca Human Capital Composition and Economic Growth at the Regional Level , 2012 .

[59]  C. Álvarez-Albelo,et al.  The Commons and Anti-Commons Problems in the Tourism Economy , 2009 .

[60]  G. Bel 1against the Mainstream: Nazi Privatization in 1930s Germany , 2010 .

[61]  Attila Varga,et al.  Local Geographic Spillovers between University Research and High Technology Innovations , 1997 .

[62]  A. Manresa,et al.  The International Trade as the Sole Engine of Growth for an Economy , 2009 .

[63]  Xavier Fageda,et al.  Price rivalry in airline markets: a study of a successful strategy of a network carrier against a low-cost carrier , 2011 .

[64]  Daniel Sol,et al.  Lowering blood alcohol content levels to save lives the european experience , 2008 .

[65]  Jordi Suriñach,et al.  Fiscal policy and interest rates: the role of financial and economic integration , 2008 .

[66]  Grid Thoma,et al.  Creating Powerful Indicators for Innovation Studies with Approximate Matching Algorithms. A test based on PATSTAT and Amadeus databases , 2007 .

[67]  INTERCONTINENTAL FLIGHTS FROM EUROPEAN AIRPORTS: TOWARDS HUB CONCENTRATION OR NOT? , 2010 .

[68]  Xavier Fageda,et al.  Ownership, Incentives and Hospitals , 2010 .

[69]  R. Ramos,et al.  Regional Economic Growth and Human Capital: The Role of Over-education , 2012 .

[70]  Raúl Ramos Lobo,et al.  Human Capital Spillovers Productivity and Regional Convergence in Spain , 2009 .

[71]  A. Lucena The Antecedents and Innovation Consequences of Organizational Search: Empirical Evidence for Spain , 2009 .

[72]  Xavier Fageda,et al.  Empirical analysis of solid management waste costs: Some evidence from Galicia, Spain , 2009 .

[73]  Miguel Santolino,et al.  Modelling the disability severity score in motor insurance claims: an application to the Spanish case , 2009 .

[74]  Dennis J. Snower,et al.  The macroeconomics of the labor market: three fundamental views , 2006, SSRN Electronic Journal.

[75]  G. Bel,et al.  Intermunicipal cooperation and privatization of solid waste services among small municipalities in Spain , 2008 .

[76]  Ernest Miguélez,et al.  Scientists on the move: tracing scientists mobility and its spatial distribution , 2009 .

[77]  Sandra Nieto,et al.  La sobreeducación de los padres afecta al rendimiento académico de sus hijos , 2011 .

[78]  Dietmar Harhoff,et al.  Methods and software for the harmonization and combination of datasets: A test based on IP-related data and accounting databases with a large panel of companies at the worldwide level , 2009 .

[79]  Laia Castany The Role of Firm Size in Training Provision Decisions: evidence from Spain , 2008 .

[80]  Xavier Fageda,et al.  Is Private Production of Public Services Cheaper Than Public Production? A Meta-Regression Analysis of Solid Waste and Water Services. , 2010 .

[81]  Francisco Mas-Verdú,et al.  Which Firms Want PhDs? The Effect of the University-Industry Relationship on the PhD Labour Market , 2010 .

[82]  J. Borrell,et al.  Clustering or Scattering: The Underlying Reason for Regulating Distance Among Retail Outlets , 2010 .

[83]  V. Royuela,et al.  Quality of Work and Aggregate Productivity , 2013 .

[84]  Xavier Fageda,et al.  Technology, Business Models and Network Structure in the Airline Industry , 2010 .

[85]  Xavier Fageda,et al.  Local privatization, intermunicipal cooperation, transaction costs and political interests: Evidence from Spain , 2008 .

[86]  Francesco Lissoni,et al.  The Keins Database on Academic Inventors: Methodology and Contents , 2006 .

[87]  Daniel Albalate,et al.  Factors explaining urban transport systems in large European cities: A cross-sectional approach , 2009 .

[88]  Raul Ramos,et al.  Job Losses, Outsourcing and Relocation: Empirical Evidence Using Microdata , 2006, SSRN Electronic Journal.

[89]  A. D. Paolo Knowledge of Catalan, Public/Private Sector Choice and Earnings: Evidence from a Double Sample Selection Model , 2010 .

[90]  Daniel Albalate,et al.  Shaping urban traffic patterns through congestion charging: What factors drive success or failure? , 2008 .

[91]  V. Royuela,et al.  Economic and Social Convergence in Colombia , 2010 .

[92]  P. Claeys,et al.  "If you want me to stay, pay" , 2011 .

[93]  Xavier Fageda,et al.  Privatization and Competition in the Delivery of Local Services: An Empirical Examination of the Dual Market Hypothesis , 2008 .

[94]  M. Artís,et al.  Does Social Capital Reinforce Technological Inputs in the Creation of Knowledge? Evidence from the Spanish Regions , 2011 .

[95]  B. Kogut,et al.  Localization of Knowledge and the Mobility of Engineers in Regional Networks , 1999 .

[96]  J. Duque,et al.  Research networks and scientific production in Economics: The recent Spanish Experience Las redes de investigación y la producción científica en economía: La experiencia española reciente , 2011 .

[97]  Mercedes Ayuso,et al.  Prediction of individual automobile RBNS claim reserves in the context of Solvency II , 2008 .

[98]  Jean-Philippe Boucher,et al.  Discrete distributions when modeling the disability severity score of motor victims. , 2010, Accident Analysis and Prevention.

[99]  Daniel Albalate,et al.  Exploring Determinants of Urban Motorcycle Accident Severity: The Case of Barcelona , 2009 .

[100]  Raquel Ortega-Argilés,et al.  Evidence on the role of ownership structure on firms' innovative performance , 2009 .

[101]  R. Ramos,et al.  Is the Wage Curve Formal or Informal? Evidence for Columbia , 2010, SSRN Electronic Journal.

[102]  G. Turati,et al.  What are the Causes of Educational Inequalities and of Their Evolution Over Time in Europe? Evidence from PISA , 2011 .

[103]  Antonio Afonso,et al.  Fiscal regime shifts in Portugal , 2009 .

[104]  Rosina Moreno,et al.  Has concentration evolved similarly in manufacturing and services? A sensitivity analysis , 2007 .

[105]  Jordi Suriñach i Caralt,et al.  Patrones de publicación internacional (ssci) de los autores afiliados a universidades españolas, en el ámbito económico-empresarial (1994-2004) , 2006 .

[106]  Rosina Moreno,et al.  Research Networks and Inventors' Mobility as Drivers of Innovation: Evidence from Europe , 2013 .

[107]  Xavier Fageda,et al.  An Empirical Analysis of a Merger between a Network and Low-Cost Airlines , 2011 .

[108]  Gerald Marschke,et al.  International Knowledge Flows: Evidence from an Inventor-Firm Matched Data Set , 2006 .

[109]  L. Anselin,et al.  Patents and innovation counts as measures of regional production of new knowledge , 2002 .

[110]  A. Giovannini,et al.  European Financial Integration , 1992 .

[111]  Alex Coad,et al.  Like Milk or Wine: Does Firm Performance Improve with Age? , 2010 .

[112]  Jens Perch Nielsen,et al.  Time-varying effects when analysing customer lifetime duration: application to the insurance market , 2006 .

[113]  I. Moreno-Torres What If There Was a Stronger Pharmaceutical Price Competition in Spain? When Regulation Has a Similar Effect to Collusion , 2011 .

[114]  Karin Hoisl,et al.  Tracing mobile inventors—The causality between inventor mobility and inventor productivity , 2007 .

[115]  Montserrat Guillén,et al.  An Introduction to Parametric and Non-Parametric Models for Bivariate Positive Insurance Claim Severity Distributions , 2010 .

[116]  G. Bel Infrastructure and nation building: The regulation and financing of network transportation infrastructures in Spain (1720–2010) , 2011 .

[117]  ANÁLISIS DE SUS DETERMINANTES , 2007 .

[118]  Rosina Moreno,et al.  REGIONAL RETURNS TO PHYSICAL CAPITAL: ARE THEY CONDITIONED BY EDUCATIONAL ATTAINMENT? , 2007 .

[119]  D. Karlis,et al.  Modelling Dependence in a Ratemaking Procedure with Multivariate Poisson Regression Models , 2010 .

[120]  Jordi Suriñach,et al.  Patrones De Publicación Internacional (Ssci) De Los Autores Afiliados A Universidades Españolas. En El Ámbito Económico-empresarial (1994-2004) , 2007 .

[121]  Luis Diaz-Serrano,et al.  The Causal Relationship between Individual’S Choice Behavior and Self-Reported Satisfaction: The Case of Residential Mobility in the EU , 2008 .

[122]  Raúl Ramos Lobo,et al.  Los salarios de los inmigrantes en el mercado de trabajo español: ¿importa el origen del capital humano? , 2009 .

[123]  Daniel Albalate,et al.  High-Speed Rail: lessons for policy makers from experiences abroad , 2010 .

[124]  Manuel Trajtenberg,et al.  'Names Game': Harnessing Inventors Patent Data for Economic Research , 2006 .

[125]  Vicente Royuela,et al.  Is the influence of quality of life on urban growth non-stationary in space? A case study of Barcelona , 2007 .

[126]  Chakkrit Snae A Comparison and Analysis of Name Matching Algorithms , 2007 .

[127]  L. Bottazzi,et al.  Innovation and Spillovers in Regions: Evidence from European Patent Data , 2002 .

[128]  Lee Fleming,et al.  Small Worlds and Regional Innovation , 2006, Organ. Sci..

[129]  Mauro Mediavilla,et al.  Evaluating the Impact of Public Subsidies on a Firm's Performance: A Quasi-Experimental Approach , 2007 .

[130]  F. Manca,et al.  A missing spatial link in institutional quality , 2011 .

[131]  P. Claeys,et al.  Testing the FTPL across government tiers , 2008 .

[132]  Xavier Fageda,et al.  ¿Por qué se privatizan servicios en los municipios (pequeños)?Evidencia empírica sobre residuos sólidos y agua , 2009 .

[133]  John McHale,et al.  Gone But Not Forgotten: Labor Flows, Knowledge Spillovers, and Enduring Social Capital , 2003 .

[134]  S. Breschi,et al.  Mobility of Skilled Workers and Co-Invention Networks: An Anatomy of Localized Knowledge Flows , 2009 .

[135]  Germà Bel,et al.  THE FIRST PRIVATIZATION POLICY IN A DEMOCRACY: SELLING STATE-OWNED ENTERPRISES IN 1948-1950 PUERTO RICO , 2009 .

[136]  Catalina Bolancé,et al.  Term structure of interest rate. european financial integration , 2006 .

[137]  Xavier Fageda,et al.  Does privatization spur regulation? Evidence from the regulatory reform of European airports , 2010 .

[138]  Xavier Fageda,et al.  Similar problems, different solutions: comparing refuse collection in the Netherlands and Spain. , 2008, Public administration.

[139]  Nicolas Carayol,et al.  Who's Who in Patents. A Bayesian approach , 2009 .

[140]  G. Bel,et al.  Speed limit laws in America: Economics, politics and geography , 2010 .

[142]  Francesca Arnaboldi,et al.  Internet Banking in Europe: A Comparative Analysis , 2008 .

[143]  Jaime Martinez Martin On the Dynamics of Exports and FDI: The Spanish Internationalization Process , 2010 .