Identifying artificial intelligence (AI) invention: a novel AI patent dataset

Artificial Intelligence (AI) is an area of increasing scholarly and policy interest. To help researchers, policymakers, and the public, this paper describes a novel dataset identifying AI in over 13.2 million patents and pre-grant publications (PGPubs). The dataset, called the Artificial Intelligence Patent Dataset (AIPD), was constructed using machine learning models for each of eight AI component technologies covering areas such as natural language processing, AI hardware, and machine learning. The AIPD contains two data files, one identifying the patents and PGPubs predicted to contain AI and a second file containing the patent documents used to train the machine learning classification models. We also present several evaluation metrics based on manual review by patent examiners with focused expertise in AI, and show that our machine learning approach achieves state-of-the-art performance across existing alternatives in the literature. We believe releasing this dataset will strengthen policy formulation, encourage additional empirical work, and provide researchers with a common base for building empirical knowledge on the determinants and impacts of AI invention.

[1]  Robert C. Seamans,et al.  Ai, Labor, Productivity and the Need for Firm-Level Data , 2018 .

[2]  Terence C. Mills,et al.  Was 19th century British growth steam-powered?: the climacteric revisited , 2004 .

[3]  J. Kesan,et al.  Eligible Subject Matter at the Patent Office: An Empirical Study of the Influence of Alice on Patent Examiners and Patent Applicants , 2020 .

[4]  Daniel F. Spulber How Patents Provide the Foundation of the Market for Inventions , 2014 .

[5]  Nicholas Crafts,et al.  Steam as a General Purpose Technology: A Growth Accounting Perspective , 2004 .

[6]  Stuart J.H. Graham,et al.  The USPTO Patent Examination Research Dataset: A Window on Patent Processing , 2018, Journal of Economics & Management Strategy.

[7]  Boyan Jovanovic,et al.  General Purpose Technologies , 2005 .

[8]  Nathan Rosenberg,et al.  A General-Purpose Technology at Work: The Corliss Steam Engine in the Late-Nineteenth-Century United States , 2004 .

[9]  Sukkoo Kim Industrialization and Urbanization: Did the Steam Engine Contribute to the Growth of Cities in the United States? , 2005 .

[10]  Anthony J. Trippe,et al.  Construction and evaluation of gold standards for patent classification—A case study on quantum computing , 2020 .

[11]  Shunsuke Managi,et al.  Trends and priority shifts in artificial intelligence technology invention: A global patent analysis , 2018, Economic Analysis and Policy.

[12]  Robert C. Seamans,et al.  Occupational, industry, and geographic exposure to artificial intelligence: A novel dataset and its potential uses , 2021, Strategic Management Journal.

[14]  Robert C. Seamans,et al.  AI and the Economy , 2018, Innovation Policy and the Economy.

[15]  Rebecca Henderson,et al.  The Impact of Artificial Intelligence on Innovation , 2018, The Economics of Artificial Intelligence.