DeepPatent: patent classification with convolutional neural networks and word embedding

Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%.

[1]  Yen-Liang Chen,et al.  A three-phase method for patent classification , 2012, Inf. Process. Manag..

[2]  Sang-Chan Park,et al.  Visualization of patent analysis for emerging technology , 2008, Expert Syst. Appl..

[3]  Mihaela Bobeica,et al.  Combining Semantics and Statistics for Patent Classification , 2010, CLEF.

[4]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[5]  David D. Lewis,et al.  An evaluation of phrasal and clustered representations on a text categorization task , 1992, SIGIR '92.

[6]  Allan Hanbury,et al.  CLEF-IP 2011: Retrieval in the Intellectual Property Domain , 2011, CLEF.

[7]  Kwangsoo Kim,et al.  Identification and evaluation of corporations for merger and acquisition strategies using patent information and text mining , 2013, Scientometrics.

[8]  Suzan Verberne,et al.  Text Representations for Patent Classification , 2013, CL.

[9]  Amy J. C. Trappey,et al.  Development of a patent document classification and search platform using a back-propagation network , 2006, Expert Syst. Appl..

[10]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[11]  Suzan Verberne,et al.  CLEF-IP 2010: Prior Art Retrieval Using the Different Sections in Patent Documents , 2010, CLEF.

[12]  Jacques Guyot,et al.  Automated Patent Classification , 2011, Current Challenges in Patent Information Retrieval.

[13]  Gilles Falquet,et al.  myClass: A Mature Tool for Patent Classification , 2010, CLEF.

[14]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[15]  Taghi M. Khoshgoftaar,et al.  Deep learning applications and challenges in big data analytics , 2015, Journal of Big Data.

[16]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[17]  Zhen Li,et al.  A framework for automatic TRIZ level of invention estimation of patents using natural language processing, knowledge-transfer and patent citation metrics , 2012, Comput. Aided Des..

[18]  Patrick G. Maggitti,et al.  Top management attention to innovation: The role of search selection and intensity , 2009 .

[19]  Yann LeCun,et al.  Convolutional Learning of Spatio-temporal Features , 2010, ECCV.

[20]  Sungjoo Lee,et al.  Modeling and analyzing technology innovation in the energy sector: Patent-based HMM approach , 2012, Comput. Ind. Eng..

[21]  Christopher D. Manning,et al.  Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.

[22]  Tao Huang,et al.  Patent classification system using a new hybrid genetic algorithm support vector machine , 2010, Appl. Soft Comput..

[23]  Andreia Gentil Bonfante,et al.  Automated Patent Classification Using Word Embedding , 2017, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

[24]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[25]  Matt Taddy,et al.  Document Classification by Inversion of Distributed Language Representations , 2015, ACL.

[26]  Mark A. Lemley,et al.  Patent Licensing, Technology Transfer, & Innovation , 2016 .

[27]  D. E. Rumelhart,et al.  Learning internal representations by back-propagating errors , 1986 .

[28]  Suzan Verberne,et al.  Using skipgrams and PoS-based feature selection for patent classification , 2012, CLIN 2012.

[29]  Amy J. C. Trappey,et al.  A patent quality analysis for innovative technology and product development , 2012, Adv. Eng. Informatics.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Gabriela Ferraro,et al.  Classification and information management for patent collections: a literature review and some research questions , 2016, Inf. Res..

[32]  Stefan Wagner,et al.  Contracting in Medical Equipment Maintenance Services: An Empirical Investigation , 2014, Manag. Sci..

[33]  Türkay Dereli,et al.  Forecasting technology success based on patent data , 2015 .

[34]  Janghyeok Yoon,et al.  Application technology opportunity discovery from technology portfolios: Use of patent classification and collaborative filtering , 2017 .

[35]  Geoffrey E. Hinton,et al.  Distributed Representations , 1986, The Philosophy of Artificial Intelligence.

[36]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[37]  A. Törcsvári,et al.  Automated categorization in the international patent classification , 2003, SIGF.

[38]  Magnus Sahlgren,et al.  The Distributional Hypothesis , 2008 .

[39]  Wipo World Intellectual Property Indicators, 2017 edition , 2017 .

[40]  Laurens van der Maaten,et al.  Accelerating t-SNE using tree-based algorithms , 2014, J. Mach. Learn. Res..

[41]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[42]  Suzan Verberne,et al.  Phrase-Based Document Categorization , 2011, Current Challenges in Patent Information Retrieval.

[43]  Nouman Azam,et al.  Comparison of term frequency and document frequency based feature selection metrics in text categorization , 2012, Expert Syst. Appl..

[44]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[45]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[46]  V Korde,et al.  TEXT CLASSIFICATION AND CLASSIFIERS: A SURVEY , 2012 .

[47]  Suzan Verberne,et al.  Patent Classification Experiments with the Linguistic Classification System LCS , 2010, CLEF.