Extracting diverse attribute-value information from product catalog text via transfer learning

E-commerce sites are increasingly becoming the norm for how consumers search, purchase, and review products. Such sites internally list millions of products, creating a torrent of product options that can overwhelm a browsing consumer. To facilitate their search, it helps to annotate each product with a table of attributes describing general features such as color, size, etc. However, the tables must be provided by the merchant, so there is a business incentive to automate this task by extracting attribute-value information directly from product titles and descriptions. However, while past methods have done extraction for only a handful of attributes, in practice their exists hundreds of diverse attributes. In this thesis, we present a single model for extracting information on all attributes. In addition, we show that incorporating extra information about intra-attribute similarity improves performance for data-poor attributes. Thesis Supervisor: Regina Barzilay Title: Delta Electronics Professor of Electrical Engineering and Computer Science

[1]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[2]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[3]  Satoshi Sekine,et al.  Unsupervised Extraction of Attributes and Their Values from Product Description , 2013, IJCNLP.

[4]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[5]  Junling Hu,et al.  Bootstrapped Named Entity Recognition for Product Attribute Extraction , 2011, EMNLP.

[6]  Ruslan Salakhutdinov,et al.  Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks , 2016, ICLR.

[7]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[8]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[9]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[10]  David Purcell,et al.  Extracting Semantic Information for e-Commerce , 2016, SEMWEB.

[11]  Christopher D. Manning,et al.  Learning Distributed Representations for Structured Output Prediction , 2014, NIPS.

[12]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[13]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[14]  Jürgen Broß,et al.  Terminology Extraction Approaches for Product Aspect Detection in Customer Reviews , 2013, CoNLL.

[15]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[16]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[17]  Lidong Bing,et al.  Unsupervised Extraction of Popular Product Attributes from Web Sites , 2012, AIRS.

[18]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[19]  Vasudeva Varma,et al.  Domain Independent Model for Product Attribute Extraction from User Reviews using Wikipedia , 2011, IJCNLP.

[20]  Daniel S. Weld,et al.  Autonomously semantifying wikipedia , 2007, CIKM '07.