Improving Product Classification Using Images

Product classification in Commerce search (\eg{} Google Product Search, Bing Shopping) involves associating categories to offers of products from a large number of merchants. The categorized offers are used in many tasks including product taxonomy browsing and matching merchant offers to products in the catalog. Hence, learning a product classifier with high precision and recall is of fundamental importance in order to provide high quality shopping experience. A product offer typically consists of a short textual description and an image depicting the product. Traditional approaches to this classification task is to learn a classifier using only the textual descriptions of the products. In this paper, we show that the use of images, a weaker signal in our setting, in conjunction with the textual descriptions, a more discriminative signal, can considerably improve the precision of the classification task, irrespective of the type of classifier being used. We present a novel classification approach, \Cross Adapt{} (\CrossAdaptAcro{}), that is cognizant of the disparity in the discriminative power of different types of signals and hence makes use of the confusion matrix of dominant signal (text in our setting) to prudently leverage the weaker signal (image), for an improved performance. Our evaluation performed on data from a major Commerce search engine's catalog shows a 12\% (absolute) improvement in precision at 100\% coverage, and a 16\% (absolute) improvement in recall at 90\% precision compared to classifiers that only use textual description of products. In addition, \CrossAdaptAcro{} also provides a more accurate classifier based only on the dominant signal (text) that can be used in situations in which only the dominant signal is available during application time.

[1]  Ariel Fuxman,et al.  Matching Unstructured Offers to Structured Product Descriptions , 2011 .

[2]  Tobias Scheffer,et al.  Multi-Relational Learning, Text Mining, and Semi-Supervised Learning for Functional Genomics , 2004, Machine Learning.

[3]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[4]  Cordelia Schmid,et al.  Multimodal semi-supervised learning for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Hagit Shatkay,et al.  Integrating image data into biomedical text categorization , 2006, ISMB.

[6]  Luiz Eduardo Soares de Oliveira,et al.  Pairwise fusion matrix for combining classifiers , 2007, Pattern Recognit..

[7]  Daoqiang Zhang,et al.  Multimodal classification of Alzheimer's disease and mild cognitive impairment , 2011, NeuroImage.

[8]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[10]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[11]  Paul A. Viola,et al.  Unsupervised improvement of visual detectors using cotraining , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[12]  Trevor Darrell,et al.  Co-Adaptation of audio-visual speech and gesture classifiers , 2006, ICMI '06.

[13]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14]  Daniel P. Huttenlocher,et al.  Landmark classification in large-scale image collections , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Wei-Hao Lin,et al.  News video classification using SVM-based multimodal classifiers and combination strategies , 2002, MULTIMEDIA '02.

[16]  Qi Zhao,et al.  Co-Tracking Using Semi-Supervised Support Vector Machines , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[17]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[18]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[19]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[20]  Tao Li,et al.  Semisupervised learning from different information sources , 2005, Knowledge and Information Systems.

[21]  Sunita Sarawagi,et al.  Cross-training: learning probabilistic mappings between topics , 2003, KDD '03.

[22]  Ramakrishnan Srikant,et al.  On integrating catalogs , 2001, WWW '01.

[23]  Fabrício Enembreck,et al.  WEB Image Classification Based on the Fusion of Image and Text Classifiers , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).