Pattern Recognition Applications and Methods

Visual search, in particular the street-to-shop task of matching fashion items displayed in everyday images with similar articles, is a challenging and commercially important task in computer vision. Building on our successful Studio2Shop model [20], we report results on Street2Fashion2Shop, a pipeline architecture that stacks Studio2Fashion, a segmentation model responsible for eliminating the background in a street image, with Fashion2Shop, an improved model matching the remaining foreground image with “title images”, front views of fashion articles on a white background. Both segmentation and product matching rely on deep convolutional neural networks. The pipeline allows us to circumvent the lack of quality annotated wild data by leveraging specific data sets at all steps. We show that the use of fashion-specific training data leads to superior performance of the segmentation model. Studio2Shop built its performance on FashionDNA, an in-house product representation trained on the rich, professionally curated Zalando catalogue. Our study presents a substantially improved version of FashionDNA that boosts the accuracy of the matching model. Results on external datasets confirm the viability of our approach.

[1]  John F. Jarvis,et al.  A survey of techniques for the display of continuous tone pictures on bilevel displays , 1976 .

[2]  Erik D. Demaine,et al.  An optimal decomposition algorithm for tree edit distance , 2006, TALG.

[3]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Limin Wang,et al.  Action recognition with trajectory-pooled deep-convolutional descriptors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  N. Fisher,et al.  A correlation coefficient for circular data , 1983 .

[6]  Luc Van Gool,et al.  Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.

[7]  Kouichi Hirata,et al.  Tai Mapping Hierarchy for Rooted Labeled Trees Through Common Subforest , 2016, Theory of Computing Systems.

[8]  Sabine Van Huffel,et al.  Development of an Interhemispheric Symmetry Measurement in the Neonatal Brain , 2014, ICPRAM.

[9]  C. Chandra Sekhar,et al.  Dynamic Kernels based Approaches to Analysis of Varying Length Patterns in Speech and Image Processing Tasks , 2017 .

[10]  Jian Dong,et al.  Deep Human Parsing with Active Template Regression , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Mubarak Shah,et al.  Recognizing 50 human action categories of web videos , 2012, Machine Vision and Applications.

[13]  Alexander C. Berg,et al.  Hipster Wars: Discovering Elements of Fashion Styles , 2014, ECCV.

[14]  Masami Takata,et al.  A Multi-fonts Kanji Character Recognition Method for Early-modern Japanese Printed Books with Ruby Characters , 2014, ICPRAM.

[15]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16]  Min Xu,et al.  Efficient Clothing Retrieval with Semantic-Preserving Visual Phrases , 2012, ACCV.

[17]  Tao Jiang,et al.  Some MAX SNP-Hard Results Concerning Unordered Labeled Trees , 1994, Inf. Process. Lett..

[18]  Kaizhong Zhang,et al.  A constrained edit distance between unordered labeled trees , 1996, Algorithmica.

[19]  François Goulette,et al.  Paris-rue-Madame Database - A 3D Mobile Laser Scanner Dataset for Benchmarking Urban Detection, Segmentation and Classification Methods , 2014, ICPRAM.

[20]  Kouichi Hirata,et al.  Tractable and Intractable variations of Unordered Tree Edit Distance , 2014, Int. J. Found. Comput. Sci..

[21]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[22]  Tobias Malach,et al.  Face Templates Creation for Surveillance Face Recognition System , 2014, ICPRAM.

[23]  Gérard Chollet,et al.  Semi-Automated Identification of Leopard Frogs , 2014, ICPRAM.

[24]  Sara Colantonio,et al.  SuperResolution-aided Recognition of Cytoskeletons in Scanning Probe Microscopy Images , 2014, ICPRAM.

[25]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  T. Kuboyama Matching and Learning in Trees , 2007 .

[27]  Kuo-Chung Tai,et al.  The Tree-to-Tree Correction Problem , 1979, JACM.

[28]  Cordelia Schmid,et al.  Actions in context , 2009, CVPR.

[29]  Kaizhong Zhang,et al.  Algorithms for the constrained editing distance between ordered labeled trees and related problems , 1995, Pattern Recognit..

[30]  Lusheng Wang,et al.  Alignment of trees: an alternative to tree edit , 1995 .

[31]  Atsuhiro Takasu,et al.  Author's Personal Copy Theoretical Computer Science Approximation and Parameterized Algorithms for Common Subtrees and Edit Distance between Unordered Trees , 2022 .

[32]  Eiichi Tanaka,et al.  The Tree-to-Tree Editing Problem , 1988, Int. J. Pattern Recognit. Artif. Intell..

[33]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[34]  Luc Van Gool,et al.  Apparel Classification with Style , 2012, ACCV.

[35]  Kouichi Hirata,et al.  Improved MAX SNP-Hard Results for Finding an Edit Distance between Unordered Trees , 2011, CPM.

[36]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[37]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[38]  Mubarak Shah,et al.  Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Kouichi Hirata,et al.  Earth Mover's Distances for Rooted Labaled Unordered Trees based on Tai Mapping Hierarchy , 2018, ICPRAM.

[40]  Luigi Cinque,et al.  A keypoint-based method for background modeling and foreground detection using a PTZ camera , 2017, Pattern Recognit. Lett..

[41]  Alexander Ferrein,et al.  CRVM: Circular Random Variable-based Matcher - A Novel Hashing Method for Fast NN Search in High-dimensional Spaces , 2018, ICPRAM.

[42]  Shikha Gupta,et al.  Segment-Level Probabilistic Sequence Kernel Based Support Vector Machines for Classification of Varying Length Patterns of Speech , 2016, ICONIP.

[43]  Luigi Cinque,et al.  Adaptive bootstrapping management by keypoint clustering for background initialization , 2017, Pattern Recognit. Lett..

[44]  Gabriel Valiente,et al.  An efficient bottom-up distance between trees , 2001, Proceedings Eighth Symposium on String Processing and Information Retrieval.

[45]  Sudarshan S. Chawathe,et al.  Comparing Hierarchical Data in External Memory , 1999, VLDB.

[46]  Zhe Wang,et al.  Towards Good Practices for Very Deep Two-Stream ConvNets , 2015, ArXiv.

[47]  Sreenivas Gollapudi,et al.  The power of two min-hashes for similarity search among hierarchical data objects , 2008, PODS.