论文信息 - Exchangeable Deep Neural Networks for Set-to-Set Matching and Learning

Exchangeable Deep Neural Networks for Set-to-Set Matching and Learning

Matching two different sets of items, called heterogeneous set-to-set matching problem, has recently received attention as a promising problem. The difficulties are to extract features to match a correct pair of different sets and also preserve two types of exchangeability required for set-to-set matching: the pair of sets, as well as the items in each set, should be exchangeable. In this study, we propose a novel deep learning architecture to address the abovementioned difficulties and also an efficient training framework for set-to-set matching. We evaluate the methods through experiments based on two industrial applications: fashion set recommendation and group re-identification. In these experiments, we show that the proposed method provides significant improvements and results compared with the state-of-the-art methods, thereby validating our architecture for the heterogeneous set matching problem.

[1] Michael A. Osborne,et al. On the Limitations of Representing Functions on Sets , 2019, ICML.

[2] Duc-Trong Le,et al. Correlation-Sensitive Next-Basket Recommendation , 2019, IJCAI.

[3] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Alexander J. Smola,et al. Deep Sets , 2017, 1703.06114.

[5] Qi Tian,et al. Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7] Li Bai,et al. Cosine Similarity Metric Learning for Face Verification , 2010, ACCV.

[8] Leonidas J. Guibas,et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Ziling Huang,et al. Group Re-Identification via Transferred Representation and Adaptive Fusion , 2019, 2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM).

[10] Dmitry Yarotsky,et al. Universal Approximations of Invariant Maps by Neural Networks , 2018, Constructive Approximation.

[11] Tomoharu Iwata,et al. Unsupervised Many-to-Many Object Matching for Relational Data , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Matti Pietikäinen,et al. Matching Groups of People by Covariance Descriptor , 2010, 2010 20th International Conference on Pattern Recognition.

[13] Denis Simakov,et al. Feature-Based Sequence-to-Sequence Matching , 2006, International Journal of Computer Vision.

[14] Tao Mei,et al. Group Reidentification with Multigrained Matching and Integration , 2019, IEEE Transactions on Cybernetics.

[15] Yu Liu,et al. Quality Aware Network for Set to Set Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Ke Lu,et al. Group Re-Identification: Leveraging and Integrating Multi-Grain Information , 2018, ACM Multimedia.

[17] Ajmal S. Mian,et al. Sparse approximated nearest points for image set classification , 2011, CVPR 2011.

[18] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[19] Cristian Sminchisescu,et al. Deep Learning of Graph Matching , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20] Yizhou Sun,et al. SimGNN: A Neural Network Approach to Fast Graph Similarity Computation , 2018, WSDM.

[21] Max Welling,et al. Attention-based Deep Multiple Instance Learning , 2018, ICML.

[22] Daniel D. Lee,et al. Grassmann discriminant analysis: a unifying view on subspace-based learning , 2008, ICML '08.

[23] Nils M. Kriege,et al. Deep Graph Matching Consensus , 2020, ICLR.

[24] Li Fei-Fei,et al. MentorNet: Regularizing Very Deep Neural Networks on Corrupted Labels , 2017, ArXiv.

[25] Gang Wang,et al. Dual Attention Matching Network for Context-Aware Feature Sequence Based Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26] Yee Whye Teh,et al. Set Transformer , 2018, ICML.

[27] Lucas Beyer,et al. In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[28] Lei Zhang,et al. Face recognition based on regularized nearest points between image sets , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[29] Nanning Zheng,et al. Point to Set Similarity Based Deep Feature Learning for Person Re-Identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Trevor Darrell,et al. Face recognition with image sets using manifold density divergence , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[31] Jia Deng,et al. Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[32] Yi Yang,et al. Random Erasing Data Augmentation , 2017, AAAI.

[33] Yu-Gang Jiang,et al. Learning Fashion Compatibility with Bidirectional LSTMs , 2017, ACM Multimedia.

[34] Josef Kittler,et al. Discriminative Learning and Recognition of Image Set Classes Using Canonical Correlations , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[36] Tomoharu Iwata,et al. Unsupervised group matching with application to cross-lingual topic matching without alignment information , 2017, Data Mining and Knowledge Discovery.

[37] Yizhou Sun,et al. Convolutional Set Matching for Graph Similarity , 2018, ArXiv.

[38] Gang Sun,et al. Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39] Andrew Zisserman,et al. Deep Face Recognition , 2015, BMVC.

[40] Li Shen,et al. Comparator Networks , 2018, ECCV.

[41] Julian J. McAuley,et al. Learning Compatibility Across Categories for Heterogeneous Item Recommendation , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[42] Kihyuk Sohn,et al. Improved Deep Metric Learning with Multi-class N-pair Loss Objective , 2016, NIPS.

[43] Yiming Yang,et al. MMD GAN: Towards Deeper Understanding of Moment Matching Network , 2017, NIPS.

[44] Jiebo Luo,et al. Mining Fashion Outfit Composition Using an End-to-End Deep Learning Approach on Set Data , 2016, IEEE Transactions on Multimedia.

[45] Ken-ichi Maeda,et al. Face recognition using temporal image sequence , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[46] Li Fei-Fei,et al. Neural Graph Matching Networks for Fewshot 3D Action Recognition , 2018, ECCV.

[47] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[48] Luc Van Gool,et al. Building Deep Networks on Grassmann Manifolds , 2016, AAAI.

[49] Akiyoshi Sannai,et al. Universal approximations of permutation invariant/equivariant functions by deep neural networks , 2019, ArXiv.

[50] Kristen Grauman,et al. Creating Capsule Wardrobes from Fashion Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51] Wei Hu,et al. DoT-GNN: Domain-Transferred Graph Neural Network for Group Re-identification , 2019, ACM Multimedia.

[52] Pushmeet Kohli,et al. Graph Matching Networks for Learning the Similarity of Graph Structured Objects , 2019, ICML.

[53] Theodoros Rekatsinas,et al. Deep Learning for Entity Matching: A Design Space Exploration , 2018, SIGMOD Conference.

[54] Piotr Indyk,et al. Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[55] Shih-Fu Chang,et al. Deep Image Set Hashing , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[56] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[57] David Vázquez,et al. Context-Aware Visual Compatibility Prediction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[58] Francesco Solera,et al. Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[59] Shaogang Gong,et al. Associating Groups of People , 2009, BMVC.

[60] Diyi Yang,et al. Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[61] Nenghai Yu,et al. Consistent matching based on boosted salience channels for group re-identification , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[62] Ichiro Takeuchi,et al. Learning Interpretable Metric between Graphs: Convex Formulation and Computation with Graph Mining , 2019, KDD.

[63] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[64] Yaron Lipman,et al. On Universal Equivariant Set Networks , 2020, ICLR.

[65] Alberto Del Bimbo,et al. Group Re-identification via Unsupervised Transfer of Sparse Features Encoding , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[66] Kenji Fukumizu,et al. Data‐driven taxonomy matching of asteroid and meteorite , 2020, Meteoritics & Planetary Science.

[67] Hakan Cevikalp,et al. Face recognition based on image sets , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[68] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[69] Deyu Wang,et al. Group-Pair Convolutional Neural Networks for Multi-View Based 3D Object Retrieval , 2018, AAAI.

[70] Saehoon Kim,et al. Practical Bayesian Optimization over Sets , 2019 .

[71] Zhiming Zhang,et al. Exploring Inter-Instance Relationships within the Query Set for Robust Image Set Matching , 2019, Sensors.

[72] Lars Schmidt-Thieme,et al. Factorizing personalized Markov chains for next-basket recommendation , 2010, WWW '10.

[73] Ryosuke Goto,et al. Outfit Generation and Style Extraction via Bidirectional LSTM and Autoencoder , 2018, ArXiv.

[74] Wen Gao,et al. Manifold-Manifold Distance with application to face recognition based on image set , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[75] Gang Wang,et al. Multi-manifold deep metric learning for image set classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[76] Jiri Matas,et al. Working hard to know your neighbor's margins: Local descriptor learning loss , 2017, NIPS.

[77] Rui Yu,et al. Hard-Aware Point-to-Set Deep Metric for Person Re-identification , 2018, ECCV.

[78] Yang Song,et al. Learning Fine-Grained Image Similarity with Deep Ranking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[79] Shin'ichi Satoh,et al. Group Re-identification via Transferred Single and Couple Representation Learning , 2019, ArXiv.

[80] Kazuhiro Fukui,et al. A Method Based on Convex Cone Model for Image-Set Classification With CNN Features , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[81] Byron C. Wallace,et al. Attention is not Explanation , 2019, NAACL.

[82] Trevor Darrell,et al. Face Recognition from Long-Term Observations , 2002, ECCV.

[83] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[84] Zhenhua Guo,et al. Permutation-Invariant Feature Restructuring for Correlation-Aware Image Set-Based Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[85] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[86] Yaron Lipman,et al. Invariant and Equivariant Graph Networks , 2018, ICLR.

[87] Samy Bengio,et al. Order Matters: Sequence to sequence for sets , 2015, ICLR.

[88] David A. Forsyth,et al. Learning Type-Aware Embeddings for Fashion Compatibility , 2018, ECCV.

[89] Larry S. Davis,et al. Covariance discriminative learning: A natural and efficient approach to image set classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[90] Pascal Vincent,et al. K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms , 2001, NIPS.

[91] Department of Electrical,et al. Computational and Performance Aspects of PCA-Based Face-Recognition Algorithms , 2001, Perception.

[92] John Riedl,et al. Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[93] David Zhang,et al. From Point to Set: Extend the Learning of Distance Metrics , 2013, 2013 IEEE International Conference on Computer Vision.