论文信息 - The Limited Multi-Label Projection Layer

The Limited Multi-Label Projection Layer

We propose the Limited Multi-Label (LML) projection layer as a new primitive operation for end-to-end learning systems. The LML layer provides a probabilistic way of modeling multi-label predictions limited to having exactly k labels. We derive efficient forward and backward passes for this layer and show how the layer can be used to optimize the top-k recall for multi-label tasks with incomplete label information. We evaluate LML layers on top-k CIFAR-100 classification and scene graph generation. We show that LML layers add a negligible amount of computational overhead, strictly improve the model's representational capacity, and improve accuracy. We also revisit the truncated top-k entropy method as a competitive baseline for top-k classification.

[1] J. Zico Kolter,et al. OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.

[2] André F. T. Martins,et al. Fast and Robust Compressive Summarization with Dual Decomposition and Multi-Task Learning , 2013, ACL.

[3] Brendan J. Frey,et al. Fast Exact Inference for Recursive Cardinality Models , 2012, UAI.

[4] Andrew Zisserman,et al. Smooth Loss Functions for Deep Top-k Classification , 2018, ICLR.

[5] In-So Kweon,et al. LinkNet: Relational Embedding for Scene Graph , 2018, NeurIPS.

[6] Mathieu Blondel,et al. Structured Prediction with Projection Oracles , 2019, NeurIPS.

[7] Xiaogang Wang,et al. Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation , 2018, ECCV.

[8] Maya R. Gupta,et al. Training highly multiclass classifiers , 2014, J. Mach. Learn. Res..

[9] Andrew McCallum,et al. Structured Prediction Energy Networks , 2015, ICML.

[10] Cynthia Rudin,et al. The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List , 2009, J. Mach. Learn. Res..

[11] Jonathan Berant,et al. Mapping Images to Scene Graphs with Permutation-Invariant Structured Prediction , 2018, NeurIPS.

[12] Svetlana Lazebnik,et al. Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[13] Bernt Schiele,et al. Top-k Multiclass SVM , 2015, NIPS.

[14] Rong Jin,et al. Top Rank Optimization in Linear Time , 2014, NIPS.

[15] Ryan P. Adams,et al. Ranking via Sinkhorn Propagation , 2011, ArXiv.

[16] Shivani Agarwal,et al. The Infinite Push: A New Support Vector Ranking Algorithm that Directly Optimizes Accuracy at the Absolute Top of the List , 2011, SDM.

[17] André F. T. Martins,et al. Learning with Fenchel-Young Losses , 2020, J. Mach. Learn. Res..

[18] Anoop Cherian,et al. Visual Permutation Learning , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Claire Cardie,et al. SparseMAP: Differentiable Sparse Structured Inference , 2018, ICML.

[20] Yejin Choi,et al. Neural Motifs: Scene Graph Parsing with Global Context , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21] Alain Rakotomamonjy,et al. Sparse Support Vector Infinite Push , 2012, ICML.

[22] Bodo Rosenhahn,et al. On Support Relations and Semantic Scene Graphs , 2016, ArXiv.

[23] Danfei Xu,et al. Scene Graph Generation by Iterative Message Passing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Eric P. Xing,et al. Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Scott W. Linderman,et al. Learning Latent Permutations with Gumbel-Sinkhorn Networks , 2018, ICLR.

[26] Stephen P. Boyd,et al. Accuracy at the Top , 2012, NIPS.

[27] Ramón Fernández Astudillo,et al. From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification , 2016, ICML.

[28] Amir Globerson,et al. Predict and Constrain: Modeling Cardinality in Deep Structured Prediction , 2018, ICML.

[29] Jia Deng,et al. Pixels to Graphs by Associative Embedding , 2017, NIPS.

[30] Jean-Charles Régin,et al. Generalized Arc Consistency for Global Cardinality Constraint , 1996, AAAI/IAAI, Vol. 1.

[31] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Razvan Pascanu,et al. Discovering objects and their relations from entangled scene representations , 2017, ICLR.

[33] Bernt Schiele,et al. Loss Functions for Top-k Error: Analysis and Insights , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34] André F. T. Martins,et al. Learning What’s Easy: Fully Differentiable Neural Easy-First Taggers , 2017, EMNLP.

[35] André F. T. Martins,et al. Sparse and Constrained Attention for Neural Machine Translation , 2018, ACL.

[36] Michael S. Bernstein,et al. Image retrieval using scene graphs , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Thomas G. Dietterich,et al. Transductive Optimization of Top k Precision , 2015, IJCAI.

[38] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.

[39] Fernando Pereira,et al. Collective Entity Resolution with Multi-Focal Attention , 2016, ACL.