论文信息 - A Walkthrough for the Principle of Logit Separation

A Walkthrough for the Principle of Logit Separation

We consider neural network training, in applications in which there are many possible classes, but at test-time, the task is a binary classification task of determining whether the given example belongs to a specific class. We define the Single Logit Classification (SLC) task: training the network so that at test-time, it would be possible to accurately identify whether the example belongs to a given class in a computationally efficient manner, based only on the output logit for this class. We propose a natural principle, the Principle of Logit Separation, as a guideline for choosing and designing loss functions that are suitable for SLC. We show that the Principle of Logit Separation is a crucial ingredient for success in the SLC task, and that SLC results in considerable speedups when the number of classes is large.

Björn W. Schuller | Gil Keren | Sivan Sabato

[1] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.

[2] Wei Xu,et al. CNN-RNN: A Unified Framework for Multi-label Image Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Wei Wang,et al. Multi-task deep neural network for multi-label learning , 2013, 2013 IEEE International Conference on Image Processing.

[4] Björn W. Schuller,et al. Tunable Sensitivity to Large Errors in Neural Network Training , 2017, AAAI.

[5] Scotland , 1914, The Hospital.

[6] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[7] Andrew Y. Ng,et al. Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[8] Sebastian Scherer,et al. VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9] Björn W. Schuller,et al. Fast Single-Class Classification and the Principle of Logit Separation , 2017, 2018 IEEE International Conference on Data Mining (ICDM).

[10] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[11] Richard M. Schwartz,et al. Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[12] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Björn W. Schuller,et al. Scaling Speech Enhancement in Unseen Environments with Noise Embeddings , 2018, ArXiv.

[14] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Dinh Phung,et al. Journal of Machine Learning Research: Preface , 2014 .

[16] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[17] Björn Schuller,et al. Calibrated Prediction Intervals for Neural Network Regressors , 2018, IEEE Access.