论文信息 - On Neyman-Pearson optimality of binary neural net classifiers

On Neyman-Pearson optimality of binary neural net classifiers

In classical binary statistical pattern recognition optimality in Neyman-Pearson sense, achieved by a (log) likelihood ratio based classifier, is often desirable. A drawback of a Neyman-Pearson optimal classifier is that it requires full knowledge of the (quotient of the) class-conditional probability densities of the input data, which is often unrealistic. The design of neural net classifiers is data driven, meaning that no explicit use is made of the class-conditional probability densities of the input data. In this paper a proof is presented that a neural net can also be trained to approximate a log-likelihood ratio and be used as a Neyman-Pearson optimal, prior-independent classifier. Properties of the approximation of the log-likelihood ratio are discussed. Examples of neural nets trained on synthetic data with known log-likelihood ratios as ground truth illustrate the results.

Raymond Veldhuis | Dan Zeng | Dan Zeng | R. Veldhuis

[1] Liwei Wang,et al. The Expressive Power of Neural Networks: A View from the Width , 2017, NIPS.

[2] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Harry L. Van Trees,et al. Detection, Estimation, and Modulation Theory, Part I , 1968 .

[4] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[5] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[6] Jian Sun,et al. Bayesian Face Revisited: A Joint Formulation , 2012, ECCV.

[7] Javier Ortega-Garcia,et al. Likelihood Ratio Calibration in a Transparent and Testable Forensic Speaker Recognition Framework , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.

[8] M. Turk,et al. Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[9] Robert D. Nowak,et al. A Neyman-Pearson approach to statistical learning , 2005, IEEE Transactions on Information Theory.

[10] Taiji Suzuki,et al. Adaptivity of deep ReLU network for learning in Besov and mixed smooth Besov spaces: optimal rate and curse of dimensionality , 2018, ICLR.

[11] Michael I. Jordan,et al. On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[12] Luuk J. Spreeuwers,et al. Low-resolution face alignment and recognition using mixed-resolution classifiers , 2017, IET Biom..

[13] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..