Compressing arrays of classifiers using Volterra-neural network: application to face recognition

Model compression is required when large models are used, for example, for a classification task, but there are transmission, space, time, or computing constraints that have to be fulfilled. Multilayer perceptron (MLP) models have been traditionally used as classifiers. Depending on the problem, they may need a large number of parameters (neuron functions, weights, and bias) to obtain an acceptable performance. This work proposes a technique to compress an array of MLPs, through the weights of a Volterra-neural network (Volterra-NN), maintaining its classification performance. It will be shown that several MLP topologies can be well-compressed into the first-, second-, and third-order (Volterra-NN) outputs. The obtained results show that these outputs can be used to build an array of (Volterra-NN) that needs significantly less parameters than the original array of MLPs, furthermore having the same high accuracy. The Volterra-NN compression capabilities were tested for solving a face recognition problem. Experimental results are presented on two well-known face databases: ORL and FERET.

[1]  Lawrence Sirovich,et al.  Application of the Karhunen-Loeve Procedure for the Characterization of Human Faces , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Georgina Stegmayer,et al.  Compressing a neural network classifier using a Volterra-Neural Network model , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[4]  Yajie Tian,et al.  Handbook of face recognition , 2003 .

[5]  Georgina Stegmayer,et al.  Array of Multilayer Perceptrons with No-class Resampling Training for Face Recognition , 2009, Inteligencia Artif..

[6]  Hyeonjoon Moon,et al.  The FERET evaluation methodology for face-recognition algorithms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Seong G. Kong,et al.  Recent advances in visual and infrared face recognition - a review , 2005, Comput. Vis. Image Underst..

[8]  Franco Scarselli,et al.  Recursive neural networks learn to localize faces , 2005, Pattern Recognit. Lett..

[9]  R. Schroeder LITERATURE SURVEY , 1981 .

[10]  Franco Giannini,et al.  Neural networks and volterra series for time-domain power amplifier behavioral models , 2007 .

[11]  D. Marquardt An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .

[12]  I W Hunter,et al.  Parallel cascade identification and its application to protein family prediction. , 2001, Journal of biotechnology.

[13]  Rich Caruana,et al.  Model compression , 2006, KDD '06.

[14]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[15]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[16]  Vito Volterra,et al.  Theory of Functionals and of Integral and Integro-Differential Equations , 2005 .

[17]  David Salomon,et al.  Data Compression: The Complete Reference , 2006 .

[18]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[19]  Wangmeng Zuo,et al.  Computational Intelligence-Based Biometric Technologies , 2007, IEEE Computational Intelligence Magazine.

[20]  J. Edward Jackson,et al.  A User's Guide to Principal Components. , 1991 .

[21]  Matt Aitkenhead,et al.  A neural network face recognition system , 2003 .

[22]  G. Ghione,et al.  Neural networks and volterra series for time-domain PA behavioral models , 2007 .

[23]  Kaj Madsen,et al.  Methods for Non-Linear Least Squares Problems , 1999 .

[24]  Bernard Zenko,et al.  Is Combining Classifiers with Stacking Better than Selecting the Best One? , 2004, Machine Learning.

[25]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[26]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[27]  Georgina Stegmayer,et al.  Volterra NN-based behavioral model for new wireless communications devices , 2009, Neural Computing and Applications.

[28]  Ashfaqur Rahman,et al.  Novel Layered Clustering-Based Approach for Generating Ensemble of Classifiers , 2011, IEEE Transactions on Neural Networks.

[29]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.