2-D Stochastic Configuration Networks for Image Data Analytics

Stochastic configuration networks (SCNs) as a class of randomized learner model have been successfully employed in data analytics due to its universal approximation capability and fast modeling property. The technical essence lies in stochastically configuring the hidden nodes (or basis functions) based on a supervisory mechanism rather than data-independent randomization as usually adopted for building randomized neural networks. Given image data modeling tasks, the use of 1-D SCNs potentially demolishes the spatial information of images, and may result in undesirable performance. This paper extends the original SCNs to a 2-D version, called 2DSCNs, for fast building randomized learners with matrix inputs. Some theoretical analysis on the goodness of 2DSCNs against SCNs, including the complexity of the random parameter space and the superiority of generalization, are presented. Empirical results over one regression example, four benchmark handwritten digit classification tasks, two human face recognition datasets, as well as one natural image database, demonstrate that the proposed 2DSCNs perform favorably and show good potential for image data analytics.

[1]  Dianhui Wang,et al.  Stochastic Configuration Networks: Fundamentals and Algorithms , 2017, IEEE Transactions on Cybernetics.

[2]  Ming Li,et al.  Insights into randomized algorithms for neural networks: Practical issues and common pitfalls , 2017, Inf. Sci..

[3]  D. E. Rumelhart,et al.  Learning internal representations by back-propagating errors , 1986 .

[4]  Jon C. Dattorro,et al.  Convex Optimization & Euclidean Distance Geometry , 2004 .

[5]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Yoh-Han Pao,et al.  Stochastic choice of basis functions in adaptive function approximation and the functional-link net , 1995, IEEE Trans. Neural Networks.

[7]  Dianhui Wang,et al.  Randomness in neural networks: an overview , 2017, WIREs Data Mining Knowl. Discov..

[8]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Pengtao Xie,et al.  Learning Less-Overlapping Representations , 2017, ArXiv.

[10]  Yaoliang Yu,et al.  Learning Latent Space Models with Angular Constraints , 2017, ICML.

[11]  Xiaoyan Sun,et al.  Multi-Dimensional Sparse Models , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[14]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[15]  Ming Li,et al.  Robust stochastic configuration networks with maximum correntropy criterion for uncertain data regression , 2019, Inf. Sci..

[16]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Inderjit S. Dhillon,et al.  Recovery Guarantees for One-hidden-layer Neural Networks , 2017, ICML.

[19]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[20]  Ivan Tyukin,et al.  Approximation with random bases: Pro et Contra , 2015, Inf. Sci..

[21]  Xuelong Li,et al.  Hierarchical Recurrent Neural Hashing for Image Retrieval With Hierarchical Convolutional Features , 2018, IEEE Transactions on Image Processing.

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  Peter L. Bartlett,et al.  The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.

[24]  Xiangtao Zheng,et al.  Remote Sensing Scene Classification by Unsupervised Representation Learning , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[25]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[26]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[27]  Tomaso Poggio,et al.  Computational vision and regularization theory , 1985, Nature.

[28]  Yunming Ye,et al.  Learning Discriminative Subspace Models for Weakly Supervised Face Detection , 2017, IEEE Transactions on Industrial Informatics.

[29]  Xiaoyan Sun,et al.  Two dimensional analysis sparse model , 2013, 2013 IEEE International Conference on Image Processing.

[30]  Junbin Gao,et al.  Matrix Neural Networks , 2016, ISNN.

[31]  Dianhui Wang,et al.  Deep stacked stochastic configuration networks for lifelong learning of non-stationary data streams , 2019, Inf. Sci..

[32]  Xiaoqiang Lu,et al.  Scene Recognition by Manifold Regularized Deep Learning Architecture , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[33]  Ming Li,et al.  Deep Stochastic Configuration Networks with Universal Approximation Property , 2017, 2018 International Joint Conference on Neural Networks (IJCNN).

[34]  Dianhui Wang,et al.  Stochastic configuration networks ensemble with heterogeneous features for large-scale data analytics , 2017, Inf. Sci..

[35]  Feilong Cao,et al.  Extended feed forward neural networks with random weights for face recognition , 2014, Neurocomputing.

[36]  Ming Li,et al.  Robust stochastic configuration networks with kernel density estimation for uncertain data regression , 2017, Inf. Sci..

[37]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[38]  Dejan J. Sobajic,et al.  Learning and generalization characteristics of the random vector Functional-link net , 1994, Neurocomputing.

[39]  Quoc V. Le,et al.  On optimization methods for deep learning , 2011, ICML.

[40]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Harry Wechsler,et al.  The FERET database and evaluation procedure for face-recognition algorithms , 1998, Image Vis. Comput..

[42]  Pengtao Xie,et al.  On the Generalization Error Bounds of Neural Networks under Diversity-Inducing Mutual Angular Regularization , 2015, ArXiv.

[43]  AI Koan,et al.  Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning , 2008, NIPS.

[44]  Junbin Gao,et al.  Matrix variate RBM model with Gaussian distributions , 2016, 2017 International Joint Conference on Neural Networks (IJCNN).

[45]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[46]  Xiaoyan Sun,et al.  Two dimensional synthesis sparse model , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[47]  P. Lancaster,et al.  The theory of matrices : with applications , 1985 .

[48]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[49]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[50]  M. Springer,et al.  The Distribution of Products of Beta, Gamma and Gaussian Random Variables , 1970 .