Unlabeled PCA-shuffling initialization for convolutional neural networks

In order to obtain prominent recognition accuracy convolutional neural networks (CNNs) need large amounts of labeled data to initialize network parameters. However, there exist two open problems, i.e., the uncertainties of the initialized effects and the limited labeled data To address the problems, we propose a novel method named UPSCNNs, which uses unlabeled data to perform Principal Component Analysis (PCA) and shuffling initialization for CNNs, composed of four steps, i.e. sampling the input images, calculating the sampling sets with PCA and initializing and shuffling the convolutional kernels. In cases with the same network architecture and activation function, i.e., Rectified Linear Units, we conduct the comparative experiments on three image datasets, i.e., STL-10, CIFAR-10(I) and CIFAR-10(II). In terms of accuracy, we find (1) the novel method increases by 4-20 percent in comparison to other weight initialization methods, e.g., Msra initialization, Xavier initialization and Random initialization and (2) an increase of 1-3 percent is obtained with unlabeled data than with only labeled data The results indicate that our method can make full use of unlabeled data for initializing CNNs to achieve good recognition effectiveness.

[1]  Andrew Y. Ng,et al.  Selecting Receptive Fields in Deep Networks , 2011, NIPS.

[2]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[3]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[4]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[5]  G. Michailidis,et al.  An Iterative Algorithm for Extending Learners to a Semi-Supervised Setting , 2008 .

[6]  Shaocheng Tong,et al.  Adaptive Controller Design-Based ABLF for a Class of Nonlinear Time-Varying State Constraint Systems , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[7]  Kaushik Roy,et al.  Energy-Efficient and Improved Image Recognition with Conditional Deep Learning , 2017, ACM J. Emerg. Technol. Comput. Syst..

[8]  Nikos Fazakis,et al.  Self-trained Rotation Forest for semi-supervised learning , 2017, J. Intell. Fuzzy Syst..

[9]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Ka Yu Hui,et al.  Direct Modeling of Complex Invariances for Visual Object Features , 2013, ICML.

[11]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[12]  Shaocheng Tong,et al.  Neural Networks-Based Adaptive Control for Nonlinear State Constrained Systems With Input Delay , 2019, IEEE Transactions on Cybernetics.

[13]  H. Abdi,et al.  Principal component analysis , 2010 .

[14]  Emile Fiesler,et al.  Neural Network Initialization , 1995, IWANN.

[15]  AbdiHervé,et al.  Principal Component Analysis , 2010, Essentials of Pattern Recognition.

[16]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18]  Lukás Burget,et al.  Strategies for training large scale neural network language models , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[19]  Wenyu Liu,et al.  Neural features for pedestrian detection , 2017, Neurocomputing.

[20]  Yifei Li,et al.  An uncertainty and density based active semi-supervised learning scheme for positive unlabeled multivariate time series classification , 2017, Knowl. Based Syst..

[21]  Xinghao Ding,et al.  Clearing the Skies: A Deep Network Architecture for Single-Image Rain Removal , 2016, IEEE Transactions on Image Processing.

[22]  U. Raghavendra,et al.  Automated identification of shockable and non-shockable life-threatening ventricular arrhythmias using convolutional neural network , 2018, Future Gener. Comput. Syst..

[23]  Paloma Martínez,et al.  Exploring convolutional neural networks for drug–drug interaction extraction , 2017, Database J. Biol. Databases Curation.

[24]  Hong Liu,et al.  A two-step convolutional neural network based computer-aided detection scheme for automatically segmenting adipose tissue volume depicting on CT images , 2017, Comput. Methods Programs Biomed..

[25]  Claire Leschi,et al.  Scaling Up Semi-supervised Learning: An Efficient and Effective LLGC Variant , 2007, PAKDD.

[26]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  Nikos Fazakis,et al.  Self-Trained Stacking Model for Semi-Supervised Learning , 2017, Int. J. Artif. Intell. Tools.

[28]  Michael R. Lyu,et al.  Face Annotation Using Transductive Kernel Fisher Discriminant , 2008, IEEE Transactions on Multimedia.

[29]  Gernot A. Fink,et al.  An Iterative Partitioning-Based Method for Semi-Supervised Annotation Learning in Image Collections , 2016, Int. J. Pattern Recognit. Artif. Intell..

[30]  Jason Weston,et al.  Question Answering with Subgraph Embeddings , 2014, EMNLP.

[31]  In-So Kweon,et al.  Light-Field Image Super-Resolution Using Convolutional Neural Network , 2017, IEEE Signal Processing Letters.

[32]  Yilong Yin,et al.  Choroid segmentation from Optical Coherence Tomography with graph-edge weights learned from deep convolutional neural networks , 2017, Neurocomputing.

[33]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[34]  Yaochu Jin,et al.  Multi-train: A semi-supervised heterogeneous ensemble classifier , 2017, Neurocomputing.

[35]  Xiao Hu,et al.  Semi-supervised detection of intracranial pressure alarms using waveform dynamics , 2013, Physiological measurement.

[36]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Yoshua Bengio,et al.  Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.