An Initial Study on the Relationship Between Meta Features of Dataset and the Initialization of NNRW

The initialization of neural networks with random weights (NNRW) has a significant impact on model performance. However, there is no suitable way to solve this problem so far. In this paper, the relationship between meta features of a dataset and the initialization of NNRW is studied. Specifically, we construct seven regression datasets with known attributes’ distributions, then initialize NNRW with different distributions and trained them based on the datasets to get the corresponding models respectively. The relationship between the attributes’ distributions of the datasets and the initialization of NNRW is analyzed by the performance of the models. Several interesting phenomena are observed: firstly, initializing NNRW with the Gaussian distribution can help the model to have a faster convergence rate than ones with the Gamma and Uniform distribution. Secondly, if one or more attributes in a dataset that follow the Gamma distribution, using Gamma distribution to initialize NNRW may result in a slower convergence rate and easy overfitting. Thirdly, initializing NNRW with a specific distribution with smaller variances can always achieve faster convergence rate and better generalization performance than the one with larger variances. The above experimental results are not sensitive to the activation function and the type of NNRW. Some theoretical analyses about the above observations are also given in the study.

[1]  Weipeng Cao Comments on “An Initial Study on the Relationship Between Meta Features of Dataset and the Initialization of NNRW” , 2020 .

[2]  Zhong Ming,et al.  Impact of Probability Distribution Selection on RVFL Performance , 2017, SmartCom.

[3]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[4]  Xu Zhou,et al.  Impact of Variances of Random Weights and Biases on Extreme Learning Machine , 2016, J. Softw..

[5]  Xueyi Liu,et al.  The selection of input weights of extreme learning machine: A sample structure preserving point of view , 2017, Neurocomputing.

[6]  T. Moon The expectation-maximization algorithm , 1996, IEEE Signal Process. Mag..

[7]  P. N. Suganthan,et al.  A comprehensive evaluation of random vector functional link networks , 2016, Inf. Sci..

[8]  Hossam Faris,et al.  Improving Extreme Learning Machine by Competitive Swarm Optimization and its application for medical diagnosis problems , 2018, Expert Syst. Appl..

[9]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[10]  Xizhao Wang,et al.  Non-iterative approaches in training feed-forward neural networks and their applications , 2018, Soft Computing.

[11]  Femida Gwadry-Sridhar,et al.  Healthy Cognitive Aging: A Hybrid Random Vector Functional-Link Model for the Analysis of Alzheimer's Disease , 2017, AAAI.

[12]  Chunyan Miao,et al.  Inferring Cognitive Wellness from Motor Patterns , 2018, IEEE Transactions on Knowledge and Data Engineering.

[13]  Dianhui Wang,et al.  Stochastic Configuration Networks: Fundamentals and Algorithms , 2017, IEEE Transactions on Cybernetics.

[14]  Gonzalo A. Ruz,et al.  Twitter Sentiment Classification Based on Deep Random Vector Functional Link , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[15]  Renato A. Krohling,et al.  Semi-Supervised Online Elastic Extreme Learning Machine for Data Classification , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[16]  Y. Takefuji,et al.  Functional-link net computing: theory, system architecture, and functionalities , 1992, Computer.

[17]  Ming Li,et al.  Insights into randomized algorithms for neural networks: Practical issues and common pitfalls , 2017, Inf. Sci..

[18]  Xizhao Wang,et al.  A review on neural networks with random weights , 2018, Neurocomputing.

[19]  Ping Liu,et al.  Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Zhong Ming,et al.  Some Tricks in Parameter Selection for Extreme Learning Machine , 2017 .

[21]  C. L. Philip Chen,et al.  Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture , 2018, IEEE Transactions on Neural Networks and Learning Systems.