论文信息 - Neural Information Processing

Neural Information Processing

In this paper, instead of modifying the framework of Extreme learning machine (ELM), we propose a learning algorithm to improve generalization ability of ELM with Synthetic Instances Generation (SIGELM). We focus on optimizing the output-layer weights via adding informative synthetic instances to the training dataset at each learning step. In order to get the required synthetic instances, a neighborhood is determined for each high-uncertainty training sample and then the synthetic instances which enhance the training performance of ELM are selected in the neighborhood. The experimental results based on 4 representative regression datasets of KEEL demonstrate that our proposed SIGELM obviously improves the generalization capability of ELM and effectively decreases the phenomenon of over-fitting.

Derong Liu | Yuanqing Li | Shengli Xie | Dongbin Zhao | El-Sayed M. El-Alfy

[1] Alan L. Yuille,et al. The Concave-Convex Procedure , 2003, Neural Computation.

[2] A. Atiya,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[3] J. Horowitz,et al. Asymptotic properties of bridge estimators in sparse high-dimensional regression models , 2008, 0804.0693.

[4] Emmanuel J. Candès,et al. Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies? , 2004, IEEE Transactions on Information Theory.

[5] Zongben Xu,et al. $L_{1/2}$ Regularization: A Thresholding Representation Theory and a Fast Solver , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[6] Catherine Blake,et al. UCI Repository of machine learning databases , 1998 .

[7] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[8] Xinbo Gao,et al. 2DPCANet: Dayside Aurora Classification Based on Deep Learning , 2015, CCCV.

[9] Alexandre d'Aspremont,et al. Support vector machine classification with indefinite kernels , 2007, Math. Program. Comput..

[10] Yuan Yan Tang,et al. Multiview Hessian discriminative sparse coding for image annotation , 2013, Comput. Vis. Image Underst..

[11] Jie Yang,et al. Incremental Robust Nonnegative Matrix Factorization for Object Tracking , 2016, ICONIP.

[12] Peter Craven,et al. Smoothing noisy data with spline functions , 1978 .

[13] Emmanuel J. Candès,et al. Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[14] E. Candès,et al. Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[15] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[16] Xiaohong Chen,et al. Solving Indefinite Kernel Support Vector Machine with Difference of Convex Functions Programming , 2017, AAAI.

[17] H. Hotelling. Relations Between Two Sets of Variates , 1936 .

[18] Emmanuel J. Candès,et al. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[19] Weifeng Liu,et al. Canonical correlation analysis networks for two-view image recognition , 2017, Inf. Sci..

[20] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[21] Weifeng Liu,et al. Multiview Hessian Regularization for Image Annotation , 2013, IEEE Transactions on Image Processing.

[22] John C. Platt,et al. Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[23] Joachim M. Buhmann,et al. Optimal Cluster Preserving Embedding of Nonmetric Proximity Data , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[24] N. Meinshausen,et al. LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA , 2008, 0806.0145.

[25] Alexander J. Smola,et al. Learning with non-positive kernels , 2004, ICML.

[26] Wenjiang J. Fu,et al. Asymptotics for lasso-type estimators , 2000 .

[27] Tatsuya Akutsu,et al. Protein homology detection using string alignment kernels , 2004, Bioinform..

[28] David L Donoho,et al. Compressed sensing , 2006, IEEE Transactions on Information Theory.

[29] Colin Campbell,et al. Analysis of SVM with Indefinite Kernels , 2009, NIPS.

[30] Klaus Obermayer,et al. Classi cation on Pairwise Proximity , 2007 .

[31] David L. Donoho,et al. Neighborly Polytopes And Sparse Solution Of Underdetermined Linear Equations , 2005 .

[32] Lei Tian,et al. Stacked PCA Network (SPCANet): An effective deep learning for face recognition , 2015, 2015 IEEE International Conference on Digital Signal Processing (DSP).

[33] Wang Yao,et al. L 1/2 regularization , 2010 .

[34] Lianwen Jin,et al. DLANet: A manifold-learning-based discriminative feature learning network for scene classification , 2015, Neurocomputing.

[35] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[36] Robert P. W. Duin,et al. A Generalized Kernel Approach to Dissimilarity-based Classification , 2002, J. Mach. Learn. Res..

[37] Jianqing Fan,et al. Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[38] Stephen P. Boyd,et al. Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[39] Jonathan J. Hull,et al. A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[40] Jiasong Wu,et al. Kernel principal component analysis network for image classification , 2015, ArXiv.

[41] Xuelong Li,et al. Geometric Mean for Subspace Selection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42] Cun-Hui Zhang,et al. Adaptive Lasso for sparse high-dimensional regression models , 2008 .

[43] Johan A. K. Suykens,et al. Classification With Truncated $\ell _{1}$ Distance Kernel , 2018, IEEE Transactions on Neural Networks and Learning Systems.