Smoothing Supervised Learning of Neural Networks for Function Approximation

Two popular hazards in supervised learning of neural networks are local minima and over fitting. Application of the momentum technique dealing with the local optima has proved efficient but it is vulnerable to over fitting. In contrast, deployment of the early stopping technique might overcome the over fitting phenomena but it sometimes terminates into the local minima. This paper proposes a hybrid approach, which is a combination of two processing neurons: momentum and early stopping, to tackle these hazards, aiming at improving the performance of neural networks in terms of both accuracy and processing time in function approximation. Experimental results conducted on various kinds of non-linear functions have demonstrated that the proposed approach is dominant compared with conventional learning approaches.

[1]  Ron Dabora,et al.  On the Role of Estimate-and-Forward With Time Sharing in Cooperative Communication , 2006, IEEE Transactions on Information Theory.

[2]  Mohammad Ali Khojastepour Distributed Cooperative Communications in Wireless Networks , 2004 .

[3]  Jan A Snyman,et al.  Practical Mathematical Optimization: An Introduction to Basic Optimization Theory and Classical and New Gradient-Based Algorithms , 2005 .

[4]  Terrence L. Fine,et al.  Feedforward Neural Network Methodology , 1999, Information Science and Statistics.

[5]  Josep Vidal,et al.  Achievable Rates of Compress-and-Forward Cooperative Relaying on Gaussian Vector Channels , 2007, 2007 IEEE International Conference on Communications.

[6]  Sae-Young Chung,et al.  Compress-and-Forward Relaying Over Parallel Gaussian Channels , 2007, 2007 2nd IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing.

[7]  Cheng Xiang,et al.  Overfitting Problem: a New Perspective from the Geometrical Interpretation of MLP , 2003, HIS.

[8]  Aaron D. Wyner,et al.  The rate-distortion function for source coding with side information at the decoder , 1976, IEEE Trans. Inf. Theory.

[9]  Anders Høst-Madsen,et al.  Capacity bounds and power allocation for wireless relay channels , 2005, IEEE Transactions on Information Theory.

[10]  Jing Li,et al.  Practical Compress-Forward in User Cooperation: Wyner-Ziv Cooperation , 2006, 2006 IEEE International Symposium on Information Theory.

[11]  Abbas El Gamal,et al.  Capacity theorems for the relay channel , 1979, IEEE Trans. Inf. Theory.

[12]  Feng Xue,et al.  Cooperation in a Half-Duplex Gaussian Diamond Relay Channel , 2007, IEEE Transactions on Information Theory.

[13]  E. Meulen,et al.  Three-terminal communication channels , 1971, Advances in Applied Probability.

[14]  Hiroki Tamura,et al.  An Algorithm of Supervised Learning for Multilayer Neural Networks , 2003, Neural Computation.

[15]  Michael R. Souryal,et al.  Quantize-and-Forward Relaying with M-ary Phase Shift Keying , 2008, 2008 IEEE Wireless Communications and Networking Conference.

[16]  Zhen Li,et al.  Research on Overcoming the Local Optimum of BPNN , 2006, 2006 6th World Congress on Intelligent Control and Automation.

[17]  Ashutosh Sabharwal,et al.  Low density parity check codes for the relay channel , 2007, IEEE Journal on Selected Areas in Communications.

[18]  Abbas El Gamal,et al.  Bounds on capacity and minimum energy-per-bit for AWGN relay channels , 2006, IEEE Transactions on Information Theory.

[19]  Michael Gastpar,et al.  Cooperative strategies and capacity theorems for relay networks , 2005, IEEE Transactions on Information Theory.

[20]  Rich Caruana,et al.  Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping , 2000, NIPS.

[21]  Ashutosh Sabharwal,et al.  On capacity of Gaussian 'cheap' relay channel , 2003, GLOBECOM '03. IEEE Global Telecommunications Conference (IEEE Cat. No.03CH37489).

[22]  Ning Qian,et al.  On the momentum term in gradient descent learning algorithms , 1999, Neural Networks.

[23]  Zixiang Xiong,et al.  Compress-forward coding with BPSK modulation for the half-duplex Gaussian relay channel , 2009, IEEE Trans. Signal Process..

[24]  Zixiang Xiong,et al.  Wyner-Ziv coding for the half-duplex relay channel , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[25]  David L. Neuhoff,et al.  Low-resolution scalar quantization for Gaussian sources and squared error , 2006, IEEE Transactions on Information Theory.

[26]  Urbashi Mitra,et al.  Capacity Gain From Two-Transmitter and Two-Receiver Cooperation , 2007, IEEE Transactions on Information Theory.

[27]  Ashutosh Sabharwal,et al.  Half-Duplex Estimate-and-Forward Relaying: Bounds and Code Design , 2006, 2006 IEEE International Symposium on Information Theory.

[28]  Aaron D. Wyner,et al.  The rate-distortion function for source coding with side information at the decoder , 1976, IEEE Trans. Inf. Theory.