论文信息 - A theoretical framework for deep locally connected ReLU network

A theoretical framework for deep locally connected ReLU network

Understanding theoretical properties of deep and locally connected nonlinear network, such as deep convolutional neural network (DCNN), is still a hard problem despite its empirical success. In this paper, we propose a novel theoretical framework for such networks with ReLU nonlinearity. The framework explicitly formulates data distribution, favors disentangled representations and is compatible with common regularization techniques such as Batch Norm. The framework is built upon teacher-student setting, by expanding the student forward/backward propagation onto the teacher's computational graph. The resulting model does not impose unrealistic assumptions (e.g., Gaussian inputs, independence of activation, etc). Our framework could help facilitate theoretical analysis of many practical issues, e.g. overfitting, generalization, disentangled representations in deep networks.

Yuandong Tian | Yuandong Tian

[1] Sompolinsky,et al. Spin-glass models of neural networks. , 1985, Physical review. A, General physics.

[2] Oriol Vinyals,et al. Qualitatively characterizing neural network optimization problems , 2014, ICLR.

[3] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.

[4] Michael I. Jordan,et al. Gradient Descent Can Take Exponential Time to Escape Saddle Points , 2017, NIPS.

[5] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[6] Tengyu Ma,et al. Identity Matters in Deep Learning , 2016, ICLR.

[7] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.

[8] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[11] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.