论文信息 - A pruning based method to learn both weights and connections for LSTM

A pruning based method to learn both weights and connections for LSTM

This project is one of the research topics in Professor William Dally’s group. In this project, we developed a pruning based method to learn both weights and connections for Long Short Term Memory (LSTM). In this method, we discard the unimportant connections in a pretrained LSTM, and make the weight matrix sparse. Then, we retrain the remaining model. After we remaining model is converge, we prune this model again and retrain the remaining model iteratively, until we achieve the desired size of model and performance. This method will save the size of the LSTM as well as prevent overfitting. Our results retrained on NeuralTalk shows that we can discard nearly 90% of the weights without hurting the performance too much. Part of the results in this project will be posted in NIPS 2015.

Shijian Tang | Shijian Tang | Jiang Han | Jianglei Han

[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[2] Richard Socher,et al. Aspect Specific Sentiment Analysis Using Hierarchical Deep Learning , 2014 .

[3] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[4] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[5] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[6] Richard Socher,et al. A Neural Network for Factoid Question Answering over Paragraphs , 2014, EMNLP.

[7] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[8] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[9] J. Rauschecker. Neuronal mechanisms of developmental plasticity in the cat's visual system. , 1984, Human neurobiology.