Neural Transplant Surgery: An Approach to Pre-training Recurrent Networks

Partially-recurrent networks have advantages over strictly feed-forward networks for certain spatiotemporal pattern classification or prediction tasks. However networks involving recurrent links are generally more difficult to train than their non-recurrent counterparts. In this paper we demonstrate that the costs of training a recurrent network can be greatly reduced by initialising the network prior to training with weights 'transplanted' from a non-recurrent architecture.