Why Does Unsupervised Pre-training Help Deep Learning?