Modeling rapid language learning by distilling Bayesian priors into artificial neural networks