A Formal Model of Learning and Policy Diffusion