Data Scaling Laws in NMT: The Effect of Noise and Architecture