An Empirical Study on Distribution Shift Robustness From the Perspective of Pre-Training and Data Augmentation