Towards Understanding the Condensation of Neural Networks at Initial Training