Mask-Guided Divergence Loss Improves the Generalization and Robustness of Deep Neural Network