Perturbed Model Validation: A New Framework to Validate Model Relevance

This paper introduces PMV (Perturbed Model Validation), a new technique to validate model relevance and detect overfitting or underfitting. PMV operates by injecting noise to the training data, retraining the model against the perturbed data, then using the training accuracy decrease rate to assess model relevance. A larger decrease rate indicates better concept-hypothesis fit. We realise PMV by using label flipping to inject noise, and evaluate PMV on four real-world datasets (breast cancer, adult, connect-4, and MNIST) and three synthetic datasets in the binary classification setting. The results reveal that PMV selects models more precisely and in a more stable way than cross-validation, and effectively detects both overfitting and underfitting.