Information Visualizations Used to Avoid the Problem of Overfitting in Supervised Machine Learning

This paper will look at what types of information graphics and visualizations can support supervised Machine Learning tasks: in essence, how to support the problem of model validation and model overfitting. In particular, I look, graphically, at model performance as a function of model complexity. With an appropriate information graphic, we can visualize at what point the model becomes too complex and starts to deteriorate in performance because of model overfitting. I will look at two actual case studies—the first, a regression task using polynomial regression and the second, a classification problem using neural networks. I create information graphics, in particular fitting graphs, to support the end-user in visualizing which model is the best choice.