Automatic problem-specific hyperparameter optimization and model selection for supervised machine learning

The use of machine learning techniques has become increasingly widespread in commercial applications and academic research. Machine learning algorithms learn a model from data that allows computers to make and improve predictions or behaviors. Despite their popularity and usefulness, most machine learning techniques require expert knowledge to guide the decisions about the most appropriate model and settings for a particular problem. In many cases, expert knowledge is not readily available. When it is, the complexity of the problem and subjectivity of the expert can often lead to sub-optimal choices in the machine learning strategy. Since different machine learning techniques are suitable for different problems, choosing the right technique and fine-tuning its particular settings are crucial tasks that will directly impact the quality of the predictions. However, deciding which machine learning technique is most well suited for processing specific data is not an easy task, as the number of choices is usually very large. In this work, we present a method that automatically selects the best machine learning algorithm for a particular set of data, and optimizes its parameter settings. Our approach is flexible and customizable, enabling the user to specify their needs in terms of predictive power, sensitivity, specificity, consistency of the predictions, and speed, among other criteria. The results obtained show that using the machine learning technique and configuration suggested by our automated approach yields predictions of a much higher quality than selecting the technique with the best results under its default settings. We also present a method to efficiently guide the search for optimal parameter settings by identifying ranges of values for each setting that produce good results for most problems. By transferring this knowledge to new problems, it is possible to find the optimal configuration of the algorithm more quickly.