Alloys selection based on the supervised learning technique for design of biocompatible medical materials

Purpose: The main aim of this paper is development, software implementation and use of the alloys selection method for the design of biocompatible materials in medical production. It is based on the use of Ito decomposition and Logistic Regression. Design/methodology/approach: The technology of machine learning is used to solve the task. The developed classification method is based on the application of multiclass Logistic Regression. In order to reduce the probability of incorrect alloy identification, expansion of the input characteristics based on the Ito decomposition of the second order has been made. On the one hand, it increased the dimension of the input features space, and as a result, it increased the time for training procedure, but on the other, it increased the solution accuracy of the alloys selection task. The accuracy evaluation of the method was carried out using different criteria. In particular, the method accuracy was estimated based on the ratio of correctly classified titanium alloys samples to the test sample dimension. This measure was used to assess the classification accuracy in the training and test modes. For a more detailed analysis of the classification method results, two additional measures were further used: Precision and Recall. Their calculation was based on the constructed confusion matrix. This made it possible to assess the ability of the developed method to find the instances of each individual alloy as a whole, as well as the ability to distinguish instances of one class from representatives on the other. The combination of these indicators allowed to evaluate the classification task accuracy in the conditions of the imbalance dataset for each class of the investigated material separately. Findings: The simulation results confirmed the effectiveness of the use of machine learning tools to solve this task. High indicators of the method’s accuracy based on the experimental results were established. In particular, the overall accuracy of the method is 96.875%, and the average values of Precision and Recall for all four classes are 94% and 98% respectively. Expansion of each vector's features from the training dataset by using Ito decomposition increased the method accuracy by more than 33% compared to the basic Logistic Regression. Research limitations/implications: The Logistic Regression's training procedure, as well as the increase of the space size of the investigated alloy's input features by using Ito decomposition, imposes a number of limitations on the application of the method in tasks that depend on the duration of the work. Practical implications: The proposed machine learning approach foralloys selection allows reducing the time, material and human resources needed to investigate the titanium alloys properties. The developed method increases the accuracy of the selection alloys task compared to the four known methods, an average of 14.5%. It can be used to select materials with appropriate properties for the design of biocompatible medical products. Originality/value: A method and software product for the titanium alloys classification task using a supervised learning technique has been developed. For this aim, the method of Logistic Regression with expanding inputs based on the second-order Ito decomposition is used.