Gene Function Classification Using Fuzzy K-Nearest Neighbor Approach

Prediction of gene function is a classification problem. Given its simplicity and relatively high accuracy, K-Nearest Neighbor (KNN) classification has become a popular choice for many real life applications. However, traditional KNN approach has two drawbacks. First, it cannot identify classes that do not exist in the training data sets. Second, it treats all K neighbors in a similar way without consideration of the distance differences between the test instance and its neighbors. In this paper, exploiting the potential of fuzzy set theory to handle uncertainty in data sets, we develop a fuzzy KNN approach for gene function classification. Experiments show that integrating fuzzy set theory into original KNN approach improves the overall performance of the classification model.