Editorial: Machine Learning Techniques on Gene Function Prediction

Gene function, including that of coding and noncoding genes, can be difficult to identify in molecular wet laboratories. Therefore, computational methods, often including machine learning, can be a useful tool to guide and predict function. Although machine learning has been considered as a “black box” in the past, it can be more accurate than simple statistical testing methods. In recent years, deep learning and big data machine learning techniques have developed rapidly and achieved an amazing level of performance in many areas, including image classification and speech recognition. This Research Topic explores the potential for machine learning applied to gene function prediction. We are pleased to see that authors brought the latest machine learning techniques on gene function prediction. Submissions came from an open call for paper, and they were accepted for publication with the assistance of professional referees. Forty-six papers are finally selected from a total of 72 submissions after rigorous reviews. They were presented from different countries and regions, including China, USA, Poland, Taiwan, Korea, Saudi Arabia, India, and so on. According to the topics, we categorize three subtopics for our special issue. The first part of this special issue discusses the gene and disease relationship. Six papers included in this part are focused on general diseases. These papers propose novel methods to predict disease and gene/miRNA/long noncoding RNA (lncRNA) associations. Su et al. proposed a novel method called GPSim to effectively deduce the semantic similarity of diseases. Yu et al. constructed a weighted four-layer disease–disease similarity network to characterize the associations at different levels between diseases. Three papers paid attention to miRNA and disease relationship. Qu et al. proposed a novel method to predict miRNA–disease associations based on Locality-constrained Linear Coding. Zhao et al. proposed a novel computational model of SNMFMDA (Symmetric Nonnegative Matrix Factorization for MiRNA-Disease Association prediction) to reveal the relation of miRNA–disease pairs. He et al. proposed an NRLMFMDA (neighborhood regularized logistic matrix factorization method for miRNA–disease association prediction) by integrating miRNA functional similarity, disease semantic similarity, Gaussian interaction profile kernel similarity, and experimental validation of disease–miRNA association. Besides miRNA, there is still a paper on lncRNA–disease relationship prediction. A dualconvolutional neural networks with attention mechanism–based method are presented for predicting the candidate disease lncRNAs (Xuan et al.). There are seven papers on cancer and oncogenes. Two papers paid attention to cancer subtypes. Liu et al. classified muscle-invasive bladder cancer into two conservative subtypes using miRNA, mRNA, and lncRNA expression data; investigated subtype-related biological pathways;