Identification of Essential Proteins Using A Novel Multi-Objective Optimization Method

Using graph theory to identify essential proteins is a hot topic at present. These methods are called network-based methods. However, the generalization ability of most network-based methods is not satisfactory. Hence, in this paper, we consider the identification of essential proteins as a multi-objective optimization problem and use a novel multi-objective optimization method to solve it. The optimization result is a set of Pareto solutions. Every solution in this set is a vector which has a certain number of essential protein candidates and is considered as an independent predictor or voter. We use a voting strategy to assemble the results of these predictors. To validate our method, we apply it on the protein-protein interactions (PPI) datasets of two species (Yeast and Escherichia coli). The experiment results show that our method outperforms state-of-the-art methods in terms of sensitive, specificity, F-measure, accuracy, and generalization ability.

[1]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[2]  Yi Pan,et al.  Construction of Refined Protein Interaction Network for Predicting Essential Proteins , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[3]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[4]  Yan Wang,et al.  Essential protein identification based on essential protein-protein interaction prediction by integrated edge weights , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[5]  Yi Pan,et al.  A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data , 2012, BMC Systems Biology.

[6]  P. Bonacich Power and Centrality: A Family of Measures , 1987, American Journal of Sociology.

[7]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Essential protein identification by a bootstrap k-nearest neighbor method based on improved edge clustering coefficient , 2015 .

[9]  Yongbo Li,et al.  AMOBH: Adaptive Multiobjective Black Hole Algorithm , 2017, Comput. Intell. Neurosci..

[10]  Yi Pan,et al.  Identification of Essential Proteins Based on Edge Clustering Coefficient , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[11]  P. Stadler,et al.  Centers of complex networks. , 2003, Journal of theoretical biology.

[12]  Yi Pan,et al.  A local average connectivity-based method for identifying essential proteins from the network level , 2011, Comput. Biol. Chem..

[13]  Jianzhi Zhang,et al.  Why Do Hubs Tend to Be Essential in Protein Networks? , 2006, PLoS genetics.

[14]  Yang Wang,et al.  Essential Protein Detection by Random Walk on Weighted Protein-Protein Interaction Networks , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[15]  Ney Lemke,et al.  Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information , 2009, BMC Bioinformatics.

[16]  Peer Bork,et al.  OGEE: an online gene essentiality database , 2011, Nucleic Acids Res..

[17]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[18]  Wu Chong,et al.  Identification of Essential Proteins Using Improved Node and Edge Clustering Coefficient , 2018, 2018 37th Chinese Control Conference (CCC).

[19]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[20]  Enrico Blanzieri,et al.  Identification of Essential Proteins Based on Ranking Edge-Weights in Protein-Protein Interaction Networks , 2014, PloS one.