论文信息 - Algorithms in Bioinformatics - 字舞流文

Algorithms in Bioinformatics

Knowing the location of a protein within the cell is important for understanding its function, role in biological processes, and potential use as a drug target. Much progress has been made in developing computational methods that predict single locations for proteins, assuming that proteins localize to a single location. However, it has been shown that proteins localize to multiple locations. While a few recent systems have attempted to predict multiple locations of proteins, they typically treat locations as independent or capture inter-dependencies by treating each locations-combination present in the training set as an individual location-class. We present a new method and a preliminary system we have developed that directly incorporates inter-dependencies among locations into the multiple-location-prediction process, using a collection of Bayesian network classifiers. We evaluate our system on a dataset of singleand multi-localized proteins. Our results, obtained by incorporating inter-dependencies are significantly higher than those obtained by classifiers that do not use inter-dependencies. The performance of our system on multi-localized proteins is comparable to a top performing system (YLoc), without restricting predictions to be based only on location-combinations present in the training set.

Jens Stoye | Aaron E. Darling | J. Stoye | A. Darling

[1] Wing-Kin Sung,et al. Improved Algorithms for Constructing Consensus Trees , 2013, SODA.

[2] F. James Rohlf,et al. Taxonomic Congruence in the Leptopodomorpha Re-examined , 1981 .

[3] E. N. Adams. Consensus Techniques and the Comparison of Taxonomic Trees , 1972 .

[4] Bengt Oxelman,et al. Improvements to resampling measures of group support , 2003 .

[5] Fred R. McMorris,et al. Consensusn-trees , 1981 .

[6] Pablo A. Goloboff,et al. TNT, a free program for phylogenetic analysis , 2008 .

[7] Mikkel Thorup,et al. An O(n log n) algorithm for the maximum agreement subtree problem for binary trees , 1996, SODA '96.

[8] Jeet Sukumaran,et al. A justification for reporting the majority-rule consensus tree in Bayesian phylogenetics. , 2008, Systematic biology.

[9] Wing-Kin Sung,et al. Constructing the R* Consensus Tree of Two Trees in Subcubic Time , 2012, Algorithmica.

[10] David Bryant,et al. A classification of consensus methods for phylogenetics , 2001, Bioconsensus.

[11] Vincent Moulton,et al. Inferring polyploid phylogenies from multiply-labeled gene trees , 2009, BMC Evolutionary Biology.

[12] Nina Amenta,et al. A Linear-Time Majority Tree Algorithm , 2003, WABI.

[13] Louis J. Gross. Algorithms in Bioinformatics: A Practical Introduction , 2009 .

[14] K. Bremer. COMBINABLE COMPONENT CONSENSUS , 1990, Cladistics : the international journal of the Willi Hennig Society.

[15] F. McMorris,et al. The median procedure for n-trees , 1986 .

[16] Fred R. McMorris,et al. A Characterization of Majority Rule for Hierarchies , 2008, J. Classif..

[17] David Fernández-Baca,et al. Majority-rule (+) consensus trees. , 2010, Mathematical biosciences.

[18] Mark Wilkinson,et al. Majority-rule supertrees. , 2007, Systematic biology.

[19] John P. Huelsenbeck,et al. MrBayes 3: Bayesian phylogenetic inference under mixed models , 2003, Bioinform..

[20] David Bryant,et al. Properties of consensus methods for inferring species trees from gene trees. , 2008, Systematic biology.

[21] Jeet Sukumaran,et al. DendroPy: a Python library for phylogenetic computing , 2010, Bioinform..

[22] Wing-Kin Sung,et al. An Optimal Algorithm for Building the Majority Rule Consensus Tree , 2013, RECOMB.

[23] W. H. Day. Optimal algorithms for comparing trees with labeled leaves , 1985 .

[24] Wing-Kin Sung,et al. Polynomial-Time Algorithms for Building a Consensus MUL-Tree , 2012, J. Comput. Biol..