GRF: A Greedy Rank Fusion Algorithm for Combining MicroRNA Target Orderings

MicroRNAs are a family of short (ca. 20–23 nt), endogenous, noncoding RNAs that negatively regulate the expression of genes at a post transcriptional level. MicroRNAs participate in almost all biological pathways and have been implicated in various diseases. Unveiling the microRNA induced gene silencing pathways has huge public health significance. In silico prediction of microRNA targets is, therefore, necessary to guide and accelerate experimental validations. Numerous target prediction algorithms have been proposed in the past decade. These differ considerably in terms of target prioritization, which confuses experimental biologists. Poor prioritization leads to futile experiments. It is, therefore, reasonable to combine or fuse such putative gene orderings produced by various algorithms to obtain a consensus ordering with the more confident targets ranked higher. Kemeny Optimal Aggregation (KOA) is one of the most popular philosophies underlying rank fusion. However, obtaining KOA is computationally intractable in real life scenarios. In this regard, evolutionary and stochastic algorithms have already been proposed. All these methods have their own merits and demerits. In the current article we propose a simple, Greedy Rank Fusion (GRF) technique for KOA. In all the tested cases, GRF is found to provide improved target prioritization. Statistical simulation and experimental validations are used to verify the usefulness of the current algorithm. Supplementary materials can be found in: http://www.isical.ac.in/~bioinfo_miu/molinfo.rar Despite a significant amount of effort since the first discovery of microRNA (miRNA) in the early 90s, its regulatory principles are not completely understood. It has been found that the expression profile of miRNAs are more discriminative of altered pathologies than that of genes. A huge number of miRNA-disease associations have been reported in the last few years. These discoveries underlined the need for a sound understanding of the miRNA mediated silencing mechanism. Experimental validations of miRNA targets incur a great deal of time and cost. To overcome this, a number of computational algorithms have been developed in the past few years. TargetScan, miRanda, PITA, RNAHybrid, PicTar, TargetMiner and MultiMiTar are notable among these. The primary concerns of these algorithms are the sequence complementarity between a particular miRNA and its mRNA target, the binding energy of the miRNA-mRNA duplex, conservation of target sites across species and the secondary structure of the miRNA-target duplex. Unfortunately, these algorithms disagree a lot in terms of their predicted targets and their rankings (based on scores provided by these algorithms). Moreover, biologists are often interested in only the highly ranked targets for a particular miRNA. Since relying on any single algorithm involves greater risk of incorrect predictions, aggregation of the target orderings produced by various algorithms appears to be a more appealing alternative. The rich history of rank aggregation dates back to the latter half of eighteenth century when Borda proposed election by order of merit followed by Condorcet’s proposal of pair wise majority voting. A consensus ranking strategy was proposed by Borda where the order of a particular element is determined by taking a simple average. On the other hand, Condorcet’s criterion permits A to be ranked higher than B if the majority of the lists suggest so. The latter one gives birth to the Condorcet’s paradox. Suppose majority prefers A to B, B to C and C to A. These preferences are not adjustable in a linear fashion. Nevertheless, Condorcet’s criterion is among the most widely used ones in consensus ordering. Aggregation techniques that minimize the pair wise disagreement between a candidate consensus list and the given lists are known as Kemeny Optimal Aggregations (KOA). KOA has a maximum likelihood interpretation. Moreover it obeys the fundamental properties of rank aggregation, namely neutrality and consistency in the Social Choice Theory and the so called Condorcet property. Computational intractability of KOA has motivated the formulation of various heuristics. Markov chains and evolutionary programs are of note in this context. Rank aggregation has successfully been used in metasearch where results of multiple search engines are taken

[1]  Tongbin Li,et al.  miRecords: an integrated resource for microRNA–target interactions , 2008, Nucleic Acids Res..

[2]  Sanghamitra Bandyopadhyay,et al.  TargetMiner: microRNA target prediction with systematic identification of tissue-specific negative examples , 2009, Bioinform..

[3]  Anton J. Enright,et al.  MicroRNA targets in Drosophila , 2003, Genome Biology.

[4]  Michael Kertesz,et al.  The role of site accessibility in microRNA target recognition , 2007, Nature Genetics.

[5]  Sven Laur,et al.  Robust rank aggregation for gene list integration and meta-analysis , 2012, Bioinform..

[6]  C. Burge,et al.  Prediction of Mammalian MicroRNA Targets , 2003, Cell.

[7]  A. Hatzigeorgiou,et al.  A guide through present computational approaches for the identification of mammalian microRNA targets , 2006, Nature Methods.

[8]  V. Ambros,et al.  The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14 , 1993, Cell.

[9]  H. Young,et al.  A Consistent Extension of Condorcet’s Election Principle , 1978 .

[10]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[11]  H. Horvitz,et al.  MicroRNA expression profiles classify human cancers , 2005, Nature.

[12]  Q. Cui,et al.  An Analysis of Human MicroRNA and Disease Associations , 2008, PloS one.

[13]  H. Young Condorcet's Theory of Voting , 1988, American Political Science Review.

[14]  Jie Ding,et al.  Integration of Ranked Lists via Cross Entropy Monte Carlo with Applications to mRNA and microRNA Studies , 2009, Biometrics.

[15]  R. Giegerich,et al.  Fast and effective prediction of microRNA/target duplexes. , 2004, RNA.