Probabilistic inference and ranking of gene regulatory pathways as a shortest-path problem

BackgroundSince the advent of microarray technology, numerous methods have been devised to infer gene regulatory relationships from gene expression data. Many approaches that infer entire regulatory networks. This produces results that are rich in information and yet so complex that they are often of limited usefulness for researchers. One alternative unit of regulatory interactions is a linear path between genes. Linear paths are more comprehensible than networks and still contain important information. Such paths can be extracted from inferred regulatory networks or inferred directly. Since criteria for inferring networks generally differs from criteria for inferring paths, indirect and direct inference of paths may achieve different results.ResultsThis paper explores a strategy to infer linear pathways by converting the path inference problem into a shortest-path problem. The edge weights used are the negative log-transformed probabilities of directness derived from the posterior joint distributions of pairwise mutual information between gene expression levels. Directness is inferred using the data processing inequality. The method was designed with two goals. One is to achieve better accuracy in path inference than extraction of paths from inferred networks. The other is to facilitate priorization of interactions for laboratory validation. A method is proposed for achieving this by ranking paths according to the joint probability of directness of each path's edges. The algorithm is evaluated using simulated expression data and is compared to extraction of shortest paths from networks inferred by two alternative methods, ARACNe and a minimum spanning tree algorithm.ConclusionsDirect path inference appears to achieve accuracy competitive with that obtained by extracting paths from networks inferred by the other methods. Preliminary exploration of the use of joint edge probabilities to rank paths is largely inconclusive. Suggestions for a better framework for such comparisons are discussed.

[1]  Mark J. Clement,et al.  Inferring gene regulatory networks from asynchronous microarray data with AIRnet , 2010, BMC Genomics.

[2]  J. Gebert,et al.  Analyzing and optimizing genetic network structure via path-finding , 2004 .

[3]  Kevin Kontos,et al.  Biological Network Inference Using Redundancy Analysis , 2007, BIRD.

[4]  Aurélien Mazurie,et al.  Gene networks inference using dynamic Bayesian networks , 2003, ECCB.

[5]  Gianluca Bontempi,et al.  minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information , 2008, BMC Bioinformatics.

[6]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[7]  Yan Li,et al.  Estimation of Mutual Information: A Survey , 2009, RSKT.

[8]  Adam A. Margolin,et al.  Reverse engineering cellular networks , 2006, Nature Protocols.

[9]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[10]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[11]  V. Anne Smith,et al.  Using Bayesian Network Inference Algorithms to Recover Molecular Genetic Regulatory Networks , 2002 .

[12]  Trupti Joshi,et al.  Inferring gene regulatory networks from multiple microarray datasets , 2006, Bioinform..

[13]  Adam A. Margolin,et al.  Reverse engineering of regulatory networks in human B cells , 2005, Nature Genetics.