Finding Supported Paths in Heterogeneous Networks

Subnetwork mining is an essential issue in the analysis of biological, social and communication networks. Recent applications require the simultaneous mining of several networks on the same or a similar vertex set. That is, one searches for subnetworks fulfilling different properties in each input network. We study the case that the input consists of a directed graph D and an undirected graph G on the same vertex set, and the sought pattern is a path P in D whose vertex set induces a connected subgraph of G. In this context, three concrete problems arise, depending on whether the existence of P is questioned or whether the length of P is to be optimized: in that case, one can search for a longest path or (maybe less intuitively) a shortest one. These problems have immediate applications in biological networks and predictable applications in social, information and communication networks. We study the classic and parameterized complexity of the problem, thus identifying polynomial and NP-complete cases, as well as fixed-parameter tractable and W[1]-hard cases. We also propose two enumeration algorithms that we evaluate on synthetic and biological data.

[1]  H. Bunke Graph Matching : Theoretical Foundations , Algorithms , and Applications , 2022 .

[2]  Sebastian Wernicke,et al.  Simple and Fast Alignment of Metabolic Pathways by Exploiting Local Diversity , 2007, APBC.

[3]  E. Wingender,et al.  Identification of dominant signaling pathways from proteomics expression data. , 2008, Journal of proteomics.

[4]  M. Held,et al.  A dynamic programming approach to sequencing problems , 1962, ACM National Meeting.

[5]  Kôiti Hasida,et al.  Spinning Multiple Social Networks for Semantic Web , 2006, AAAI.

[6]  Gos Micklem,et al.  YeastMine—an integrated data warehouse for Saccharomyces cerevisiae data as a multipurpose tool-kit , 2012, Database J. Biol. Databases Curation.

[7]  Saěso Dězeroski Relational Data Mining , 2001, Encyclopedia of Machine Learning and Data Mining.

[8]  Ludovic Cottret,et al.  Metabolic network visualization eliminating node redundance and preserving metabolic pathways , 2007, BMC Systems Biology.

[9]  Guillaume Fertin,et al.  Algorithmic Aspects of Heterogeneous Biological Networks Comparison , 2011, COCOA.

[10]  Jörg Flum,et al.  Parameterized Complexity Theory , 2006, Texts in Theoretical Computer Science. An EATCS Series.

[11]  John E. Hopcroft,et al.  The Directed Subgraph Homeomorphism Problem , 1978, Theor. Comput. Sci..

[12]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .

[13]  Rolf Niedermeier,et al.  Invitation to Fixed-Parameter Algorithms , 2006 .

[14]  Michael R. Fellows,et al.  Parameterized Complexity , 1998 .

[15]  Richard Bellman,et al.  Dynamic Programming Treatment of the Travelling Salesman Problem , 1962, JACM.

[16]  Dirk Walther,et al.  The integrated analysis of metabolic and protein interaction networks reveals novel molecular organizing principles , 2008, BMC Syst. Biol..

[17]  Francisco J. Planes,et al.  A critical examination of stoichiometric and path-finding approaches to metabolic pathways , 2008, Briefings Bioinform..

[18]  Adem Karahoca,et al.  Data Mining and Knowledge Discovery in Real Life Applications , 2009 .

[19]  Edith D. Wong,et al.  Saccharomyces Genome Database: the genomics resource of budding yeast , 2011, Nucleic Acids Res..

[20]  Andrew G McDonald,et al.  ExplorEnz: a MySQL database of the IUBMB enzyme nomenclature , 2007, BMC Biochemistry.

[21]  Christian Komusiewicz,et al.  An algorithmic framework for fixed-cardinality optimization in sparse graphs applied to dense subgraph problems , 2015, Discret. Appl. Math..

[22]  Frédéric Boyer,et al.  Syntons, metabolons and interactons: an exact graph-theoretical approach for exploring neighbourhood between genomic and functional data , 2005, Bioinform..

[23]  E. J. Williams,et al.  Coexpression of neighboring genes in the genome of Arabidopsis thaliana. , 2004, Genome research.

[24]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[25]  Ron Y. Pinter,et al.  Alignment of metabolic pathways , 2005, Bioinform..

[26]  Roded Sharan,et al.  PathBLAST: a tool for alignment of protein interaction networks , 2004, Nucleic Acids Res..

[27]  Roded Sharan,et al.  Topology-Free Querying of Protein Interaction Networks , 2009, RECOMB.

[28]  Michael R. Fellows,et al.  On the parameterized complexity of multiple-interval graph problems , 2009, Theor. Comput. Sci..

[29]  Mario Vento,et al.  Thirty Years Of Graph Matching In Pattern Recognition , 2004, Int. J. Pattern Recognit. Artif. Intell..

[30]  P. Gács,et al.  Algorithms , 1992 .

[31]  Jiawei Han,et al.  Mining hidden community in heterogeneous social networks , 2005, LinkKDD '05.