Item sets based graph mining algorithm and application in genetic regulatory networks

This paper presents some graph-mining algorithms and their application in genetic regulatory networks. We use item set labeling method of directed graphs, to efficiently find frequent subgraphs. An inexact match scheme of subgraphs is performed by an edit distance of vertices and directed edges in the Regulatory Networks. This inexact frequent subgraph algorithm extends the paper of Koyuturk, Grama and Szpankowski on exact frequent subgraphs (The Kayuturk et al paper considers the inexact match as an open problem). Index Terms—Inexact Match, Pattern Mining, Sub-Graph Mining, Genetic Regulatory Networks, Subsystem Variants

[1]  Wojciech Szpankowski,et al.  An efficient algorithm for detecting frequent subgraphs in biological networks , 2004, ISMB/ECCB.

[2]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[3]  Takashi Washio,et al.  Complete Mining of Frequent Patterns from Graphs: Mining Graph Data , 2003, Machine Learning.

[4]  Kamran Sartipi,et al.  A graph pattern matching approach to software architecture recovery , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[5]  Lawrence B. Holder,et al.  Discovery of Inexact Concepts from Structural Data , 1993, IEEE Trans. Knowl. Data Eng..

[6]  Thorsten Meinl,et al.  Graph based molecular data mining - an overview , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[7]  Ron Y. Pinter,et al.  Alignment of metabolic pathways , 2005, Bioinform..

[8]  Minoru Kanehisa,et al.  Toward Pathway Engineering: A New Database of Genetic and Molecular Pathways , 1997 .

[9]  George Karypis,et al.  Finding Frequent Patterns in a Large Sparse Graph* , 2005, Data Mining and Knowledge Discovery.

[10]  Ross A. Overbeek,et al.  Automatic detection of subsystem/pathway variants in genome analysis , 2005, ISMB.