In this paper we employ a recent algorithm by Zantema et al. for detecting maximal frequent subgraphs (MFS) in collections of graphs corresponding tobiological networks from the KEGG database. Each graph of a particular collection corresponds to one organism and represents one pathway or a union of pathways of this organism. Previously the MFS algorithm has been applied only to graphs that have enzymes as nodes. In this paper we introduce a new type of graphs, reaction graphs, which contain more information than the enzyme graphs. We apply the MFS algorithm to reaction graphs obtained from the KEGG network. Earlier the MFS algorithm was tested only on smaller graphs of individual metabolic pathways. In this paper we show that the algorithm can cope with large collections (containing more than 600 graphs) of large graphs (comprising more than 5000 edges). Moreover, the results are produced in real time - within a few seconds - which is important for on-line applications of the algorithm. Also, our results confirm the the feasibility of the maximal frequent subgraphs approach for finding similarities and relationships between different organisms - the more similar the graphs in the collection, the larger the size of the found maximal frequent subgraphs.
[1]
Jiawei Han,et al.
gSpan: graph-based substructure pattern mining
,
2002,
2002 IEEE International Conference on Data Mining, 2002. Proceedings..
[2]
Wojciech Szpankowski,et al.
Detecting Conserved Interaction Patterns in Biological Networks
,
2006,
J. Comput. Biol..
[3]
Gultekin Özsoyoglu,et al.
Pathways Database System: An Integrated System for Biological Pathways
,
2003,
Bioinform..
[4]
Wojciech Szpankowski,et al.
An efficient algorithm for detecting frequent subgraphs in biological networks
,
2004,
ISMB/ECCB.
[5]
Dragan Bosnacki,et al.
Finding Frequent Subgraphs in Biological Networks Via Maximal Item Sets
,
2008,
BIRD.
[6]
Peter D. Karp,et al.
Representing, analyzing, and synthesizing biochemical pathways
,
1994,
IEEE Expert.
[7]
George Karypis,et al.
Frequent subgraph discovery
,
2001,
Proceedings 2001 IEEE International Conference on Data Mining.