Notions of similarity for systems biology models

Abstract Systems biology models are rapidly increasing in complexity, size and numbers. When building large models, researchers rely on software tools for the retrieval, comparison, combination and merging of models, as well as for version control. These tools need to be able to quantify the differences and similarities between computational models. However, depending on the specific application, the notion of ‘similarity’ may greatly vary. A general notion of model similarity, applicable to various types of models, is still missing. Here we survey existing methods for the comparison of models, introduce quantitative measures for model similarity, and discuss potential applications of combined similarity measures. To frame model comparison as a general problem, we describe a theoretical approach to defining and computing similarities based on a combination of different model aspects. The six aspects that we define as potentially relevant for similarity are underlying encoding, references to biological entities, quantitative behaviour, qualitative behaviour, mathematical equations and parameters and network structure. We argue that future similarity measures will benefit from combining these model aspects in flexible, problem-specific ways to mimic users’ intuition about model similarity, and to support complex model searches in databases.

[1]  Jugal K. Kalita,et al.  A comparison of algorithms for the pairwise alignment of biological networks , 2014, Bioinform..

[2]  Olaf Wolkenhauer,et al.  Improving the reuse of computational models through version control , 2013, Bioinform..

[3]  Anne E. Trefethen,et al.  Toward interoperable bioscience data , 2012, Nature Genetics.

[4]  Peter J. Hunter,et al.  Bioinformatics Applications Note Databases and Ontologies the Physiome Model Repository 2 , 2022 .

[5]  N. Boyce Life itself , 2018, The Lancet.

[6]  Andrew M. Jenkinson,et al.  The EBI RDF platform: linked open data for the life sciences , 2014, Bioinform..

[7]  Ronan M. T. Fleming,et al.  A community-driven global reconstruction of human metabolism , 2013, Nature Biotechnology.

[8]  Herbert M. Sauro,et al.  Bioinformatics Applications Note Comparing Simulation Results of Sbml Capable Simulators , 2022 .

[9]  Nicolas Le Novère,et al.  Ranked retrieval of Computational Biology models , 2010, BMC Bioinformatics.

[10]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[11]  O. Wolkenhauer Why model? , 2013, Front. Physiol..

[12]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[13]  Olaf Wolkenhauer,et al.  How Modeling Standards, Software, and Initiatives Support Reproducibility in Systems Biology and Systems Medicine , 2016, IEEE Transactions on Biomedical Engineering.

[14]  Peter J. Hunter,et al.  Revision history aware repositories of computational models of biological systems , 2011, BMC Bioinformatics.

[15]  Hai Hu,et al.  Assessing semantic similarity measures for the characterization of human regulatory pathways , 2006, Bioinform..

[16]  Neil Swainston,et al.  Sustainable model building the role of standards and biological semantics. , 2011, Methods in enzymology.

[17]  Yangyang Zhao,et al.  BioModels: ten-year anniversary , 2014, Nucleic Acids Res..

[18]  Jacky L. Snoep,et al.  Reproducible computational biology experiments with SED-ML - The Simulation Experiment Description Markup Language , 2011, BMC Systems Biology.

[19]  Ralf Hofestädt,et al.  Approaches in Integrative Bioinformatics , 2014, Springer Berlin Heidelberg.

[20]  Edda Klipp,et al.  Propagating semantic information in biochemical network models , 2012, BMC Bioinformatics.

[21]  Olaf Wolkenhauer,et al.  An algorithm to detect and communicate the differences in computational models describing biological systems , 2015, Bioinform..

[22]  Nicolas Le Novère,et al.  Structure, function, and behaviour of computational models in systems biology , 2013, BMC Systems Biology.

[23]  Alexander Mazein,et al.  STON: exploring biological pathways using the SBGN standard and graph databases , 2016, BMC Bioinformatics.

[24]  Zlatko Trajanoski,et al.  MEMOSys 2.0: an update of the bioinformatics database for genome-scale models and genomic data , 2014, Database J. Biol. Databases Curation.

[25]  John J Tyson,et al.  Functional motifs in biochemical reaction networks. , 2010, Annual review of physical chemistry.

[26]  Matthias Lange,et al.  Information Retrieval in Life Sciences: A Programmatic Survey , 2014, Approaches in Integrative Bioinformatics.

[27]  David G. Stork,et al.  Pattern Classification , 1973 .

[28]  Dagmar Waltemath,et al.  How Can Semantic Annotations Support the Identification of Network Similarities? , 2014, SWAT4LS.

[29]  Pedro Mendes,et al.  A Method for Comparing Multivariate Time Series with Different Dimensions , 2013, PloS one.

[30]  U. Alon Biological Networks: The Tinkerer as an Engineer , 2003, Science.

[31]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[32]  Rui-Sheng Wang,et al.  Boolean modeling in systems biology: an overview of methodology and applications , 2012, Physical biology.

[33]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[34]  Michael Darsow,et al.  ChEBI: a database and ontology for chemical entities of biological interest , 2007, Nucleic Acids Res..

[35]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[36]  Tadahide Izumi,et al.  The major role of human AP-endonuclease homolog Apn2 in repair of abasic sites in Schizosaccharomyces pombe. , 2004, Nucleic acids research.

[37]  Peter J. Hunter,et al.  An Overview of CellML 1.1, a Biological Model Description Language , 2003, Simul..

[38]  E. Klipp,et al.  Retrieval, alignment, and clustering of computational models based on semantic annotations , 2011, Molecular systems biology.

[39]  Jonathan R. Karr,et al.  A Whole-Cell Computational Model Predicts Phenotype from Genotype , 2012, Cell.

[40]  Edda Klipp,et al.  Systems Biology , 1994 .

[41]  Richard Orton,et al.  Version control of pathway models using XML patches , 2009, BMC Systems Biology.

[42]  Gary R. Mirams,et al.  The Cardiac Electrophysiology Web Lab , 2016, Biophysical journal.

[43]  Xuelong Li,et al.  A survey of graph edit distance , 2010, Pattern Analysis and Applications.

[44]  Olaf Wolkenhauer,et al.  COMODI: an ontology to characterise differences in versions of computational models in biology , 2016, Journal of Biomedical Semantics.

[45]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..

[46]  J. Tyson Modeling the cell division cycle: cdc2 and cyclin interactions. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[47]  Oliver Sawodny,et al.  Integration of Boolean models exemplified on hepatocyte signal transduction , 2012, Briefings Bioinform..

[48]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[49]  Mudita Singhal,et al.  COPASI - a COmplex PAthway SImulator , 2006, Bioinform..

[50]  A Goldbeter,et al.  A minimal cascade model for the mitotic oscillator involving cyclin and cdc2 kinase. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[51]  N. Novère Quantitative and logic modelling of molecular and gene networks , 2015, Nature Reviews Genetics.

[52]  Michael Lässig,et al.  Local graph alignment and motif search in biological networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[53]  Edda Klipp,et al.  Annotation and merging of SBML models with semanticSBML , 2010, Bioinform..

[54]  Kurt Sandkuhl,et al.  Finding patterns in biochemical reaction networks , 2016, PeerJ Prepr..

[55]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[56]  Michael Hucka,et al.  A Profile of Today's SBML-Compatible Software , 2011, 2011 IEEE Seventh International Conference on e-Science Workshops.

[57]  U. Alon An introduction to systems biology : design principles of biological circuits , 2019 .

[58]  Sarala M. Wimalaratne,et al.  The Systems Biology Graphical Notation , 2009, Nature Biotechnology.

[59]  Olaf Wolkenhauer,et al.  Reproducibility of Model-Based Results in Systems Biology , 2013 .

[60]  Eva Balsa-Canto,et al.  Global dynamic optimization approach to predict activation in metabolic pathways , 2014, BMC Systems Biology.

[61]  Michel Dumontier,et al.  Controlled vocabularies and semantics in systems biology , 2011, Molecular systems biology.

[62]  Olaf Wolkenhauer,et al.  Combining computational models, semantic annotations and simulation experiments in a graph database , 2015, Database J. Biol. Databases Curation.

[63]  François Fages,et al.  A graphical method for reducing and relating models in systems biology , 2010, Bioinform..

[64]  Olaf Wolkenhauer,et al.  Annotation-based feature extraction from sets of SBML models , 2014, Journal of Biomedical Semantics.

[65]  Gary R. Mirams,et al.  High-throughput functional curation of cellular electrophysiology models. , 2011, Progress in biophysics and molecular biology.

[66]  Nicolas Le Novère,et al.  BioModels linked dataset , 2014, BMC Systems Biology.

[67]  Dagmar Waltemath,et al.  Management of simulation studies in computational biology , 2015 .