Analysis and Visualization of Gene Expressions and Protein Structures

This paper describes a web-based interactive framework for the analysis and visualization of gene expressions and protein structures. The formulation of the proposed framework was encountered by many challenges due to the wide range of relevant analysis and visualization techniques, in addition to the existence of a diversity of biological data types, on which these techniques operate. The main challenges that guided the formulation of the present framework are: (a) the integration of data from heterogeneous resources, such as expert-driven data from text, public domain databases and diverse large scale experimental data sets, and (b) difficulty in integrating the most recent analysis and visualization tools due to the lack of standard I/O. Therefore, the fundamental innovation in the proposed framework is the integration of the state-of-the-art techniques of both analysis and visualization for gene expressions and protein structures through a unified workflow. In addition, it supports a wide range of input data types and exports three dimensional interactive outputs using Virtual Reality Modeling Language (VRML) to be ready for exploration via off-the-shelf monitors as well as immersive, 3D, stereo display environments.

[1]  Alan Wee-Chung Liew,et al.  DB-Curve: a novel 2D method of DNA sequence visualization and representation , 2003 .

[2]  Mike Carson,et al.  Algorithm for ribbon models of proteins , 1986 .

[3]  Janet M. Thornton,et al.  Software engineering challenges in bioinformatics , 2004, Proceedings. 26th International Conference on Software Engineering.

[4]  Jacques Cohen,et al.  Bioinformatics—an introduction for computer scientists , 2004, CSUR.

[5]  D. Lipman,et al.  National Center for Biotechnology Information , 2019, Springer Reference Medizin.

[6]  Dieter W. Fellner,et al.  BioBrowser: A Framework for Fast Protein Visualization , 2005, EuroVis.

[7]  Mark D. Wilkinson,et al.  BioMOBY: An Open Source Biological Web Services Proposal , 2002, Briefings Bioinform..

[8]  Chun Li,et al.  On a 3-D representation of DNA primary sequences. , 2004, Combinatorial chemistry & high throughput screening.

[9]  Joaquín Dopazo,et al.  GEPAS, an experiment-oriented pipeline for the analysis of microarray gene expression data , 2005, Nucleic Acids Res..

[10]  Abdel-Badeeh M. Salem,et al.  An efficient enhanced k-means clustering algorithm , 2006 .

[11]  Joan Hérisson,et al.  DNA in Virtuo visualization and exploration of 3D genomic structures , 2004, AFRIGRAPH '04.

[12]  S Miyano,et al.  Open source clustering software. , 2004, Bioinformatics.

[13]  Yogesh L. Simmhan,et al.  A survey of data provenance in e-science , 2005, SGMD.

[14]  Peter Bühlmann,et al.  Supervised clustering of genes , 2002, Genome Biology.

[15]  Russell C. Eberhart,et al.  Gene clustering using self-organizing maps and particle swarm optimization , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[16]  Raymond K. Wong,et al.  Informative 3D Visualization of Multiple Protein Structures , 2004, APBC.

[17]  Ron D. Appel,et al.  ExPASy: the proteomics server for in-depth protein knowledge and analysis , 2003, Nucleic Acids Res..

[18]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[19]  Philip E. Bourne,et al.  Overview of Structural Bioinformatics , 2005 .

[20]  Satoru Miyano,et al.  Open source clustering software , 2004 .

[21]  Timo Honkela,et al.  Self-Organizing Maps and Constructive Learning , 2000 .