Bioinformatics Adventures in Database Research

Informatics has helped launch molecular biology into the genomic era. It appears certain that informatics will remain a major contributor to molecular biology in the post-genome era.We discuss here data integration and datamining in bioinformatics, as well as the role that database theory played in these topics. We also describe LIMS as a third key topic in bioinformatics where advances in database system and theory can be very relevant.

[1]  Bertil Schmidt,et al.  Parallel detection of regulatory elements with gMP , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[2]  Heikki Mannila,et al.  Levelwise Search and Borders of Theories in Knowledge Discovery , 1997, Data Mining and Knowledge Discovery.

[3]  Eng Chong Tan,et al.  APPLICATION OF TIME-FREQUENCY ANALYSIS IN EXON CLASSIFICATION , 2004 .

[4]  Zexuan Zhu,et al.  Whole-Genome Functional Classification of Genes by Latent Semantic Analysis on Microarray Data , 2004, APBC.

[5]  Jian-Jun Shu,et al.  Pairwise alignment of the DNA sequence using hypercomplex number representation , 2004, Bulletin of mathematical biology.

[6]  D. Gerhold,et al.  DNA chips: promising toys have become powerful tools. , 1999, Trends in biochemical sciences.

[7]  Bertil Schmidt,et al.  Design of a Bit-Serial Floating Point Unit for a Fine Grained Parallel Processor Array , 2003, PDPTA.

[8]  Huiqing Liu,et al.  Use of Built-in Features in the Interpretation of High-dimensional Cancer Diagnosis Data , 2004, APBC.

[9]  G. Schuler,et al.  Entrez: molecular biology database and retrieval system. , 1996, Methods in enzymology.

[10]  Philip Wadler,et al.  Comprehending monads , 1990, Mathematical Structures in Computer Science.

[11]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[12]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[13]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[14]  Limsoon Wong,et al.  Normal Forms and Conservative Extension Properties for Query Languages over Collection Types , 1996, J. Comput. Syst. Sci..

[15]  Jinyan Li,et al.  Using Rules to Analyse Bio-medical Data: A Comparison between C4.5 and PCL , 2003, WAIM.

[16]  Jinyan Li,et al.  Efficient mining of emerging patterns: discovering trends and differences , 1999, KDD '99.

[17]  Chunru Wan,et al.  Unsupervised gene selection via spectral biclustering , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[18]  Zheng Yun,et al.  Dynamic algorithm for inferring qualitative models of gene regulatory networks , 2004, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..

[19]  Feng Lin,et al.  Improvement of the Needleman-Wunsch Algorithm , 2004, Rough Sets and Current Trends in Computing.

[20]  Anders Gorm Pedersen,et al.  Neural Network Prediction of Translation Initiation Sites in Eukaryotes: Perspectives for EST and Genome Analysis , 1997, ISMB.

[21]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[22]  Yang Song,et al.  BioDIFF: an effective fast change detection algorithm for genomic and proteomic data , 2004, CIKM '04.

[23]  Jinyan Li,et al.  Geography of Differences between Two Classes of Data , 2002, PKDD.

[24]  Shahrokh Saeednia,et al.  How to maintain both privacy and authentication in digital libraries , 2000 .

[25]  M. Kozak An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs. , 1987, Nucleic acids research.

[26]  Chee Keong Kwoh,et al.  THE SENSITIVITY AND RATIONALITY OF PAIRWISE LINKAGE DISEQUILIBRIUM MEASURES – A PRACTICAL ANALYSIS , 2004 .

[27]  Meng Joo Er,et al.  Excerpts of research in brain sciences and neural networks in Singapore , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[28]  Huiqing Liu,et al.  Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (ALL) patients , 2003, Bioinform..

[29]  Limsoon Wong,et al.  BioKleisli: a digital library for biomedical researchers , 1997, International Journal on Digital Libraries.

[30]  Li Shen,et al.  EFFICIENT ALGORITHM FOR GENE SELECTION USING PLS-RLSC , 2004 .

[31]  C. W. Ong,et al.  A Robust Rule-Based Event Management Architecture for Call-Data Records , 2004, KES.

[32]  Hans-Jörg Schek,et al.  Remarks on the algebra of non first normal form relations , 1982, PODS.

[33]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[34]  Chee Keong Kwoh,et al.  AN EVOLUTIONARY LINEAGE FOR INTRON LOSS/GAIN IN FIVE EUKAYOTIC GENOMES , 2004 .

[35]  Huiqing Liu,et al.  Mean-entropy discretized features are effective for classifying high-dimensional biomedical data , 2003, BIOKDD.

[36]  P G Baker,et al.  Recent developments in biological sequence databases. , 1998, Current opinion in biotechnology.

[37]  J. C. Tay,et al.  Approximate String Matching for Multiple-Attribute, Large-Scale Customer Address Databases , 2003, ICADL.

[38]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[39]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[40]  J. Downing,et al.  Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. , 2002, Cancer cell.

[41]  D. E. Rumelhart,et al.  Learning internal representations by back-propagating errors , 1986 .

[42]  Dan Suciu,et al.  Comprehension syntax , 1994, SGMD.

[43]  N. Dagdee,et al.  Turing machine simulation using hard-limiter neurons , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[44]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[45]  F. Lin,et al.  Spline-based volumetric modeling and printing for bioceramic implants , 2004, VRCAI '04.

[46]  Lin Feng,et al.  CyberparaBLAST: the parallelized BLAST Web server , 2003, Proceedings. 2003 International Conference on Cyberworlds.

[47]  Li Shen,et al.  PLS and SVD based penalized logistic regression for cancer classification using microarray data , 2005, APBC.

[48]  D. Searls,et al.  Using bioinformatics in gene and drug discovery. , 2000, Drug discovery today.

[49]  Huiqing Liu,et al.  An in-silico method for prediction of polyadenylation signals in human sequences. , 2003, Genome informatics. International Conference on Genome Informatics.

[50]  Artemis G. Hatzigeorgiou,et al.  Translation initiation start prediction in human cDNAs with high accuracy , 2002, Bioinform..

[51]  Limsoon Wong,et al.  Principles of Programming with Complex Objects and Collection Types , 1995, Theor. Comput. Sci..

[52]  Gunnar Rätsch,et al.  Engineering Support Vector Machine Kerneis That Recognize Translation Initialion Sites , 2000, German Conference on Bioinformatics.

[53]  Weiguo Liu,et al.  A Generic Parallel Pattern-Based System for Bioinformatics , 2004, Euro-Par.

[54]  Limsoon Wong,et al.  Kleisli, a functional query system , 2000, J. Funct. Program..

[55]  Narendra S. Chaudhari,et al.  Improvement of the inside-outside algorithm using prediction and application to RNA modeling , 2003, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693).

[56]  Patrick C. Fischer,et al.  Nested Relational Structures , 1986, Adv. Comput. Res..

[57]  Huiqing Liu,et al.  Selection of patient samples and genes for outcome prediction , 2004, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..

[58]  Lin Feng,et al.  Parallel computation for multiple sequence alignments , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.

[59]  Zhao Ying,et al.  Fast leave-one-out evaluation and improvement on inference for LS-SVMs , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[60]  Jianhui Zhao,et al.  A Model-based Approach for Human Motion Reconstruction from Monocular Images , 2004 .

[61]  Lin Feng,et al.  Using blocks+ database in Needleman-Wunsch algorithm , 2004, IEEE Annual Meeting of the Fuzzy Information, 2004. Processing NAFIPS '04..

[62]  Chee Keong Kwoh,et al.  Statistical Analysis of Symmetric Exon Sets in Eukaryotic Genes , 2003 .

[63]  Jennifer Widom,et al.  Object exchange across heterogeneous information sources , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[64]  Xiao Yang,et al.  Graphical approach for motif recognition in DNA sequences , 2004, 2004 Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[65]  Limsoon Wong,et al.  Kleisli: its exchange format, supporting tools, and an application in protein interaction extraction , 2000, Proceedings IEEE International Symposium on Bio-Informatics and Biomedical Engineering.

[66]  Akifumi Makinouchi,et al.  A Consideration on Normal Form of Not-Necessarily-Normalized Relation in the Relational Data Model , 1977, VLDB.

[67]  Narendra S. Chaudhari,et al.  Protein Family Classification Using Second-Order Recurrent Neural Networks , 2003 .

[68]  Chee Keong Kwoh,et al.  The pattern classification based on the nearest feature midpoints , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[69]  Jagath C. Rajapakse,et al.  Color channel encoding with NMF for face recognition , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[70]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[71]  Zhihong Man,et al.  Video analysis and knowledge based fire detection , 2003 .

[72]  Limsoon Wong,et al.  Using feature generation and feature selection for accurate prediction of translation initiation sites. , 2002, Genome informatics. International Conference on Genome Informatics.

[73]  Pat Langley,et al.  An Analysis of Bayesian Classifiers , 1992, AAAI.

[74]  Bertil Schmidt,et al.  An Area-Efficient Bit-Serial Integer Multiplier , 2003, VLSI.

[75]  Limsoon Wong,et al.  Query Languages for Bags and Aggregate Functions , 1997, J. Comput. Syst. Sci..

[76]  Robin Milner,et al.  Principal type-schemes for functional programs , 1982, POPL '82.

[77]  Kotagiri Ramamohanarao,et al.  The Space of Jumping Emerging Patterns and Its Incremental Maintenance Algorithms , 2000, ICML.

[78]  Kenneth H. Fasman,et al.  The GDB human genome data base anno 1993 , 1993, Nucleic Acids Res..

[79]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[80]  Xuebin Zheng,et al.  Graphical models for brain connectivity from functional imaging data , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[81]  Laura M. Haas,et al.  DiscoveryLink: A system for integrated access to life sciences data sources , 2001, IBM Syst. J..

[82]  Limsoon Wong,et al.  The Kleisli Query System as a Backbone for Bioinformatics Data Integration and Analysis , 2003, Bioinformatics.

[83]  N W Matheson,et al.  The GDB Human Genome Data Base Anno 1992. , 1992, Nucleic acids research.