Minimizing the Cross Validation Error to Mix Kernel Matrices of Heterogeneous Biological Data

In biological data, it is often the case that objects are described in two or more representations. In order to perform classification based on such data, we have to combine them in a certain way. In the context of kernel machines, this task amounts to mix several kernel matrices into one. In this paper, we present two ways to mix kernel matrices, where the mixing weights are optimized to minimize the cross validation error. In bacteria classification and gene function prediction experiments, our methods significantly outperformed single kernel classifiers in most cases.

[1]  Bairoch,et al.  Construction of the gyrB Database for the Identification and Classification of Bacteria. , 1998, Genome informatics. Workshop on Genome Informatics.

[2]  Yau-Hwang Kuo,et al.  Hardware realization of higher-order CMAC model for color calibration , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[3]  Daniel Hanisch,et al.  Co-clustering of biological networks and gene expression data , 2002, ISMB.

[4]  Chun-Shin Lin,et al.  CMAC with General Basis Functions , 1996, Neural Networks.

[5]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[6]  Chun-Shin Lin,et al.  Learning convergence of CMAC technique , 1997, IEEE Trans. Neural Networks.

[7]  H. Kasai,et al.  Phylogeny of the genus Pseudomonas: intrageneric structure reconstructed from the nucleotide sequences of gyrB and rpoD genes. , 2000, Microbiology.

[8]  T S Kuo,et al.  A neuro-control system for the knee joint position control with quadriceps stimulation. , 1997, IEEE transactions on rehabilitation engineering : a publication of the IEEE Engineering in Medicine and Biology Society.

[9]  James S. Albus,et al.  Data Storage in the Cerebellar Model Articulation Controller (CMAC) , 1975 .

[10]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[11]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[12]  Chao He,et al.  Learning Convergence of CMAC Algorithm , 2004, Neural Processing Letters.

[13]  Filson H. Glanz,et al.  Application of a General Learning Algorithm to the Control of Robotic Manipulators , 1987 .

[14]  Volker Roth,et al.  Nonlinear Discriminant Analysis Using Kernel Functions , 1999, NIPS.

[15]  Stephen M. Scalera,et al.  Bipedal Gait Adaptation For Walking With Dynamic Balance , 1991, 1991 American Control Conference.

[16]  W. Thomas Miller,et al.  Real-time application of neural networks for sensor-based control of robots with vision , 1989, IEEE Trans. Syst. Man Cybern..

[17]  Kiyoshi Asai,et al.  Marginalized kernels for biological sequences , 2002, ISMB.

[18]  P. C. Parks Convergence Properties of Associative Memory Storage for Learning Control Systems , 1992 .

[19]  Lennart Ljung,et al.  Theory and Practice of Recursive Identification , 1983 .

[20]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[21]  A. Kolcz Application of the CMAC input encoding scheme in the N-tuple approximation net , 1994 .

[22]  M. Kanehisa,et al.  Graph-driven features extraction from microarray data , 2002, physics/0206055.

[23]  James S. Albus,et al.  New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .

[24]  W. Thomas Miller,et al.  Real-time dynamic control of an industrial manipulator using a neural network-based learning controller , 1990, IEEE Trans. Robotics Autom..

[25]  Bernhard Schölkopf,et al.  A Kernel Approach for Learning from Almost Orthogonal Patterns , 2002, European Conference on Principles of Data Mining and Knowledge Discovery.

[26]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[27]  D.A. Handelman,et al.  Theory and development of higher-order CMAC neural networks , 1992, IEEE Control Systems.

[28]  A. Sideris,et al.  Learning convergence in the cerebellar model articulation controller , 1992, IEEE Trans. Neural Networks.

[29]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[30]  Jong-Hwan Kim,et al.  CMAC based control of nonlinear mechanical system , 1996, Proceedings of the 1996 IEEE IECON. 22nd International Conference on Industrial Electronics, Control, and Instrumentation.

[31]  Kanako Watanabe,et al.  ICB database: the gyrB database for identification and classification of bacteria , 2001, Nucleic Acids Res..

[32]  Jason Weston,et al.  Gene functional classification from heterogeneous data , 2001, RECOMB.

[33]  S. Sathiya Keerthi,et al.  Evaluation of simple performance measures for tuning SVM hyperparameters , 2003, Neurocomputing.