On the Combination of Information-Theoretic Kernels with Generative Embeddings

Classical methods to obtain classifiers for structured objects (e.g., sequences, images) are based on generative models and adopt a classical generative Bayesian framework. To embrace discriminative approaches (namely, support vector machines), the objects have to be mapped/embedded onto a Hilbert space; one way that has been proposed to carry out such an embedding is via generative models (maybe learned from data). This type of hybrid discriminative/generative approach has been recently shown to outperform classifiers obtained directly from the generative model upon which the embedding is built.

[1]  Joachim M. Buhmann,et al.  Computational TMA Analysis and Cell Nucleus Classification of Renal Cell Carcinoma , 2010, DAGM-Symposium.

[2]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[3]  Nebojsa Jojic,et al.  Free energy score space , 2009, NIPS.

[4]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[5]  Eric P. Xing,et al.  Nonextensive Information Theoretic Kernels on Measures , 2009, J. Mach. Learn. Res..

[6]  Mark J. F. Gales,et al.  Using SVMs to classify variable length speech patterns , 2002 .

[7]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[8]  Lawrence Carin,et al.  Sparse multinomial logistic regression: fast algorithms and generalization bounds , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Kenji Fukumizu,et al.  Semigroup Kernels on Measures , 2005, J. Mach. Learn. Res..

[10]  Robert P. W. Duin,et al.  Component-based discriminative classification for hidden Markov models , 2009, Pattern Recognit..

[11]  Daniel Q. Naiman,et al.  Microarray Classification from Several Two-Gene Expression Comparisons , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[12]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Tom Minka,et al.  Principled Hybrids of Generative and Discriminative Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[15]  Jean-Cédric Chappelier,et al.  PLSI: The True Fisher Kernel and beyond , 2009, ECML/PKDD.

[16]  André F. T. Martins,et al.  Renal Cancer Cell Classification Using Generative Embeddings and Information Theoretic Kernels , 2011, PRIB.

[17]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[18]  Dima Damen,et al.  Detecting Carried Objects in Short Video Sequences , 2008, ECCV.

[19]  Manuele Bicego,et al.  Hybrid Generative-Discriminative Nucleus Classification of Renal Cell Carcinoma , 2011, SIMBAD.

[20]  Mário A. T. Figueiredo,et al.  Similarity-based classification of sequences using hidden Markov models , 2004, Pattern Recognit..

[21]  Mark J. F. Gales,et al.  Speech Recognition using SVMs , 2001, NIPS.

[22]  Andrew Zisserman,et al.  Advances in Neural Information Processing Systems (NIPS) , 2007 .

[23]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[24]  Tai Sing Lee,et al.  Hybrid generative-discriminative classification using posterior divergence , 2011, CVPR 2011.

[25]  Thomas Hofmann,et al.  Learning the Similarity of Documents: An Information-Geometric Approach to Document Retrieval and Categorization , 1999, NIPS.

[26]  C. Tsallis Possible generalization of Boltzmann-Gibbs statistics , 1988 .

[27]  Nuno Vasconcelos,et al.  A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications , 2003, NIPS.

[28]  Colin Campbell,et al.  The Latent Process Decomposition of cDNA Microarray Data Sets , 2005, TCBB.

[29]  Jean-Philippe Vert,et al.  Semigroup Kernels on Finite Sets , 2004, NIPS.

[30]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[31]  Lei Liu,et al.  Ensemble gene selection by grouping for microarray data classification , 2010, J. Biomed. Informatics.

[32]  André F. T. Martins,et al.  Information Theoretical Kernels for Generative Embeddings Based on Hidden Markov Models , 2010, SSPR/SPR.

[33]  Michele Tansella,et al.  Brain Morphometry by Probabilistic Latent Semantic Analysis , 2010, MICCAI.

[34]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[35]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[36]  Gunnar Rätsch,et al.  A New Discriminative Kernel from Probabilistic Models , 2001, Neural Computation.

[37]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[38]  Li Wang,et al.  Hybrid huberized support vector machines for microarray classification , 2007, ICML '07.

[39]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[40]  Hiroki Suyari Generalization of Shannon-Khinchin axioms to nonextensive systems and the uniqueness theorem for the nonextensive entropy , 2004, IEEE Transactions on Information Theory.

[41]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[42]  Alessandro Perina,et al.  Expression microarray classification using topic models , 2010, SAC '10.

[43]  C. R. Rao,et al.  On the convexity of some divergence measures based on entropy functions , 1982, IEEE Trans. Inf. Theory.

[44]  Trevor J. Hastie,et al.  Discriminative vs Informative Learning , 1997, KDD.

[45]  Henrik Boström,et al.  Fusion of dimensionality reduction methods: A case study in microarray classification , 2009, 2009 12th International Conference on Information Fusion.

[46]  Nebojsa Jojic,et al.  A hybrid generative/discriminative classification framework based on free-energy terms , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[47]  André F. T. Martins,et al.  Combining free energy score spaces with information theoretic kernels: Application to scene classification , 2010, 2010 IEEE International Conference on Image Processing.