Hash Kernels for Structured Data

We propose hashing to facilitate efficient kernels. This generalizes previous work using sampling and we show a principled way to compute the kernel matrix for data streams and sparse feature spaces. Moreover, we give deviation bounds from the exact kernel matrix. This has applications to estimation on strings and graphs.

[1]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[2]  C. Berg,et al.  Harmonic Analysis on Semigroups , 1984 .

[3]  A. Debnath,et al.  Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity. , 1991, Journal of medicinal chemistry.

[4]  A. Dembo,et al.  A note on uniform laws of averages for dependent processes , 1993 .

[5]  Bernhard Schölkopf,et al.  Improving the Accuracy and Speed of Support Vector Machines , 1996, NIPS.

[6]  Bernhard Schölkopf,et al.  Support vector learning , 1997 .

[7]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[8]  Olvi L. Mangasarian,et al.  Generalized Support Vector Machines , 1998 .

[9]  C. Watkins Dynamic Alignment Kernels , 1999 .

[10]  Hannu Toivonen,et al.  Statistical evaluation of the predictive toxicology challenge , 2000 .

[11]  Bernhard Schölkopf,et al.  Dynamic Alignment Kernels , 2000 .

[12]  Katya Scheinberg,et al.  Efficient SVM Training Using Low-Rank Kernel Representations , 2002, J. Mach. Learn. Res..

[13]  Kiyoshi Asai,et al.  Marginalized kernels for biological sequences , 2002, ISMB.

[14]  Thomas G. Dietterich,et al.  Editors. Advances in Neural Information Processing Systems , 2002 .

[15]  Ashwin Srinivasan,et al.  Statistical Evaluation of the Predictive Toxicology Challenge 2000-2001 , 2003, Bioinform..

[16]  Dimitris Achlioptas,et al.  Database-friendly random projections: Johnson-Lindenstrauss with binary coins , 2003, J. Comput. Syst. Sci..

[17]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[18]  P. Dobson,et al.  Distinguishing enzyme structures from non-enzymes without alignments. , 2003, Journal of molecular biology.

[19]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[20]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[21]  Yoram Singer,et al.  The Forgetron: A Kernel-Based Perceptron on a Fixed Budget , 2005, NIPS.

[22]  S. Chatterjee Concentration Inequalities With Exchangeable Pairs , 2005 .

[23]  Choon Hui Teo,et al.  Fast and space efficient string kernels using suffix arrays , 2006, ICML.

[24]  S. V. N. Vishwanathan,et al.  Fast Computation of Graph Kernels , 2006, NIPS.

[25]  Alexander J. Smola,et al.  Binet-Cauchy Kernels on Dynamical Systems and its Application to the Analysis of Dynamic Scenes , 2007, International Journal of Computer Vision.

[26]  Alexander J. Smola,et al.  Unifying Divergence Minimization and Statistical Inference Via Convex Duality , 2006, COLT.

[27]  Thomas Hofmann,et al.  Conditional Random Sampling: A Sketch-based Sampling Technique for Sparse Data , 2007 .

[28]  Aryeh Kontorovich A Universal Kernel for Learning Regular Languages , 2007, MLG.

[29]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[30]  Hans-Peter Kriegel,et al.  Graph Kernels For Disease Outcome Prediction From Protein-Protein Interaction Networks , 2006, Pacific Symposium on Biocomputing.

[31]  Natasa Przulj,et al.  Biological network comparison using graphlet degree distribution , 2007, Bioinform..

[32]  Mark Dredze,et al.  Small Statistical Models by Random Feature Mixing , 2008, ACL 2008.

[33]  Yoram Singer,et al.  The Forgetron: A Kernel-Based Perceptron on a Budget , 2008, SIAM J. Comput..

[34]  John Langford,et al.  Hash Kernels , 2009, AISTATS.