A Tutorial Review of RKHS Methods in Machine Learning
暂无分享,去创建一个
[1] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[2] Shahar Mendelson,et al. Rademacher averages and phase transitions in Glivenko-Cantelli classes , 2002, IEEE Trans. Inf. Theory.
[3] Quoc V. Le,et al. Nonparametric Quantile Regression , 2005 .
[4] I. J. Schoenberg. Metric spaces and completely monotone functions , 1938 .
[5] Bernhard Schölkopf,et al. On a Kernel-Based Method for Pattern Recognition, Regression, Approximation, and Operator Inversion , 1998, Algorithmica.
[6] Gunnar Rätsch,et al. Soft Margins for AdaBoost , 2001, Machine Learning.
[7] D. Nolan. The excess-mass ellipsoid , 1991 .
[8] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.
[9] William T. Freeman,et al. Understanding belief propagation and its generalizations , 2003 .
[10] F. Girosi,et al. Networks for approximation and learning , 1990, Proc. IEEE.
[11] Allan Pinkus,et al. Strictly Positive Definite Functions on a Real Inner Product Space , 2004, Adv. Comput. Math..
[12] John Shawe-Taylor,et al. A Column Generation Algorithm For Boosting , 2000, ICML.
[13] Ben Taskar,et al. Max-Margin Markov Networks , 2003, NIPS.
[14] Y. Makovoz. Random Approximants and Neural Networks , 1996 .
[15] Xiaojin Zhu,et al. Kernel conditional random fields: representation and clique selection , 2004, ICML.
[16] Martin J. Wainwright,et al. Semidefinite Relaxations for Approximate Inference on Graphs with Cycles , 2003, NIPS.
[17] Christopher J. C. Burges,et al. Simplified Support Vector Decision Rules , 1996, ICML.
[18] Thomas Gärtner,et al. Multi-Instance Kernels , 2002, ICML.
[19] D. Mason,et al. Generalized quantile processes , 1992 .
[20] Arthur E. Hoerl,et al. Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.
[21] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[22] A Tikhonov,et al. Solution of Incorrectly Formulated Problems and the Regularization Method , 1963 .
[23] Thorsten Joachims,et al. A support vector method for multivariate performance measures , 2005, ICML.
[24] Bernhard Schölkopf,et al. A Generalized Representer Theorem , 2001, COLT/EuroCOLT.
[25] Michael I. Jordan,et al. Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.
[26] Gunnar Rätsch,et al. Robust Boosting via Convex Optimization: Theory and Applications , 2007 .
[27] F. Girosi,et al. From regularization to radial, tensor and additive splines , 1993, Neural Networks for Signal Processing III - Proceedings of the 1993 IEEE-SP Workshop.
[28] R. Fletcher. Practical Methods of Optimization , 1988 .
[29] Thomas Hofmann,et al. Unifying collaborative and content-based filtering , 2004, ICML.
[30] Michael I. Jordan,et al. Convexity, Classification, and Risk Bounds , 2006 .
[31] Ben Taskar,et al. Max-Margin Parsing , 2004, EMNLP.
[32] Christian Berg,et al. Potential Theory on Locally Compact Abelian Groups , 1975 .
[33] Robin Sibson,et al. What is projection pursuit , 1987 .
[34] John Shawe-Taylor,et al. Structural Risk Minimization Over Data-Dependent Hierarchies , 1998, IEEE Trans. Inf. Theory.
[35] J. Weston,et al. Support vector regression with ANOVA decomposition kernels , 1999 .
[36] Alexander J. Smola,et al. Binet-Cauchy Kernels , 2004, NIPS.
[37] Ralf Herbrich,et al. Learning Kernel Classifiers: Theory and Algorithms , 2001 .
[38] Thorsten Joachims,et al. Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.
[39] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.
[40] J. Kettenring,et al. Canonical Analysis of Several Sets of Variables , 2022 .
[41] Christopher K. I. Williams. Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond , 1999, Learning in Graphical Models.
[42] Eugene Charniak,et al. Statistical Parsing with a Context-Free Grammar and Word Statistics , 1997, AAAI/IAAI.
[43] Jason Weston,et al. Mismatch String Kernels for SVM Protein Classification , 2002, NIPS.
[44] Eric R. Ziegel,et al. Generalized Linear Models , 2002, Technometrics.
[45] Hisashi Kashima,et al. Marginalized Kernels Between Labeled Graphs , 2003, ICML.
[46] G. Wahba,et al. Some results on Tchebycheffian spline functions , 1971 .
[47] Jason Weston,et al. A kernel method for multi-labelled classification , 2001, NIPS.
[48] Vladimir Koltchinskii,et al. Rademacher penalties and structural risk minimization , 2001, IEEE Trans. Inf. Theory.
[49] M. Fiedler. Algebraic connectivity of graphs , 1973 .
[50] Vapnik,et al. SVMs for Histogram Based Image Classification , 1999 .
[51] Bernhard Schölkopf,et al. Support vector learning , 1997 .
[52] Bernhard Schölkopf,et al. Kernel Dependency Estimation , 2002, NIPS.
[53] N. Cristianini,et al. On Kernel-Target Alignment , 2001, NIPS.
[54] J. Darroch,et al. Generalized Iterative Scaling for Log-Linear Models , 1972 .
[55] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[56] Bernhard Schölkopf,et al. Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.
[57] Bernhard Schölkopf,et al. Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.
[58] J. Stewart. Positive definite functions and generalizations, an historical survey , 1976 .
[59] Thorsten Joachims,et al. Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.
[60] Gunnar Rätsch,et al. Adapting Codes and Embeddings for Polychotomies , 2002, NIPS.
[61] Zaïd Harchaoui,et al. A Machine Learning Approach to Conjoint Analysis , 2004, NIPS.
[62] P. Rujan. A Fast Method for Calculating the Perceptron with Maximal Stability , 1993 .
[63] S. Klinke,et al. Exploratory Projection Pursuit , 1995 .
[64] Alexander J. Smola,et al. Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.
[65] W. Krauth,et al. Learning algorithms with optimal stability in neural networks , 1987 .
[66] H. Hotelling. Relations Between Two Sets of Variates , 1936 .
[67] Koby Crammer,et al. On the Learnability and Design of Output Codes for Multiclass Problems , 2002, Machine Learning.
[68] V. Vapnik. Estimation of Dependences Based on Empirical Data , 2006 .
[69] B. Yandell,et al. Automatic Smoothing of Regression Functions in Generalized Linear Models , 1986 .
[70] O. Mangasarian,et al. Robust linear programming discrimination of two linearly inseparable sets , 1992 .
[71] Yoav Freund,et al. Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.
[72] John C. Platt,et al. Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .
[73] C. Micchelli,et al. Functions that preserve families of positive semidefinite matrices , 1995 .
[74] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[75] Frank Jensen,et al. Optimal junction Trees , 1994, UAI.
[76] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[77] Bernhard Schölkopf,et al. Iterative kernel principal component analysis for image modeling , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[78] L. Baum,et al. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .
[79] Vladimir Vapnik,et al. The Nature of Statistical Learning , 1995 .
[80] G. Wahba. Spline models for observational data , 1990 .
[81] Thomas Gärtner,et al. A survey of kernels for structured data , 2003, SKDD.
[82] Sean R. Eddy,et al. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .
[83] Bernhard Schölkopf,et al. Comparison of View-Based Object Recognition Algorithms Using Realistic 3D Models , 1996, ICANN.
[84] Michael Collins,et al. Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.
[85] D. Cox,et al. Asymptotic Analysis of Penalized Likelihood and Related Estimators , 1990 .
[86] Judea Pearl,et al. Probabilistic reasoning in intelligent systems , 1988 .
[87] Eleazar Eskin,et al. The Spectrum Kernel: A String Kernel for SVM Protein Classification , 2001, Pacific Symposium on Biocomputing.
[88] A. Tsybakov. On nonparametric estimation of density level sets , 1997 .
[89] David Haussler,et al. Convolution kernels on discrete structures , 1999 .
[90] Thomas Hofmann,et al. Hidden Markov Support Vector Machines , 2003, ICML.
[91] Alexander J. Smola,et al. Kernels and Regularization on Graphs , 2003, COLT.
[92] T. Sager. An Iterative Method for Estimating a Multivariate Mode and Isopleth , 1979 .
[93] G. Wahba,et al. Smoothing spline ANOVA for exponential families, with application to the Wisconsin Epidemiological Study of Diabetic Retinopathy : the 1994 Neyman Memorial Lecture , 1995 .
[94] J. C. BurgesChristopher. A Tutorial on Support Vector Machines for Pattern Recognition , 1998 .
[95] Yoav Freund,et al. Experiments with a New Boosting Algorithm , 1996, ICML.
[96] T. Poggio,et al. On optimal nonlinear associative recall , 1975, Biological Cybernetics.
[97] Colin McDiarmid,et al. Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .
[98] Koby Crammer. Online Learning for Complex Cat-egorial Problems , 2005 .
[99] Ingo Steinwart,et al. Support Vector Machines are Universally Consistent , 2002, J. Complex..
[100] Alexander J. Smola,et al. Regression estimation with support vector learning machines , 1996 .
[101] Andrew McCallum,et al. A Conditional Random Field for Discriminatively-trained Finite-state String Edit Distance , 2005, UAI.
[102] J. Friedman. Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .
[103] Alexander J. Smola,et al. Fast Kernels for String and Tree Matching , 2002, NIPS.
[104] Michael Gribskov,et al. Use of Receiver Operating Characteristic (ROC) Analysis to Evaluate Sequence Matching , 1996, Comput. Chem..
[105] Luke S. Zettlemoyer,et al. Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.
[106] Tong Zhang. Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .
[107] Eugene Charniak,et al. A Maximum-Entropy-Inspired Parser , 2000, ANLP.
[108] Gunnar Rätsch,et al. Kernel PCA and De-Noising in Feature Spaces , 1998, NIPS.
[109] W. Rudin,et al. Fourier Analysis on Groups. , 1965 .
[110] B. Schölkopf,et al. Efficient face detection by a cascaded support–vector machine expansion , 2004, Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.
[111] Bart De Moor,et al. Subspace angles between ARMA models , 2002, Syst. Control. Lett..
[112] Dan Gusfield,et al. Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .
[113] V. Vapnik. Pattern recognition using generalized portrait method , 1963 .
[114] C. Watkins. Dynamic Alignment Kernels , 1999 .
[115] Ralf Herbrich,et al. Algorithmic Luckiness , 2001, J. Mach. Learn. Res..
[116] Susan A. Murphy,et al. Monographs on statistics and applied probability , 1990 .
[117] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[118] Lior Wolf,et al. Learning over Sets using Kernel Principal Angles , 2003, J. Mach. Learn. Res..
[119] Michael Collins,et al. Discriminative Reranking for Natural Language Parsing , 2000, CL.
[120] E. Parzen. STATISTICAL INFERENCE ON TIME SERIES BY RKHS METHODS. , 1970 .
[121] Fernando Pereira,et al. Shallow Parsing with Conditional Random Fields , 2003, NAACL.
[122] Bernhard Schölkopf,et al. The connection between regularization operators and support vector kernels , 1998, Neural Networks.
[123] K. Karhunen. Zur Spektraltheorie stochastischer prozesse , 1946 .
[124] J. Besag. Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .
[125] B. Ripley,et al. Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.
[126] Michael I. Jordan,et al. Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[127] Bernhard Schölkopf,et al. New Support Vector Algorithms , 2000, Neural Computation.
[128] Herbert Meschkowski,et al. Hilbertsche Räume mit Kernfunktion , 1962 .
[129] F. A. Seiler,et al. Numerical Recipes in C: The Art of Scientific Computing , 1989 .
[130] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[131] Shahar Mendelson,et al. A Few Notes on Statistical Learning Theory , 2002, Machine Learning Summer School.
[132] Thomas Hofmann,et al. Support vector machine learning for interdependent and structured output spaces , 2004, ICML.
[133] P. Sen,et al. Restricted canonical correlations , 1994 .
[134] Thore Graepel,et al. Large Margin Rank Boundaries for Ordinal Regression , 2000 .
[135] J. Kahane. Some Random Series of Functions , 1985 .
[136] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..
[137] Bernhard Schölkopf,et al. Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.
[138] J. Dauxois,et al. Nonlinear canonical analysis and independence tests , 1998 .
[139] H. Hotelling. Analysis of a complex of statistical variables into principal components. , 1933 .
[140] Ingo Steinwart,et al. On the Influence of the Kernel on the Consistency of Support Vector Machines , 2002, J. Mach. Learn. Res..
[141] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2003, ICTAI.
[142] Thomas Hofmann,et al. Gaussian process classification for segmenting and annotating sequences , 2004, ICML.
[143] Bernhard Schölkopf,et al. Sparse Greedy Matrix Approximation for Machine Learning , 2000, International Conference on Machine Learning.
[144] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .
[145] B. Yandell,et al. Semi-Parametric Generalized Linear Models. , 1985 .
[146] Andrew McCallum,et al. Gene Prediction with Conditional Random Fields , 2005 .
[147] Christopher K. I. Williams,et al. Pascal Visual Object Classes Challenge Results , 2005 .
[148] Koby Crammer,et al. Loss Bounds for Online Category Ranking , 2005, COLT.
[149] A. Tsybakov,et al. Optimal aggregation of classifiers in statistical learning , 2003 .
[150] D Haussler,et al. Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.
[151] Thomas Hofmann,et al. Large margin methods for label sequence learning , 2003, INTERSPEECH.
[152] Leo Breiman,et al. Prediction Games and Arcing Algorithms , 1999, Neural Computation.
[153] Risi Kondor,et al. Diffusion kernels on graphs and other discrete structures , 2002, ICML 2002.
[154] Bernhard Schölkopf,et al. Support Vector Novelty Detection Applied to Jet Engine Vibration Spectra , 2000, NIPS.
[155] Steven A. Orszag,et al. CBMS-NSF REGIONAL CONFERENCE SERIES IN APPLIED MATHEMATICS , 1978 .
[156] J. Hartigan. Estimation of a Convex Density Contour in Two Dimensions , 1987 .
[157] W. Steiger,et al. Least Absolute Deviations: Theory, Applications and Algorithms , 1984 .
[158] J. Mercer. Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .
[159] Yoram Singer,et al. Logistic Regression, AdaBoost and Bregman Distances , 2000, Machine Learning.
[160] V. A. Morozov,et al. Methods for Solving Incorrectly Posed Problems , 1984 .
[161] Bernhard Schölkopf,et al. Kernel Methods in Computational Biology , 2005 .
[162] Christopher K. I. Williams,et al. Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.
[163] Richard F. Gunst,et al. Applied Regression Analysis , 1999, Technometrics.
[164] John D. Lafferty,et al. Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..
[165] A. P. Dawid,et al. Applications of a general propagation algorithm for probabilistic expert systems , 1992 .
[166] C. Berg,et al. Harmonic Analysis on Semigroups , 1984 .
[167] Richard Cole,et al. Faster suffix tree construction with missing suffix links , 2000, STOC '00.
[168] Bernhard Schölkopf,et al. Training Invariant Support Vector Machines , 2002, Machine Learning.
[169] Gunnar Rätsch,et al. Engineering Support Vector Machine Kerneis That Recognize Translation Initialion Sites , 2000, German Conference on Bioinformatics.
[170] W. Polonik. Minimum volume sets and generalized quantile processes , 1997 .
[171] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..
[172] Thomas Hofmann,et al. Exponential Families for Conditional Random Fields , 2004, UAI.
[173] Mathieu Raffinot,et al. Fast Regular Expression Search , 1999, WAE.
[174] David Haussler,et al. Probabilistic kernel regression models , 1999, AISTATS.
[175] Kenneth O. Kortanek,et al. Semi-Infinite Programming: Theory, Methods, and Applications , 1993, SIAM Rev..
[176] M. Aizerman,et al. Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .
[177] David J. Spiegelhalter,et al. Probabilistic Networks and Expert Systems , 1999, Information Science and Statistics.
[178] Alexander J. Smola,et al. Advances in Large Margin Classifiers , 2000 .
[179] A. Buja,et al. Projection Pursuit Indexes Based on Orthonormal Function Expansions , 1993 .
[180] Yoram Singer,et al. Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..
[181] O. Mangasarian. Linear and Nonlinear Separation of Patterns by Linear Programming , 1965 .
[182] Koby Crammer,et al. On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..
[183] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .
[184] Alexander J. Smola,et al. Learning with kernels , 1998 .
[185] Thomas Hofmann,et al. Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..
[186] D. Bamber. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .
[187] Yukio Shibata,et al. On the tree representation of chordal graphs , 1988, J. Graph Theory.
[188] S. Sinha. A Duality Theorem for Nonlinear Programming , 1966 .
[189] David M. Magerman,et al. Learning grammatical stucture using statistical decision-trees , 1996, ICGI.
[190] Bernhard Schölkopf,et al. Kernel Constrained Covariance for Dependence Measurement , 2005, AISTATS.
[191] Bernhard Schölkopf,et al. Learning from labeled and unlabeled data on a directed graph , 2005, ICML.
[192] Noga Alon,et al. Scale-sensitive dimensions, uniform convergence, and learnability , 1997, JACM.
[193] John W. Tukey,et al. A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.
[194] Michael I. Jordan. Graphical Models , 2003 .
[195] David Haussler,et al. Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.
[196] Alexander J. Smola,et al. Support Vector Regression Machines , 1996, NIPS.
[197] Michael Collins,et al. Convolution Kernels for Natural Language , 2001, NIPS.
[198] Dustin Boswell,et al. Introduction to Support Vector Machines , 2002 .
[199] H. Kashima,et al. Kernels for graphs , 2004 .
[200] Richard J. Martin. A metric for ARMA processes , 2000, IEEE Trans. Signal Process..
[201] J. M. Hammersley,et al. Markov fields on finite graphs and lattices , 1971 .
[202] O. Bousquet. THEORY OF CLASSIFICATION: A SURVEY OF RECENT ADVANCES , 2004 .
[203] Nello Cristianini,et al. Classification using String Kernels , 2000 .
[204] Gunnar Rätsch,et al. Predicting Time Series with Support Vector Machines , 1997, ICANN.
[205] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.
[206] R. Kondor,et al. Bhattacharyya and Expected Likelihood Kernels , 2003 .
[207] Walter W Garvin,et al. Introduction to Linear Programming , 2018, Linear Programming and Resource Allocation Modeling.
[208] S. Bochner. Monotone Funktionen, Stieltjessche Integrale und harmonische Analyse , 1933 .
[209] Marvin Minsky,et al. Perceptrons: An Introduction to Computational Geometry , 1969 .
[210] Shai Ben-David,et al. On the difficulty of approximately maximizing agreements , 2000, J. Comput. Syst. Sci..
[211] Philip M. Long,et al. Fat-shattering and the learnability of real-valued functions , 1994, COLT '94.
[212] N. Aronszajn. Theory of Reproducing Kernels. , 1950 .
[213] Yoram Singer,et al. Log-Linear Models for Label Ranking , 2003, NIPS.
[214] Pietro Perona,et al. Combining generative models and Fisher kernels for object recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.
[215] Matthias Hein,et al. Maximal Margin Classification for Metric Spaces , 2003, COLT.