论文信息 - Advanced Lectures on Machine Learning - 字舞流文

Advanced Lectures on Machine Learning

Advanced lectures on machine learning , Advanced lectures on machine learning , کتابخانه دیجیتال جندی شاپور اهواز

Alexander J. Smola | Shahar Mendelson | S. Mendelson | Alex Smola

[1] Manfred K. Warmuth,et al. Relative loss bounds for single neurons , 1999, IEEE Trans. Neural Networks.

[2] Shie Mannor,et al. The Consistency of Greedy Algorithms for Classification , 2002, COLT.

[3] Robert E. Schapire,et al. The strength of weak learnability , 1990, Mach. Learn..

[4] John Shawe-Taylor,et al. Sparsity vs. Large Margins for Linear Classifiers , 2000, COLT.

[5] Tong Zhang,et al. On the Dual Formulation of Regularized Linear Systems with Convex Risks , 2002, Machine Learning.

[6] David Haussler,et al. Convolution kernels on discrete structures , 1999 .

[7] Glenn Fung,et al. Data selection for support vector machine classifiers , 2000, KDD '00.

[8] H. Cramér. Mathematical methods of statistics , 1947 .

[9] Gunnar Rätsch,et al. Soft Margins for AdaBoost , 2001, Machine Learning.

[10] Llew Mason,et al. Margins and combined classifiers , 1999 .

[11] R. Fletcher. Practical Methods of Optimization , 1988 .

[12] Gunnar Rätsch,et al. Sparse Regression Ensembles in Infinite and Finite Hypothesis Spaces , 2002, Machine Learning.

[13] Yoram Singer,et al. Boosting for document routing , 2000, CIKM '00.

[14] J. Lafferty. Additive models, boosting, and inference for generalized divergences , 1999, COLT '99.

[15] David B. Dunson,et al. Bayesian Data Analysis , 2010 .

[16] L. Milne‐Thomson. A Treatise on the Theory of Bessel Functions , 1945, Nature.

[17] C. Watkins. Dynamic Alignment Kernels , 1999 .

[18] Tommi S. Jaakkola,et al. Maximum Entropy Discrimination , 1999, NIPS.

[19] Pal Rujan,et al. Playing Billiards in Version Space , 1997, Neural Computation.

[20] E. Polak. Introduction to linear and nonlinear programming , 1973 .

[21] Ralf Herbrich. Learning linear classifiers: theory and algorithms , 2001 .

[22] John Shawe-Taylor,et al. Structural Risk Minimization Over Data-Dependent Hierarchies , 1998, IEEE Trans. Inf. Theory.

[23] Gábor Lugosi,et al. A Consistent Strategy for Boosting Algorithms , 2002, COLT.

[24] Grace Wahba,et al. Spline Models for Observational Data , 1990 .

[25] M. Gibbs,et al. Efficient implementation of gaussian processes , 1997 .

[26] Alexander J. Smola,et al. Sparse Greedy Gaussian Process Regression , 2000, NIPS.

[27] Gunnar Rätsch,et al. An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[28] Stuart J. Russell,et al. Experimental comparisons of online and batch versions of bagging and boosting , 2001, KDD '01.

[29] V. Koltchinskii,et al. Empirical margin distributions and bounding the generalization error of combined classifiers , 2002, math/0405343.

[30] D. Mackay,et al. Bayesian methods for adaptive models , 1992 .

[31] Klaus Obermayer,et al. Classi cation on Pairwise Proximity , 2007 .

[32] C. Stein,et al. Estimation with Quadratic Loss , 1992 .

[33] C. Robert. The Bayesian choice : a decision-theoretic motivation , 1996 .

[34] Bernhard Schölkopf,et al. New Support Vector Algorithms , 2000, Neural Computation.

[35] M. Seeger. Bayesian methods for Support Vector machines and Gaussian processes , 1999 .

[36] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[37] Volker Tresp,et al. A Bayesian Committee Machine , 2000, Neural Computation.

[38] W. Press,et al. Numerical Recipes in C++: The Art of Scientific Computing (2nd edn)1 Numerical Recipes Example Book (C++) (2nd edn)2 Numerical Recipes Multi-Language Code CD ROM with LINUX or UNIX Single-Screen License Revised Version3 , 2003 .

[39] Rocco A. Servedio,et al. Smooth boosting and learning with malicious noise , 2003 .

[40] Alexander J. Smola,et al. Learning with kernels , 1998 .

[41] Tsuhan Chen,et al. Pose invariant face recognition , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[42] O. Mangasarian,et al. Robust linear programming discrimination of two linearly inseparable sets , 1992 .

[43] Manfred K. Warmuth,et al. Additive versus exponentiated gradient updates for linear prediction , 1995, STOC '95.

[44] Bernhard Schölkopf,et al. Computing the Bayes Kernel Classifier , 2000 .

[45] Matthias W. Seeger,et al. Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[46] David Haussler,et al. Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[47] Yoram Singer,et al. Leveraged Vector Machines , 1999, NIPS.

[48] David G. Luenberger,et al. Linear and Nonlinear Programming: Second Edition , 2003 .

[49] S. Nash,et al. Linear and Nonlinear Programming , 1987 .

[50] Tommi S. Jaakkola,et al. Feature Selection and Dualities in Maximum Entropy Discrimination , 2000, UAI.

[51] Toniann Pitassi,et al. A Gradient-Based Boosting Algorithm for Regression Problems , 2000, NIPS.

[52] Robert P. W. Duin,et al. Support vector domain description , 1999, Pattern Recognit. Lett..

[53] A. Kennedy,et al. Hybrid Monte Carlo , 1988 .

[54] Katya Scheinberg,et al. Efficient SVM Training Using Low-Rank Kernel Representations , 2002, J. Mach. Learn. Res..

[55] Christopher M. Bishop,et al. Variational Relevance Vector Machines , 2000, UAI.

[56] David J. Spiegelhalter,et al. Sequential updating of conditional probabilities on directed graphical structures , 1990, Networks.

[57] Balas K. Natarajan,et al. Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..

[58] John E. Moody,et al. The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.

[59] David Haussler,et al. Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[60] Carl E. Rasmussen,et al. In Advances in Neural Information Processing Systems , 2011 .

[61] Ran El-Yaniv,et al. Variance Optimized Bagging , 2002, ECML.

[62] Allan Pinkus,et al. Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.

[63] Franco P. Preparata,et al. The Densest Hemisphere Problem , 1978, Theor. Comput. Sci..

[64] I. S. Gradshteyn,et al. Table of Integrals, Series, and Products , 1976 .

[65] Umesh V. Vazirani,et al. An Introduction to Computational Learning Theory , 1994 .

[66] Peter Stone,et al. Modeling Auction Price Uncertainty Using Boosting-based Conditional Density Estimation , 2002, ICML.

[67] David J. C. MacKay,et al. Variational Gaussian process classifiers , 2000, IEEE Trans. Neural Networks Learn. Syst..

[68] Gunnar Rätsch,et al. Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[69] Robert E. Schapire,et al. A Brief Introduction to Boosting , 1999, IJCAI.

[70] Gunnar Rätsch,et al. Maximizing the Margin with Boosting , 2002, COLT.

[71] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[72] Bernhard Schölkopf,et al. Kernel Principal Component Analysis , 1997, ICANN.

[73] Stéphane Mallat,et al. Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[74] Tong Zhang. Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .

[75] Harris Drucker,et al. Comparison of learning algorithms for handwritten digit recognition , 1995 .

[76] Leslie G. Valiant,et al. A theory of the learnable , 1984, CACM.

[77] Bernhard Schölkopf,et al. Sparse Greedy Matrix Approximation for Machine Learning , 2000, International Conference on Machine Learning.

[78] John Shawe-Taylor,et al. A Column Generation Algorithm For Boosting , 2000, ICML.

[79] David Barber,et al. Bayesian Classification With Gaussian Processes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[80] Yoram Singer,et al. Boosting and Rocchio applied to text filtering , 1998, SIGIR '98.

[81] G. Rätsch. Robust Boosting via Convex Optimization , 2001 .

[82] P. Tseng,et al. On the convergence of the coordinate descent method for convex differentiable minimization , 1992 .

[83] Cesare Furlanello,et al. Tuning Cost-Sensitive Boosting and Its Application to Melanoma Diagnosis , 2001, Multiple Classifier Systems.

[84] Michael I. Jordan,et al. An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[85] Ron Meir,et al. Data-Dependent Bounds for Bayesian Mixture Methods , 2002, NIPS.

[86] Robert A. Jacobs,et al. Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[87] Michael I. Jordan,et al. Computing upper and lower bounds on likelihoods in intractable networks , 1996, UAI.

[88] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[89] K. Kiwiel. Relaxation Methods for Strictly Convex Regularizations of Piecewise Linear Programs , 1998 .

[90] J. Mercer. Functions of positive and negative type, and their connection with the theory of integral equations , 1909 .

[91] Paul S. Bradley,et al. Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[92] John D. Lafferty,et al. Boosting and Maximum Likelihood for Exponential Models , 2001, NIPS.

[93] Ralf Herbrich,et al. Bayes Point Machines: Estimating the Bayes Point in Kernel Space , 1999 .

[94] Christopher K. I. Williams. Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond , 1999, Learning in Graphical Models.

[95] Yoshua Bengio,et al. Boosting Neural Networks , 2000, Neural Computation.

[96] J. Rissanen,et al. Modeling By Shortest Data Description* , 1978, Autom..

[97] HettichR.,et al. Semi-infinite programming , 1979 .

[98] Yoram Singer,et al. Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[99] Ole Winther,et al. Mean Field Methods for Classification with Gaussian Processes , 1998, NIPS.

[100] O. Mangasarian. Linear and Nonlinear Separation of Patterns by Linear Programming , 1965 .

[101] Gunnar Rätsch,et al. On the Convergence of Leveraging , 2001, NIPS.

[102] Peter L. Bartlett,et al. Improved Generalization Through Explicit Optimization of Margins , 2000, Machine Learning.

[103] T Poggio,et al. Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks , 1990, Science.

[104] Richard Nock,et al. A Robust Boosting Algorithm , 2002, ECML.

[105] Ole Winther,et al. Gaussian processes and SVM: Mean field and leave-one-out estimator , 2000 .

[106] Aiko M. Hormann,et al. Programs for Machine Learning. Part I , 1962, Inf. Control..

[107] Marc Sebban,et al. Boosting Density Function Estimators , 2002, ECML.

[108] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[109] John Skilling,et al. Maximum Entropy and Bayesian Methods , 1989 .

[110] P. Bartlett,et al. Probabilities for SV Machines , 2000 .

[111] Shie Mannor,et al. Geometric Bounds for Generalization in Boosting , 2001, COLT/EuroCOLT.

[112] Yoshua Bengio,et al. Pattern Recognition and Neural Networks , 1995 .

[113] Yishay Mansour,et al. On the boosting ability of top-down decision tree learning algorithms , 1996, STOC '96.

[114] Manfred K. Warmuth,et al. Boosting as entropy projection , 1999, COLT '99.

[115] Yoram Singer,et al. BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[116] Bernhard Schölkopf,et al. Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[117] Marilyn A. Walker,et al. SPoT: A Trainable Sentence Planner , 2001, NAACL.

[118] Radford M. Neal. Priors for Infinite Networks , 1996 .

[119] Terrence J. Sejnowski,et al. Learning Nonlinear Overcomplete Representations for Efficient Coding , 1997, NIPS.

[120] Wenxin Jiang,et al. Some Theoretical Aspects of Boosting in the Presence of Noisy Data , 2001, ICML.

[121] Manfred K. Warmuth,et al. The perceptron algorithm vs. Winnow: linear vs. logarithmic mistake bounds when few input variables are relevant , 1995, COLT '95.

[122] B. Schölkopf,et al. Linear programs for automatic accuracy control in regression. , 1999 .

[123] Bernhard Schölkopf,et al. A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[124] Gunnar Rätsch,et al. Robust Ensemble Learning , 2000 .

[125] F. Girosi. Models of Noise and Robust Estimates , 1991 .

[126] Geoffrey E. Hinton,et al. Evaluation of Gaussian processes and other methods for non-linear regression , 1997 .

[127] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[128] P. Gänssler. Weak Convergence and Empirical Processes - A. W. van der Vaart; J. A. Wellner. , 1997 .

[129] Philip E. Gill,et al. Practical optimization , 1981 .

[130] Ran El-Yaniv,et al. Localized Boosting , 2000, COLT.

[131] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[132] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[133] G. Wahba,et al. Some results on Tchebycheffian spline functions , 1971 .

[134] George Eastman House,et al. Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[135] Thomas Richardson,et al. Boosting methodology for regression problems , 1999, AISTATS.

[136] Klaus-Robert Müller,et al. Subspace information criterion for nonquadratic regularizers-Model selection for sparse regressors , 2002, IEEE Trans. Neural Networks.

[137] Katya Scheinberg,et al. A product-form Cholesky factorization method for handling dense columns in interior point methods for linear programming , 2004, Math. Program..

[138] J. Langford,et al. FeatureBoost: A Meta-Learning Algorithm that Improves Model Robustness , 2000, ICML.

[139] John Shawe-Taylor,et al. Towards a strategy for boosting regressors , 2000 .

[140] Robert E. Schapire,et al. Using output codes to boost multiclass learning problems , 1997, ICML.

[141] Gunnar Rätsch,et al. Constructing Boosting Algorithms from SVMs: An Application to One-Class Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[142] Olvi L. Mangasarian,et al. Arbitrary-norm separating plane , 1999, Oper. Res. Lett..

[143] Robert E. Schapire,et al. The Boosting Approach to Machine Learning An Overview , 2003 .

[144] J. Stoer,et al. Introduction to Numerical Analysis , 2002 .

[145] Rocco A. Servedio,et al. PAC Analogues of Perceptron and Winnow Via Boosting the Margin , 2000, Machine Learning.

[146] Gunnar Rätsch,et al. Adapting Codes and Embeddings for Polychotomies , 2002, NIPS.

[147] Nello Cristianini,et al. On the generalization of soft margin algorithms , 2002, IEEE Trans. Inf. Theory.

[148] Yoav Freund,et al. Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[149] J. Ross Quinlan,et al. Boosting First-Order Learning , 1996, ALT.

[150] Shun-ichi Amari,et al. Network information criterion-determining the number of hidden units for an artificial neural network model , 1994, IEEE Trans. Neural Networks.

[151] Nello Cristianini,et al. Further results on the margin distribution , 1999, COLT '99.

[152] Chuan Long,et al. Boosting Noisy Data , 2001, ICML.

[153] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .

[154] Gunnar Rätsch,et al. An asymptotic analysis of AdaBoost in the binary classification case , 1998 .

[155] David J. C. MacKay,et al. The Evidence Framework Applied to Classification Networks , 1992, Neural Computation.

[156] Tong Zhang,et al. A General Greedy Approximation Algorithm with Applications , 2001, NIPS.

[157] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[158] Peter L. Bartlett,et al. Functional Gradient Techniques for Combining Hypotheses , 2000 .

[159] Charles R. Johnson,et al. Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[160] Alexander J. Smola,et al. Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.

[161] Srinivas Bangalore,et al. Combining prior knowledge and boosting for call classification in spoken language dialogue , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[162] Tong Zhang,et al. Sequential greedy approximation for certain convex optimization problems , 2003, IEEE Trans. Inf. Theory.

[163] Gene H. Golub,et al. Matrix computations , 1983 .

[164] Ralf Herbrich,et al. Learning Kernel Classifiers: Theory and Algorithms , 2001 .

[165] Leslie G. Valiant,et al. Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[166] Michael I. Jordan,et al. An Introduction to Graphical Models , 2001 .

[167] T. Poggio,et al. On optimal nonlinear associative recall , 1975, Biological Cybernetics.

[168] Vladimir Vapnik,et al. The Nature of Statistical Learning , 1995 .

[169] Gunnar Rätsch. Robustes Boosting durch konvexe Optimierung , 2001, Ausgezeichnete Informatikdissertationen.

[170] H. Luetkepohl. The Handbook of Matrices , 1996 .

[171] Philip M. Long,et al. On-line learning of linear functions , 1991, STOC '91.

[172] Manfred Opper,et al. Sparse Representation for Gaussian Process Models , 2000, NIPS.