Advanced Lectures on Machine Learning

Advanced lectures on machine learning , Advanced lectures on machine learning , کتابخانه دیجیتال جندی شاپور اهواز

[1]  Manfred K. Warmuth,et al.  Relative loss bounds for single neurons , 1999, IEEE Trans. Neural Networks.

[2]  Shie Mannor,et al.  The Consistency of Greedy Algorithms for Classification , 2002, COLT.

[3]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[4]  John Shawe-Taylor,et al.  Sparsity vs. Large Margins for Linear Classifiers , 2000, COLT.

[5]  Tong Zhang,et al.  On the Dual Formulation of Regularized Linear Systems with Convex Risks , 2002, Machine Learning.

[6]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[7]  Glenn Fung,et al.  Data selection for support vector machine classifiers , 2000, KDD '00.

[8]  H. Cramér Mathematical methods of statistics , 1947 .

[9]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[10]  Llew Mason,et al.  Margins and combined classifiers , 1999 .

[11]  R. Fletcher Practical Methods of Optimization , 1988 .

[12]  Gunnar Rätsch,et al.  Sparse Regression Ensembles in Infinite and Finite Hypothesis Spaces , 2002, Machine Learning.

[13]  Yoram Singer,et al.  Boosting for document routing , 2000, CIKM '00.

[14]  J. Lafferty Additive models, boosting, and inference for generalized divergences , 1999, COLT '99.

[15]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[16]  L. Milne‐Thomson A Treatise on the Theory of Bessel Functions , 1945, Nature.

[17]  C. Watkins Dynamic Alignment Kernels , 1999 .

[18]  Tommi S. Jaakkola,et al.  Maximum Entropy Discrimination , 1999, NIPS.

[19]  Pal Rujan,et al.  Playing Billiards in Version Space , 1997, Neural Computation.

[20]  E. Polak Introduction to linear and nonlinear programming , 1973 .

[21]  Ralf Herbrich Learning linear classifiers: theory and algorithms , 2001 .

[22]  John Shawe-Taylor,et al.  Structural Risk Minimization Over Data-Dependent Hierarchies , 1998, IEEE Trans. Inf. Theory.

[23]  Gábor Lugosi,et al.  A Consistent Strategy for Boosting Algorithms , 2002, COLT.

[24]  Grace Wahba,et al.  Spline Models for Observational Data , 1990 .

[25]  M. Gibbs,et al.  Efficient implementation of gaussian processes , 1997 .

[26]  Alexander J. Smola,et al.  Sparse Greedy Gaussian Process Regression , 2000, NIPS.

[27]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[28]  Stuart J. Russell,et al.  Experimental comparisons of online and batch versions of bagging and boosting , 2001, KDD '01.

[29]  V. Koltchinskii,et al.  Empirical margin distributions and bounding the generalization error of combined classifiers , 2002, math/0405343.

[30]  D. Mackay,et al.  Bayesian methods for adaptive models , 1992 .

[31]  Klaus Obermayer,et al.  Classi cation on Pairwise Proximity , 2007 .

[32]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[33]  C. Robert The Bayesian choice : a decision-theoretic motivation , 1996 .

[34]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[35]  M. Seeger Bayesian methods for Support Vector machines and Gaussian processes , 1999 .

[36]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[37]  Volker Tresp,et al.  A Bayesian Committee Machine , 2000, Neural Computation.

[38]  W. Press,et al.  Numerical Recipes in C++: The Art of Scientific Computing (2nd edn)1 Numerical Recipes Example Book (C++) (2nd edn)2 Numerical Recipes Multi-Language Code CD ROM with LINUX or UNIX Single-Screen License Revised Version3 , 2003 .

[39]  Rocco A. Servedio,et al.  Smooth boosting and learning with malicious noise , 2003 .

[40]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[41]  Tsuhan Chen,et al.  Pose invariant face recognition , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[42]  O. Mangasarian,et al.  Robust linear programming discrimination of two linearly inseparable sets , 1992 .

[43]  Manfred K. Warmuth,et al.  Additive versus exponentiated gradient updates for linear prediction , 1995, STOC '95.

[44]  Bernhard Schölkopf,et al.  Computing the Bayes Kernel Classifier , 2000 .

[45]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[46]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[47]  Yoram Singer,et al.  Leveraged Vector Machines , 1999, NIPS.

[48]  David G. Luenberger,et al.  Linear and Nonlinear Programming: Second Edition , 2003 .

[49]  S. Nash,et al.  Linear and Nonlinear Programming , 1987 .

[50]  Tommi S. Jaakkola,et al.  Feature Selection and Dualities in Maximum Entropy Discrimination , 2000, UAI.

[51]  Toniann Pitassi,et al.  A Gradient-Based Boosting Algorithm for Regression Problems , 2000, NIPS.

[52]  Robert P. W. Duin,et al.  Support vector domain description , 1999, Pattern Recognit. Lett..

[53]  A. Kennedy,et al.  Hybrid Monte Carlo , 1988 .

[54]  Katya Scheinberg,et al.  Efficient SVM Training Using Low-Rank Kernel Representations , 2002, J. Mach. Learn. Res..

[55]  Christopher M. Bishop,et al.  Variational Relevance Vector Machines , 2000, UAI.

[56]  David J. Spiegelhalter,et al.  Sequential updating of conditional probabilities on directed graphical structures , 1990, Networks.

[57]  Balas K. Natarajan,et al.  Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..

[58]  John E. Moody,et al.  The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.

[59]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[60]  Carl E. Rasmussen,et al.  In Advances in Neural Information Processing Systems , 2011 .

[61]  Ran El-Yaniv,et al.  Variance Optimized Bagging , 2002, ECML.

[62]  Allan Pinkus,et al.  Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.

[63]  Franco P. Preparata,et al.  The Densest Hemisphere Problem , 1978, Theor. Comput. Sci..

[64]  I. S. Gradshteyn,et al.  Table of Integrals, Series, and Products , 1976 .

[65]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[66]  Peter Stone,et al.  Modeling Auction Price Uncertainty Using Boosting-based Conditional Density Estimation , 2002, ICML.

[67]  David J. C. MacKay,et al.  Variational Gaussian process classifiers , 2000, IEEE Trans. Neural Networks Learn. Syst..

[68]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[69]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[70]  Gunnar Rätsch,et al.  Maximizing the Margin with Boosting , 2002, COLT.

[71]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[72]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[73]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[74]  Tong Zhang Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .

[75]  Harris Drucker,et al.  Comparison of learning algorithms for handwritten digit recognition , 1995 .

[76]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[77]  Bernhard Schölkopf,et al.  Sparse Greedy Matrix Approximation for Machine Learning , 2000, International Conference on Machine Learning.

[78]  John Shawe-Taylor,et al.  A Column Generation Algorithm For Boosting , 2000, ICML.

[79]  David Barber,et al.  Bayesian Classification With Gaussian Processes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[80]  Yoram Singer,et al.  Boosting and Rocchio applied to text filtering , 1998, SIGIR '98.

[81]  G. Rätsch Robust Boosting via Convex Optimization , 2001 .

[82]  P. Tseng,et al.  On the convergence of the coordinate descent method for convex differentiable minimization , 1992 .

[83]  Cesare Furlanello,et al.  Tuning Cost-Sensitive Boosting and Its Application to Melanoma Diagnosis , 2001, Multiple Classifier Systems.

[84]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[85]  Ron Meir,et al.  Data-Dependent Bounds for Bayesian Mixture Methods , 2002, NIPS.

[86]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[87]  Michael I. Jordan,et al.  Computing upper and lower bounds on likelihoods in intractable networks , 1996, UAI.

[88]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[89]  K. Kiwiel Relaxation Methods for Strictly Convex Regularizations of Piecewise Linear Programs , 1998 .

[90]  J. Mercer Functions of positive and negative type, and their connection with the theory of integral equations , 1909 .

[91]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[92]  John D. Lafferty,et al.  Boosting and Maximum Likelihood for Exponential Models , 2001, NIPS.

[93]  Ralf Herbrich,et al.  Bayes Point Machines: Estimating the Bayes Point in Kernel Space , 1999 .

[94]  Christopher K. I. Williams Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond , 1999, Learning in Graphical Models.

[95]  Yoshua Bengio,et al.  Boosting Neural Networks , 2000, Neural Computation.

[96]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[97]  HettichR.,et al.  Semi-infinite programming , 1979 .

[98]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[99]  Ole Winther,et al.  Mean Field Methods for Classification with Gaussian Processes , 1998, NIPS.

[100]  O. Mangasarian Linear and Nonlinear Separation of Patterns by Linear Programming , 1965 .

[101]  Gunnar Rätsch,et al.  On the Convergence of Leveraging , 2001, NIPS.

[102]  Peter L. Bartlett,et al.  Improved Generalization Through Explicit Optimization of Margins , 2000, Machine Learning.

[103]  T Poggio,et al.  Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks , 1990, Science.

[104]  Richard Nock,et al.  A Robust Boosting Algorithm , 2002, ECML.

[105]  Ole Winther,et al.  Gaussian processes and SVM: Mean field and leave-one-out estimator , 2000 .

[106]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[107]  Marc Sebban,et al.  Boosting Density Function Estimators , 2002, ECML.

[108]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[109]  John Skilling,et al.  Maximum Entropy and Bayesian Methods , 1989 .

[110]  P. Bartlett,et al.  Probabilities for SV Machines , 2000 .

[111]  Shie Mannor,et al.  Geometric Bounds for Generalization in Boosting , 2001, COLT/EuroCOLT.

[112]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[113]  Yishay Mansour,et al.  On the boosting ability of top-down decision tree learning algorithms , 1996, STOC '96.

[114]  Manfred K. Warmuth,et al.  Boosting as entropy projection , 1999, COLT '99.

[115]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[116]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[117]  Marilyn A. Walker,et al.  SPoT: A Trainable Sentence Planner , 2001, NAACL.

[118]  Radford M. Neal Priors for Infinite Networks , 1996 .

[119]  Terrence J. Sejnowski,et al.  Learning Nonlinear Overcomplete Representations for Efficient Coding , 1997, NIPS.

[120]  Wenxin Jiang,et al.  Some Theoretical Aspects of Boosting in the Presence of Noisy Data , 2001, ICML.

[121]  Manfred K. Warmuth,et al.  The perceptron algorithm vs. Winnow: linear vs. logarithmic mistake bounds when few input variables are relevant , 1995, COLT '95.

[122]  B. Schölkopf,et al.  Linear programs for automatic accuracy control in regression. , 1999 .

[123]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[124]  Gunnar Rätsch,et al.  Robust Ensemble Learning , 2000 .

[125]  F. Girosi Models of Noise and Robust Estimates , 1991 .

[126]  Geoffrey E. Hinton,et al.  Evaluation of Gaussian processes and other methods for non-linear regression , 1997 .

[127]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[128]  P. Gänssler Weak Convergence and Empirical Processes - A. W. van der Vaart; J. A. Wellner. , 1997 .

[129]  Philip E. Gill,et al.  Practical optimization , 1981 .

[130]  Ran El-Yaniv,et al.  Localized Boosting , 2000, COLT.

[131]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[132]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[133]  G. Wahba,et al.  Some results on Tchebycheffian spline functions , 1971 .

[134]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[135]  Thomas Richardson,et al.  Boosting methodology for regression problems , 1999, AISTATS.

[136]  Klaus-Robert Müller,et al.  Subspace information criterion for nonquadratic regularizers-Model selection for sparse regressors , 2002, IEEE Trans. Neural Networks.

[137]  Katya Scheinberg,et al.  A product-form Cholesky factorization method for handling dense columns in interior point methods for linear programming , 2004, Math. Program..

[138]  J. Langford,et al.  FeatureBoost: A Meta-Learning Algorithm that Improves Model Robustness , 2000, ICML.

[139]  John Shawe-Taylor,et al.  Towards a strategy for boosting regressors , 2000 .

[140]  Robert E. Schapire,et al.  Using output codes to boost multiclass learning problems , 1997, ICML.

[141]  Gunnar Rätsch,et al.  Constructing Boosting Algorithms from SVMs: An Application to One-Class Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[142]  Olvi L. Mangasarian,et al.  Arbitrary-norm separating plane , 1999, Oper. Res. Lett..

[143]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[144]  J. Stoer,et al.  Introduction to Numerical Analysis , 2002 .

[145]  Rocco A. Servedio,et al.  PAC Analogues of Perceptron and Winnow Via Boosting the Margin , 2000, Machine Learning.

[146]  Gunnar Rätsch,et al.  Adapting Codes and Embeddings for Polychotomies , 2002, NIPS.

[147]  Nello Cristianini,et al.  On the generalization of soft margin algorithms , 2002, IEEE Trans. Inf. Theory.

[148]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[149]  J. Ross Quinlan,et al.  Boosting First-Order Learning , 1996, ALT.

[150]  Shun-ichi Amari,et al.  Network information criterion-determining the number of hidden units for an artificial neural network model , 1994, IEEE Trans. Neural Networks.

[151]  Nello Cristianini,et al.  Further results on the margin distribution , 1999, COLT '99.

[152]  Chuan Long,et al.  Boosting Noisy Data , 2001, ICML.

[153]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[154]  Gunnar Rätsch,et al.  An asymptotic analysis of AdaBoost in the binary classification case , 1998 .

[155]  David J. C. MacKay,et al.  The Evidence Framework Applied to Classification Networks , 1992, Neural Computation.

[156]  Tong Zhang,et al.  A General Greedy Approximation Algorithm with Applications , 2001, NIPS.

[157]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[158]  Peter L. Bartlett,et al.  Functional Gradient Techniques for Combining Hypotheses , 2000 .

[159]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[160]  Alexander J. Smola,et al.  Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.

[161]  Srinivas Bangalore,et al.  Combining prior knowledge and boosting for call classification in spoken language dialogue , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[162]  Tong Zhang,et al.  Sequential greedy approximation for certain convex optimization problems , 2003, IEEE Trans. Inf. Theory.

[163]  Gene H. Golub,et al.  Matrix computations , 1983 .

[164]  Ralf Herbrich,et al.  Learning Kernel Classifiers: Theory and Algorithms , 2001 .

[165]  Leslie G. Valiant,et al.  Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[166]  Michael I. Jordan,et al.  An Introduction to Graphical Models , 2001 .

[167]  T. Poggio,et al.  On optimal nonlinear associative recall , 1975, Biological Cybernetics.

[168]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[169]  Gunnar Rätsch Robustes Boosting durch konvexe Optimierung , 2001, Ausgezeichnete Informatikdissertationen.

[170]  H. Luetkepohl The Handbook of Matrices , 1996 .

[171]  Philip M. Long,et al.  On-line learning of linear functions , 1991, STOC '91.

[172]  Manfred Opper,et al.  Sparse Representation for Gaussian Process Models , 2000, NIPS.