Robust Boosting via Convex Optimization: Theory and Applications
暂无分享,去创建一个
[1] I. Omiaj,et al. Extensions of a Theory of Networks for Approximation and Learning : dimensionality reduction and clustering , 2022 .
[2] Manfred K. Warmuth,et al. Relating Data Compression and Learnability , 2003 .
[3] Leonid Mosheyev,et al. Penalty/Barrier multiplier algorthm for semidefinit programming , 2000 .
[4] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .
[5] Peter L. Bartlett,et al. Functional Gradient Techniques for Combining Hypotheses , 2000 .
[6] L. Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .
[7] Gunnar Rätsch,et al. Constructing Boosting Algorithms from SVMs: An Application to One-Class Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[8] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[9] Yoram Singer,et al. Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.
[10] Kenneth O. Kortanek,et al. Semi-Infinite Programming: Theory, Methods, and Applications , 1993, SIAM Rev..
[11] Olvi L. Mangasarian,et al. Arbitrary-norm separating plane , 1999, Oper. Res. Lett..
[12] Ran El-Yaniv,et al. Localized Boosting , 2000, COLT.
[13] R. Schapire. The Strength of Weak Learnability , 1990, Machine Learning.
[14] Xiaohua Xia,et al. Active power residential non-intrusive appliance load monitoring system , 2009, AFRICON 2009.
[15] B. Ripley,et al. Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.
[16] Jonathan M. Borwein,et al. Adjoint Process Duality , 1983, Math. Oper. Res..
[17] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[18] Bernhard Schölkopf,et al. Comparing support vector machines with Gaussian kernels to radial basis function classifiers , 1997, IEEE Trans. Signal Process..
[19] Wenxin Jiang. Does Boosting Over t: Views From an Exact Solution , 2000 .
[20] John E. Moody,et al. The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.
[21] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[22] John Shawe-Taylor,et al. Sparsity vs. Large Margins for Linear Classifiers , 2000, COLT.
[23] David Haussler,et al. Equivalence of models for polynomial learnability , 1988, COLT '88.
[24] Panos M. Pardalos,et al. Quadratic programming with one negative eigenvalue is NP-hard , 1991, J. Glob. Optim..
[25] Yoav Freund,et al. Boosting a weak learning algorithm by majority , 1995, COLT '90.
[26] Gunnar Rätsch,et al. A Mathematical Programming Approach to the Kernel Fisher Algorithm , 2000, NIPS.
[27] Robert E. Schapire,et al. A Brief Introduction to Boosting , 1999, IJCAI.
[28] David Haussler,et al. What Size Net Gives Valid Generalization? , 1989, Neural Computation.
[29] Ralf Herbrich,et al. Adaptive margin support vector machines for classification , 1999 .
[30] B. Scholkopf,et al. Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).
[31] Gunnar Rätsch,et al. Barrier Boosting , 2000, COLT.
[32] Gunnar Rätsch,et al. Engineering Support Vector Machine Kerneis That Recognize Translation Initialion Sites , 2000, German Conference on Bioinformatics.
[33] Gunnar Rätsch,et al. An asymptotic analysis of AdaBoost in the binary classification case , 1998 .
[34] Ayhan Demiriz,et al. Linear Programming Boosting via Column Generation , 2002, Machine Learning.
[35] Fernando Pérez-Cruz,et al. Fast Training of Support Vector Classifiers , 2000, NIPS.
[36] Gunnar Rätsch,et al. Soft Margins for AdaBoost , 2001, Machine Learning.
[37] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[38] Nathan Intrator,et al. Boosted Mixture of Experts: An Ensemble Learning Scheme , 1999, Neural Computation.
[39] Gunnar Rätsch,et al. An Arcing algorithm with an intuitive learning control parameter , 2001 .
[40] Yoram Singer,et al. Leveraged Vector Machines , 1999, NIPS.
[41] Tong Zhang,et al. A General Greedy Approximation Algorithm with Applications , 2001, NIPS.
[42] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.
[43] Bernhard Schölkopf,et al. New Support Vector Algorithms , 2000, Neural Computation.
[44] S. D. Pietra,et al. Duality and Auxiliary Functions for Bregman Distances , 2001 .
[45] T. Terlaky,et al. Logarithmic barrier decomposition methods for semi-infinite programming , 1997 .
[46] Javed A. Aslam. Improving Algorithms for Boosting , 2000, COLT.
[47] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.
[48] Wenxin Jiang. On weak base hypotheses and their implications for boosting regression and classification , 2002 .
[49] Gunnar Rätsch,et al. Predicting Time Series with Support Vector Machines , 1997, ICANN.
[50] O. Mangasarian,et al. Robust linear programming discrimination of two linearly inseparable sets , 1992 .
[51] Gunnar Rätsch,et al. Learning to Predict the Leave-One-Out Error of Kernel Based Classifiers , 2001, ICANN.
[52] André Elisseeff,et al. Algorithmic Stability and Generalization Performance , 2000, NIPS.
[53] Jun Rokui,et al. Improving the Generalization Performance of the Minimum Classification Error Learning and Its Application to Neural Networks , 1998, ICONIP.
[55] Shun-ichi Amari,et al. A Theory of Pattern Recognition , 1968 .
[56] B. Schölkopf,et al. General cost functions for support vector regression. , 1998 .
[57] Gunnar Rätsch,et al. Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.
[58] O. Mangasarian. Linear and Nonlinear Separation of Patterns by Linear Programming , 1965 .
[59] John Shawe-Taylor,et al. A Column Generation Algorithm For Boosting , 2000, ICML.
[60] D. Cox,et al. Asymptotic Analysis of Penalized Likelihood and Related Estimators , 1990 .
[61] Harris Drucker,et al. Boosting and Other Ensemble Methods , 1994, Neural Computation.
[62] P. Tseng,et al. On the convergence of the coordinate descent method for convex differentiable minimization , 1992 .
[63] Gunnar Rätsch,et al. On the Convergence of Leveraging , 2001, NIPS.
[64] Peter L. Bartlett,et al. Improved Generalization Through Explicit Optimization of Margins , 2000, Machine Learning.
[65] Gunnar Rätsch,et al. Active Learning in the Drug Discovery Process , 2001, NIPS.
[66] T Poggio,et al. Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks , 1990, Science.
[67] David W. Opitz,et al. An Empirical Evaluation of Bagging and Boosting , 1997, AAAI/IAAI.
[68] Wenxin Jiang,et al. Is regularization unnecessary for boosting? , 2001, AISTATS.
[69] Wenxin Jiang. Some Results on Weakly Accurate Base Learners for Boosting Regression and Classification , 2000, Multiple Classifier Systems.
[70] G. Wahba,et al. Some results on Tchebycheffian spline functions , 1971 .
[71] Claudio Gentile,et al. Linear Hinge Loss and Average Margin , 1998, NIPS.
[72] Nathan Intrator,et al. Boosting Regression Estimators , 1999, Neural Computation.
[73] Manfred K. Warmuth,et al. The Perceptron Algorithm Versus Winnow: Linear Versus Logarithmic Mistake Bounds when Few Input Variables are Relevant (Technical Note) , 1997, Artif. Intell..
[74] Yoav Freund,et al. Experiments with a New Boosting Algorithm , 1996, ICML.
[75] J. Friedman. Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .
[76] Richard Bellman,et al. Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.
[77] Gunnar Rätsch,et al. An Improvement of AdaBoost to Avoid Overfitting , 1998, ICONIP.
[78] Peter E. Hart,et al. Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.
[79] Wenxin Jiang. Process consistency for AdaBoost , 2003 .
[80] Wenxin Jiang,et al. Some Theoretical Aspects of Boosting in the Presence of Noisy Data , 2001, ICML.
[81] G. W. Hart,et al. Nonintrusive appliance load monitoring , 1992, Proc. IEEE.
[82] O. Mangasarian,et al. Lipschitz continuity of solutions of linear inequalities, programs and complementarity problems , 1987 .
[83] David Haussler,et al. Probabilistic kernel regression models , 1999, AISTATS.
[84] Gunnar Rätsch,et al. Kernel PCA pattern reconstruction via approximate pre-images. , 1998 .
[85] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..
[86] L. Glass,et al. Oscillation and chaos in physiological control systems. , 1977, Science.
[87] Thomas Richardson,et al. Boosting methodology for regression problems , 1999, AISTATS.
[88] Eddy Mayoraz,et al. DynaBoost: Combining Boosted Hypotheses in a Dynamic Way , 1999 .
[89] J. Lafferty. Additive models, boosting, and inference for generalized divergences , 1999, COLT '99.
[90] Gunnar Rätsch,et al. v-Arc: Ensemble Learning in the Presence of Outliers , 1999, NIPS.
[91] Manfred K. Warmuth,et al. Sample compression, learnability, and the Vapnik-Chervonenkis dimension , 1995, Machine Learning.
[92] Temple F. Smith. Occam's razor , 1980, Nature.
[93] John D. Lafferty,et al. Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..
[94] Heinz H. Bauschke,et al. Legendre functions and the method of random Bregman projections , 1997 .
[95] Bernhard Schölkopf,et al. Support vector learning , 1997 .
[96] Robert E. Schapire,et al. Design and analysis of efficient learning algorithms , 1992, ACM Doctoral dissertation award ; 1991.
[97] Toniann Pitassi,et al. A Gradient-Based Boosting Algorithm for Regression Problems , 2000, NIPS.
[98] John Moody,et al. Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.
[99] K. Kiwiel. Relaxation Methods for Strictly Convex Regularizations of Piecewise Linear Programs , 1998 .
[100] William H. Press,et al. Numerical recipes in C , 2002 .
[101] Ralf Herbrich. Learning linear classifiers: theory and algorithms , 2001 .
[102] Manfred K. Warmuth,et al. Bounds on approximate steepest descent for likelihood maximization in exponential families , 1994, IEEE Trans. Inf. Theory.
[103] Paul S. Bradley,et al. Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.
[104] Umesh V. Vazirani,et al. An Introduction to Computational Learning Theory , 1994 .
[105] H. Akaike. A new look at the statistical model identification , 1974 .
[106] L. Devroye. Bounds for the Uniform Deviation of Empirical Measures , 1982 .
[107] Alexander H. Waibel,et al. A novel objective function for improved phoneme recognition using time delay neural networks , 1990, International 1989 Joint Conference on Neural Networks.
[108] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .
[109] Marc Teboulle,et al. An Interior Proximal Algorithm and the Exponential Multiplier Method for Semidefinite Programming , 1998, SIAM J. Optim..
[110] Richard Bellman,et al. Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.
[111] Manfred K. Warmuth,et al. The perceptron algorithm vs. Winnow: linear vs. logarithmic mistake bounds when few input variables are relevant , 1995, COLT '95.
[112] B. Schölkopf,et al. Linear programs for automatic accuracy control in regression. , 1999 .
[113] Bernhard Schölkopf,et al. A Generalized Representer Theorem , 2001, COLT/EuroCOLT.
[114] Gunnar Rätsch,et al. Robust Ensemble Learning , 2000 .
[115] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..
[116] David Haussler,et al. Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.
[117] B. Schölkopf,et al. Asymptotically Optimal Choice of ε-Loss for Support Vector Machines , 1998 .
[118] J. Langford,et al. FeatureBoost: A Meta-Learning Algorithm that Improves Model Robustness , 2000, ICML.
[119] John Shawe-Taylor,et al. Towards a strategy for boosting regressors , 2000 .
[120] L. Galway. Spline Models for Observational Data , 1991 .
[121] J. van Leeuwen,et al. Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.
[122] A. Hoffman. On approximate solutions of systems of linear inequalities , 1952 .
[123] Harris Drucker,et al. Boosting Performance in Neural Networks , 1993, Int. J. Pattern Recognit. Artif. Intell..
[124] J. Copas. Regression, Prediction and Shrinkage , 1983 .
[125] Tomaso A. Poggio,et al. Extensions of a Theory of Networks for Approximation and Learning , 1990, NIPS.
[126] John Shawe-Taylor,et al. A framework for structural risk minimisation , 1996, COLT '96.
[127] Dale Schuurmans,et al. Boosting in the Limit: Maximizing the Margin of Learned Ensembles , 1998, AAAI/IAAI.
[128] Andreas S. Weigend,et al. Time Series Prediction: Forecasting the Future and Understanding the Past , 1994 .
[129] Manfred K. Warmuth,et al. Boosting as entropy projection , 1999, COLT '99.
[130] J. Kohlmorgen,et al. Analysis of nonstationary time series by mixtures of self-organizing predictors , 2000, Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop (Cat. No.00TH8501).
[131] N. Cristianini,et al. Robust Bounds on Generalization from the Margin Distribution , 1998 .
[132] S. Amari,et al. Information geometry of estimating functions in semi-parametric statistical models , 1997 .
[133] Bernhard Schölkopf,et al. The connection between regularization operators and support vector kernels , 1998, Neural Networks.
[134] John Mark,et al. Introduction to radial basis function networks , 1996 .
[135] Gunnar Rätsch,et al. Invariant Feature Extraction and Classification in Kernel Spaces , 1999, NIPS.
[136] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[137] Paul S. Bradley,et al. Parsimonious Least Norm Approximation , 1998, Comput. Optim. Appl..
[138] Stéphane Mallat,et al. Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..
[139] Gunnar Rätsch,et al. New Methods for Splice Site Recognition , 2002, ICANN.
[140] Yves Grandvalet. Bagging Can Stabilize without Reducing Variance , 2001, ICANN.
[141] Sanmay Das,et al. Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection , 2001, ICML.
[142] Yoav Freund,et al. A Short Introduction to Boosting , 1999 .
[143] R. C. Williamson,et al. Classification on proximity data with LP-machines , 1999 .
[144] Philip M. Long,et al. On-line learning of linear functions , 1991, STOC '91.
[145] Dr. M. G. Worster. Methods of Mathematical Physics , 1947, Nature.
[146] Joachim Piehler. Einführung in die lineare Optimierung , 1966 .
[147] Jason Weston,et al. Transductive Inference for Estimating Values of Functions , 1999, NIPS.
[148] Klaus-Robert Müller,et al. Annealed Competition of Experts for a Segmentation and Classification of Switching Dynamics , 1996, Neural Computation.
[149] S. D. Pietra,et al. Statistical Learning Algorithms Based on Bregman Distances , 1997 .
[150] ObradovicZoran,et al. Adaptive boosting techniques in heterogeneous and spatial databases , 2001 .
[151] J. Simonoff. Multivariate Density Estimation , 1996 .
[152] Alexander J. Smola,et al. Learning with kernels , 1998 .
[153] Gunnar Rätsch,et al. A New Discriminative Kernel from Probabilistic Models , 2001, Neural Computation.
[154] Zoran Obradovic,et al. Adaptive boosting techniques in heterogeneous and spatial databases , 2001, Intell. Data Anal..
[155] Manfred K. Warmuth,et al. Additive versus exponentiated gradient updates for linear prediction , 1995, STOC '95.
[156] Gunnar Rätsch,et al. Regularizing AdaBoost , 1998, NIPS.
[157] Thomas G. Dietterich,et al. Pruning Adaptive Boosting , 1997, ICML.
[158] David P. Helmbold,et al. A geometric approach to leveraging weak learners , 1999, Theor. Comput. Sci..
[159] Mark Herbster,et al. Tracking the Best Linear Predictor , 2001, J. Mach. Learn. Res..
[160] Y. Freund,et al. Adaptive game playing using multiplicative weights , 1999 .
[161] J. Mercer. Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .
[162] A. N. Tikhonov,et al. Solutions of ill-posed problems , 1977 .
[163] Leo Breiman,et al. Prediction Games and Arcing Algorithms , 1999, Neural Computation.
[164] Yoav Freund,et al. Game theory, on-line prediction and boosting , 1996, COLT '96.
[165] H. Schwenk,et al. Adaboosting neural networks , 1997 .
[166] Y. Censor,et al. Parallel Optimization: Theory, Algorithms, and Applications , 1997 .
[167] Harris Drucker,et al. Improving Regressors using Boosting Techniques , 1997, ICML.
[168] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[169] Gilles Blanchard. Méthodes de mélange et d'agrégation d'estimateurs en reconnaissance de formes : Application aux arbres de décision , 2001 .
[170] Marti A. Hearst. Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..
[171] David P. Helmbold,et al. Leveraging for Regression , 2000, COLT.
[172] Alberto Maria Segre,et al. Programs for Machine Learning , 1994 .
[173] P. Vannerem,et al. Classifying LEP data with support vector algorithms. , 1999 .
[174] Yoram Singer,et al. Logistic Regression, AdaBoost and Bregman Distances , 2000, Machine Learning.
[175] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.
[176] D. Bertsekas,et al. Multiplier methods for convex programming , 1973, CDC 1973.
[177] Yoav Freund,et al. Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.
[178] J. Ross Quinlan,et al. Boosting First-Order Learning , 1996, ALT.
[179] Shun-ichi Amari,et al. Network information criterion-determining the number of hidden units for an artificial neural network model , 1994, IEEE Trans. Neural Networks.
[180] Chuan Long,et al. Boosting Noisy Data , 2001, ICML.
[181] Gunnar Rätsch,et al. Kernel PCA and De-Noising in Feature Spaces , 1998, NIPS.
[182] Harris Drucker,et al. Learning algorithms for classification: A comparison on handwritten digit recognition , 1995 .
[183] David P. Helmbold,et al. Potential Boosters? , 1999, NIPS.
[184] Paola Campadelli,et al. A Boosting Algorithm for Regression , 1997, ICANN.
[185] Peter L. Bartlett,et al. The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.
[186] Bernhard Schölkopf,et al. Generalization Performance of Regularization Networks and Support Vector Machines via Entropy Numbers of Compact Operators , 1998 .
[187] Gunnar Rätsch,et al. Sparse Regression Ensembles in Infinite and Finite Hypothesis Spaces , 2002, Machine Learning.
[188] Klaus-Robert Müller,et al. Analysis of switching dynamics with competing neural networks , 1995 .
[189] J. Dussault,et al. Stable exponential-penalty algorithm with superlinear convergence , 1994 .