Interactive Martingale Boosting

We present an approach and a system that explores the application of interactive machine learning to a branching program-based boosting algorithm-- Martingale Boosting. Typically, its performance is based on the ability of a learner to meet a fixed objective and does not account for preferences (e.g. low false positives) arising from an underlying classification problem. We use user preferences gathered on holdout data to guide the two-sided advantages of individual weak learners and tune them to meet these preferences. Extensive experiments show that while arbitrary preferences might be difficult to meet for a single classifier, a non-linear ensemble of classifiers as the one constructed by martingale boosting, performs better.

[1]  Yishay Mansour,et al.  On the boosting ability of top-down decision tree learning algorithms , 1996, STOC '96.

[2]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1990, COLT '90.

[3]  Geoff Hulten,et al.  Learning at Low False Positive Rates , 2006, CEAS.

[4]  Rocco A. Servedio,et al.  Martingale Boosting , 2005, COLT.

[5]  Chuan-Sheng Foo,et al.  Efficient multiple hyperparameter learning for log-linear models , 2007, NIPS.

[6]  Salvatore J. Stolfo,et al.  AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.

[7]  Ming-Syan Chen,et al.  Asymmetric support vector machines: low false-positive learning under the user tolerance , 2008, KDD.

[8]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[9]  R. Fletcher Practical Methods of Optimization , 1988 .

[10]  Ethem Alpaydin,et al.  MultiStage Cascading of Multiple Classifiers: One Man's Noise is Another Man's Data , 2000, ICML.

[11]  Nuno Vasconcelos,et al.  Cost-Sensitive Boosting , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Leslie G. Valiant,et al.  Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[13]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[14]  A C C Gibbs,et al.  Data Analysis , 2009, Encyclopedia of Database Systems.

[15]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[16]  Rocco A. Servedio,et al.  Adaptive Martingale Boosting , 2008, NIPS.

[17]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[18]  Peter Auer,et al.  Proceedings of the 18th annual conference on Learning Theory , 2005 .

[19]  Dragos Gavrilut,et al.  Malware detection using machine learning , 2009, 2009 International Multiconference on Computer Science and Information Technology.

[20]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[21]  Dan Suciu,et al.  Journal of the ACM , 2006 .

[22]  M. Panella Associate Editor of the Journal of Computer and System Sciences , 2014 .

[23]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[24]  Desney S. Tan,et al.  Performance and Preferences: Interactive Refinement of Machine Learning Procedures , 2012, AAAI.

[25]  Desney S. Tan,et al.  Interactive optimization for steering machine classification , 2010, CHI.

[26]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[27]  Yishay Mansour,et al.  Boosting Using Branching Programs , 2000, J. Comput. Syst. Sci..