Learning Monotonic Linear Functions

Learning probabilities (p-concepts [13]) and other real-valued concepts (regression) is an important role of machine learning. For example, a doctor may need to predict the probability of getting a disease P[y|x], which depends on a number of risk factors.

[1]  Yishay Mansour,et al.  On the Boosting Ability of Top-Down Decision Tree Learning Algorithms , 1999, J. Comput. Syst. Sci..

[2]  Michael Kearns,et al.  Efficient noise-tolerant learning from statistical queries , 1993, STOC.

[3]  Ron Kohavi,et al.  Wrappers for performance enhancement and oblivious decision graphs , 1995 .

[4]  Bianca Zadrozny,et al.  Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.

[5]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[6]  Javed A. Aslam,et al.  Specification and simulation of statistical query algorithms for efficiency and noise tolerance , 1995, Annual Conference Computational Learning Theory.

[7]  Yishay Mansour,et al.  Boosting Using Branching Programs , 2000, J. Comput. Syst. Sci..

[8]  R. Gray,et al.  Applications of information theory to pattern recognition and the design of decision trees and trellises , 1988 .

[9]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[10]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[11]  Emili Montserrat,et al.  A predictive model for aggressive non-Hodgkin's lymphoma. , 1993, The New England journal of medicine.

[12]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[13]  Tom Bylander Polynomial learnability of linear threshold approximations , 1993, COLT '93.

[14]  Rocco A. Servedio,et al.  Boosting in the presence of noise , 2003, STOC '03.

[15]  R. Tibshirani,et al.  Generalized Additive Models , 1991 .

[16]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[17]  Alan M. Frieze,et al.  A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions , 1996, Algorithmica.

[18]  Lalit R. Bahl,et al.  A tree-based statistical language model for natural language speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[19]  Eric B. Baum,et al.  A Polynomial Time Algorithm That Learns Two Hidden Unit Nets , 1990, Neural Computation.

[20]  Jonathan J. Oliver Decision Graphs - An Extension of Decision Trees , 1993 .

[21]  Robert E. Schapire,et al.  Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[22]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .