Combining classification algorithms

Dissertacao de Doutoramento em Ciencia de Computadores apresentada a Faculdade de Ciencias da Universidade do Porto

[1]  Ingo Br,et al.  Prolog programming for artificial intelligence , 1990 .

[2]  Ryszard S. Michalski,et al.  A Theory and Methodology of Inductive Learning , 1983, Artificial Intelligence.

[3]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[4]  Zijian Zheng Naive Bayesian Classiier Committees , 1998 .

[5]  George H. John Enhancements to the data mining process , 1997 .

[6]  Kai Ming Ting The Characterisation of Predictive Accuracy and Decision Combination , 1996, ICML.

[7]  D. Wolpert On Overfitting Avoidance as Bias , 1993 .

[8]  Kagan Tumer,et al.  Theoretical Foundations Of Linear And Order Statistics Combiners For Neural Pattern Classifiers , 1995 .

[9]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[10]  Simon Kasif,et al.  Induction of Oblique Decision Trees , 1993, IJCAI.

[11]  Pat Langley,et al.  Machine learning as an experimental science , 2004, Machine Learning.

[12]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[13]  J. R. Quinlan Discovering rules by induction from large collections of examples Intro-ductory readings in expert s , 1979 .

[14]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[15]  Sholom M. Weiss,et al.  Predictive data mining - a practical guide , 1997 .

[16]  Kagan Tumer,et al.  Classifier Combining: Analytical Results and Implications , 1995 .

[17]  P. Utgoff,et al.  Multivariate Versus Univariate Decision Trees , 1992 .

[18]  Larry A. Rendell,et al.  Rerepresenting and Restructuring Domain Theories: A Constructive Induction Approach , 1994, J. Artif. Intell. Res..

[19]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[20]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[21]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[22]  Salvatore J. Stolfo,et al.  A Comparative Evaluation of Voting and Meta-learning on Partitioned Data , 1995, ICML.

[23]  Carla E. Brodley,et al.  Addressing the Selective Superiority Problem: Automatic Algorithm/Model Class Selection , 1993 .

[24]  Thomas G. Dietterich,et al.  Pruning Adaptive Boosting , 1997, ICML.

[25]  Ron Kohavi,et al.  Option Decision Trees with Majority Votes , 1997, ICML.

[26]  Larry A. Rendell,et al.  Constructive Induction On Decision Trees , 1989, IJCAI.

[27]  Ron Kohavi,et al.  Lazy Decision Trees , 1996, AAAI/IAAI, Vol. 1.

[28]  Kai Ming Ting,et al.  Decision Combination Based on the Characterisation of Predictive Accuracy , 1997, Intell. Data Anal..

[29]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[30]  George H. John,et al.  Robust Linear Discriminant Trees , 1995, AISTATS.

[31]  David W. Aha,et al.  Error-Correcting Output Codes for Local Learners , 1998, ECML.

[32]  J. R. Quinlan Miniboosting Decision Trees , 1999 .

[33]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[34]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[35]  Sebastian Thrun,et al.  The MONK''s Problems-A Performance Comparison of Different Learning Algorithms, CMU-CS-91-197, Sch , 1991 .

[36]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[37]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[38]  Carla E. Brodley,et al.  Automatic Selection of Split Criterion during Tree Growing Based on Node Location , 1995, ICML.

[39]  Thomas G. Dietterich Machine-Learning Research Four Current Directions , 1997 .

[40]  Igor Kononenko,et al.  Semi-Naive Bayesian Classifier , 1991, EWSL.

[41]  J. R. Quinlan Learning With Continuous Classes , 1992 .

[42]  Michael Schlosser,et al.  Non-Linear Decision Trees - NDT , 1996, ICML.

[43]  R. Cranley,et al.  Multivariate Analysis—Methods and Applications , 1985 .

[44]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[45]  Zijian Zheng,et al.  Naive Bayesian Classifier Committees , 1998, ECML.

[46]  Pedro M. Domingos A Process-Oriented Heuristic for Model Selection , 1998, ICML.

[47]  Carla E. Brodley,et al.  Linear Machine Decision Trees , 1991 .

[48]  Donato Malerba,et al.  Decision Tree Pruning as a Search in the State Space , 1993, ECML.

[49]  H. Lounis,et al.  Evaluation of Learning Systems: An Artificial Data-Based Approach , 1991, EWSL.

[50]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[51]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[52]  Ron Kohavi,et al.  Data Mining Using MLC a Machine Learning Library in C++ , 1996, Int. J. Artif. Intell. Tools.

[53]  Geoffrey I. Webb,et al.  Incorporating canonical discriminant attributes in classification learning , 1994 .

[54]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[55]  Pedro M. Domingos Why Does Bagging Work? A Bayesian Account and its Implications , 1997, KDD.

[56]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[57]  Pedro Domingos Bayesian Model Averaging in Rule Induction , 1997 .

[58]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[59]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[60]  Thierry Van de Merckt Decision Trees in Numerical Attribute Spaces , 1993, IJCAI.

[61]  Kagan Tumer,et al.  Error Correlation and Error Reduction in Ensemble Classifiers , 1996, Connect. Sci..

[62]  Luís Torgo,et al.  Knowledge Acquisition via Knowledge Integration , 1990 .

[63]  O. Mangasarian,et al.  Multicategory discrimination via linear programming , 1994 .

[64]  Ron Kohavi,et al.  Wrappers for performance enhancement and oblivious decision graphs , 1995 .

[65]  Larry A. Rendell,et al.  Empirical learning as a function of concept character , 2004, Machine Learning.

[66]  Cullen Schaffer,et al.  A Conservation Law for Generalization Performance , 1994, ICML.

[67]  J. R. Quinlan,et al.  Comparing connectionist and symbolic learning methods , 1994, COLT 1994.

[68]  Pat Langley,et al.  Induction of Recursive Bayesian Classifiers , 1993, ECML.

[69]  Carla E. Brodley,et al.  An Incremental Method for Finding Multivariate Splits for Decision Trees , 1990, ML.

[70]  Ivan Bratko,et al.  ASSISTANT 86: A Knowledge-Elicitation Tool for Sophisticated Users , 1987, EWSL.

[71]  William W. Cohen Fast Eeective Rule Induction , 1995 .

[72]  Casimir A. Kulikowski,et al.  Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems , 1990 .

[73]  Leo Breiman,et al.  Bias, Variance , And Arcing Classifiers , 1996 .

[74]  Pedro M. Domingos Unifying Instance-Based and Rule-Based Induction , 1996, Machine Learning.

[75]  Thomas G. Dietterich,et al.  Error-Correcting Output Coding Corrects Bias and Variance , 1995, ICML.

[76]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[77]  Ronald L. Rivest,et al.  Learning decision lists , 2004, Machine Learning.

[78]  Stephen D. Bay Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets , 1998, ICML.

[79]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[80]  Pat Langley,et al.  Elements of Machine Learning , 1995 .

[81]  Wray L. Buntine,et al.  A theory of learning classification rules , 1990 .

[82]  João Gama,et al.  Combining Classifiers by Constructive Induction , 1998, ECML.

[83]  David B. Skalak,et al.  Prototype Selection for Composite Nearest Neighbor Classifiers , 1995 .

[84]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[85]  J. Ross Quinlan,et al.  Simplifying decision trees , 1987, Int. J. Hum. Comput. Stud..

[86]  Michael J. Pazzani,et al.  Classification and regression by combining models , 1998 .

[87]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[88]  Temple F. Smith Occam's razor , 1980, Nature.

[89]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[90]  O. Mangasarian,et al.  Pattern Recognition Via Linear Programming: Theory and Application to Medical Diagnosis , 1989 .

[91]  Olivier Gascuel,et al.  Statistical Significance in Inductive Learning , 1992, ECAI.

[92]  Luís Torgo,et al.  Dynamic Discretization of Continuous Attributes , 1998, IBERAMIA.

[93]  Zijian Zheng Constructing New Attributes for Decision Tree Learning , 1996 .

[94]  G. K. Bhattacharyya,et al.  Statistical Concepts And Methods , 1978 .

[95]  Pedro M. Domingos Knowledge Discovery Via Multiple Models , 1998, Intell. Data Anal..

[96]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[97]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[98]  Ian H. Witten,et al.  Stacked generalization: when does it work? , 1997, IJCAI 1997.

[99]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[100]  Donato Malerba,et al.  A Comparative Analysis of Methods for Pruning Decision Trees , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[101]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[102]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[103]  Christopher J. Matheus,et al.  The Need for Constructive Induction , 1991, ML.

[104]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[105]  A. J. Feelders,et al.  Using Machine Learning, Neural Networks and Statistics to Predict Corporate Bankruptcy: A Comparative Study , 1996 .

[106]  Pedro M. Domingos Towards a Unified Approach to Concept Learning , 1996, AAAI/IAAI, Vol. 2.

[107]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[108]  H. Jose Exploiting Multiple Existing Models and Learning Algorithms , 1995 .

[109]  Jason Catlett,et al.  On Changing Continuous Attributes into Ordered Discrete Attributes , 1991, EWSL.

[110]  J. R. Quinlan,et al.  MDL and Categorical Theories (Continued) , 1995, ICML.

[111]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[112]  Pedro M. Domingos Occam's Two Razors: The Sharp and the Blunt , 1998, KDD.

[113]  Salvatore J. Stolfo,et al.  Learning Arbiter and Combiner Trees from Partitioned Data for Scaling Machine Learning , 1995, KDD.

[114]  Jo Ao Gama Combining Classiiers by Constructive Induction , 1998 .

[115]  W. Loh,et al.  Tree-Structured Classification via Generalized Discriminant Analysis. , 1988 .

[116]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[117]  Ron Kohavi,et al.  Error-Based and Entropy-Based Discretization of Continuous Features , 1996, KDD.