Modular learning models in forecasting natural phenomena

Modular model is a particular type of committee machine and is comprised of a set of specialized (local) models each of which is responsible for a particular region of the input space, and may be trained on a subset of training set. Many algorithms for allocating such regions to local models typically do this in automatic fashion. In forecasting natural processes, however, domain experts want to bring in more knowledge into such allocation, and to have certain control over the choice of models. This paper presents a number of approaches to building modular models based on various types of splits of training set and combining the models' outputs (hard splits, statistically and deterministically driven soft combinations of models, 'fuzzy committees', etc.). An issue of including a domain expert into the modeling process is also discussed, and new algorithms in the class of model trees (piece-wise linear modular regression models) are presented. Comparison of the algorithms based on modular local modeling to the more traditional 'global' learning models on a number of benchmark tests and river flow forecasting problems shows their higher accuracy and transparency of the resulting models.

[1]  Rich Caruana,et al.  Greedy Attribute Selection , 1994, ICML.

[2]  Ian H. Witten,et al.  Selecting multiway splits in decision trees , 1996 .

[3]  J. R. Quinlan Learning With Continuous Classes , 1992 .

[4]  K. S. Tang,et al.  Genetic Algorithms: Concepts and Designs with Disk , 1999 .

[5]  Dimitri P. Solomatine,et al.  On the encapsulation of numerical-hydraulic models in artificial neural network , 1999 .

[6]  Geoff Holmes,et al.  Optimizing the Induction of Alternating Decision Trees , 2001, PAKDD.

[7]  Sam Kwong,et al.  Genetic Algorithms : Concepts and Designs , 1998 .

[8]  D. Solomatine,et al.  Model trees as an alternative to neural networks in rainfall—runoff modelling , 2003 .

[9]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[10]  D.P. Solomatine,et al.  Semi-optimal hierarchical regression models and ANNs , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[11]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[12]  Yong Wang,et al.  Using Model Trees for Classification , 1998, Machine Learning.

[13]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[14]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[15]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[16]  Paola Campadelli,et al.  A Boosting Algorithm for Regression , 1997, ICANN.

[17]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[18]  Amanda J. C. Sharkey,et al.  On Combining Artificial Neural Nets , 1996, Connect. Sci..

[19]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[20]  Hans-Peter Kriegel,et al.  Visual classification: an interactive approach to decision tree construction , 1999, KDD '99.

[21]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[22]  Boonserm Kijsirikul,et al.  Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining , 2003, KDD 2003.

[23]  Jiawei Han,et al.  Generalization and decision tree induction: efficient classification in data mining , 1997, Proceedings Seventh International Workshop on Research Issues in Data Engineering. High Performance Database Management for Large-Scale Applications.

[24]  Diego G. Loyola,et al.  Applications of neural network methods to the processing of Earth observation satellite data , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[25]  Ian H. Witten,et al.  Interactive machine learning: letting users build classifiers , 2002, Int. J. Hum. Comput. Stud..

[26]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[27]  Yoav Freund,et al.  The Alternating Decision Tree Learning Algorithm , 1999, ICML.

[28]  D. P. Solomatine,et al.  Two Strategies of Adaptive Cluster Covering with Descent and Their Comparison to Other Algorithms , 1999, J. Glob. Optim..

[29]  Durga Lal Shrestha,et al.  Instance‐based learning compared to other data‐driven methods in hydrological forecasting , 2008 .

[30]  Kristin P. Bennett,et al.  Global Tree Optimization: A Non-greedy Decision Tree Algorithm , 2007 .

[31]  Ian Witten,et al.  Data Mining , 2000 .

[32]  Marko Robnik-Sikonja,et al.  Pruning Regression Trees with MDL , 1998, ECAI.

[33]  Ronald L. Rivest,et al.  Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..

[34]  Durga L. Shrestha,et al.  Experiments with AdaBoost.RT, an Improved Boosting Scheme for Regression , 2006, Neural Computation.

[35]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[36]  Linda See,et al.  Applying soft computing approaches to river level forecasting , 1999 .

[37]  Amanda J. C. Sharkey,et al.  Boosting Using Neural Networks , 1999 .

[38]  Ian H. Witten,et al.  Induction of model trees for predicting continuous classes , 1996 .

[39]  D.P. Solomatine,et al.  AdaBoost.RT: a boosting algorithm for regression problems , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[40]  Aimo A. Törn,et al.  Global Optimization , 1999, Science.

[41]  L. Breiman Stacked Regressions , 1996, Machine Learning.

[42]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[43]  Nikola K. Kasabov,et al.  DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction , 2002, IEEE Trans. Fuzzy Syst..

[44]  Dimitri P. Solomatine,et al.  Machine learning in sedimentation modelling , 2006, Neural Networks.

[45]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[46]  Paul E. Utgoff,et al.  Decision Tree Induction Based on Efficient Tree Restructuring , 1997, Machine Learning.

[47]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[48]  TreesKristin P. Bennett,et al.  Optimal Decision Trees , 1996 .

[49]  Grigorios Tsoumakas,et al.  Effective Stacking of Distributed Classifiers , 2002, ECAI.