An alternative approach for the prediction of significant wave heights based on classification and regression trees

Abstract In this study, the performances of classification and regression trees for the prediction of significant wave heights were investigated. The data set used in this study is comprised of 5 years of wave and wind data gathered from a deep water location in Lake Michigan. Training and testing data include wind speed and wind direction as the input variables and significant wave heights ( H s ) as the output variable. To build the classification trees, a C5 algorithm was invoked. Then, significant wave heights for the whole data set were grouped into wave height bins of 0.25 m and a class was assigned to each bin. For evaluation of the developed model, the index of each predicted class was compared with that of the observed data. The CART algorithm was employed for building and evaluating regression trees. Results of decision trees were then compared with those of artificial neural networks (ANNs). The error statistics of decision trees and ANNs were nearly similar. Results indicate that the decision tree, as an efficient novel approach with an acceptable range of error, can be used successfully for prediction of H s . It is argued that the advantage of decision trees is that, in contrast to neural networks, they represent rules.

[1]  Dina Makarynska,et al.  Artificial neural networks in wave predictions at the west coast of Portugal , 2005, Comput. Geosci..

[2]  Mehmet Özger,et al.  Prediction of wave parameters by using fuzzy logic approach , 2007 .

[3]  Judith J Baker,et al.  Medicare payment system for hospital inpatients: diagnosis-related groups. , 2002, Journal of health care finance.

[4]  M. C. Deo,et al.  Neural networks for wave forecasting , 2001 .

[5]  Hamid R. Nemati,et al.  Organizational Data Mining: Leveraging Enterprise Data Resources for Optimal Performance , 2003 .

[6]  Makarand Deo,et al.  Real time wave forecasting using neural networks , 1998 .

[7]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[8]  John Wang,et al.  Data Mining: Opportunities and Challenges , 2003 .

[9]  M. Deo,et al.  Genetic programming for retrieving missing information in wave records along the west coast of India , 2007 .

[10]  M. Deo,et al.  Real-time wave forecasts off the western Indian coast , 2007 .

[11]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[12]  O. Makarynskyy,et al.  Improving wave predictions with artificial neural networks , 2004 .

[13]  Makarand Deo,et al.  Neural networks in ocean engineering , 2006 .

[14]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[15]  M. H. Kazeminezhad,et al.  Hindcasting of wave parameters using different soft computing methods , 2008 .

[16]  Prasanta Kumar Dey,et al.  Project risk management: a combined analytic hierarchy process and decision tree approach , 2002 .

[17]  Amir Etemad-Shahidi,et al.  Application of two numerical models for wave hindcasting in Lake Erie , 2007 .

[18]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[19]  Mehmed Kantardzic,et al.  Data Mining: Concepts, Models, Methods, and Algorithms , 2002 .

[20]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[21]  Seyed Jamshid Mousavi,et al.  APPLICATION OF FUZZY INFERENCE SYSTEM IN THE PREDICTION OF WAVE PARAMETERS , 2005 .