Partial least squares regression based variables selection for water level predictions

Floods are common phenomenon in the state of Kuala Krai, specifically in Kelantan-Malaysia. Every year , floods affecting biodiversity on this region and al so causing property loss of this residential area. The residents in Kelantan always suffered from floods s ince the water overflows to the areas adjoining to the rivers, lakes or dams. Months, average monthly rain fall, temperature, relative humidity and surface wi nd were used as predictors while the water level of Ga las River was used as response. The selection of su itable predictor variables becomes an important issue for developing prediction model since the analysis data uses many variables from meteorological and hydrogical departments. In this study, we conduct K-fold CrossValidation (CV) to select the important variables f or the water level predictions. A suitable predicti on model is needed to forecast the water level in Gala s River by adopting the Ordinary Linear Regression (OLR) and Partial Least Squares Regression (PLSR). However, we need to perform pre-processing data of the datasets since the original data contain missin g data. We perform two types of pre-processing data which are using mean of the corresponding months (t ype I pre-processing data) and OLR (type II preprocessing data) of missing data. Based on the experiment, PLSR is more suitable model rather than OLR for predicting the water level in Galas River and t he use of the type I pre-processing data gives high er accuracy than the type II pre-processing data.

[1]  Seymour Geisser,et al.  The Predictive Sample Reuse Method with Applications , 1975 .

[2]  S. D. Jong SIMPLS: an alternative approach to partial least squares regression , 1993 .

[3]  J. Carrillo-Rivera,et al.  Groundwater Flow Systems and Their Response to Climate Change: A Need for a Water-System View Approach , 2012 .

[4]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[5]  L. Kajfez-Bogataj,et al.  N–PLS regression as empirical downscaling tool in climate change studies , 2005 .

[6]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[7]  E. Noji,et al.  The Public Health Consequences of Disasters , 2000, Prehospital and Disaster Medicine.

[8]  Shinichi Morishita,et al.  On Classification and Regression , 1998, Discovery Science.

[9]  S. Larson The shrinkage of the coefficient of multiple correlation. , 1931 .

[10]  Antoni Wibowo,et al.  Predictions of water level in Dungun River Terengganu using partial least squares regression , 2012 .

[11]  Manish Kumar Goyal,et al.  Application of PLS-Regression as Downscaling Tool for Pichola Lake Basin in India , 2010 .

[12]  Rosmina A. Bustami,et al.  Artificial neural network for precipitation and water level predictions of Bedup River , 2007 .

[13]  Roman Rosipal,et al.  Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space , 2002, J. Mach. Learn. Res..

[14]  Bjørn-Helge Mevik,et al.  Mean squared error of prediction (MSEP) estimates for principal component regression (PCR) and partial least squares regression (PLSR) , 2004 .

[15]  R. Joyce,et al.  1 Mesoamerica : A Working Model for Archaeology , 2003 .

[16]  Ilan Noy,et al.  NATURAL DISASTERS , 2011 .

[17]  Debashis Kushary,et al.  Bootstrap Methods and Their Application , 2000, Technometrics.

[18]  Mohd Alauddin Mohd Ali,et al.  Monitoring of GPS Precipitable Water Vapor During the Severe Flood in Kelantan , 2012 .

[19]  S. Wold,et al.  A randomization test for PLS component selection , 2007 .

[20]  Ilan Kelman,et al.  An analysis of the causes and circumstances of flood disaster deaths. , 2005, Disasters.

[21]  I. Helland ON THE STRUCTURE OF PARTIAL LEAST SQUARES REGRESSION , 1988 .

[22]  Harald Martens,et al.  A multivariate calibration problem in analytical chemistry solved by partial least-squares models in latent variables , 1983 .

[23]  Maosheng Zhong,et al.  A Model on the Relation between the Rainfall in Poyang Lake Basin and Its Water Level , 2010, 2010 4th International Conference on Bioinformatics and Biomedical Engineering.

[24]  Kristin P. Bennett,et al.  An Optimization Perspective on Kernel Partial Least Squares Regression , 2003 .

[25]  Maomei Wang,et al.  Water Level Variation and Prediction of the Pingshan Sinkhole in Guizhou, Southwestern China , 2008 .