Group LASSO for Rainfall Data Modeling in Indramayu District, West Java, Indonesia

Abstract Regression problems with many potential predictors cannot be modeled using classical regression. In high dimensional data cases, predictors are much greater than its observations, so thus the penalized regression is better approach than classical regression in selecting predictors in the model. Group LASSO is one of penalized regression method which its predictors forming groups. Groups of predictors can be based on high correlation among predictors in a group, or can be based on its identically in shape, behavior or location. This paper performs rainfall data modeling using Group LASSO, with GLMSELECT procedure in SAS. The groups of predictors formed based on Principal Component Analysis. Rainfall data in Indramayu District, West Java, Indonesia as the response and 49 variables of Global Precipitation Climatology Project (GPCP) as its predictors. We have the best model for rainfall data based on the largest Adjusted R Square, minimum SBC and Root MSE.