GAUSSIAN COPULA MARGINAL REGRESSION FOR MODELING EXTREME DATA WITH APPLICATION

Regression is commonly used to determine the relationship between the response variable and the predictor variable, where the parameters are estimated by Ordinary Least Square (OLS). This method can be used with an assumption that residuals are normally distributed (0, σ2). However, the assumption of normality of the data is often violated due to extreme observations, which are often found in the climate data. Modeling of rice harvested area with rainfall predictor variables allows extreme observations. Therefore, another approximation is necessary to be applied in order to overcome the presence of extreme observations. The method used to solve this problem is a Gaussian Copula Marginal Regression (GCMR), the regression-based Copula. As a case study, the method is applied to model rice harvested area of rice production centers in East Java, Indonesia, covering District: Banyuwangi, Lamongan, Bojonegoro, Ngawi and Jember. Copula is chosen because this method is not strict against the assumption distribution, especially the normal distribution. Moreover, this method can describe dependency on extreme point clearly. The GCMR performance will be compared with OLS and Generalized Linear Models (GLM). The identification result of the dependencies structure between the Rice Harvest per period (RH) and monthly rainfall showed a dependency in all areas of research. It is shown that the real test copula type mostly follows the Gumbel distribution. While the comparison of the model goodness for rice harvested area in the modeling showed that the method used to model the exact GCMR in five districts RH1 and RH2 in Jember district since its lowest AICc. Looking at the data distribution pattern of response variables, it can be concluded that the GCMR good for modeling the response variable that is not normally distributed and tend to have a large skew.

[1]  Richard F. Gunst,et al.  Applied Regression Analysis , 1999, Technometrics.

[2]  Concha Bielza,et al.  Akaike Information Criterion , 2014 .

[3]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[4]  C. Varin,et al.  Gaussian Copula Marginal Regression , 2012 .

[5]  M. Sklar Fonctions de repartition a n dimensions et leurs marges , 1959 .

[6]  P. Friederichs,et al.  Multivariate non-normally distributed random variables in climate research - introduction to the copula approach , 2008 .

[7]  P. Embrechts,et al.  Chapter 8 – Modelling Dependence with Copulas and Applications to Risk Management , 2003 .

[8]  Norman R. Draper,et al.  Applied regression analysis (2. ed.) , 1981, Wiley series in probability and mathematical statistics.

[9]  Witold F. Krajewski,et al.  Modeling radar-rainfall estimation uncertainties using parametric and non-parametric approaches , 2008 .

[10]  P. Embrechts,et al.  Risk Management: Correlation and Dependence in Risk Management: Properties and Pitfalls , 2002 .

[11]  A. Agresti An introduction to categorical data analysis , 1997 .

[12]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[13]  J. V. Van Impe,et al.  Reflections on the use of robust and least-squares non-linear regression to model challenge tests conducted in/on food products. , 2005, International journal of food microbiology.

[14]  Forecasting Model of Rice Production Using Weighted Rainfall Index in Subang, Karawang, and Indramayu Regency , 2012 .

[15]  C. De Michele,et al.  A Generalized Pareto intensity‐duration model of storm rainfall exploiting 2‐Copulas , 2003 .

[16]  E. Diday,et al.  Clustering a Global Field of Atmospheric Profiles by Mixture Decomposition of Copulas , 2005 .