New links for binary regression: an application to coca cultivation in Peru

Binary response data arise naturally in applications. In general, the well-known logistic and probit regression models form the basis for analyzing binary data in practice. These regression models make use of symmetric link functions (logit and probit links). However, many authors have emphasized the need of asymmetric links in modeling binary response data. In this paper, we consider a broad class of parametric link functions that contains as special cases both symmetric as well as asymmetric links. Furthermore, this class of links is quite flexible and simple, and may be an interesting alternative to the usual regression models for binary data. We consider a frequentist approach to perform inferences, and the maximum likelihood method is employed to estimate the model parameters. We also propose residuals for the link models to assess departures from model assumptions as well as to detect outlying observations. Additionally, the local influence method is discussed, and the normal curvatures for studying local influence are derived under two specific perturbation schemes. Finally, an application to the coca leaf cultivation in Peru is considered to show the usefulness of the proposed link models in practice.

[1]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[2]  R. Prentice,et al.  A generalization of the probit and logit methods for dose response curves. , 1976, Biometrics.

[3]  D. Collett Modelling Binary Data , 1991 .

[4]  Parametric link modification of both tails in binary regression , 1994 .

[5]  Q. Vuong Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses , 1989 .

[6]  D. Pregibon Goodness of Link Tests for Generalized Linear Models , 1980 .

[7]  Francisco J. Aranda-Ordaz,et al.  On Two Families of Transformations to Additivity for Binary Response Data , 1981 .

[8]  Víctor M. Guerrero,et al.  Use of the Box-Cox transformation with binary response models , 1982 .

[9]  Calyampudi R. Rao,et al.  Linear Statistical Inference and Its Applications. , 1975 .

[10]  Bayesian inference of binary regression models with parametric link , 1994 .

[11]  Dipak K. Dey,et al.  A new class of flexible link functions with application to species co-occurrence in cape floristic region , 2013, 1401.1915.

[12]  Siddhartha Chib,et al.  Inference in Semiparametric Dynamic Models for Binary Longitudinal Data , 2006 .

[13]  Ramesh C. Gupta,et al.  Analyzing skewed data by power normal model , 2008 .

[14]  Calyampudi Radhakrishna Rao,et al.  Linear Statistical Inference and its Applications , 1967 .

[15]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[16]  B. Arnold,et al.  Measuring Skewness with Respect to the Mode , 1995 .

[17]  Dipak K. Dey,et al.  Generalized extreme value regression for binary response data: An application to B2B electronic payments system adoption , 2011, 1101.1373.

[18]  Josemar Rodrigues,et al.  Bayesian skew-probit regression for binary response data , 2014 .

[19]  A. Agresti Analysis of Ordinal Categorical Data , 1985 .

[20]  H. Bolfarine,et al.  A skew item response model , 2006 .

[21]  Robert E. Weiss,et al.  The Cost of Adding Parameters to a Model , 1996 .

[22]  Xia Wang,et al.  Flexible link functions in a joint model of binary and longitudinal data , 2015 .

[23]  M. Newton,et al.  Bayesian Inference for Semiparametric Binary Regression , 1996 .

[24]  R. Spady,et al.  AN EFFICIENT SEMIPARAMETRIC ESTIMATOR FOR BINARY RESPONSE MODELS , 1993 .

[25]  S. Kotz,et al.  Symmetric Multivariate and Related Distributions , 1989 .

[26]  R. Cook,et al.  Assessing influence on predictions from generalized linear models , 1990 .

[27]  R. Dennis Cook,et al.  Assessing influence on regression coefficients in generalized linear models , 1989 .

[28]  Peter K. Dunn,et al.  Randomized Quantile Residuals , 1996 .

[29]  Michael H. Kutner Applied Linear Statistical Models , 1974 .

[30]  T. Stukel Generalized Logistic Models , 1988 .

[31]  Jonathan Nagler,et al.  Scobit: An Alternative Estimator to Logit and Probit , 1994 .

[32]  A. Agresti Analysis of Ordinal Categorical Data: Agresti/Analysis , 2010 .

[33]  D. Dey,et al.  A New Skewed Link Model for Dichotomous Quantal Response Data , 1999 .

[34]  Chuanhai Liu Robit Regression: A Simple Robust Alternative to Logistic and Probit Regression , 2005 .

[35]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[36]  Jorge Luis Bazán,et al.  Una clasificación de modelos de regresión binaria asimétrica: el uso del BAYES-PUCP en una aplicación sobre la decisión del cultivo ilícito de hoja de coca , 2008, Economia.

[37]  S. Chib,et al.  Marginal Likelihood and Bayes Factors for Dirichlet Process Mixture Models , 2003 .

[38]  Anthony C. Atkinson,et al.  Plots, transformations, and regression : an introduction to graphical methods of diagnostic regression analysis , 1987 .

[39]  D. Cox,et al.  Parameter Orthogonality and Approximate Conditional Inference , 1987 .

[40]  D. Dey,et al.  Flexible generalized t-link models for binary response data , 2008 .

[41]  Claudia Czado,et al.  The effect of link misspecification on binary regression inference , 1992 .