Zero-inflated generalized Poisson regression mixture model for mapping quantitative trait loci underlying count trait with many zeros.

Phenotypes measured in counts are commonly observed in nature. Statistical methods for mapping quantitative trait loci (QTL) underlying count traits are documented in the literature. The majority of them assume that the count phenotype follows a Poisson distribution with appropriate techniques being applied to handle data dispersion. When a count trait has a genetic basis, "naturally occurring" zero status also reflects the underlying gene effects. Simply ignoring or miss-handling the zero data may lead to wrong QTL inference. In this article, we propose an interval mapping approach for mapping QTL underlying count phenotypes containing many zeros. The effects of QTLs on the zero-inflated count trait are modelled through the zero-inflated generalized Poisson regression mixture model, which can handle the zero inflation and Poisson dispersion in the same distribution. We implement the approach using the EM algorithm with the Newton-Raphson algorithm embedded in the M-step, and provide a genome-wide scan for testing and estimating the QTL effects. The performance of the proposed method is evaluated through extensive simulation studies. Extensions to composite and multiple interval mapping are discussed. The utility of the developed approach is illustrated through a mouse F(2) intercross data set. Significant QTLs are detected to control mouse cholesterol gallstone formation.

[1]  E. Hill Journal of Theoretical Biology , 1961, Nature.

[2]  Trudy F. C. Mackay,et al.  Quantitative trait loci in Drosophila , 2001, Nature Reviews Genetics.

[3]  Z. Zeng Precision mapping of quantitative trait loci. , 1994, Genetics.

[4]  G. Churchill,et al.  Lith6: a new QTL for cholesterol gallstones from an intercross of CAST/Ei and DBA/2J inbred mouse strains. , 2003, Journal of lipid research.

[5]  Ahmed Rebai,et al.  Comparison of methods for regression interval mapping in QTL analysis with non-normal traits , 1997 .

[6]  G. Churchill,et al.  A statistical framework for quantitative trait mapping. , 2001, Genetics.

[7]  L Kruglyak,et al.  A nonparametric approach for mapping quantitative trait loci. , 1995, Genetics.

[8]  Jun Zhu,et al.  On the Generalized Poisson Regression Mixture Model for Mapping Quantitative Trait Loci With Count Data , 2006, Genetics.

[9]  T. C. Nesbitt,et al.  fw2.2: a quantitative trait locus key to the evolution of tomato fruit size. , 2000, Science.

[10]  J C Whittaker,et al.  Mapping quantitative trait Loci using generalized estimating equations. , 2001, Genetics.

[11]  R. Jansen,et al.  Interval mapping of multiple quantitative trait loci. , 1993, Genetics.

[12]  J van den Broek,et al.  A score test for zero inflation in a Poisson distribution. , 1995, Biometrics.

[13]  T. Sang,et al.  Rice Domestication by Reducing Shattering , 2007 .

[14]  Rongling Wu,et al.  Statistical Genetics of Quantitative Traits: Linkage, Maps and QTL , 2007 .

[15]  H. Akaike A new look at the statistical model identification , 1974 .

[16]  Felix Famoye,et al.  Zero-Inflated Generalized Poisson Regression Model with an Application to Domestic Violence Data , 2021, Journal of Data Science.

[17]  A generalized estimating equations approach to quantitative trait locus detection of non-normal traits , 2003, Genetics Selection Evolution.

[18]  C. Czado,et al.  Zero-inflated generalized Poisson models with regression effects on the mean, dispersion and zero-inflation level applied to patent outsourcing rates , 2007 .

[19]  David C. Heilbron,et al.  Zero-Altered and other Regression Models for Count Data with Added Zeros , 1994 .

[20]  W. Godwin Article in Press , 2000 .

[21]  G. Churchill,et al.  Single and interacting QTLs for cholesterol gallstones revealed in an intercross between mouse strains NZB and SM , 2005, Mammalian Genome.

[22]  R. Doerge,et al.  Empirical threshold values for quantitative trait mapping. , 1994, Genetics.

[23]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[24]  E. Lander,et al.  Mapping mendelian factors underlying quantitative traits using RFLP linkage maps. , 1989, Genetics.

[25]  S. Leal Genetics and Analysis of Quantitative Traits , 2001 .

[26]  G. Churchill,et al.  FXR and ABCG5/ABCG8 as determinants of cholesterol gallstone formation from quantitative trait locus mapping in mice. , 2003, Gastroenterology.

[27]  Diane Lambert,et al.  Zero-inflacted Poisson regression, with an application to defects in manufacturing , 1992 .

[28]  M A Newton,et al.  Genetic identification of multiple loci that control breast cancer susceptibility in the rat. , 1998, Genetics.

[29]  Z. Zeng,et al.  Multiple interval mapping for quantitative trait loci. , 1999, Genetics.

[30]  G. Churchill,et al.  New quantitative trait loci that contribute to cholesterol gallstone formation detected in an intercross of CAST/Ei and 129S1/SvImJ inbred mice. , 2003, Physiological genomics.

[31]  J. Mullahy Specification and testing of some modified count data models , 1986 .

[32]  P. Portincasa,et al.  Cholesterol gallstone disease , 2006, The Lancet.

[33]  John Hinde,et al.  Score tests for zero-inflated Poisson models , 2002 .

[34]  ON SOME STATISTICAL ASPECTS OF THE INTERVAL MAPPING FOR QTL DETECTION , 2005 .

[35]  G. Casella,et al.  A General Framework for Analyzing the Genetic Architecture of Developmental Characteristics , 2004, Genetics.

[36]  F. Famoye Restricted generalized poisson regression model , 1993 .

[37]  C. Haley,et al.  A simple regression method for mapping quantitative trait loci in line crosses using flanking markers , 1992, Heredity.