Gene Copy Number Analysis for Family Data Using Semiparametric Copula Model

Gene copy number changes are common characteristics of many genetic disorders. A new technology, array comparative genomic hybridization (a-CGH), is widely used today to screen for gains and losses in cancers and other genetic diseases with high resolution at the genome level or for specific chromosomal region. Statistical methods for analyzing such a-CGH data have been developed. However, most of the existing methods are for unrelated individual data and the results from them provide explanation for horizontal variations in copy number changes. It is potentially meaningful to develop a statistical method that will allow for the analysis of family data to investigate the vertical kinship effects as well. Here we consider a semiparametric model based on clustering method in which the marginal distributions are estimated nonparametrically, and the familial dependence structure is modeled by copula. The model is illustrated and evaluated using simulated data. Our results show that the proposed method is more robust than the commonly used multivariate normal model. Finally, we demonstrated the utility of our method using a real dataset.

[1]  N. Balakrishnan,et al.  Continuous Bivariate Distributions , 2009 .

[2]  Nir Friedman,et al.  Context-Specific Bayesian Clustering for Gene Expression Data , 2002, J. Comput. Biol..

[3]  C. Molony,et al.  Genetic analysis of genome-wide variation in human gene expression , 2004, Nature.

[4]  Ajay N. Jain,et al.  Assembly of microarrays for genome-wide measurement of DNA copy number , 2001, Nature Genetics.

[5]  W. Kuo,et al.  High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays , 1998, Nature Genetics.

[6]  C. Genest,et al.  A semiparametric estimation procedure of dependence parameters in multivariate families of distributions , 1995 .

[7]  Martin Crowder,et al.  Continuous Bivariate Distributions, Emphasizing Applications , 1993 .

[8]  Gilles Celeux,et al.  A statistical approach for CGH microarray data analysis , 2004 .

[9]  D. Oakes Multivariate survival distributions , 1994 .

[10]  D. Pinkel,et al.  Comparative Genomic Hybridization for Molecular Cytogenetic Analysis of Solid Tumors , 2022 .

[11]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[12]  Wenqing He,et al.  Semiparametric Clustering Method for microarray Data Analysis , 2008, J. Bioinform. Comput. Biol..

[13]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[14]  R. Redon,et al.  Copy Number Variation: New Insights in Genome Diversity References , 2006 .

[15]  Elena Marchiori,et al.  Breakpoint identification and smoothing of array comparative genomic hybridization data , 2004, Bioinform..

[16]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[17]  Wolfgang Breymann,et al.  Dependence structures for multivariate high-frequency data in finance , 2003 .

[18]  Elisa Rossi,et al.  Increased HER2 gene copy number is associated with response to gefitinib therapy in epidermal growth factor receptor-positive non-small-cell lung cancer patients. , 2005, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[19]  I. James,et al.  HIV: Experiencing the Pressures of Modern Life , 2005, Science.

[20]  D. Carr,et al.  Templates for Looking at Gene Expression Clustering , 1999 .

[21]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Paul H. C. Eilers,et al.  Quantile smoothing of array CGH data , 2005, Bioinform..

[23]  Ajay N. Jain,et al.  Hidden Markov models approach to the analysis of array CGH data , 2004 .

[24]  Jürgen Symanzik,et al.  Statistical Analysis of Spatial Point Patterns , 2005, Technometrics.

[25]  Satishs Iyengar,et al.  Multivariate Models and Dependence Concepts , 1998 .

[26]  M. Sklar Fonctions de repartition a n dimensions et leurs marges , 1959 .

[27]  P. Embrechts,et al.  Chapter 8 – Modelling Dependence with Copulas and Applications to Risk Management , 2003 .

[28]  Christian A. Rees,et al.  Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[29]  S. Jain,et al.  GENETIC STRUCTURE OF POPULATIONS. , 1975, Evolution; international journal of organic evolution.

[30]  H. Müller,et al.  Local Polynomial Modeling and Its Applications , 1998 .

[31]  H. Döhner,et al.  Matrix‐based comparative genomic hybridization: Biochips to screen for genomic imbalances , 1997, Genes, chromosomes & cancer.

[32]  Hans-Werner Mewes,et al.  Interpreting Clusters of Gene Expression Profiles in Terms of Metabolic Pathways , 1999, German Conference on Bioinformatics.

[33]  Joe W. Gray,et al.  Genome scanning with array CGH delineates regional alterations in mouse islet carcinomas , 2001, Nature Genetics.

[34]  R. Gascoyne,et al.  Impact of whole genome amplification on analysis of copy number variants , 2008, Nucleic acids research.

[35]  David W. Scott,et al.  Feasibility of multivariate density estimates , 1991 .

[36]  D. Schaid Mathematical and Statistical Methods for Genetic Analysis , 1999 .

[37]  Lue Ping Zhao,et al.  Array rank order regression analysis for the detection of gene copy-number changes in human cancer. , 2003, Genomics.

[38]  A. W. Kemp,et al.  Continuous Bivariate Distributions, Emphasising Applications , 1991 .

[39]  H. Ostrer,et al.  A versatile statistical analysis algorithm to detect genome copy number variation. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[40]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[41]  B. Rovin,et al.  The Influence of CCL 3 L 1 Gene – Containing Segmental Duplications on HIV-1 / AIDS Susceptibility , 2009 .

[42]  Sylvia Richardson,et al.  Detection of gene copy number changes in CGH microarrays using a spatially correlated mixture model , 2006, Bioinform..

[43]  Junbai Wang,et al.  M-CGH: Analysing microarray-based CGH experiments , 2004, BMC Bioinformatics.