Statistical properties of the quantile normalization method for density curve alignment.

The article investigates the large sample properties of the quantile normalization method by Bolstad et al. (2003) [4] which has become one of the most popular methods to align density curves in microarray data analysis. We prove consistency of this method which is viewed as a particular case of the structural expectation procedure for curve alignment, which corresponds to a notion of barycenter of measures in the Wasserstein space. Moreover, we show that, this method fails in some case of mixtures, and we propose a new methodology to cope with this issue.

[1]  Thibaut Le Gouic,et al.  Distribution's template estimate with Wasserstein metrics , 2011, 1111.5927.

[2]  M. Talagrand,et al.  Probability in Banach Spaces: Isoperimetry and Processes , 1991 .

[3]  T. Gasser,et al.  Synchronizing sample curves nonparametrically , 1999 .

[4]  T. Gasser,et al.  Alignment of curves by dynamic time warping , 1997 .

[5]  J. Ramsay,et al.  Curve registration by local regression , 2000 .

[6]  Terry Speed,et al.  Normalization of cDNA microarray data. , 2003, Methods.

[7]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[8]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[9]  Yee Hwa Yang,et al.  Preprocessing Two-Color Spotted Arrays , 2005 .

[10]  Jean-Michel Loubes,et al.  Non parametric estimation of the structural expectation of a stochastic increasing function , 2008, Stat. Comput..

[11]  Yee Hwa Yang,et al.  Normalization for two-color cDNA microarray data , 2003 .

[12]  Gareth M. James Curve alignment by moments , 2007, 0712.1425.

[13]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[14]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[15]  J. Pech,et al.  Regulatory Features Underlying Pollination-Dependent and -Independent Tomato Fruit Set Revealed by Transcript and Primary Metabolite Profiling[W] , 2009, The Plant Cell Online.

[16]  Fabrice Gamboa,et al.  Semi-parametric estimation of shifts , 2007, 0712.1936.

[17]  H. A. David,et al.  THE DISTRIBUTION OF THE RATIO, IN A SINGLE NORMAL SAMPLE, OF RANGE TO STANDARD DEVIATION , 1954 .

[18]  Bernard W. Silverman,et al.  Incorporating parametric effects into functional principal components analysis , 1995 .

[19]  H. Müller,et al.  Functional Convex Averaging and Synchronization for Time-Warped Random Curves , 2004 .

[20]  T. Gasser,et al.  Searching for Structure in Curve Samples , 1995 .

[21]  F. N. David,et al.  STATISTICAL TREATMENT OF CENSORED DATA PART I. FUNDAMENTAL FORMULAE , 1954 .

[22]  R. Serfling Approximation Theorems of Mathematical Statistics , 1980 .

[23]  Jean-Michel Loubes,et al.  MANIFOLD EMBEDDING FOR CURVE REGISTRATION , 2011, 1105.5565.

[24]  J. Ramsay,et al.  Curve registration , 2018, Oxford Handbooks Online.

[25]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[26]  Birgitte B. Rønn,et al.  Nonparametric maximum likelihood estimation for shifted curves , 2001 .

[27]  T. Gasser,et al.  Statistical Tools to Analyze Data Representing a Sample of Curves , 1992 .

[28]  Terry Speed,et al.  Design and analysis of comparative microarray experiments , 2003 .

[29]  Guillaume Carlier,et al.  Barycenters in the Wasserstein Space , 2011, SIAM J. Math. Anal..

[30]  J. Ramsay,et al.  Combining Registration and Fitting for Functional Models , 2008 .

[31]  Sandrine Dudoit,et al.  Bioconductor R Packages for Exploratory Analysis and Normalization of cDNA Microarray Data , 2003 .

[32]  Herbert A. David,et al.  Order Statistics, Third Edition , 2003, Wiley Series in Probability and Statistics.

[33]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[34]  Mark Pollitt,et al.  Exploration , 2006, J. Digit. Forensic Pract..

[35]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[36]  B. Arnold,et al.  A first course in order statistics , 2008 .