NITPicker: selecting time points for follow-up experiments

BackgroundThe design of an experiment influences both what a researcher can measure, as well as how much confidence can be placed in the results. As such, it is vitally important that experimental design decisions do not systematically bias research outcomes. At the same time, making optimal design decisions can produce results leading to statistically stronger conclusions. Deciding where and when to sample are among the most critical aspects of many experimental designs; for example, we might have to choose the time points at which to measure some quantity in a time series experiment. Choosing times which are too far apart could result in missing short bursts of activity. On the other hand, there may be time points which provide very little information regarding the overall behaviour of the quantity in question.ResultsIn this study, we develop a tool called NITPicker (Next Iteration Time-point Picker) for selecting optimal time points (or spatial points along a single axis), that eliminates some of the biases caused by human decision-making, while maximising information about the shape of the underlying curves. NITPicker uses ideas from the field of functional data analysis. NITPicker is available on the Comprehensive R Archive Network (CRAN) and code for drawing figures is available on Github (https://github.com/ezer/NITPicker).ConclusionsNITPicker performs well on diverse real-world datasets that would be relevant for varied biological applications, including designing follow-up experiments for longitudinal gene expression data, weather pattern changes over time, and growth curves.

[1]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[2]  Regina Y. Liu,et al.  DD-Classifier: Nonparametric Classification Procedure Based on DD-Plot , 2012 .

[3]  Jeffrey T. Leek,et al.  Gene expression EDGE : extraction and analysis of differential gene expression , 2006 .

[4]  Ziv Bar-Joseph,et al.  Selecting the most appropriate time points to profile in high-throughput studies , 2017, eLife.

[5]  B. Graveley The developmental transcriptome of Drosophila melanogaster , 2010, Nature.

[6]  Berthold Göttgens,et al.  Determining Physical Mechanisms of Gene Expression Regulation from Single Cell Gene Expression Data , 2016, PLoS Comput. Biol..

[7]  Carl de Boor,et al.  A Practical Guide to Splines. , 1980 .

[8]  Hao Ji,et al.  Optimal designs for longitudinal and functional data , 2016, 1604.05375.

[9]  R. D. Tuddenham,et al.  Physical growth of California boys and girls from birth to eighteen years. , 1954, Publications in child development. University of California, Berkeley.

[10]  Scott M. Palmer,et al.  LungMAP: The Molecular Atlas of Lung Development Program , 2017, American journal of physiology. Lung cellular and molecular physiology.

[11]  Katja E. Jaeger,et al.  The Evening Complex coordinates environmental and endogenous signals in Arabidopsis , 2017, Nature Plants.

[12]  Felix Naef,et al.  What shapes eukaryotic transcriptional bursting? , 2017, Molecular bioSystems.

[13]  L. Hillier,et al.  The time-resolved transcriptome of C. elegans , 2016, Genome research.

[14]  Ziv Bar-Joseph,et al.  Active learning for sampling in time-series experiments with application to gene expression analysis , 2005, ICML.

[15]  Detlef Weigel,et al.  LNK genes integrate light and clock signaling networks at the core of the Arabidopsis oscillator , 2013, Proceedings of the National Academy of Sciences.

[16]  Hans-Georg Müller Functional Data Analysis. , 2011 .

[17]  Manuel Febrero Bande,et al.  The DDG-classifier in the functional setting , 2017 .

[18]  Wei Wu,et al.  Generative models for functional data using phase and amplitude separation , 2012, Comput. Stat. Data Anal..

[19]  Ji Zhang,et al.  Optimal timepoint sampling in high-throughput gene expression experiments , 2012, Bioinform..

[20]  T. Lundstedt,et al.  Experimental design and optimization , 1998 .

[21]  Jane-Ling Wang,et al.  Review of Functional Data Analysis , 2015, 1507.05135.

[22]  Michal Linial,et al.  Novel Unsupervised Feature Filtering of Biological Data , 2006, ISMB.

[23]  D. R. Causton,et al.  A modern tool for classical plant growth analysis. , 2002, Annals of botany.

[24]  C. R. Deboor,et al.  A practical guide to splines , 1978 .

[25]  J. A. Cuesta-Albertos,et al.  The $$\hbox {DD}^G$$DDG-classifier in the functional setting , 2015, 1501.00372.