Automatic time-series phenotyping using massive feature extraction

Phenotype measurements frequently take the form of time series, but we currently lack a systematic method for relating these complex data streams to scientifically meaningful outcomes, such as relating the movement dynamics of a model organism to their genotype, or measurements of brain dynamics of a patient to their disease diagnosis. Here we report a new tool, hctsa, that automatically selects interpretable and useful properties of time series by comparing over 7 700 time-series features drawn from diverse scientific literatures. Using exemplar applications to high throughput phenotyping experiments, we show how hctsa allows researchers to leverage decades of time-series research to understand and quantify informative structure in time-series data.

[1]  Max A. Little,et al.  Highly comparative time-series analysis: the empirical structure of time series and their methods , 2013, Journal of The Royal Society Interface.

[2]  Rajan P Kulkarni,et al.  Single-Cell Phenotyping within Transparent Intact Tissue through Whole-Body Clearing , 2014, Cell.

[3]  R. Isaac,et al.  Drosophila male sex peptide inhibits siesta sleep and promotes locomotor activity in the post-mated female , 2010, Proceedings of the Royal Society B: Biological Sciences.

[4]  Nick S. Jones,et al.  Highly Comparative Feature-Based Time-Series Classification , 2014, IEEE Transactions on Knowledge and Data Engineering.

[5]  Jamey S. Kain,et al.  Leg-tracking and automated behavioural classification in Drosophila , 2012, Nature Communications.

[6]  Nick S. Jones,et al.  Highly comparative fetal heart rate analysis , 2014, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[7]  E. Z. Kim,et al.  High-Resolution Positional Tracking for Long-Term Analysis of Drosophila Sleep and Locomotion Using the “Tracker” Program , 2012, PloS one.

[8]  A. Gomez-Marin,et al.  Hierarchical Compression of C. elegans Locomotion Reveals Phenotypic Differences in the Organisation of Behaviour , 2016, bioRxiv.

[9]  George Hripcsak,et al.  Next-generation phenotyping of electronic health records , 2012, J. Am. Medical Informatics Assoc..

[10]  R. Kerr,et al.  Discovery of Brainwide Neural-Behavioral Maps via Multiscale Unsupervised Structure Learning , 2014, Science.

[11]  Madalena Costa,et al.  Multiscale entropy analysis of biological signals. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Steve D. M. Brown,et al.  A systematic, genome-wide, phenotype-driven mutagenesis programme for gene function studies in the mouse , 2000, Nature Genetics.

[13]  Steve D. M. Brown,et al.  High-throughput mouse phenotyping. , 2011, Methods.

[14]  Giorgio F. Gilestro,et al.  Video tracking and analysis of sleep in Drosophila melanogaster , 2012, Nature Protocols.

[15]  M. Capecchi,et al.  Virtual Histology of Transgenic Mouse Embryos for High-Throughput Phenotyping , 2006, PLoS genetics.

[16]  T. Insel,et al.  Wesleyan University From the SelectedWorks of Charles A . Sanislow , Ph . D . 2010 Research Domain Criteria ( RDoC ) : Toward a New Classification Framework for Research on Mental Disorders , 2018 .

[17]  Laura J. Grundy,et al.  A dictionary of behavioral motifs reveals clusters of genes affecting Caenorhabditis elegans locomotion , 2012, Proceedings of the National Academy of Sciences.