Automatic time-series phenotyping using massive feature extraction

Across a far-reaching diversity of scientific and industrial applications, a general key problem involves relating the structure of time-series data to a meaningful outcome, such as detecting anomalous events from sensor recordings, or diagnosing patients from physiological time-series measurements like heart rate or brain activity. Currently, researchers must devote considerable effort manually devising, or searching for, properties of their time series that are suitable for the particular analysis problem at hand. Addressing this non-systematic and time-consuming procedure, here we introduce a new tool, hctsa, that selects interpretable and useful properties of time series automatically, by comparing implementations over 7700 time-series features drawn from diverse scientific literatures. Using two exemplar biological applications, we show how hctsa allows researchers to leverage decades of time-series research to quantify and understand informative structure in their time-series data.

[1]  M. Capecchi,et al.  Virtual Histology of Transgenic Mouse Embryos for High-Throughput Phenotyping , 2006, PLoS genetics.

[2]  T. Insel,et al.  Wesleyan University From the SelectedWorks of Charles A . Sanislow , Ph . D . 2010 Research Domain Criteria ( RDoC ) : Toward a New Classification Framework for Research on Mental Disorders , 2018 .

[3]  Laura J. Grundy,et al.  A dictionary of behavioral motifs reveals clusters of genes affecting Caenorhabditis elegans locomotion , 2012, Proceedings of the National Academy of Sciences.

[4]  Nick S. Jones,et al.  Highly Comparative Feature-Based Time-Series Classification , 2014, IEEE Transactions on Knowledge and Data Engineering.

[5]  R. Kerr,et al.  Discovery of Brainwide Neural-Behavioral Maps via Multiscale Unsupervised Structure Learning , 2014, Science.

[6]  A. Gomez-Marin,et al.  Hierarchical Compression of C. elegans Locomotion Reveals Phenotypic Differences in the Organisation of Behaviour , 2016, bioRxiv.

[7]  George Hripcsak,et al.  Next-generation phenotyping of electronic health records , 2012, J. Am. Medical Informatics Assoc..

[8]  Giorgio F. Gilestro,et al.  Video tracking and analysis of sleep in Drosophila melanogaster , 2012, Nature Protocols.

[9]  Rajan P Kulkarni,et al.  Single-Cell Phenotyping within Transparent Intact Tissue through Whole-Body Clearing , 2014, Cell.

[10]  R. Isaac,et al.  Drosophila male sex peptide inhibits siesta sleep and promotes locomotor activity in the post-mated female , 2010, Proceedings of the Royal Society B: Biological Sciences.

[11]  Madalena Costa,et al.  Multiscale entropy analysis of biological signals. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Steve D. M. Brown,et al.  A systematic, genome-wide, phenotype-driven mutagenesis programme for gene function studies in the mouse , 2000, Nature Genetics.

[13]  Steve D. M. Brown,et al.  High-throughput mouse phenotyping. , 2011, Methods.

[14]  E. Z. Kim,et al.  High-Resolution Positional Tracking for Long-Term Analysis of Drosophila Sleep and Locomotion Using the “Tracker” Program , 2012, PloS one.

[15]  Max A. Little,et al.  Highly comparative time-series analysis: the empirical structure of time series and their methods , 2013, Journal of The Royal Society Interface.

[16]  Jamey S. Kain,et al.  Leg-tracking and automated behavioural classification in Drosophila , 2012, Nature Communications.

[17]  Nick S. Jones,et al.  Highly comparative fetal heart rate analysis , 2014, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.