DeepFly3D: A deep learning-based approach for 3D limb and appendage tracking in tethered, adult Drosophila

Studying how neural circuits orchestrate limbed behaviors requires the precise measurement of the positions of each appendage in 3-dimensional (3D) space. Deep neural networks can estimate 2-dimensional (2D) pose in freely behaving and tethered animals. However, the unique challenges associated with transforming these 2D measurements into reliable and precise 3D poses have not been addressed for small animals including the fly, Drosophila melanogaster. Here we present DeepFly3D, a software that infers the 3D pose of tethered, adult Drosophila—or other animals—using multiple camera images. DeepFly3D does not require manual calibration, uses pictorial structures to automatically detect and correct pose estimation errors, and uses active learning to iteratively improve performance. We demonstrate more accurate unsupervised behavioral embedding using 3D joint angles rather than commonly used 2D pose data. Thus, DeepFly3D enables the automated acquisition of behavioral measurements at an unprecedented level of resolution for a variety of biological applications.

[1]  Nicolas Roussel,et al.  1 € filter: a simple speed-based low-pass filter for noisy input in interactive systems , 2012, CHI.

[2]  Bernt Schiele,et al.  2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Cristian Sminchisescu,et al.  Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Michael Unser,et al.  Imaging neural activity in the ventral nerve cord of behaving adult Drosophila , 2018, Nature Communications.

[5]  Cordelia Schmid,et al.  LCR-Net: Localization-Classification-Regression for Human Pose , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Yichen Wei,et al.  Weakly-supervised Transfer for 3D Human Pose Estimation in the Wild , 2017, ArXiv.

[7]  Hans-Peter Seidel,et al.  VNect , 2017, ACM Trans. Graph..

[8]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[9]  Jeremy G Todd,et al.  Systematic exploration of unsupervised methods for mapping behavior , 2016, bioRxiv.

[10]  Michael B. Reiser,et al.  Two-photon calcium imaging from motion-sensitive neurons in head-fixed Drosophila during optomotor walking behavior , 2010, Nature Methods.

[11]  Cristian Sminchisescu,et al.  Latent structured models for human pose estimation , 2011, 2011 International Conference on Computer Vision.

[12]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[13]  Jamey S. Kain,et al.  Leg-tracking and automated behavioural classification in Drosophila , 2012, Nature Communications.

[14]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[15]  R. Mann,et al.  Quantification of gait parameters in freely walking wild type and sensory deprived Drosophila melanogaster , 2013, eLife.

[16]  Cristian Sminchisescu,et al.  Deep Multitask Architecture for Integrated 2D and 3D Human Sensing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Stefan R. Pulver,et al.  Ultra-sensitive fluorescent proteins for imaging neuronal activity , 2013, Nature.

[18]  Mikhail Kislin,et al.  Fast animal pose estimation using deep neural networks , 2018, Nature Methods.

[19]  Yichen Wei,et al.  Compositional Human Pose Regression , 2018, Comput. Vis. Image Underst..

[20]  Barry J. Dickson,et al.  Neuronal Control of Drosophila Walking Direction , 2014, Science.

[21]  Gordon J. Berman,et al.  Optogenetic dissection of descending behavioral control in Drosophila , 2017, bioRxiv.

[22]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[23]  Luc Van Gool,et al.  The WILDTRACK Multi-Camera Person Dataset , 2017, ArXiv.

[24]  A. Whitworth,et al.  Mechanisms of Parkinson's Disease: Lessons from Drosophila. , 2017, Current topics in developmental biology.

[25]  James E. Fitzgerald,et al.  Threshold-Based Ordering of Sequential Actions during Drosophila Courtship , 2019, Current Biology.

[26]  D. Tank,et al.  Imaging Large-Scale Neural Activity with Cellular Resolution in Awake, Mobile Mice , 2007, Neuron.

[27]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[28]  Pascal Fua,et al.  Learning to Fuse 2D and 3D Image Cues for Monocular Body Pose Estimation , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Francesc Moreno-Noguer,et al.  3D Human Pose Estimation from a Single Image via Distance Matrix Regression , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Michael Unser,et al.  FlyLimbTracker: An active contour based approach for leg segment tracking in unmarked, freely behaving Drosophila , 2016, bioRxiv.

[31]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[32]  Matthias Bethge,et al.  Using DeepLabCut for 3D markerless pose estimation across species and behaviors , 2018 .

[33]  Xiaowei Zhou,et al.  Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Julie H. Simpson,et al.  A neural command circuit for grooming movement control , 2015, eLife.

[35]  Stefan R. Pulver,et al.  Independent Optical Excitation of Distinct Neural Populations , 2014, Nature Methods.

[36]  Roy E. Ritzmann,et al.  Computer-Assisted 3D Kinematic Analysis of All Leg Joints in Walking Insects , 2010, PloS one.

[37]  Kevin M. Cury,et al.  DeepLabCut: markerless pose estimation of user-defined body parts with deep learning , 2018, Nature Neuroscience.

[38]  Yaser Sheikh,et al.  Hand Keypoint Detection in Single Images Using Multiview Bootstrapping , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Thomas B. Moeslund,et al.  Multiple cues used in model-based human motion capture , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[40]  William Bialek,et al.  Mapping the stereotyped behaviour of freely moving fruit flies , 2013, Journal of The Royal Society Interface.

[41]  James J. Little,et al.  A Simple Yet Effective Baseline for 3d Human Pose Estimation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  L. Mahadevan,et al.  Recovery of locomotion after injury in Drosophila melanogaster depends on proprioception , 2016, Journal of Experimental Biology.

[43]  Xiaowei Zhou,et al.  Harvesting Multiple Views for Marker-Less 3D Human Pose Annotations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  W. Bender,et al.  A Drosophila model of Parkinson's disease , 2000, Nature.

[45]  In-So Kweon,et al.  Accelerated Kmeans Clustering Using Binary Random Projection , 2014, ACCV.

[46]  Hans-Peter Seidel,et al.  General Automatic Human Shape and Motion Capture Using Volumetric Contour Cues , 2016, ECCV.

[47]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[48]  Jitendra Malik,et al.  Recovering 3D human body configurations using shape contexts , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Hideaki Kimata,et al.  Human Pose as Calibration Pattern: 3D Human Pose Estimation with Multiple Unsynchronized and Uncalibrated Cameras , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[50]  Julie H. Simpson,et al.  A suppression hierarchy among competing motor programs drives sequential grooming in Drosophila , 2014, eLife.

[51]  Lourdes Agapito,et al.  Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Jonathan Tompson,et al.  Efficient ConvNet-based marker-less motion capture in general scenes with a low number of cameras , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Richard Szeliski,et al.  Vision Algorithms: Theory and Practice , 2002, Lecture Notes in Computer Science.