Few shot domain adaptation for in situ macromolecule structural classification in cryo-electron tomograms

MOTIVATION Cryo-Electron Tomography (cryo-ET) visualizes structure and spatial organization of macromolecules and their interactions with other subcellular components inside single cells in the close-to-native state at sub-molecular resolution. Such information is critical for the accurate understanding of cellular processes. However, subtomogram classification remains one of the major challenges for the systematic recognition and recovery of the macromolecule structures in cryo-ET because of imaging limits and data quantity. Recently, deep learning has significantly improved the throughput and accuracy of large-scale subtomogram classification. However often it is difficult to get enough high-quality annotated subtomogram data for supervised training due to the enormous expense of labeling. To tackle this problem, it is beneficial to utilize another already annotated dataset to assist the training process. However, due to the discrepancy of image intensity distribution between source domain and target domain, the model trained on subtomograms in source domain may perform poorly in predicting subtomogram classes in the target domain. RESULTS In this paper, we adapt a few shot domain adaptation method for deep learning based cross-domain subtomogram classification. The essential idea of our method consists of two parts: 1) take full advantage of the distribution of plentiful unlabeled target domain data, and 2) exploit the correlation between the whole source domain dataset and few labeled target domain data. Experiments conducted on simulated and real datasets show that our method achieves significant improvement on cross domain subtomogram classification compared with baseline methods.

[1]  N. O. Manning,et al.  The protein data bank , 1999 .

[2]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[3]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[4]  G Sapiro,et al.  Classification and 3D averaging with missing wedge correction in biological electron tomography. , 2008, Journal of structural biology.

[5]  R. Aebersold,et al.  Visual proteomics of the human pathogen Leptospira interrogans , 2009, Nature Methods.

[6]  Neil D. Lawrence,et al.  Dataset Shift in Machine Learning , 2009 .

[7]  Frank Alber,et al.  High-throughput subtomogram alignment and classification by Fourier space constrained fast volumetric matching. , 2012, Journal of structural biology.

[8]  J. Briggs Structural biology in situ--the potential of subtomogram averaging. , 2013, Current opinion in structural biology.

[9]  Alan McCree,et al.  Supervised domain adaptation for I-vector based speaker recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Kate Saenko,et al.  Deep CORAL: Correlation Alignment for Deep Domain Adaptation , 2016, ECCV Workshops.

[11]  O. Medalia,et al.  Cellular structural biology as revealed by cryo-electron tomography , 2016, Journal of Cell Science.

[12]  Michael I. Jordan,et al.  Unsupervised Domain Adaptation with Residual Transfer Networks , 2016, NIPS.

[13]  Quinn Jones,et al.  Few-Shot Adversarial Domain Adaptation , 2017, NIPS.

[14]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[16]  Eric P. Xing,et al.  Deep learning-based subdivision approach for large scale macromolecules structure recovery from electron cryo tomograms , 2017, Bioinform..

[17]  Karim Elmaaroufi,et al.  Improved deep learning-based macromolecules structure classification from electron cryo-tomograms , 2017, Machine Vision and Applications.

[18]  W. Baumeister,et al.  In Situ Structure of Neuronal C9orf72 Poly-GA Aggregates Reveals Proteasome Recruitment , 2018, Cell.

[19]  Patrick Kenny,et al.  Speaker Verification in Mismatched Conditions with Frustratingly Easy Domain Adaptation , 2018, Odyssey.

[20]  Yong Zi Tan,et al.  Reducing effects of particle adsorption to the air-water interface in cryoEM , 2018, Nature Methods.

[21]  Xiangrui Zeng,et al.  Adversarial domain adaptation for cross data source macromolecule in situ structural classification in cellular electron cryo-tomograms , 2019, Bioinform..

[22]  Emmanuel Moebel New strategies for the identification and enumeration of macromolecules in 3D images of cryo electron tomography. (Nouvelles stratégies pour l'identification et l'énumération de macromolécules dans des images de cryo-tomographie électronique 3D) , 2019 .