Automatic Pipeline for Cryo-EM Data Preprocessing

Cryo-electron microscopy (cryo-EM) is arguably the fastest growing technique in structural biology (Callaway, 2020). Since 2012, the number of deposited structures determined with cryo-EM has been increasing at rates unmatched by the other techniques, namely X-ray crystallography and nuclear magnetic resonance (NMR). This has led to the huge investments in the required infrastructure, made by universities and government-funding agencies worldwide. Moreover, with the recent improvement in sample preparation and data collection (Jain et al., 2012; Arnold et al., 2017; Cheng et al., 2018; Zivanov et al., 2018; Darrow et al., 2019; Ravelli et al., 2019), a single cryo-EM instrument can easily generate more than one dataset (5,000 8,000 movies) per day. Data quality assessment and user-free preprocessing will soon become the bottleneck in the “high-throughput” era of cryo-EM.