GEOMetaCuration: a web-based application for accurate manual curation of Gene Expression Omnibus metadata

Abstract Metadata curation has become increasingly important for biological discovery and biomedical research because a large amount of heterogeneous biological data is currently freely available. To facilitate efficient metadata curation, we developed an easy-to-use web-based curation application, GEOMetaCuration, for curating the metadata of Gene Expression Omnibus datasets. It can eliminate mechanical operations that consume precious curation time and can help coordinate curation efforts among multiple curators. It improves the curation process by introducing various features that are critical to metadata curation, such as a back-end curation management system and a curator-friendly front-end. The application is based on a commonly used web development framework of Python/Django and is open-sourced under the GNU General Public License V3. GEOMetaCuration is expected to benefit the biocuration community and to contribute to computational generation of biological insights using large-scale biological data. An example use case can be found at the demo website: http://geometacuration.yubiolab.org. Database URL: https://bitbucket.com/yubiolab/GEOMetaCuration

[1]  AnHai Doan,et al.  MetaSRA: normalized human sample-specific metadata for the Sequence Read Archive , 2017, Bioinform..

[2]  Halil Kilicoglu,et al.  Biomedical Text Mining for Research Rigor and Integrity: Tasks, Challenges, Directions , 2017, bioRxiv.

[3]  Evan Bolton,et al.  Database resources of the National Center for Biotechnology Information , 2017, Nucleic Acids Res..

[4]  Winston A Hide,et al.  Big data: The future of biocuration , 2008, Nature.

[5]  Jin Li,et al.  SFMetaDB: a comprehensive annotation of mouse RNA splicing factor RNA-Seq datasets , 2017, bioRxiv.

[6]  Peng Yu,et al.  RNASeqMetaDB: a database and web server for navigating metadata of publicly available mouse RNA-Seq datasets , 2015, Bioinform..

[7]  Gongshe Yang,et al.  An Additive Effect of Promoting Thermogenic Gene Expression in Mice Adipose-Derived Stromal Vascular Cells by Combination of Rosiglitazone and CL316,243 , 2017, International journal of molecular sciences.

[8]  Samantha A. Morris,et al.  CellNet: Network Biology Applied to Stem Cell Engineering , 2014, Cell.

[9]  Delphine Dauga,et al.  Biocuration: A New Challenge for the Tunicate Community , 2015, Genesis.

[10]  Shingo Kajimura,et al.  PPARγ agonists induce a white-to-brown fat conversion through stabilization of PRDM16 protein. , 2012, Cell metabolism.

[11]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[12]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.