The center for expanded data annotation and retrieval

The Center for Expanded Data Annotation and Retrieval is studying the creation of comprehensive and expressive metadata for biomedical datasets to facilitate data discovery, data interpretation, and data reuse. We take advantage of emerging community-based standard templates for describing different kinds of biomedical datasets, and we investigate the use of computational techniques to help investigators to assemble templates and to fill in their values. We are creating a repository of metadata from which we plan to identify metadata patterns that will drive predictive data entry when filling in metadata templates. The metadata repository not only will capture annotations specified when experimental datasets are initially created, but also will incorporate links to the published literature, including secondary analyses and possible refinements or retractions of experimental interpretations. By working initially with the Human Immunology Project Consortium and the developers of the ImmPort data repository, we are developing and evaluating an end-to-end solution to the problems of metadata authoring and management that will generalize to other data-management environments.

[1]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[2]  Jane Greenberg,et al.  Metadata Extraction and Harvesting , 2004 .

[3]  Jane Greenberg,et al.  Understanding Metadata and Metadata Schemes , 2005 .

[4]  Oliver Hofmann,et al.  ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level , 2010, Bioinform..

[5]  Tom Cramer,et al.  Designing and Implementing Second Generation Digital Preservation Services: A Scalable Model for the Stanford Digital Repository , 2010, D Lib Mag..

[6]  Christine L Borgman,et al.  Science friction: Data, metadata, and collaboration , 2011, Social studies of science.

[7]  Csongor Nyulas,et al.  BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications , 2011, Nucleic Acids Res..

[8]  C. Tenopir,et al.  Data Sharing by Scientists: Practices and Perceptions , 2011, PloS one.

[9]  Christopher G. Chute,et al.  The National Center for Biomedical Ontology , 2012, J. Am. Medical Informatics Assoc..

[10]  Christine L. Borgman,et al.  The conundrum of sharing research data , 2012, J. Assoc. Inf. Sci. Technol..

[11]  Anne E. Trefethen,et al.  Toward interoperable bioscience data , 2012, Nature Genetics.

[12]  R. Service Biology's dry future. , 2013, Science.

[13]  Lynn Yarmey,et al.  Towards Standardization: A Participatory Framework for Scientific Standard-Making , 2013, Int. J. Digit. Curation.

[14]  Nicole A. Vasilevsky,et al.  On the reproducibility of science: unique identification of research resources in the biomedical literature , 2013, PeerJ.

[15]  Eric A. Fischer Public Access to Data from Federally Funded Research: Provisions in OMB Circular A-110 , 2013 .

[16]  Melissa Haendel,et al.  A sea of standards for omics data: sink or swim? , 2013, J. Am. Medical Informatics Assoc..

[17]  Jeffrey A. Wiser,et al.  ImmPort: disseminating data to the public for the future of immunology , 2014, Immunologic Research.

[18]  Michelle Dunn,et al.  The National Institutes of Health's Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data , 2014, J. Am. Medical Informatics Assoc..

[19]  Raphael Gottardo,et al.  Computational resources for high-dimensional immune analysis from the Human Immunology Project Consortium , 2014, Nature Biotechnology.

[20]  Tiffany C. Chao Mapping Methods Metadata for Research Data , 2015 .

[21]  H. Stehouwer Research Data Alliance: Research Data Sharing without Barriers , 2015 .

[22]  小森和樹 Gene Expression Omnibus利用方法の検討 , 2016 .