EXTRACTING AND STRUCTURING SUBCELLULAR LOCATION INFORMATION FROM ON-LINE JOURNAL ARTICLES: THE SUBCELLULAR LOCATION IMAGE FINDER

Previous applications of information extraction methods to articles in biomedical journals have predominantly been based on interpreting article text. This often leads to uncertainty about whether statements that are found are attempts at reviews or summaries of data in other papers, conjectures or opinions, or conclusions from evidence presented in the paper at hand. The ability to extract information from the primary data presented in an article, which is often in the form of images, would allow more accurate information to be extracted. Towards this end, we have built a system that extracts information on one particular aspect of biology from a combination of text and image in journal articles. The design and performance of this system are described here, along with conclusions about possible improvements in the scientific publishing process that we have drawn from our implementation process.

[1]  Robert F. Murphy,et al.  A neural network classifier capable of recognizing the patterns of all major subcellular structures in fluorescence microscope images of HeLa cells , 2001, Bioinform..

[2]  Jie Yao,et al.  Searching online journals for fluorescence microscope images depicting protein subcellular location patterns , 2001, Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001).

[3]  William W. Cohen,et al.  Extracting information from text and images for location proteomics , 2003, BIOKDD.

[4]  Kai Huang,et al.  Boosting accuracy of automated classification of fluorescence microscope images for location proteomics , 2004, BMC Bioinformatics.

[5]  William W. Cohen Infrastructure Components for Large-Scale Information Extraction Systems , 2003, IAAI.

[6]  Robert F. Murphy,et al.  Robust Numerical Features for Description and Classification of Subcellular Location Patterns in Fluorescence Microscope Images , 2003, J. VLSI Signal Process..

[7]  William W. Cohen,et al.  Understanding captions in biomedical publications , 2003, KDD '03.

[8]  Kai Huang,et al.  Automated classification of subcellular patterns in multicell images without segmentation into single cells , 2004, 2004 2nd IEEE International Symposium on Biomedical Imaging: Nano to Macro (IEEE Cat No. 04EX821).

[9]  M V Boland,et al.  Automated recognition of patterns characteristic of subcellular structures in fluorescence microscopy images. , 1998, Cytometry.