Are figure legends sufficient? Evaluating the contribution of associated text to biomedical figure comprehension

BackgroundBiomedical scientists need to access figures to validate research facts and to formulate or to test novel research hypotheses. However, figures are difficult to comprehend without associated text (e.g., figure legend and other reference text). We are developing automated systems to extract the relevant explanatory information along with figures extracted from full text articles. Such systems could be very useful in improving figure retrieval and in reducing the workload of biomedical scientists, who otherwise have to retrieve and read the entire full-text journal article to determine which figures are relevant to their research. As a crucial step, we studied the importance of associated text in biomedical figure comprehension.MethodsTwenty subjects evaluated three figure-text combinations: figure+legend, figure+legend+title+abstract, and figure+full-text. Using a Likert scale, each subject scored each figure+text according to the extent to which the subject thought he/she understood the meaning of the figure and the confidence in providing the assigned score. Additionally, each subject entered a free text summary for each figure-text. We identified missing information using indicator words present within the text summaries. Both the Likert scores and the missing information were statistically analyzed for differences among the figure-text types. We also evaluated the quality of text summaries with the text-summarization evaluation method the ROUGE score.ResultsOur results showed statistically significant differences in figure comprehension when varying levels of text were provided. When the full-text article is not available, presenting just the figure+legend left biomedical researchers lacking 39–68% of the information about a figure as compared to having complete figure comprehension; adding the title and abstract improved the situation, but still left biomedical researchers missing 30% of the information. When the full-text article is available, figure comprehension increased to 86–97%; this indicates that researchers felt that only 3–14% of the necessary information for full figure comprehension was missing when full text was available to them. Clearly there is information in the abstract and in the full text that biomedical scientists deem important for understanding the figures that appear in full-text biomedical articles.ConclusionWe conclude that the texts that appear in full-text biomedical articles are useful for understanding the meaning of a figure, and an effective figure-mining system needs to unlock the information beyond figure legend. Our work provides important guidance to the figure mining systems that extract information only from figure and figure legend.

[1]  B. Everitt,et al.  Statistical methods for rates and proportions , 1973 .

[2]  Jie Yao,et al.  Searching online journals for fluorescence microscope images depicting protein subcellular location patterns , 2001, Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001).

[3]  Kathleen R. McKeown,et al.  Summarization Evaluation Methods: Experiments and Analysis , 1998 .

[4]  Julia Hirschberg,et al.  Do Summaries Help? A Task-Based Evaluation of Multi-Document Summarization , 2005 .

[5]  Vimla L. Patel,et al.  Comprehending instructions for using pharmaceutical products in rural Kenya , 1990 .

[6]  Nigel Collier,et al.  A baseline feature set for learning rhetorical zones using full articles in the biomedical domain , 2005, SKDD.

[7]  K. Bretonnel Cohen,et al.  Frontiers of biomedical text mining: current progress , 2007, Briefings Bioinform..

[8]  Marti A. Hearst,et al.  Exploring the Efficacy of Caption Search for Bioscience Journal Search Interfaces , 2007, BioNLP@ACL.

[9]  Marti A. Hearst,et al.  TREC 2007 Genomics Track Overview , 2007, TREC.

[10]  William R. Hersh,et al.  A comparative analysis of retrieval features used in the TREC 2006 Genomics Track passage retrieval task , 2007, AMIA.

[11]  Shih-Fu Chang,et al.  Exploring Text and Image Features to Classify Images in Bioscience Literature , 2006, BioNLP@NAACL-HLT.

[12]  Hagit Shatkay,et al.  Integrating image data into biomedical text categorization , 2006, ISMB.

[13]  Padmini Srinivasan,et al.  Categorization of Sentence Types in Medical Abstracts , 2003, AMIA.

[14]  Robert F. Murphy,et al.  EXTRACTING AND STRUCTURING SUBCELLULAR LOCATION INFORMATION FROM ON-LINE JOURNAL ARTICLES: THE SUBCELLULAR LOCATION IMAGE FINDER , 2004 .

[15]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[16]  Hong Yu,et al.  BioEx: A Novel User-Interface that Accesses Images from Abstract Sentences , 2006, HLT-NAACL.

[17]  H. Toutenburg Fleiss, J. L.: Statistical Methods for Rates and Proportions. John Wiley & Sons, New York‐London‐Sydney‐Toronto 1973. XIII, 233 S. , 1974 .

[18]  Michael Krauthammer,et al.  Of truth and pathways: chasing bits of information through myriads of articles , 2002, ISMB.

[19]  Mirella Lapata,et al.  Automatic Evaluation of Text Coherence: Models and Representations , 2005, IJCAI.

[20]  Hong Yu,et al.  Towards Answering Biological Questions with Experimental Evidence: Automatically Identifying Text that Summarize Image Content in Full-Text Articles , 2006, AMIA.

[21]  Hong Yu,et al.  Accessing bioscience images from abstract sentences , 2006, ISMB.

[22]  Johnathan A Napier,et al.  A Saccharomyces cerevisiae Gene Required for Heterologous Fatty Acid Elongase Activity Encodes a Microsomal β-Keto-reductase* , 2002, The Journal of Biological Chemistry.

[23]  Ronen Feldman,et al.  Rule-based extraction of experimental evidence in the biomedical domain: the KDD Cup 2002 (task 1) , 2002, SKDD.

[24]  Michael Krauthammer,et al.  Yale Image Finder (YIF): a new search engine for retrieving biomedical images , 2008, Bioinform..

[25]  Preslav Nakov,et al.  BioText Search Engine: beyond abstract search , 2007, Bioinform..

[26]  Inderjeet Mani,et al.  SUMMAC: a text summarization evaluation , 2002, Natural Language Engineering.

[27]  Alexander A. Morgan,et al.  Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup , 2003, ISMB.