Full Text and Figure Display Improves Bioscience Literature Search

When reading bioscience journal articles, many researchers focus attention on the figures and their captions. This observation led to the development of the BioText literature search engine [1], a freely available Web-based application that allows biologists to search over the contents of Open Access Journals, and see figures from the articles displayed directly in the search results. This article presents a qualitative assessment of this system in the form of a usability study with 20 biologist participants using and commenting on the system. 19 out of 20 participants expressed a desire to use a bioscience literature search engine that displays articles' figures alongside the full text search results. 15 out of 20 participants said they would use a caption search and figure display interface either frequently or sometimes, while 4 said rarely and 1 said undecided. 10 out of 20 participants said they would use a tool for searching the text of tables and their captions either frequently or sometimes, while 7 said they would use it rarely if at all, 2 said they would never use it, and 1 was undecided. This study found evidence, supporting results of an earlier study, that bioscience literature search systems such as PubMed should show figures from articles alongside search results. It also found evidence that full text and captions should be searched along with the article title, metadata, and abstract. Finally, for a subset of users and information needs, allowing for explicit search within captions for figures and tables is a useful function, but it is not entirely clear how to cleanly integrate this within a more general literature search interface. Such a facility supports Open Access publishing efforts, as it requires access to full text of documents and the lifting of restrictions in order to show figures in the search interface.

[1]  Marti A. Hearst Search User Interfaces , 2009 .

[2]  Raman Chandrasekar,et al.  Do thumbnail previews help users make better relevance decisions about web search results? , 2002, SIGIR '02.

[3]  Alfonso Valencia,et al.  Overview of BioCreAtIvE: critical assessment of information extraction for biology , 2005, BMC Bioinformatics.

[4]  Dah-Jye Lee,et al.  Finding relevant PDF medical journal articles by the content of their figures , 2007, SPIE Medical Imaging.

[5]  Alexander A. Morgan,et al.  Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup , 2003, ISMB.

[6]  Marti A. Hearst,et al.  Evidence for Showing Gene/Protein Name Suggestions in Bioscience Literature Search Interfaces , 2007, Pacific Symposium on Biocomputing.

[7]  Hao Chen,et al.  Content-rich biological network constructed by mining PubMed abstracts , 2004, BMC Bioinformatics.

[8]  Susan T. Dumais,et al.  Learning user interaction models for predicting web search result preferences , 2006, SIGIR.

[9]  Saul Greenberg,et al.  How People Recognise Previously Seen Web Pages from Titles, URLs and Thumbnails , 2001 .

[10]  Rohini K. Srihari,et al.  Automatic Indexing and Content-Based Retrieval of Captioned Images , 1995, Computer.

[11]  Michael Krauthammer,et al.  Yale Image Finder (YIF): a new search engine for retrieving biomedical images , 2008, Bioinform..

[12]  Preslav Nakov,et al.  BioText Search Engine: beyond abstract search , 2007, Bioinform..

[13]  Sameer Antani,et al.  Exploring access to scientific literature using content-based image retrieval , 2007, SPIE Medical Imaging.

[14]  Marti A. Hearst,et al.  Exploring the Efficacy of Caption Search for Bioscience Journal Search Interfaces , 2007, BioNLP@ACL.

[15]  Fang Liu,et al.  FigSearch: a figure legend indexing and classification system , 2004, Bioinform..

[16]  Shih-Fu Chang,et al.  Exploring Text and Image Features to Classify Images in Bioscience Literature , 2006, BioNLP@NAACL-HLT.

[17]  Hong Yu,et al.  Are figure legends sufficient? Evaluating the contribution of associated text to biomedical figure comprehension , 2009, Journal of biomedical discovery and collaboration.

[18]  Edward Cutrell,et al.  An eye tracking study of the effect of target rank on web search , 2007, CHI.

[19]  Ted Boren,et al.  Thinking aloud: reconciling theory and practice , 2000 .

[20]  Hong Yu,et al.  Accessing bioscience images from abstract sentences , 2006, ISMB.

[21]  William W. Cohen,et al.  Understanding captions in biomedical publications , 2003, KDD '03.

[22]  Marti A. Hearst,et al.  Improving Search Results Quality by Customizing Summary Lengths , 2008, ACL.

[23]  A. Valencia,et al.  A gene network for navigating the literature , 2004, Nature Genetics.

[24]  Rohini K. Srihari,et al.  Piction: A System That Uses Captions to Label Human Faces in Newspaper Photographs , 1991, AAAI.

[25]  Gordon B. Davis,et al.  User Acceptance of Information Technology: Toward a Unified View , 2003, MIS Q..

[26]  Mary Czerwinski,et al.  The Contribution of Thumbnail Image, Mouse-over Text and Spatial Location Memory to Web Page Retrieval in 3D , 1999, INTERACT.

[27]  Heshan Sun,et al.  The role of moderating factors in user technology acceptance , 2006, Int. J. Hum. Comput. Stud..

[28]  A. Valencia,et al.  Evaluation of text-mining systems for biology: overview of the Second BioCreative community challenge , 2008, Genome Biology.

[29]  Marti A. Hearst,et al.  Finding the flow in web site search , 2002, CACM.

[30]  Hagit Shatkay,et al.  Integrating image data into biomedical text categorization , 2006, ISMB.

[31]  Jimmy J. Lin,et al.  Navigating information spaces: A case study of related article search in PubMed , 2008, Inf. Process. Manag..

[32]  Miguel A. Andrade-Navarro,et al.  Information extraction from full text scientific articles: Where are the keywords? , 2003, BMC Bioinformatics.

[33]  Kevin Li,et al.  Faceted metadata for image search and browsing , 2003, CHI '03.

[34]  Jimmy J. Lin,et al.  PubMed related articles: a probabilistic topic-based model for content similarity , 2007, BMC Bioinformatics.

[35]  Robert F. Murphy,et al.  Robust Numerical Features for Description and Classification of Subcellular Location Patterns in Fluorescence Microscope Images , 2003, J. VLSI Signal Process..

[36]  George R. S. Weir,et al.  People and Computers IX: Crafting Interaction: Styles, Metaphors, Modalities and Agents , 1994 .

[37]  Marti A. Hearst,et al.  TREC 2007 Genomics Track Overview , 2007, TREC.

[38]  Yuntao Qian,et al.  Improved recognition of figures containing fluorescence microscope images in online journal articles using graphical models , 2008, Bioinform..