Studies on Relevance, Ranking and Results Display

This study considers the extent to which users with the same query agree as to what is relevant, and how what is considered relevant may translate into a retrieval algorithm and results display. To combine user perceptions of relevance with algorithm rank and to present results, we created a prototype digital library of scholarly literature. We confine studies to one population of scientists (paleontologists), one domain of scholarly scientific articles (paleo-related), and a prototype system (PaleoLit) that we built for the purpose. Based on the principle that users do not pre-suppose answers to a given query but that they will recognize what they want when they see it, our system uses a rules-based algorithm to cluster results into fuzzy categories with three relevance levels. Our system matches at least 1/3 of our participants' relevancy ratings 87% of the time. Our subsequent usability study found that participants trusted our uncertainty labels but did not value our color-coded horizontal results layout above a standard retrieval list. We posit that users make such judgments in limited time, and that time optimization per task might help explain some of our findings. Index Terms—knowledge retrieval; uncertainty, "fuzzy," and probabilistic reasoning; knowledge representation formalisms and methods ——————————  ——————————

[1]  Joshua B. Smith,et al.  HETERODONTY IN TYRANNOSAURUS REX: IMPLICATIONS FOR THE TAXONOMIC AND SYSTEMATIC UTILITY OF THEROPOD DENTITIONS , 2005 .

[2]  Karel Vredenburg,et al.  A survey of user-centered design practice , 2002, CHI.

[3]  Alan F. Smeaton,et al.  Personalisation and recommender systems in digital libraries , 2005, International Journal on Digital Libraries.

[4]  Emese M. Bordy,et al.  ENIGMATIC TRACE FOSSILS FROM THE AEOLIAN LOWER JURASSIC CLARENS FORMATION, SOUTHERN AFRICA , 2008 .

[5]  Dong Liu,et al.  Formal description of the cognitive process of decision making , 2004, Proceedings of the Third IEEE International Conference on Cognitive Informatics, 2004..

[6]  Chris Buckley,et al.  Relevance Feedback Track Overview: TREC 2008 , 2008, TREC.

[7]  Xiaoyun Wang,et al.  An Improvement on the Model of Ontology-Based Semantic Similarity Computation , 2009, 2009 First International Workshop on Database Technology and Applications.

[8]  Tara A. Forbis,et al.  THE EVOLUTION OF EMBRYO SIZE IN ANGIOSPERMS AND OTHER SEED PLANTS: IMPLICATIONS FOR THE EVOLUTION OF SEED DORMANCY , 2002, Evolution; international journal of organic evolution.

[9]  Jin Zhang,et al.  Visualization for Information Retrieval , 2007, Encyclopedia of Database Systems.

[10]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[11]  Hilla Peretz,et al.  The , 1966 .

[12]  Leon P. A. M. Claessens,et al.  DINOSAUR GASTRALIA; ORIGIN, MORPHOLOGY, AND FUNCTION , 2004 .

[13]  Ellen M. Voorhees,et al.  Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.

[14]  Susan E. Evans,et al.  A stem-group frog from the Early Triassic of Poland , 1998 .

[15]  Ludmila I. Kuncheva,et al.  Fuzzy Classifier Design , 2000, Studies in Fuzziness and Soft Computing.

[16]  Brian M. Davis,et al.  A revision of "pediomyid" marsupials from the Late Cretaceous of North America , 2007 .

[17]  William B. Frakes,et al.  Stemming Algorithms , 1992, Information Retrieval: Data Structures & Algorithms.

[18]  P. L. Mitchell,et al.  Assessing the potential for the stomatal characters of extant and fossil Ginkgo leaves to signal atmospheric CO2 change. , 2001, American journal of botany.

[19]  Jianying Hu,et al.  Comparison and Classification of Documents Based on Layout Similarity , 2000, Information Retrieval.

[20]  Mehmet Serkan Akkiraz,et al.  Palaeoecology of Coal-Bearing Eocene Sediments in Central Anatolia (Turkey) Based on Quantitative Palynological Data , 2008 .

[21]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[22]  RODOLFO A. CORIA,et al.  THE BRAINCASE OF GIGANOTOSAURUS CAROLINII (DINOSAURIA: THEROPODA) FROM THE UPPER CRETACEOUS OF ARGENTINA , 2002 .

[23]  Schubert Foo,et al.  Subjective Relevance: Implications on Interface Design for Information Retrieval Systems , 2005, ICADL.

[24]  Ellen M. Voorhees Variations in relevance judgments and the measurement of retrieval effectiveness , 2000, Inf. Process. Manag..

[25]  Linda Schamber Relevance and Information Behavior. , 1994 .

[26]  S. Floyd,et al.  PERSPECTIVE: THE ORIGIN OF FLOWERING PLANTS AND THEIR REPRODUCTIVE BIOLOGY–A TALE OF TWO PHYLOGENIES , 2001, Evolution; international journal of organic evolution.

[27]  Christopher J. Fox,et al.  Lexical Analysis and Stoplists , 1992, Information Retrieval: Data Structures & Algorithms.

[28]  Joshua B. Smith,et al.  DENTAL MORPHOLOGY AND VARIATION IN MAJUNGASAURUS CRENATISSIMUS (THEROPODA: ABELISAURIDAE) FROM THE LATE CRETACEOUS OF MADAGASCAR , 2007 .

[29]  Yang Gao,et al.  An efficient adaptive focused crawler based on ontology learning , 2005, Fifth International Conference on Hybrid Intelligent Systems (HIS'05).

[30]  Chengjun Zhang,et al.  Variation in Ginkgo biloba L. leaf characters across a climatic gradient in China , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[31]  S. Hope Decision-making under spatial uncertainty , 2005 .

[32]  Eero Sormunen,et al.  Liberal relevance criteria of TREC -: counting on negligible documents? , 2002, SIGIR '02.

[33]  J. Bonaparte,et al.  Dinosaurs: a jurassic assemblage from patagonia. , 1979, Science.

[34]  Yin Guisheng,et al.  Research on Ontology-Based Measuring Semantic Similarity , 2008, 2008 International Conference on Internet Computing in Science and Engineering.

[35]  Marc M. Sebrechts,et al.  Visualization of search results: a comparative evaluation of text, 2D, and 3D interfaces , 1999, SIGIR '99.

[36]  Carol Collier Kuhlthau,et al.  Accommodating the User's Information Search Process: Challenges for Information Retrieval System Designers , 2005 .

[37]  Claudio Carpineto,et al.  Full-Subtopic Retrieval with Keyphrase-Based Search Results Clustering , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[38]  D. Henderson,et al.  Fused and vaulted nasals of tyrannosaurid dinosaurs: Implications for cranial strength and feeding mechanics , 2006 .

[39]  Schubert Foo,et al.  Subjective Relevance: Implications on Digital Libraries for Experts and Novices , 2004, ICADL.

[40]  James Allan,et al.  Evaluating combinations of ranked lists and visualizations of inter-document similarity , 2001, Inf. Process. Manag..

[41]  THOMAS EISNER,et al.  Living Fossils: On Lampreys, Baronia, and the Search for Medicinals , 2003 .

[42]  Howard J. Hamilton,et al.  Towards an Ontology-Based Spatial Clustering Framework , 2005, Canadian Conference on AI.

[43]  Erik Duval,et al.  Creating New Learning Experiences on a Global Scale, Second European Conference on Technology Enhanced Learning, EC-TEL 2007, Crete, Greece, September 17-20, 2007, Proceedings , 2007, EC-TEL.

[44]  Yun Peng,et al.  Finding and Ranking Knowledge on the Semantic Web , 2005, SEMWEB.

[45]  Sevinç Özkan-Altiner,et al.  Rock-Forming Nannofossils in Uppermost Jurassic-Lower Cretaceous Rock Units of Northwest Anatolia: Nannoconusand Its Resived Taxonomy , 1999 .

[46]  H. Simon,et al.  Invariants of human behavior. , 1990, Annual review of psychology.

[47]  Paul A. Cairns,et al.  Beyond guidelines: what can we learn from the visual information seeking mantra? , 2005, Ninth International Conference on Information Visualisation (IV'05).

[48]  Lois M. L. Delcambre,et al.  Using semantic components to search for domain-specific documents: An evaluation from the system perspective and the user perspective , 2009, Inf. Syst..

[49]  Karel Vredenburg,et al.  User-centered design methods in practice: a survey of the state of the art , 2001, CASCON.

[50]  REBECCA R. HANNA,et al.  MULTIPLE INJURY AND INFECTION IN A SUB-ADULT THEROPOD DINOSAUR> ALLOSAURUS FRAGILIS WITH COMPARISONS TO ALLOSAUR PATHOLOGY IN THE CLEVELAND-LLOYD DINOSAUR QUARRY COLLECTION , 2002 .

[51]  Erik Duval,et al.  Relevance Ranking Metrics for Learning Objects , 2007, IEEE Transactions on Learning Technologies.

[52]  Tiziana Catarci,et al.  Visualization in Digital Libraries , 2005, From Integrated Publication and Information Systems to Virtual Information and Knowledge Environments.

[53]  William A. DiMichele,et al.  Plant Paleoecology in Deep Time1 , 2008 .

[54]  Judith Gelernter Visual Classification with Information Visualization (Infoviz) for Digital Library Collections , 2007 .

[55]  A. Zenebe,et al.  Visualization of Item Features, Customer Preference and Associated Uncertainty using Fuzzy Sets , 2007, NAFIPS 2007 - 2007 Annual Meeting of the North American Fuzzy Information Processing Society.

[56]  Luo Si,et al.  Unified utility maximization framework for resource selection , 2004, CIKM '04.

[57]  P. Cisek,et al.  Decisions in Changing Conditions: The Urgency-Gating Model , 2009, The Journal of Neuroscience.

[58]  Andrew Turpin,et al.  Why batch and user evaluations do not give the same results , 2001, SIGIR '01.

[59]  Edward A. Fox,et al.  Automatic document metadata extraction using support vector machines , 2003, 2003 Joint Conference on Digital Libraries, 2003. Proceedings..

[60]  Koji Iwanuma,et al.  Rapid Synthesis of Domain-Specific Web Search Engines Based on Semi-Automatic Training-Example Generation , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[61]  Yi Zhang,et al.  An Unbiased Generative Model for Setting Dissemination Thresholds , 2003 .

[62]  Jeffrey A. Wilson,et al.  FOREBRAIN ENLARGEMENT AMONG NONAVIAN THEROPOD DINOSAURS , 2000 .

[63]  John Karat,et al.  "THAT's What I Was Looking For": Comparing User-Rated Relevance with Search Engine Rankings , 2005, INTERACT.

[64]  Jialun Qin,et al.  Building domain-specific Web collections for scientific digital libraries: a meta-search enhanced focused crawling method , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[65]  Tefko Saracevic Relevance: A review of the literature and a framework for thinking on the notion in information science. Part III: Behavior and effects of relevance , 2007 .

[66]  Matthias Hemmje,et al.  From Integrated Publication and Information Systems to Information and Knowledge Environments , 2005 .

[67]  Eugene Fink,et al.  Creating and visualizing fuzzy document classification , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[68]  Mark Levene,et al.  Presentation bias is significant in determining user preference for search results - A user study , 2009, J. Assoc. Inf. Sci. Technol..

[69]  F. Prevosti,et al.  Paleoecology of the large carnivore guild from the late Pleistocene of Argentina , 2006 .

[70]  Michael E. Lesk,et al.  Relevance assessments and retrieval system evaluation , 1968, Inf. Storage Retr..

[71]  Roberto Iannuzzi,et al.  Permian plants from the Chutani Formation (Titicaca Group, Northern Altiplano of Bolivia): II. The morphogenus Glossopteris , 2004 .

[72]  Ali Shiri,et al.  Metadata-enhanced visual interfaces to digital libraries , 2008, J. Inf. Sci..

[73]  Peter Jansen,et al.  Effectiveness of Clustering in Ad-Hoc Retrieval , 1998, TREC.

[74]  A.T. Eshlaghy,et al.  A New Approach for Classification of Weighting Methods , 2006, 2006 IEEE International Conference on Management of Innovation and Technology.

[75]  Fabio Paternò,et al.  Human-Computer Interaction - INTERACT 2005 , 2005, Lecture Notes in Computer Science.

[76]  Gabriella Kazai,et al.  Overview of the INEX 2007 Book Search Track (BookSearch'07) , 2007, INEX.

[77]  Chaomei Chen,et al.  Top 10 Unsolved Information Visualization Problems , 2005, IEEE Computer Graphics and Applications.

[78]  Miles Efron,et al.  Query polyrepresentation for ranking retrieval systems without relevance judgments , 2010 .

[79]  Shih-Fu Chang,et al.  Visual islands: intuitive browsing of visual search results , 2008, CIVR '08.

[80]  C. Lee Giles,et al.  CiteSeer: an autonomous Web agent for automatic retrieval and identification of interesting publications , 1998, AGENTS '98.

[81]  M. Sheelagh T. Carpendale,et al.  Grounded evaluation of information visualizations , 2008, BELIV.

[82]  Pertti Vakkari,et al.  The influence of relevance levels on the effectiveness of interactive information retrieval , 2004, J. Assoc. Inf. Sci. Technol..

[83]  Carol L. Barry Document Representations and Clues to Document Relevance , 1998, J. Am. Soc. Inf. Sci..

[84]  Nowatzki,et al.  The eolianites between Sanga do Cabral and Botucatu formations in Rio Grande do Sul State, Brazil. , 2000, Anais da Academia Brasileira de Ciencias.

[85]  Karel Jezek,et al.  Extending the single words-based document model: a comparison of bigrams and 2-itemsets , 2006, DocEng '06.

[86]  A C Terry,et al.  Long-term growth of Ginkgo with CO(2) enrichment increases leaf ice nucleation temperatures and limits recovery of the photosynthetic system from freezing. , 2000, Plant physiology.

[87]  Catherine Plaisant,et al.  The challenge of information visualization evaluation , 2004, AVI.

[88]  Aravindan Veerasamy,et al.  Effectiveness of a graphical display of retrieval results , 1997, SIGIR '97.

[89]  Binh Pham,et al.  Visualisation of Information Uncertainty: Progress and Challenges. , 2009 .

[90]  Scott L. Wing,et al.  Ecological conservatism in the “living fossil” Ginkgo , 2003, Paleobiology.

[91]  Roy E. Bailey,et al.  Decision making under uncertainty , 2005 .

[92]  G. Cuny,et al.  THE COELOPHYSOID LOPHOSTROPHEUS AIRELENSIS, GEN. NOV.: A REVIEW OF THE SYSTEMATICS OF “LILIENSTERNUS” AIRELENSIS FROM THE TRIASSIC-JURASSIC OUTCROPS OF NORMANDY (FRANCE) , 2007 .

[93]  Guy Bouxin,et al.  Ginkgo, a multivariate analysis package , 2005 .

[94]  Kerry Martin Bone,et al.  Potential interaction of Ginkgo biloba leaf with antiplatelet or anticoagulant drugs: what is the evidence? , 2008, Molecular nutrition & food research.

[95]  Enrico Motta,et al.  The Semantic Web - ISWC 2005, 4th International Semantic Web Conference, ISWC 2005, Galway, Ireland, November 6-10, 2005, Proceedings , 2005, SEMWEB.

[96]  Forbes Gibb,et al.  Relationship among activities and problems causing uncertainty in information seeking and retrieval , 2009, J. Documentation.

[97]  Chaomei Chen,et al.  Visualising Semantic Spaces and Author Co-Citation Networks in Digital Libraries , 1999, Inf. Process. Manag..

[98]  Marion A. Hersh,et al.  Sustainable decision making: the role of decision support systems , 1999, IEEE Trans. Syst. Man Cybern. Part C.

[99]  R. Allain,et al.  DISCOVERY OF MEGALOSAUR (DINOSAURIA, THEROPODA) IN THE MIDDLE BATHONIAN OF NORMANDY (FRANCE) AND ITS IMPLICATIONS FOR THE PHYLOGENY OF BASAL TETANURAE , 2002 .

[100]  Yacine Rezgui,et al.  A document management methodology based on similarity contents , 2004, Inf. Sci..

[101]  P. Currie,et al.  Skull structure and evolution in tyrannosaurid dinosaurs , 2003 .