Augmenting Scientific Creativity with Retrieval across Knowledge Domains

Exposure to ideas in domains outside a scientist's own may benefit her in reformulating existing research problems in novel ways and discovering new application domains for existing solution ideas. While improved performance in scholarly search engines can help scientists efficiently identify relevant advances in domains they may already be familiar with, it may fall short of helping them explore diverse ideas \textit{outside} such domains. In this paper we explore the design of systems aimed at augmenting the end-user ability in cross-domain exploration with flexible query specification. To this end, we develop an exploratory search system in which end-users can select a portion of text core to their interest from a paper abstract and retrieve papers that have a high similarity to the user-selected core aspect but differ in terms of domains. Furthermore, end-users can `zoom in' to specific domain clusters to retrieve more papers from them and understand nuanced differences within the clusters. Our case studies with scientists uncover opportunities and design implications for systems aimed at facilitating cross-domain exploration and inspiration.

[1]  A. Kittur,et al.  Augmenting Scientific Creativity with an Analogical Search Engine , 2022, ACM Trans. Comput. Hum. Interact..

[2]  Daniel S. Weld,et al.  From Who You Know to What You Read: Augmenting Scientific Recommendations with Implicit Social Networks , 2022, CHI.

[3]  Tao Wang,et al.  Review on Mn-based and Fe-based layered cathode materials for sodium-ion batteries , 2022, Ionics.

[4]  Robert G. Capra,et al.  Analyzing information resources that support the creative process , 2022, CHIIR.

[5]  Bela Gipp,et al.  Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings , 2022, EMNLP.

[6]  Arman Cohan,et al.  Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity , 2021, NAACL.

[7]  Daniel S. Weld,et al.  Bursting Scientific Filter Bubbles: Boosting Innovation via Novel Author Discovery , 2021, CHI.

[8]  Elisa Bertino,et al.  Learning Domain Semantics and Cross-Domain Correlations for Paper Recommendation , 2021, SIGIR.

[9]  A. McCallum,et al.  CSFCube - A Test Collection of Computer Science Research Articles for Faceted Query by Example , 2021, NeurIPS Datasets and Benchmarks.

[10]  Orland Hoeber,et al.  Visually Linked Keywords to Support Exploratory Browsing , 2021, CHIIR.

[11]  Qingyao Ai,et al.  Beyond Probability Ranking Principle: Modeling the Dependencies among Documents , 2021, WSDM.

[12]  Bela Gipp,et al.  Aspect-based Document Similarity for Research Papers , 2020, COLING.

[13]  Daniel J. Wigdor,et al.  Understanding and Supporting Academic Literature Review Workflows with LitSense , 2020, AVI.

[14]  E. Olivetti,et al.  Dissolution of olivines from steel and copper slags in basic solution , 2020 .

[15]  Eric Horvitz,et al.  SciSight: Combining faceted navigation and research group detection for COVID-19 exploratory scientific search , 2020, bioRxiv.

[16]  Daniel S. Weld,et al.  SPECTER: Document-level Representation Learning using Citation-informed Transformers , 2020, ACL.

[17]  Roee Aharoni,et al.  Unsupervised Domain Clusters in Pretrained Language Models , 2020, ACL.

[18]  Michael Färber,et al.  Citation recommendation: approaches and datasets , 2020, International Journal on Digital Libraries.

[19]  Yong-Yeol Ahn,et al.  Neural embeddings of scholarly periodicals reveal complex disciplinary organizations , 2020, Science Advances.

[20]  Barbara Grune,et al.  Evaluation of Scientific Elements for Text Similarity in Biomedical Publications , 2019, ArgMining@ACL.

[21]  Themis Palpanas,et al.  Example-based Search: a New Frontier for Exploratory Search , 2019, SIGIR.

[22]  Michael Bendersky,et al.  Domain Adaptation for Enterprise Email Search , 2019, SIGIR.

[23]  J. Meidow,et al.  Search , 2019, Principles of Quantum Artificial Intelligence.

[24]  Iz Beltagy,et al.  SciBERT: A Pretrained Language Model for Scientific Text , 2019, EMNLP.

[25]  Dafna Shahaf,et al.  Scaling up analogical innovation with crowds and AI , 2019, Proceedings of the National Academy of Sciences.

[26]  Bhaskar Mitra,et al.  Cross Domain Regularization for Neural Ranking Models using Adversarial Learning , 2018, SIGIR.

[27]  Simone Teufel,et al.  Identifying problems and solutions in scientific text , 2018, Scientometrics.

[28]  Byron C. Wallace,et al.  Learning Disentangled Representations of Texts with Application to Biomedical Abstracts , 2018, EMNLP.

[29]  James A. Evans,et al.  Slowed canonical progress in large fields of science , 2018, Proceedings of the National Academy of Sciences.

[30]  Oren Kurland,et al.  Selective Cluster Presentation on the Search Results Page , 2018, ACM Trans. Inf. Syst..

[31]  Dafna Shahaf,et al.  Accelerating Innovation Through Analogy Mining , 2017, KDD.

[32]  Dorota Glowacka,et al.  Exploring Scientific Literature Search through Topic Models , 2017, ESIDA@IUI.

[33]  Jeff Johnson,et al.  Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[34]  Niloy Ganguly,et al.  FeRoSA: A Faceted Recommendation System for Scientific Articles , 2016, PAKDD.

[35]  Jacob G Foster,et al.  Choosing experiments to accelerate collective discovery , 2015, Proceedings of the National Academy of Sciences.

[36]  Alex Ksikes,et al.  Towards exploratory faceted search systems , 2014 .

[37]  Petra Badke-Schaub,et al.  Inspiration peak: exploring the semantic distance between design problem and textual inspirational stimuli , 2013 .

[38]  K. P. Abhilash,et al.  Investigations on pure and Ag doped lithium lanthanum titanate (LLTO) nanocrystalline ceramic electrolytes for rechargeable lithium-ion batteries , 2013 .

[39]  Carlos Guestrin,et al.  Beyond keyword search: discovering relevant scientific literature , 2011, KDD.

[40]  Ben Carterette,et al.  System effectiveness, user models, and user utility: a conceptual framework for investigation , 2011, SIGIR.

[41]  Evaggelia Pitoura,et al.  Search result diversification , 2010, SGMD.

[42]  Ryen W. White,et al.  Exploratory Search: Beyond the Query-Response Paradigm , 2009, Exploratory Search.

[43]  Glenn Regehr,et al.  Slowing down when you should: a new model of expert judgment. , 2007, Academic medicine : journal of the Association of American Medical Colleges.

[44]  Pia Borlund,et al.  The concept of relevance in IR , 2003, J. Assoc. Inf. Sci. Technol..

[45]  D. Swanson,et al.  Undiscovered Public Knowledge: A Ten-Year Update , 1996, KDD.

[46]  Benjamin Kuipers,et al.  A Description of Think Aloud Method and Protocol Analysis , 1993 .

[47]  R. Oppenheimer Analogy in science. , 1956 .

[48]  Ronen Tamari,et al.  Scaling Creative Inspiration with Fine-Grained Functional Facets of Product Ideas , 2021, ArXiv.

[49]  Daniel S. Weld,et al.  S2ORC: The Semantic Scholar Open Research Corpus , 2020, ACL.

[50]  Dafna Shahaf,et al.  31 SOLVENT : A Mixed Initiative System for Finding Analogies between Research Papers , 2018 .

[51]  Peter E. Thornton,et al.  Interdisciplinary research in climate and energy sciences , 2016 .

[52]  Christian D. Schunn,et al.  Do the best design ideas (really) come from conceptually distant sources of inspiration , 2015 .

[53]  Giuseppe Del Re,et al.  Models and analogies in science , 2013 .

[54]  Emanuele Della Valle,et al.  An Introduction to Information Retrieval , 2013 .

[55]  Kenneth D. Forbus,et al.  Analogy and creativity in the works of Johannes Kepler , 1997 .

[56]  Melanie Mitchell,et al.  Analogy-making as perception - a computer model , 1993, Neural network modeling and connectionism.