Information Retrieval and Visualization for Searching Scientific articles and Patents

Given the rapidly changing face of technology, keeping up with the trends and identifying potential areas to be explored for research or commercialization is a challenging task. Decision makers, research analysts, scholars, research directors all make use of digital collections, use of which is facilitated by search applications developed on top of them. However, search is a humandriven activity and the result of such analysis is largely dependent on the initial inputs that are provided by the expert. Besides, aggregating and assimilating all the information returned by a search engine is no less daunting. In this paper, we propose intelligent methods for presenting search results to help information assimilation. We also present methods for analyzing large collections of documents in an automated way to generate insights that can prove to be useful for analysts. Starting from time-stamped collections of research publications and patent documents, we present several Information retrieval (IR) techniques that can successfully extract and present insights about emerging, popular and receding trends in research along with their current levels of commercialization. We present results of experiments based on research abstracts made available by digital libraries and US patent office.

[1]  Enrico Motta,et al.  Making Sense of Research with Rexplore , 2012, International Semantic Web Conference.

[2]  Ben Shneiderman,et al.  Rapid understanding of scientific paper collections: Integrating statistics, text analytics, and visualization , 2012, J. Assoc. Inf. Sci. Technol..

[3]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[4]  Kwangsoo Kim,et al.  A patent intelligence system for strategic technology planning , 2013, Expert Syst. Appl..

[5]  Lipika Dey,et al.  Obtaining Technology Insights from Large and Heterogeneous Document Collections , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[6]  Samee U. Khan,et al.  A literature review on the state-of-the-art in patent analysis , 2014 .

[7]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[8]  Cassidy R. Sugimoto,et al.  Topics in dynamic research communities: An exploratory study for the field of information retrieval , 2012, J. Informetrics.

[9]  Thomas L. Griffiths,et al.  Integrating Topics and Syntax , 2004, NIPS.

[10]  Daniel L. Fay,et al.  Research collaboration in universities and academic entrepreneurship: the-state-of-the-art , 2012, The Journal of Technology Transfer.

[11]  Bo Gao,et al.  PatentMiner: topic-driven patent analysis and mining , 2012, KDD.

[12]  Jian Pei,et al.  Detecting topic evolution in scientific literature: how can citations help? , 2009, CIKM.

[13]  Duen-Ren Liu,et al.  Discovering competitive intelligence by mining changes in patent trends , 2010, Expert Syst. Appl..

[14]  Dongwoo Kang,et al.  An SAO-based text mining approach to building a technology tree for technology planning , 2012, Expert Syst. Appl..

[15]  Enrico Motta,et al.  Identifying Diachronic Topic-Based Research Communities by Clustering Shared Research Trajectories , 2014, ESWC.

[16]  Enrico Motta,et al.  Mining Semantic Relations between Research Areas , 2012, SEMWEB.

[17]  Dafna Shahaf,et al.  Metro maps of science , 2012, KDD.