Analysis of the Trends in Biochemical Research Using Latent Dirichlet Allocation (LDA)

Biochemistry has been broadly defined as “chemistry of molecules included or related to living systems”, but is becoming increasingly hard to be distinguished from other related fields. Targets of its studies evolve rapidly; some newly emerge, disappear, combine, or resurface themselves with a fresh viewpoint. Methodologies for biochemistry have been extremely diversified, thanks particularly to those adopted from molecular biology, synthetic chemistry, and biophysics. Therefore, this paper adopts topic modeling, a text mining technique, to identify the research topics in the field of biochemistry over the past twenty years and quantitatively analyze the changes in its trends. The results of the topic modeling analysis obtained through this study will provide a helpful tool for researchers, journal editors, publishers, and funding agencies to understand the connections among the diverse sub-fields in biochemical research and even see how the research topics branch out and integrate with other fields.

[1]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[2]  A. Demirbas,et al.  Combustion characteristics of different biomass fuels , 2004 .

[3]  Peter McKendry,et al.  Energy production from biomass (Part 1): Overview of biomass. , 2002, Bioresource technology.

[4]  Jian Zuo,et al.  Green building research–current status and future agenda: A review , 2014 .

[5]  M. Kinch,et al.  The rise (and decline?) of biotechnology. , 2014, Drug discovery today.

[6]  Yafeng Yin,et al.  Discovering themes and trends in transportation research using topic modeling , 2017 .

[7]  David Rosen,et al.  LDA v. LSA: A Comparison of Two Computational Text Analysis Tools for the Functional Categorization of Patents , 2016, ICCBR Workshops.

[8]  Sonia Bergamaschi,et al.  Comparing LDA and LSA Topic Models for Content-Based Movie Recommendation Systems , 2014, WEBIST.

[9]  Krystyn R Clark,et al.  Creation of an innovative laser incident reporting form for improved trend analysis using the Delphi technique. , 2006, Military medicine.

[10]  Thomas L. Griffiths,et al.  Probabilistic Topic Models , 2007 .

[11]  David J. Newman,et al.  Probabilistic topic decomposition of an eighteenth-century American newspaper , 2006, J. Assoc. Inf. Sci. Technol..

[12]  Chong Wang,et al.  Continuous Time Dynamic Topic Models , 2008, UAI.

[13]  Shaowen Yao,et al.  An overview of topic modeling and its current applications in bioinformatics , 2016, SpringerPlus.

[14]  Justin Grimmer,et al.  A Bayesian Hierarchical Topic Model for Political Texts: Measuring Expressed Agendas in Senate Press Releases , 2010, Political Analysis.

[15]  Jaeki Song,et al.  An Empirical Comparison of Four Text Mining Methods* , 2010, J. Comput. Inf. Syst..

[16]  Peter W. Foltz,et al.  The Measurement of Textual Coherence with Latent Semantic Analysis. , 1998 .

[17]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Alan T. Bull,et al.  Biotechnology: International Trends and Perspectives , 1982 .

[19]  William L. Brockhaus,et al.  An analysis of prior delphi applications and some observations on its future applicability , 1977 .

[20]  Dirk Voelkel,et al.  Managing Open Innovation in Biotechnology , 2006 .

[21]  Gideon S. Mann,et al.  Bibliometric impact measures leveraging topic analysis , 2006, Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06).

[22]  Sean Gerrish,et al.  A Language-based Approach to Measuring Scholarly Impact , 2010, ICML.

[23]  Peter McKendry,et al.  Energy production from biomass (Part 2): Conversion technologies. , 2002, Bioresource technology.

[24]  Louis Galambos,et al.  The Global Chemical Industry in the Age of the Petrochemical Revolution , 2006 .

[25]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[26]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[27]  Peter McKendry,et al.  Energy production from biomass (Part 3): Gasification technologies. , 2002, Bioresource technology.

[28]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[29]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[30]  R. Mohan American Chemical Society to Honor Mohan , 2016 .