Can literature analysis identify innovation drivers in drug discovery?

Drug discovery must be guided not only by medical need and commercial potential, but also by the areas in which new science is creating therapeutic opportunities, such as target identification and the understanding of disease mechanisms. To systematically identify such areas of high scientific activity, we use bibliometrics and related data-mining methods to analyse over half a terabyte of data, including PubMed abstracts, literature citation data and patent filings. These analyses reveal trends in scientific activity related to disease studied at varying levels, down to individual genes and pathways, and provide methods to monitor areas in which scientific advances are likely to create new therapeutic opportunities.

[1]  David Wheeler,et al.  Building Customized Data Pipelines Using the Entrez Programming Utilities (eUtils) , 2004 .

[2]  Fiona E. Murray Innovation as co-evolution of scientific and technological networks: exploring tissue engineering , 2002 .

[3]  Xiaoyan Wang,et al.  Automated Terminology Networks for the Integration of Heterogeneous Databases , 2004, MedInfo.

[4]  Jon Cohen Where Have All the Dollars Gone? , 2008, Science.

[5]  David Card,et al.  Going to College to Avoid the Draft: The Unintended Legacy of the Vietnam War , 2001 .

[6]  Yves A Lussier,et al.  Automating terminological networks to link heterogeneous biomedical databases. , 2004, Studies in health technology and informatics.

[7]  M. Ashiya,et al.  Non-insulin therapies for type 2 diabetes , 2007, Nature Reviews Drug Discovery.

[8]  M. Teitelbaum,et al.  Structural Disequilibria in Biomedical Research , 2008, Science.

[9]  Philippe Ducor,et al.  Coauthorship and Coinventorship , 2000, Science.

[10]  D B Searls,et al.  Mining the bibliome , 2001, The Pharmacogenomics Journal.

[11]  P Ducor Intellectual property. Coauthorship and coinventorship. , 2000, Science.

[12]  Ted Pedersen,et al.  Measures of semantic similarity and relatedness in the biomedical domain , 2007, J. Biomed. Informatics.

[13]  Johan Bollen,et al.  Journal status , 2006, Scientometrics.

[14]  James J. Cimino,et al.  Mining Cross-Terminology Links in the UMLS , 2006, AMIA.

[15]  R. Mesa,et al.  New insights into the pathogenesis and treatment of chronic myeloproliferative disorders , 2008, Current opinion in hematology.

[16]  Jon Cohen,et al.  Bang for the Buck , 2008, Science.

[17]  Michel Zitt,et al.  Relativity of citation performance and excellence measures: From cross-field to cross-scale effects of field-normalisation , 2005, Scientometrics.

[18]  K Takahashi,et al.  An alternative to journal-based impact factors. , 1999, Occupational medicine.

[19]  R. Webster,et al.  Are We Ready for Pandemic Influenza? , 2003, Science.

[20]  A. Marshall,et al.  Trends in biotech literature 2006 , 2007, Nature Biotechnology.

[21]  E Ray Dorsey,et al.  Financial anatomy of neuroscience research , 2006, Annals of neurology.

[22]  R. Chakrabarti,et al.  Non-insulin dependent diabetes mellitus: present therapies and new drug targets. , 2005, Mini reviews in medicinal chemistry.

[23]  Garry Jennings,et al.  Finding improved medicines: the role of academic–industrial collaboration , 2005, Nature Reviews Drug Discovery.

[24]  A. Coates,et al.  Novel approaches to developing new antibiotics for bacterial infections , 2007, British journal of pharmacology.

[25]  N. Morral,et al.  Novel targets and therapeutic strategies for type 2 diabetes , 2003, Trends in Endocrinology & Metabolism.

[26]  A. Karpas,et al.  Human retroviruses in leukaemia and AIDS: reflections on their discovery, biology and epidemiology , 2004, Biological reviews of the Cambridge Philosophical Society.

[27]  P. Lawrence The mismeasurement of science , 2007, Current Biology.

[28]  James J. Cimino,et al.  Towards the development of a conceptual distance metric for the UMLS , 2004, J. Biomed. Informatics.

[29]  David S. Stoffer,et al.  Time series analysis and its applications , 2000 .

[30]  S J Hurel,et al.  Adiponectin and its gene variants as risk factors for insulin resistance, the metabolic syndrome and cardiovascular disease. , 2006, Atherosclerosis.

[31]  D. Pompliano,et al.  Drugs for bad bugs: confronting the challenges of antibacterial discovery , 2007, Nature Reviews Drug Discovery.

[32]  J. Lehmann,et al.  An Antidiabetic Thiazolidinedione Is a High Affinity Ligand for Peroxisome Proliferator-activated Receptor γ (PPARγ) (*) , 1995, The Journal of Biological Chemistry.

[33]  J T Kalberer,et al.  Funding impact of the National Cancer Act and beyond. , 1979, Cancer research.

[34]  A. Fleischer,et al.  Thalidomide: current and potential clinical applications. , 2000, The American journal of medicine.

[35]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[36]  Miguel Vicente,et al.  The fallacies of hope: will we discover new antibiotics to combat pathogenic bacteria in time? , 2006, FEMS microbiology reviews.

[37]  Fredric Cohen,et al.  Macro trends in pharmaceutical innovation , 2005, Nature Reviews Drug Discovery.

[38]  L Bren,et al.  Frances Oldham Kelsey. FDA medical reviewer leaves her mark on history. , 2001, FDA consumer.

[39]  Ronald Rousseau,et al.  Aggregation properties of relative impact and other classical indicators: Convexity issues and the Yule-Simpson paradox , 2009, Scientometrics.

[40]  Roman Boutellier,et al.  A case study of lean drug discovery: from project driven research to innovation studios and process factories. , 2008, Drug discovery today.

[41]  Xiaotian Zhong,et al.  Mission possible: managing innovation in drug discovery , 2007, Nature Biotechnology.

[42]  P. Lawrence The politics of publication , 2003, Nature.

[43]  D. Price Little Science, Big Science , 1965 .

[44]  P Vallance,et al.  Drug Discovery and Development in the Age of Molecular Medicine , 2007, Clinical pharmacology and therapeutics.

[45]  D J PRICE,et al.  NETWORKS OF SCIENTIFIC PAPERS. , 1965, Science.

[46]  A. Rudensky,et al.  A well adapted regulatory contrivance: regulatory T cell development and the forkhead family transcription factor Foxp3 , 2005, Nature Immunology.

[47]  Eugene Garfield,et al.  The growth of the cell death field: an analysis from the ISI-Science citation index , 1997, Cell Death and Differentiation.

[48]  Jeanne G. Harris,et al.  Competing on Analytics: The New Science of Winning , 2007 .

[49]  Innovation OR Stagnation Challenge and Opportunity on the Critical Path to New Medical Products , 2004 .

[50]  F. Sams-Dodd,et al.  Optimizing the discovery organization for innovation. , 2005, Drug discovery today.

[51]  Robert R Ruffolo,et al.  Why has R&D productivity declined in the pharmaceutical industry? , 2006, Expert opinion on drug discovery.

[52]  Stacy Lawrence,et al.  Trends in the biotech literature , 2005, Nature Biotechnology.

[53]  David B. Searls,et al.  Literature mining in support of drug discovery , 2008, Briefings Bioinform..