Twitter K-H networks in action: Advancing biomedical literature for drug search

The importance of searching biomedical literature for drug interaction and side-effects is apparent. Current digital libraries (e.g., PubMed) suffer infrequent tagging and metadata annotation updates. Such limitations cause absence of linking literature to new scientific evidence. This demonstrates a great deal of challenges that stand in the way of scientists when searching biomedical repositories. In this paper, we present a network mining approach that provides a bridge for linking and searching drug-related literature. Our contributions here are two fold: (1) an efficient algorithm called HashPairMiner to address the run-time complexity issues demonstrated in its predecessor algorithm: HashnetMiner, and (2) a database of discoveries hosted on the web to facilitate literature search using the results produced by HashPairMiner. Though the K-H network model and the HashPairMiner algorithm are fairly young, their outcome is evidence of the considerable promise they offer to the biomedical science community in general and the drug research community in particular.

[1]  Rada Mihalcea,et al.  Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Langu , 2011, ACL 2011.

[2]  Fernando Diaz,et al.  Improving recency ranking using twitter data , 2013, TIST.

[3]  Xindong Wu,et al.  The Top Ten Algorithms in Data Mining , 2009 .

[4]  Graeme Hirst,et al.  Resolving Lexical Ambiguity Computationally with Spreading Activation and Polaroid Words , 1988 .

[5]  Ricardo Pietrobon,et al.  The Database for Aggregate Analysis of ClinicalTrials.gov (AACT) and Subsequent Regrouping by Clinical Specialty , 2012, PloS one.

[6]  Timothy Baldwin,et al.  Lexical Normalisation of Short Text Messages: Makn Sens a #twitter , 2011, ACL.

[7]  Xindong Wu,et al.  Mining patterns in Big Data K-H networks , 2014, 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA).

[8]  Joseph Berger,et al.  Theoretical Research Programs: Studies in the Growth of Theory , 1994 .

[9]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[10]  Ahmed Abdeen Hamed,et al.  An Exploratory Analysis of Twitter Keyword-Hashtag Networks and Knowledge Discovery Applications , 2014 .

[11]  J. Frost,et al.  Sharing Health Data for Better Outcomes on PatientsLikeMe , 2010, Journal of medical Internet research.

[12]  Jeanette J McCarthy,et al.  Genomic Medicine: A Decade of Successes, Challenges, and Opportunities , 2013, Science Translational Medicine.

[13]  Catherine N. Norton,et al.  LigerCat: Using "MeSH Clouds" from Journal, Article, or Gene Citations to Facilitate the Identification of Relevant Biomedical Literature , 2009, AMIA.

[14]  Ashish Verma,et al.  Building re-usable dictionary repositories for real-world text mining , 2010, CIKM '10.

[15]  R. Hanneman Introduction to Social Network Methods , 2001 .

[16]  Shivaram Narayanan,et al.  The Betweenness Centrality Of Biological Networks , 2005 .

[17]  James Bailey,et al.  Mining Minimal Contrast Subgraph Patterns , 2006, SDM.

[18]  Miron Livny,et al.  Pegasus, a workflow management system for science automation , 2015, Future Gener. Comput. Syst..

[19]  P Ryan,et al.  Novel Data‐Mining Methodologies for Adverse Drug Event Discovery and Analysis , 2012, Clinical pharmacology and therapeutics.

[20]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[21]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[22]  Joseph T. Tennis,et al.  Social tagging in the life sciences: characterizing a new metadata resource for bioinformatics , 2009, BMC Bioinformatics.

[23]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[24]  Ryen W. White,et al.  Toward Enhanced Pharmacovigilance Using Patient-Generated Data on the Internet , 2014, Clinical pharmacology and therapeutics.

[25]  Christopher C. Yang,et al.  Postmarketing Drug Safety Surveillance Using Publicly Available Health-Consumer-Contributed Content in Social Media , 2014, TMIS.

[26]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[27]  Elizabeth S. Chen,et al.  MeSHing molecular sequences and clinical trials: A feasibility study , 2010, J. Biomed. Informatics.

[28]  O. Sporns Networks of the Brain , 2010 .

[29]  J Gorsky Marijuana test: no ibuprofen interference. , 1988, Science.

[30]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[31]  Abeed Sarker,et al.  Portable automatic text classification for adverse drug reaction detection via multi-corpus training , 2015, J. Biomed. Informatics.

[32]  Lyle H. Ungar,et al.  Identifying potential adverse effects using the web: A new approach to medical hypothesis generation , 2011, J. Biomed. Informatics.

[33]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[34]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[35]  John Skvoretz,et al.  Status, Network, and Structure Theory Development in Group Processes , 1998 .

[36]  Ellen Riloff,et al.  Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping , 1999, AAAI/IAAI.

[37]  Joachim Mathiesen,et al.  Modular networks of word correlations on Twitter , 2011, Scientific Reports.