MedRank: Discovering Influential Medical Treatments from Literature by Information Network Analysis

Medical literature has been an important information source for clinical professionals. As the body of medical literature expands rapidly, keeping this knowledge up-to-date becomes a challenge for medical professionals. One question is that for a given disease how can we find the most influential treatments currently available from online medical publications? In this paper we propose MedRank, a new network-based algorithm that ranks heterogeneous objects in a medical information network. The network is extracted from MEDLINE, a large collection of semi-structured medical literature. Different types of objects such as journal articles, pathological symptoms, diseases, clinical trials, treatments, authors, and journals are linked together through their relationships. The experimental results are compared with the expert rankings collected from doctors and two baseline methods, namely degree centrality and NetClus. The evaluation shows that our algorithm is effective and efficient. The success of categorized entity ranking in medical literature domain suggests a new methodology and a potential success in ranking semi-structured data in other domains.

[1]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[2]  Yizhou Sun,et al.  Ranking-based clustering of heterogeneous information networks with star network schema , 2009, KDD.

[3]  Gang Luo,et al.  Design and Evaluation of the iMed Intelligent Medical Search Engine , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[4]  E. David,et al.  Networks, Crowds, and Markets: Reasoning about a Highly Connected World , 2010 .

[5]  Chen Xiaoyun,et al.  PGMCLU: A novel parallel grid-based clustering algorithm for multi-density datasets , 2009, 2009 1st IEEE Symposium on Web Society.

[6]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[7]  Chris Arney,et al.  Networks, Crowds, and Markets: Reasoning about a Highly Connected World (Easley, D. and Kleinberg, J.; 2010) [Book Review] , 2013, IEEE Technology and Society Magazine.

[8]  Xiaotie Deng,et al.  Approximate and dynamic rank aggregation , 2004, Theor. Comput. Sci..

[9]  Tanja Urbancic,et al.  Literature mining method RaJoLink for uncovering relations between biomedical concepts , 2009, J. Biomed. Informatics.

[10]  E. Uriarte,et al.  Multi-target QPDR classification model for human breast and colon cancer-related proteins using star graph topological indices , 2008, Journal of Theoretical Biology.

[11]  E. Berner,et al.  Clinical Decision Support Systems: Theory and Practice , 1998 .

[12]  Euripides G. M. Petrakis,et al.  The AMTEx approach in the medical document indexing and retrieval application , 2009, Data Knowl. Eng..

[13]  Hongfei Lin,et al.  Passage retrieval based hidden knowledge discovery from biomedical literature , 2011, Expert Syst. Appl..

[14]  Chunqiang Tang,et al.  On iterative intelligent medical search , 2008, SIGIR '08.

[15]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[16]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[17]  Wanda Pratt,et al.  Using statistical and knowledge-based approaches for literature-based discovery , 2006, J. Biomed. Informatics.

[18]  Di Zhao,et al.  Combining PubMed knowledge and EHR data to develop a weighted bayesian network for pancreatic cancer prediction , 2011, J. Biomed. Informatics.

[19]  Gultekin Özsoyoglu,et al.  Context-based literature digital collection search , 2008, The VLDB Journal.

[20]  Wei-Ying Ma,et al.  Object-level ranking: bringing order to Web objects , 2005, WWW '05.

[21]  Allan Borodin,et al.  Link analysis ranking: algorithms, theory, and experiments , 2005, TOIT.

[22]  Hao Yang,et al.  MedSearch: a specialized search engine for medical information retrieval , 2008, CIKM '08.

[23]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[24]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[25]  R. Forthofer,et al.  Rank Correlation Methods , 1981 .

[26]  Ronald Fagin,et al.  Comparing top k lists , 2003, SODA '03.

[27]  Yizhou Sun,et al.  RankClus: integrating clustering with ranking for heterogeneous information network analysis , 2009, EDBT '09.

[28]  C. Spearman The proof and measurement of association between two things. By C. Spearman, 1904. , 1987, The American journal of psychology.

[29]  Alistair Moffat,et al.  A similarity measure for indefinite rankings , 2010, TOIS.

[30]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[31]  Michelangelo Ceci,et al.  Complex objects ranking: a relational data mining approach , 2010, SAC '10.