An improved rank based disease prediction using web navigation patterns on bio-medical databases

Abstract Applying machine learning techniques to on-line biomedical databases is a challenging task, as this data is collected from large number of sources and it is multi-dimensional. Also retrieval of relevant document from large repository such as gene document takes more processing time and an increased false positive rate. Generally, the extraction of biomedical document is based on the stream of prior observations of gene parameters taken at different time periods. Traditional web usage models such as Markov, Bayesian and Clustering models are sensitive to analyze the user navigation patterns and session identification in online biomedical database. Moreover, most of the document ranking models on biomedical database are sensitive to sparsity and outliers. In this paper, a novel user recommendation system was implemented to predict the top ranked biomedical documents using the disease type, gene entities and user navigation patterns. In this recommendation system, dynamic session identification, dynamic user identification and document ranking techniques were used to extract the highly relevant disease documents on the online PubMed repository. To verify the performance of the proposed model, the true positive rate and runtime of the model was compared with that of traditional static models such as Bayesian and Fuzzy rank. Experimental results show that the performance of the proposed ranking model is better than the traditional models.

[1]  Ophir Frieder,et al.  Enhancing web search in the medical domain via query clarification , 2016, Information Retrieval Journal.

[2]  Kecheng Liu,et al.  Collaborative personal profiling for web service ranking and recommendation , 2014, Information Systems Frontiers.

[3]  Hui Zhang,et al.  SHh-Gli1 signaling pathway promotes cell survival by mediating baculoviral IAP repeat-containing 3 (BIRC3) gene in pancreatic cancer cells , 2016, Tumor Biology.

[4]  JiangDaxin,et al.  Mining search and browse logs for web search , 2013 .

[5]  Xing Chen,et al.  Drug-target interaction prediction by random walk on the heterogeneous network. , 2012, Molecular bioSystems.

[6]  Jian Pei,et al.  Mining search and browse logs for web search , 2013, ACM Trans. Intell. Syst. Technol..

[7]  Peter Szolovits,et al.  Evaluating the state-of-the-art in automatic de-identification. , 2007, Journal of the American Medical Informatics Association : JAMIA.

[8]  Laurianne Sitbon,et al.  AEHRC & QUT at TREC 2011 Medical Track: A Concept-Based Information Retrieval Approach , 2011, TREC.

[9]  Özlem Uzuner,et al.  Extracting medication information from clinical text , 2010, J. Am. Medical Informatics Assoc..

[10]  S. C. Hui,et al.  Web content recommender system based on consumer behavior modeling , 2011, IEEE Transactions on Consumer Electronics.

[11]  Matthias Zwick,et al.  Automated curation of gene name normalization results using the Konstanz information miner , 2015, J. Biomed. Informatics.

[12]  F Rinaldi,et al.  OntoGene in BioCreative II.5 , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  Mark Sanderson,et al.  Improving patient record search: A meta-data based approach , 2016, Inf. Process. Manag..

[14]  Florent Masseglia,et al.  Discovering frequent behaviors: time is an essential element of the context , 2010, Knowledge and Information Systems.

[15]  Michael K. Ng,et al.  Functional Module Analysis for Gene Coexpression Networks with Network Integration , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.