A study of similar question retrieval method in online health communities

Purpose The purpose of this paper is to propose a new approach to retrieve similar questions in online health communities to improve the efficiency of health information retrieval and sharing. Design/methodology/approach This paper proposes a hybrid approach to combining domain knowledge similarity and topic similarity to retrieve similar questions in online health communities. The domain knowledge similarity can evaluate the domain distance between different questions. And the topic similarity measures questions’ relationship base on the extracted latent topics. Findings The experiment results show that the proposed method outperforms the baseline methods. Originality/value This method conquers the problem of word mismatch and considers the named entities included in questions, which most of existing studies did not.

[1]  Hua Xu,et al.  Research and applications: A comprehensive study of named entity recognition in Chinese clinical text , 2014, J. Am. Medical Informatics Assoc..

[2]  Alejandro Figueroa,et al.  Automatically generating effective search queries directly from community question-answering questions for finding related questions , 2017, Expert Syst. Appl..

[3]  Mi-Young Kim,et al.  Community question retrieval in health forums , 2017, 2017 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI).

[4]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[5]  Joemon M. Jose,et al.  A Hybrid Approach for Question Retrieval in Community Question Answerin , 2016, Comput. J..

[6]  Meng-Sung Wu,et al.  Modeling query-document dependencies with topic language models for information retrieval , 2015, Inf. Sci..

[7]  Antonio Ferrández,et al.  Lexical and Syntactic knowledge for Information Retrieval , 2011 .

[8]  Tat-Seng Chua,et al.  Capturing the Semantics of Key Phrases Using Multiple Languages for Question Retrieval , 2016, IEEE Transactions on Knowledge and Data Engineering.

[9]  G. Andersson,et al.  Theme Issue on E-Mental Health: A Growing Field in Internet Research , 2010, Journal of medical Internet research.

[10]  Kirk Roberts,et al.  Interactive use of online health resources: a comparison of consumer and professional questions , 2016, J. Am. Medical Informatics Assoc..

[11]  Di Jiang,et al.  TEII: Topic enhanced inverted index for top-k document retrieval , 2015, Knowl. Based Syst..

[12]  Ophir Frieder,et al.  Repeatable evaluation of search services in dynamic environments , 2007, TOIS.

[13]  Kam-Fai Wong,et al.  Interpreting TF-IDF term weights as making relevance decisions , 2008, TOIS.

[14]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[15]  Yang Jin,et al.  An Overview of Research on Electronic Medical Record Oriented Named Entity Recognition and Entity Relation Extraction , 2014 .

[16]  G. Eysenbach Medicine 2.0: Social Networking, Collaboration, Participation, Apomediation, and Openness , 2008, Journal of medical Internet research.

[17]  L. Engelen,et al.  Definition of Health 2.0 and Medicine 2.0: A Systematic Review , 2010, Journal of medical Internet research.

[18]  Carel T. J. Hulshof,et al.  An online network tool for quality information to answer questions about occupational safety and health: usability and applicability , 2010, BMC Medical Informatics Decis. Mak..

[19]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.

[20]  R. Ji,et al.  A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives , 2014, PloS one.

[21]  Chun-Kai Huang,et al.  QA document recommendations for communities of question-answering websites , 2014, Knowl. Based Syst..

[22]  Han Zhang,et al.  Knowledge sharing in online health communities: A social exchange theory perspective , 2016, Inf. Manag..

[23]  Shuying Shen,et al.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text , 2011, J. Am. Medical Informatics Assoc..