Towards organizing health knowledge on community-based health services

Online community-based health services accumulate a huge amount of unstructured health question answering (QA) records at a continuously increasing pace. The ability to organize these health QA records has been found to be effective for data access. The existing approaches for organizing information are often not applicable to health domain due to its domain nature as characterized by complex relation among entities, large vocabulary gap, and heterogeneity of users. To tackle these challenges, we propose a top-down organization scheme, which can automatically assign the unstructured health-related records into a hierarchy with prior domain knowledge. Besides automatic hierarchy prototype generation, it also enables each data instance to be associated with multiple leaf nodes and profiles each node with terminologies. Based on this scheme, we design a hierarchy-based health information retrieval system. Experiments on a real-world dataset demonstrate the effectiveness of our scheme in organizing health QA into a topic hierarchy and retrieving health QA records from the topic hierarchy.

[1]  Meng Wang,et al.  Multimedia answering: enriching text QA with media information , 2011, SIGIR.

[2]  Milind R. Naphade,et al.  Learning the semantics of multimedia queries and concepts from a small number of examples , 2005, MULTIMEDIA '05.

[3]  Yang Song,et al.  Hierarchical tag visualization and application for tag recommendations , 2011, CIKM '11.

[4]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[5]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[6]  Qiong Luo,et al.  Towards Ontology Learning from Folksonomies , 2009, IJCAI.

[7]  Chunqiang Tang,et al.  On iterative intelligent medical search , 2008, SIGIR '08.

[8]  Tat-Seng Chua,et al.  Topic hierarchy construction for the organization of multi-source user generated contents , 2013, SIGIR.

[9]  Yi-Liang Zhao,et al.  Bridging the Vocabulary Gap between Health Seekers and Healthcare Knowledge , 2015, IEEE Transactions on Knowledge and Data Engineering.

[10]  Yinan Zhang,et al.  A phrase mining framework for recursive construction of a topical hierarchy , 2013, KDD.

[11]  Tao Mei,et al.  Learning to video search rerank via pseudo preference feedback , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[12]  Meng Wang,et al.  Oracle in Image Search: A Content-Based Approach to Performance Prediction , 2012, TOIS.

[13]  Laks V. S. Lakshmanan,et al.  Efficient extraction of ontologies from domain specific text corpora , 2012, CIKM '12.

[14]  Jimeng Sun,et al.  Medical Case-based Retrieval by Leveraging Medical Ontology and Physician Feedback: UIUC-IBM at ImageCLEF 2010 , 2010, CLEF.

[15]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[16]  Kai Wang,et al.  A syntactic tree matching approach to finding similar questions in community-based qa services , 2009, SIGIR.

[17]  Wei-Ta Chu,et al.  RoleNet: Movie Analysis from the Perspective of Social Networks , 2009, IEEE Transactions on Multimedia.

[18]  Hector Garcia-Molina,et al.  Clustering the tagged web , 2009, WSDM '09.

[19]  Tat-Seng Chua,et al.  aMM: Towards adaptive ranking of multi-modal documents , 2015, International Journal of Multimedia Information Retrieval.

[20]  Rong Yan,et al.  Multimedia Search with Pseudo-relevance Feedback , 2003, CIVR.

[21]  Tat-Seng Chua,et al.  From Tweets to Wellness: Wellness Event Detection from Twitter Streams , 2016, AAAI.

[22]  Kai Wang,et al.  Prototype hierarchy based clustering for the categorization and navigation of web collections , 2010, SIGIR.

[23]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[24]  Tao Li,et al.  A Joint Local-Global Approach for Medical Terminology Assignment , 2014, MedIR@SIGIR.

[25]  Steffen Staab,et al.  Learning Concept Hierarchies from Text with a Guided Agglomerative Clustering Algorithm , 2005, ICML 2005.

[26]  Matthijs J. Warrens,et al.  Inequalities between multi-rater kappas , 2010, Adv. Data Anal. Classif..

[27]  Jennifer Ann Rode,et al.  Competing Online Viewpoints and Models of Chronic Illness , 2011 .

[28]  Xiangyu Wang,et al.  Learning to Recommend Descriptive Tags for Questions in Social Forums , 2014, TOIS.

[29]  Hongfei Yan,et al.  Automatic labeling hierarchical topics , 2012, CIKM '12.

[30]  Meng Wang,et al.  Harvesting visual concepts for image search with complex queries , 2012, ACM Multimedia.

[31]  Ryen W. White,et al.  Pursuing insights about healthcare utilization via geocoded search queries , 2013, SIGIR.

[32]  Marc-Allen Cartright,et al.  Intentions and attention in exploratory health search , 2011, SIGIR.

[33]  Chi Fai Cheung,et al.  A concept-relationship acquisition and inference approach for hierarchical taxonomy construction from tags , 2010, Inf. Process. Manag..

[34]  Ryen W. White,et al.  Studies of the onset and persistence of medical concerns in search logs , 2012, SIGIR '12.