Online detection of concerned HIV-related messages in web forums

Web forums become the means of online communication and information sharing sources for the learning about health care and related treatment knowledge. By adopting web crawlers and natural language processing techniques, the automatic identification approach of the concerned HIV-related messages is proposed to facilitate the health authorities and social support groups in instant counseling. The proposed supervised GA/k-means for classification approach can help construct an effective identification and classification model with acceptable classification performance accompanied with its full flexibility to develop different fitness functions in accordance with the need of different requirements. Furthermore, with the aid of correspondence analysis, the most frequently used terms in concerned HIV-related messages are identified and focus on risky sexual behavior whereas unconcerned messages are those who of worried well.