Smart text-classification of user-generated data in educational social networks

Nowadays, face-to-face interpersonal communication has been gradually replaced by communications via virtual social network platforms, which applies to the new generation of education. The amount of user-generated data in social networking sites is increasing day by day. Understanding and consuming this great amount of data has become a harder task. Classifying the user-generated data (mainly text) can help simplify the user experience by providing them dynamic personalized recommender. Filtering the data, and providing users with what is relevant to them, will help them utilize this data more effectively. In education, recommending relevant learning content to learners in educational social networking sites saves them the arduous task of sifting through a huge amount of information. This paper introduces the partial-supervised learning for Hierarchical Dirichlet Process (HDP) for text classification with inherent hierarchical structure in education. This enables the use of partial known model structure and labels as expert knowledge to guide the model learning procedure from the text without labels. Compared with the existing partial/semi-supervised HDP, the proposed method is able to make use of the known labels for not only structure construction, but also parameter learning. This enhancement provides a more flexible way and better guide for the model learning from the unlabelled documents. We experimentally investigate the contribution of partial knowledge to guide the model learning process. The proposed partial-supervision for HDP is applied to student micro-blog automatic classification and adds intelligence to our student social media platform (ELSE).

[1]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[2]  Julita Vassileva,et al.  Motivating participation in social computing applications: a user modeling perspective , 2012, User Modeling and User-Adapted Interaction.

[3]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[5]  Terry Evans,et al.  Designing for Learning: Online Social Networks as a Classroom Environment. , 2011 .

[6]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Di Wang,et al.  Semi-Supervised Latent Dirichlet Allocation and Its Application for Document Classification , 2012, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.