Analyzing the language evolution of a science classroom via a topic model

In this paper, we introduce a topic model to analyze the temporal change in the spoken language of a science classroom based on a dataset of conversations among a teacher and students. One of the key goals is discovering the root of the change in the language usage of students. To accomplish this, we defined 4 categories which generate words: 1) back ground (general) 2) activity, 3) session subject, and 4) personal. Our experimental results support the hypothesis that the change in the language of students mainly consists of using more activity-based language which can be interpreted as using more scientific discourse. Such an approach can be used to investigate the effect of teaching methods or to represent an individual’s progress.

[1]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[2]  Jean-Michel Renders,et al.  Large-scale hierarchical text classification without labelled data , 2011, WSDM '11.

[3]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[5]  Laurie P. Dringus,et al.  Using data mining as a strategy for assessing asynchronous discussion forums , 2005, Comput. Educ..

[6]  Mark K. Singley,et al.  The classroom sentinel: supporting data-driven decision-making in the classroom , 2005, WWW '05.

[7]  Sebastián Ventura,et al.  Educational Data Mining: A Review of the State of the Art , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[8]  Thomas L. Griffiths,et al.  Integrating Topics and Syntax , 2004, NIPS.

[9]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..