Novice Programmers Talking about Projects: What Automated Text Analysis Reveals about Online Scratch Users' Comments

In this paper we examine the possibilities of applying predictive analysis to users' written communication via comments in an open-ended online social networking forum: Scratch.mit.edu. Scratch is primarily used by youth ages 8-16 years to program software like games, animations, and stories; their social interactions take place around commenting, remixing, and sharing computer programs (called projects). This exploratory work contributes to work in educational data mining by broadly describing and comparing comments about projects versus other topics in Scratch. Referencing communication accommodation theory, we found that user comments about projects exhibited different linguistic cues than other comments, and these cues were successfully used to classify comment topic. Further, results also suggest that project comments embody richer language than other comments. This suggests several future avenues for research on youth's online comments about programming and other technical projects that may reveal educational opportunities in creating and sharing projects.

[1]  W. Marsden I and J , 2012 .

[2]  Paulo Blikstein,et al.  Using learning analytics to assess students' behavior in open-ended programming tasks , 2011, LAK.

[3]  K. Peppler,et al.  Youth, Technology, and DIY , 2011 .

[4]  N. Mercer,et al.  Language and Social Psychology , 1979 .

[5]  Janice D. Gobert,et al.  Leveraging Educational Data Mining for Real-time Performance Assessment of Scientific Inquiry Skills within Microworlds , 2012, EDM 2012.

[6]  Sara M. Grimes,et al.  Kids Online: A New Research Agenda for Understanding Social Networking Forums , 2012 .

[7]  Marcel Abendroth,et al.  Data Mining Practical Machine Learning Tools And Techniques With Java Implementations , 2016 .

[8]  H. Giles,et al.  Communication accommodation theory: A look back and a look ahead , 2005 .

[9]  Robert N. St. Clair,et al.  Language and social psychology , 1981 .

[10]  A. Lenhart,et al.  Teens and Sexting: How and why minor teens are sending sexually suggestive nude or nearly nude images via text messaging , 2009 .

[11]  Ryan S. Baker,et al.  The State of Educational Data Mining in 2009: A Review and Future Visions. , 2009, EDM 2009.

[12]  Cristina Conati,et al.  Combining Unsupervised and Supervised Classification to Build User Models for Exploratory , 2009, EDM 2009.

[13]  P. Boyd Youth , 1970, Mental Health.

[14]  Jaideep Srivastava,et al.  Please Scroll down for Article Communication Methods and Measures the Virtual Worlds Exploratorium: Using Large-scale Data and Computational Techniques for Communication Research the Virtual Worlds Exploratorium: Using Large-scale Data and Computational Techniques for Communication Research , 2022 .

[15]  Kenneth R. Koedinger,et al.  An Open Repository and analysis tools for fine-grained, longitudinal learner data , 2008, EDM.

[16]  William B. Gudykunst Theorizing about intercultural communication , 2005 .

[17]  Marko Dragojevic,et al.  Communication Accommodation Theory , 2015 .

[18]  Jay F. Nunamaker,et al.  A Comparison of Classification Methods for Predicting Deception in Computer-Mediated Communication , 2004, J. Manag. Inf. Syst..

[19]  J. Nunamaker,et al.  Automating Linguistics-Based Cues for Detecting Deception in Text-Based Asynchronous Computer-Mediated Communications , 2004 .

[20]  Marina Umaschi Bers New Media and Technology. New Directions for Youth Development, No. 128. , 2011 .

[21]  Deborah A. Fields,et al.  Collaborative Agency in Youth Online and Offline Creative Production in Scratch , 2012 .

[22]  Ryan Shaun Joazeiro de Baker,et al.  Off-task behavior in the cognitive tutor classroom: when students "game the system" , 2004, CHI.

[23]  Eric Rosenbaum,et al.  Scratch: programming for all , 2009, Commun. ACM.

[24]  Jay F. Nunamaker,et al.  Advances in automated deception detection in text-based computer-mediated communication , 2004, SPIE Defense + Commercial Sensing.

[25]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[26]  Shirley Brice Heath,et al.  Three’s Not a Crowd: Plans, Roles, and Focus in the Arts , 2001 .

[27]  Taylor Martin,et al.  Using Learning Analytics to Understand the Learning Pathways of Novice Programmers , 2013 .

[28]  S. Wortham Youth, Technology, and DIY: Developing Participatory Competencies in Creative Media Production , 2010 .

[29]  Ian Witten,et al.  Data Mining , 2000 .

[30]  ResnickMitchel,et al.  Programming by choice , 2008 .

[31]  Carolyn Penstein Rosé,et al.  Measuring prevalence of other-oriented transactive contributions using an automated measure of speech style accommodation , 2013, International Journal of Computer-Supported Collaborative Learning.

[32]  C. Whissell,et al.  A Dictionary of Affect in Language: IV. Reliability, Validity, and Applications , 1986 .

[33]  Dongsong Zhang,et al.  ROD - toward rapid ontology development for underdeveloped domains , 2002, Proceedings of the 35th Annual Hawaii International Conference on System Sciences.

[34]  Mitchel Resnick,et al.  Programming by choice: urban youth learning programming with scratch , 2008, SIGCSE '08.

[35]  Diane M. Mackie,et al.  Knowledge of the Advocated Position and the Processing of In-Group and Out-Group Persuasive Messages , 1992 .

[36]  Deborah A. Fields,et al.  From tools to communities: designs to support online creative collaboration in scratch , 2012, IDC '12.

[37]  Marina Umaschi Bers Issue editor's notes. New media and technology: youth as content creators. , 2010, New directions for youth development.

[38]  Bonnie A. Nardi,et al.  Learning Conversations in World of Warcraft , 2007, 2007 40th Annual Hawaii International Conference on System Sciences (HICSS'07).

[39]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[40]  Deborah A. Fields,et al.  Understanding Collaborative Practices in the Scratch Online Community: Patterns of Participation Among Youth Designers , 2013, CSCL.

[41]  Richard West,et al.  Introducing Communication Theory: Analysis and Application , 2000 .

[42]  Bruce L Sherin,et al.  A Computational Study of Commonsense Science: An Exploration in the Automated Analysis of Clinical Interview Data , 2013 .

[43]  Carolyn Temple Adger,et al.  Kids Talk: Strategic Language Use in Later Childhood. Oxford Studies in Sociolinguistics. , 1998 .