This paper presents a research on using Word2Vec for determining implicit links in multi-participant Computer-Supported Collaborative Learning chat conversations. Word2Vec is a powerful and one of the newest Natural Language Processing semantic models used for computing text cohesion and similarity between documents. This research considers cohesion scores in terms of the strength of the semantic relations established between two utterances, the higher the score, the stronger the similarity between two utterances. An implicit link is established based on cohesion to the most similar previous utterance, within an imposed window. Three similarity formulas were used to compute the cohesion score: an unnormalized score, a normalized score with distance and Mihalcea's formula. Our corpus of conversations incorporated explicit references provided by authors, which were used for validation. A window of 5 utterances and a 1-minute time frame provided the highest detection rate both for exact matching and matching of a block of continuous utterances belonging to the same speaker. Moreover, the unnormalized score correctly identified the largest number of implicit links.
[1]
Traian Rebedea,et al.
A Polyphonic Model and System for Inter-animation Analysis in Chat Conversations with Multiple Participants
,
2010,
CICLing.
[2]
Danielle S. McNamara,et al.
ReaderBench: Automated evaluation of collaboration based on cohesion and dialogism
,
2015,
International Journal of Computer-Supported Collaborative Learning.
[3]
Traian Rebedea,et al.
Time and Semantic Similarity - What is the Best Alternative to Capture Implicit Links in CSCL Conversations?
,
2017,
CSCL.
[4]
Carlo Strapparava,et al.
Corpus-based and Knowledge-based Measures of Text Semantic Similarity
,
2006,
AAAI.
[5]
Jeffrey Dean,et al.
Efficient Estimation of Word Representations in Vector Space
,
2013,
ICLR.
[6]
Victor Kaptelinin,et al.
Group Cognition Computer Support for Building Collaborative Knowledge
,
2007
.