A design for a listening learner corpus as a language resource for computer-assisted language learning systems is proposed, and a pilot learner corpus compiled from 20 university learners of English as a foreign language is reported. The learners dictated a news report. The corpus was annotated with part-of-speech tags and error tags for the dictation. The validity of the proposed learner corpus design was assessed on the basis of the pilot learner corpus, and was determined by examining the distribution of errors and whether the distribution properly demonstrated the learners’ listening ability. The validity of the corpus was further assessed by developing a listenability measurement method that explains the ease of listening of a listening material for learners. The results suggested the dictation-based corpus data was useful for assessing the ease of listening English materials for learners, which could lead to the development of a computer-assisted language learning system tool.
[1]
C. Anton Rytting,et al.
ArCADE: An Arabic Corpus of Auditory Dictation Errors
,
2014,
BEA@ACL.
[2]
Hitoshi Isahara,et al.
Compiling Learner Corpus Data of Linguistic Output and Language Processing in Speaking, Listening, Writing, and Reading
,
2011,
IJCNLP.
[3]
Johanne Paradis,et al.
Grammatical morphology in children learning English as a second language: implications of similarities with specific language impairment.
,
2005,
Language, speech, and hearing services in schools.
[4]
Hiroaki Nanjo,et al.
A Listenability Measuring Method for an Adaptive Computer-assisted Language Learningand Teaching System
,
2014,
PACLIC.
[5]
Philip Hubbard.
Learner Training for Effective Use of CALL
,
2013
.
[6]
Ken Beatty,et al.
Teaching and Researching Computer-Assisted Language Learning
,
2003
.