Classic Children's Literature - Difficult to Read?

Classic children's literature such as Alice in Wonderland is nowadays freely available thanks to initiatives such as Project Gutenberg. Due to diverging vocabularies and style, these texts are often not readily understandable to children in the present day. Our goal is to make such texts more accessible by aiding children in the reading process, in particular by automatically identifying the terms that result in low readability. As a first step, in this poster we report on a preliminary user study that investigates the extent of the vocabulary problem. We also propose and evaluate a basic approach to detect such difficult terminology.

[1]  Marie-Francine Moens,et al.  Text simplification for children , 2010, SIGIR 2010.

[2]  Kevyn Collins-Thompson,et al.  Predicting reading difficulty with statistical language models , 2005, J. Assoc. Inf. Sci. Technol..

[3]  Mari Ostendorf,et al.  A machine learning approach to reading level assessment , 2009, Comput. Speech Lang..

[4]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.