Comparing child and adult language: exploring semantic constraints

Actual research on child-machine interaction indicate that children are specific with respect to various acoustic, linguistic [7], psychological, cultural and social factors. We wish to address the linguistic factor, focusing on the semantic knowledge which needs to be mastered by a computer system designed to interact with children. Our work is intentionally usage-based and application-driven. The research was conducted in the frame of the EmotiRob project, which aims at building a companion robot for children experiencing emotional difficulties. The robot is supposed to understand the emotional state of the child and respond (albeit non linguistically) adequately [1]. The interactional capacities are heavily dependent on the results of the comprehension module. The comprehension model incorporates semantic knowledge such as children-based ontologies and specific semantic associative rules. Our study is based on a corpus of Fairy Tales, which will later be compared to an oral corpus when the latter is completed. We argue that lexical knowledge and semantic associations discovered in this corpus will not differ greatly between writing and speech. Fairy Tales constitute privileged material for teachers and psychologists who argue that they play a crucial role in child socialization and structuration of concepts. To spot child language specificities, we provide a contrastive analysis of semantic preferences according to production (child VS adult authored text) and to reception (child VS adult destined text). We use a shallow ontology to compare verb constraints on specific syntactic positions in child VS adult texts. Preliminary results show, as expected, a significant difference in terms of reception, though questioning the idea that adult language is much more constraining, while differences in terms of production are less obvious and call for a detailed qualitative study.