Automatic measurement of syntactic complexity in child language acquisition
We describe a heuristics-based system for automatic measurement of syntactic complexity using the revised Developmental Level (D-Level) scale (Rosenberg & Abbeduto 1987; Covington et al. 2006). The system takes a raw sentence as input and assigns it to an appropriate developmental level on the scale. The system is designed with child language acquisition and psycholinguistic research in mind, and is therefore developed and evaluated using both written data from the Penn Treebank (Marcus et al. 1993) and spoken child language acquisition data from the CHILDES database (MacWhinney 2000). Experiment results show that the model achieves an accuracy of 94.0% and 93.2% on unseen test data from the Penn Treebank and the CHILDES database respectively. We illustrate how the system is used in an example application to investigate the correlation of average D-Level score and speaker age.