Extracting script features from a large corpus of handwritten documents