Mining Large Data Sets for the Humanities