Towards a corpus-based identification