Proposal and study of statistical features for string similarity computation and classification