BIOINFORMATICS ORIGINAL A practical algorithm for finding maximal exact matches in large sequence datasets using sparse suffix arrays