Detection of Spelling Errors in Swedish Clinical Text

Spelling errors are common in clinical text because such text is written under pressure and lack of time. It is mostly used for internal communication. To improve text mining and other type of text processing tools, spelling error detection and correction is needed. In this paper we will count spelling errors in Swedish clinical text. The developed algorithm uses word lists for detection such as a Swedish general dictionary, a medical dictionary and a list of abbreviations. The final algorithm has been tested on a Swedish clinical corpus, we obtained 12 per cent spelling errors. After error analysis of the result, it was concluded that many errors were detected by the algorithm due to inadequate word list and faulty preprocessing such as lemmatization and compound splitting. By manually removing these correct words from the list, total spelling errors were decreased to 7.6 per cent.