Recognition complexity with large vocabulary

In this paper, we study the complexity of the recognition process for a large vocabulary, with a specific application using a large French dictionary. In the first part, we review the lexical and grammatical differences between French and English that affect recognition complexity. Then we compute various measurements on a large pseudo-phonetic French dictionary. In the second part, we report experiments on a large real text. To measure recognition complexity, we compute the average branching factor arising in all possible spellings of the phonetic transcription of the text, according to our dictionary. We show that a Markov modeling of French allows us to reduce significantly the branching factor by discarding improbable choices.