A speech spectrogram expert

Various authors have pointed out that humans can become quite adept at deriving phonetic transcriptions from speech spectrograms (as good as 90% accuracy at the phoneme level). In this paper, we describe an expert system which attempts to simulate this performance. The speech spectrogram expert (SPEX) is actually a society made up of three experts: a 2-dimensional vision expert, an acoustic-phonetic expert, and a phonetics expert. The visual reasoning expert finds important visual features of the spectrogram. The acoustic-phonetic expert reasons about how visual features relate to phonemes, and about how phonemes change visually in different contexts. The phonetics expert reasons about allowable phoneme sequences and transformations, and deduces an English spelling for phoneme strings. The speech spectrogram expert is highly interactive, allowing users to investigate hypotheses and edit rules.