From Speech Recognition to Language and Multimodal Processing