Integration of natural language and vision processing in a cognitive architecture