Integrating Speech and Selective Visual Perception Using a Semantic Network