Language-Model-Based Paired Variational Autoencoders for Robotic Language Learning