Evaluating Human-Language Model Interaction