TOUCHDOWN: Natural Language Navigation and Spatial Reasoning in Visual Street Environments