A Closer Look at Temporal Sentence Grounding in Videos: Datasets and Metrics