A Closer Look at Temporal Sentence Grounding in Videos: Dataset and Metric