Visual Matching is Enough for Scene Text Retrieval