Filling the Gap of Utterance-aware and Speaker-aware Representation for Multi-turn Dialogue