DIBERT: Dependency Injected Bidirectional Encoder Representations from Transformers
暂无分享,去创建一个
In this paper, we propose a new model named DIBERT
which stands for Dependency Injected Bidirectional Encoder
Representations from Transformers. DIBERT is a variation of
the BERT and has an additional third objective called Parent
Prediction (PP) apart from Masked Language Modeling (MLM)
and Next Sentence Prediction (NSP). PP injects the syntactic
structure of a dependency tree while pre-training the DIBERT
which generates syntax-aware generic representations. We use
the WikiText-103 benchmark dataset to pre-train both BERT-
Base and DIBERT. After fine-tuning, we observe that DIBERT
performs better than BERT-Base on various downstream tasks
including Semantic Similarity, Natural Language Inference and
Sentiment Analysis.