论文信息 - Task-Specific Dependency-based Word Embedding Methods - 字舞流文

Task-Specific Dependency-based Word Embedding Methods

Two task-specific dependency-based word embedding methods are proposed for text classification in this work. In contrast with universal word embedding methods that work for generic tasks, we design task-specific word embedding methods to offer better performance in a specific task. Our methods follow the PPMI matrix factorization framework and derive word contexts from the dependency parse tree. The first one, called the dependency-based word embedding (DWE), chooses keywords and neighbor words of a target word in the dependency parse tree as contexts to build the word-context matrix. The second method, named class-enhanced dependency-based word embedding (CEDWE), learns from word-context as well as word-class co-occurrence statistics. DWE and CEDWE are evaluated on popular text classification datasets to demonstrate their effectiveness. It is shown by experimental results they outperform several state-of-the-art word embedding methods.

C.-C. Jay Kuo | Bin Wang | Chengwei Wei | Chen Wei | Bin Wang | Chengwei Wei

[1] Omer Levy,et al. Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[2] Xiang Zhang,et al. Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[3] Danqi Chen,et al. A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[4] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[5] Christopher D. Manning,et al. Bilingual Word Embeddings for Phrase-Based Machine Translation , 2013, EMNLP.

[6] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[7] Thomas Demeester,et al. Representation learning for very short texts using weighted word embedding aggregation , 2016, Pattern Recognit. Lett..

[8] Jianxin Li,et al. Training and Evaluating Improved Dependency-Based Word Embeddings , 2018, AAAI.

[9] 知秀柴田. 5分で分かる!? 有名論文ナナメ読み：Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[10] Omer Levy,et al. Improving Distributional Similarity with Lessons Learned from Word Embeddings , 2015, TACL.

[11] Zhiyuan Liu,et al. Topical Word Embeddings , 2015, AAAI.

[12] Ani Nenkova,et al. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , 2016, NAACL 2016.

[13] C.-C. Jay Kuo,et al. SBERT-WK: A Sentence Embedding Method by Dissecting BERT-Based Word Models , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[14] Suresh Manandhar,et al. Dependency Based Embeddings for Sentence Classification Tasks , 2016, NAACL.

[15] Qian Liu,et al. Task-oriented Word Embedding for Text Classification , 2018, COLING.

[16] Ming Zhou,et al. Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[17] Joakim Nivre,et al. Universal Stanford dependencies: A cross-linguistic typology , 2014, LREC.

[18] Omer Levy,et al. Dependency-Based Word Embeddings , 2014, ACL.

[19] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[20] J. R. Firth,et al. A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[21] Mirella Lapata,et al. Dependency-Based Construction of Semantic Space Models , 2007, CL.

[22] Christopher D. Manning,et al. Stanza: A Python Natural Language Processing Toolkit for Many Human Languages , 2020, ACL.

[23] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.