论文信息 - An Advantage Actor-Critic Algorithm with Confidence Exploration for Open Information Extraction - 字舞流文

An Advantage Actor-Critic Algorithm with Confidence Exploration for Open Information Extraction

Open Information Extraction (OIE) is a task of generating the structured representations of information from natural language sentences. Recently years, many works have trained an End-to-End OIE extractor based on Sequence-toSequence (Seq2Seq) model and applied Reinforce Algorithm to update the model. However, the model performance often suffers from a large training variance and limited exploration. This paper introduces a reinforcement learning framework that enables an Advantage Actor-Critic (AAC) algorithm to update the Seq2Seq model with samples from a novel Confidence Exploration (CE). The AAC algorithm reduces the training variance with a fine-grained evaluation of each individual word. The confidence exploration provides effective training samples by exploring the word at key positions. Empirical evaluations demonstrate the leading performance of our Advantage Actor-Critic algorithm and Confidence Exploration over other comparison methods.

Xu Li | Ping Li | Guiliang Liu | Miningming Sun | P. Li | Guiliang Liu | Xu Li | Miningming Sun

[1] Joelle Pineau,et al. An Actor-Critic Algorithm for Sequence Prediction , 2016, ICLR.

[2] Mausam,et al. Open Information Extraction Systems and Downstream Applications , 2016, IJCAI.

[3] Xu Li,et al. Extracting Knowledge from Web Text with Monte Carlo Tree Search , 2020, WWW.

[4] André Freitas,et al. A Survey on Open Information Extraction , 2018, COLING.

[5] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[6] Xu Li,et al. Logician and Orator: Learning from the Duality between Language and Knowledge in Open Domain , 2018, EMNLP.

[7] Oren Etzioni,et al. Open question answering over curated and extracted knowledge bases , 2014, KDD.

[8] Ido Dagan,et al. Supervised Open Information Extraction , 2018, NAACL.

[9] Yang Xiang,et al. Chinese Open Relation Extraction and Knowledge Base Establishment , 2018, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[10] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[11] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[12] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[13] Oren Etzioni,et al. Chinese Open Relation Extraction for Knowledge Acquisition , 2014, EACL.

[14] Hang Li,et al. “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[15] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[16] Lantao Yu,et al. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[17] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[18] Miao Fan,et al. Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction , 2018, WSDM.

[19] Oren Etzioni,et al. Open Language Learning for Information Extraction , 2012, EMNLP.

[20] Yang Liu,et al. Modeling Coverage for Neural Machine Translation , 2016, ACL.

[21] Ping Li,et al. Multi-Agent Discussion Mechanism for Natural Language Generation , 2019, AAAI.

[22] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.