Pretraining Financial Text Encoder Enhanced by Lifelong Learning

As the number of financial literature grows rapidly, Financial text mining is becoming important increasingly. In recent years, extracting valuable information from financial documents, namely financial text mining, gained significant popularity within research communities. Although Deep Learning-based financial text mining has achieved remarkable progress recently, in financial fields it still suffers from issues of lack of task-specific labeled training data. To alleviate these issues, we present a pretraining financial text encoder, named F-BERT, a domain-specific language model pretrained on large-scale financial corpora. Different from original BERT, proposed F-BERT is trained continually on both general corpus and financial domain corpus, and four pretraining tasks can be pretrained through lifelong learning, which can enable our F-BERT to continually capture language knowledge and semantic information. The experimental results demonstrate that proposed F-BERT achieves strong results on several financial text mining tasks. Extensive experimental results show the effectiveness and robustness of F-BERT. The source code and pretrained models of F-BERT are available online.

[1]  Wei Zhao,et al.  Generative Multi-Task Learning for Text Classification , 2020, IEEE Access.

[2]  Xiaodong Liu,et al.  Multi-Task Deep Neural Networks for Natural Language Understanding , 2019, ACL.

[3]  Jun Zhao,et al.  Semantics-Reinforced Networks for Question Generation , 2020, ECAI.

[4]  Hao Tian,et al.  ERNIE 2.0: A Continual Pre-training Framework for Language Understanding , 2019, AAAI.

[5]  Bing Liu,et al.  Lifelong machine learning: a paradigm for continuous learning , 2017, Frontiers of Computer Science.

[6]  Ke Tian,et al.  aiai at FinSBD task: Sentence Boundary Detection in Noisy Texts From Financial Documents Using Deep Attention Model , 2019 .

[7]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[8]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[9]  Xiaodong Liu,et al.  Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.

[10]  Kaiming He,et al.  Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.

[11]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[12]  Yun Xiang,et al.  Efficient Incremental Learning Using Dynamic Correction Vector , 2020, IEEE Access.

[13]  Rosa H. M. Chan,et al.  Challenges in Task Incremental Learning for Assistive Robotics , 2020, IEEE Access.

[14]  Yi Pan,et al.  A novel ensemble deep learning model for stock prediction based on stock prices and news , 2020, International Journal of Data Science and Analytics.

[15]  Xinying Xu,et al.  Exemplar-Supported Representation for Effective Class-Incremental Learning , 2020, IEEE Access.

[16]  Gregory Dudek,et al.  Benchmark Environments for Multitask Learning in Continuous Domains , 2017, ArXiv.

[17]  Pekka Korhonen,et al.  Good debt or bad debt: Detecting semantic orientations in economic texts , 2013, J. Assoc. Inf. Sci. Technol..

[18]  Yan Huang,et al.  AIG Investments.AI at the FinSBD Task: Sentence Boundary Detection through Sequence Labelling and BERT Fine-tuning , 2019 .

[19]  Omer Levy,et al.  SpanBERT: Improving Pre-training by Representing and Predicting Spans , 2019, TACL.

[20]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2018, Neural Networks.

[21]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[22]  Dogu Araci,et al.  FinBERT: Financial Sentiment Analysis with Pre-trained Language Models , 2019, ArXiv.

[23]  Hao Wu,et al.  Mixed Precision Training , 2017, ICLR.

[24]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[25]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[26]  Nassir Navab,et al.  Understanding the effects of artifacts on automated polyp detection and incorporating that knowledge via learning without forgetting , 2020 .

[27]  Bo Jin,et al.  Unified Generative Adversarial Networks for Multiple-Choice Oriented Machine Comprehension , 2020, ACM Trans. Intell. Syst. Technol..

[28]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[29]  Jing Zhang,et al.  DIM Reader: Dual Interaction Model for Machine Comprehension , 2017, CCL.

[30]  Ning Jin,et al.  Multi-Task Learning Model Based on Multi-Scale CNN and LSTM for Sentiment Classification , 2020, IEEE Access.

[31]  Liuqing Yang,et al.  DP-LSTM: Differential Privacy-inspired LSTM for Stock Prediction Using Financial News , 2019, ArXiv.

[32]  Alexander Sergeev,et al.  Horovod: fast and easy distributed deep learning in TensorFlow , 2018, ArXiv.

[33]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.