ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering

With the recent advance in large pre-trained language models, researchers have achieved record performances in NLP tasks that mostly focus on language pattern matching. The community is experiencing the shift of the challenge from how to model language to the imitation of complex reasoning abilities like human beings. In this work, we investigate the application domain of finance that involves real-world, complex numerical reasoning. We propose a new large-scale dataset, ConvFinQA, aiming to study the chain of numerical reasoning in conversational question answering. Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations. We conduct comprehensive experiments and analyses with both the neural symbolic methods and the prompting-based methods, to provide insights into the reasoning mechanisms of these two divisions. We believe our new dataset should serve as a valuable resource to push forward the exploration of real-world, complex reasoning tasks as the next research focus. Our dataset and code is publicly available at https://github.com/czyssrs/ConvFinQA.

[1]  Derek Hoiem,et al.  Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners , 2022, NeurIPS.

[2]  Andrew M. Dai,et al.  PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..

[3]  Dale Schuurmans,et al.  Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.

[4]  Alexander M. Rush,et al.  Multitask Prompted Training Enables Zero-Shot Task Generalization , 2021, ICLR.

[5]  Sameena Shah,et al.  FinQA: A Dataset of Numerical Reasoning over Financial Data , 2021, EMNLP.

[6]  Quan Z. Sheng,et al.  Conversational question answering: a survey , 2021, Knowledge and Information Systems.

[7]  Fuli Feng,et al.  TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance , 2021, ACL.

[8]  Jun Zhao,et al.  FinBERT: A Pre-trained Financial Language Representation Model for Financial Text Mining , 2020, IJCAI.

[9]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[10]  Quoc V. Le,et al.  Neural Symbolic Reader: Scalable Integration of Distributed and Symbolic Representations for Reading Comprehension , 2020, ICLR.

[11]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[12]  Armineh Nourbakhsh,et al.  A framework for anomaly detection using language modeling, and its applications to finance , 2019, ArXiv.

[13]  Jiajun Zhang,et al.  Are You for Real? Detecting Identity Fraud via Dialogue Interactions , 2019, EMNLP.

[14]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[15]  Yejin Choi,et al.  MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms , 2019, NAACL.

[16]  Gabriel Stanovsky,et al.  DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs , 2019, NAACL.

[17]  Danqi Chen,et al.  CoQA: A Conversational Question Answering Challenge , 2018, TACL.

[18]  Eunsol Choi,et al.  QuAC: Question Answering in Context , 2018, EMNLP.

[19]  Jinhua Du,et al.  NextGen AML: Distributed Deep Learning based Language Technologies to Augment Anti Money Laundering Investigation , 2018, ACL.

[20]  Mitesh M. Khapra,et al.  Complex Sequential Question Answering: Towards Learning to Converse Over Linked Question Answer Pairs with a Knowledge Graph , 2018, AAAI.

[21]  Abhishek Kumar,et al.  A Multilayer Perceptron based Ensemble Technique for Fine-grained Financial Sentiment Analysis , 2017, EMNLP.

[22]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[23]  Ming-Wei Chang,et al.  Search-based Neural Structured Learning for Sequential Question Answering , 2017, ACL.

[24]  Chen Liang,et al.  Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision , 2016, ACL.

[25]  Min-Yuh Day,et al.  Deep learning for financial sentiment analysis on finance news providers , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[26]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[27]  Hannaneh Hajishirzi,et al.  MAWPS: A Math Word Problem Repository , 2016, NAACL.

[28]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29]  Aaron C. Courville,et al.  Mining , 2011, The Classical Review.

[30]  Mascha Kaléko,et al.  What You Need… , 2010 .

[31]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[32]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .