论文信息 - Segment Information Extraction from Financial Annual Reports Using Neural Network

Segment Information Extraction from Financial Annual Reports Using Neural Network

This is an extension from a selected paper from JSAI2019. To extract business contents automatically from financial reports is an important problem in the financial area. Especially, segment names and their explanations are important contents that should be extracted. However, the methods for extracting these types of information from financial reports have not been established. In this study, we aim to develop a practical solution for extracting these types of information. To solve this problem, we developed a manually annotated dataset for the task of extracting the segment names and their explanations of each company from financial reports and then developed a recurrent neural network model to solve this task. Our method using the manually annotated dataset outperformed the baseline methods in the task of extracting segment names and their explanations of each company from annual financial reports. In addition, we experimentally demonstrated that our method can be available for this task even when we have a small training dataset. This work is the first work for applying a machine learning method to the task of extracting segment names and their explanations. The insights from this work should be valuable in the industrial area.

Hiroki Sakaji | Tomoki Ito | Kiyoshi Izumi

[1] Hiroyuki Sakai,et al. Extraction of sentences concerning business performance forecast and economic forecast from summaries of financial statements by deep learning , 2017, 2017 IEEE Symposium Series on Computational Intelligence (SSCI).

[2] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3] Ganesh Ramakrishnan,et al. Numerical Relation Extraction with Minimal Supervision , 2016, AAAI.

[4] Sumali Conlon,et al. A Rule-Based System to Extract Financial Information , 2012, J. Comput. Inf. Syst..

[5] Luís Torgo,et al. Automatic Selection of Table Areas in Documents for Information Extraction , 2003, EPIA.

[6] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[7] Hiroyuki Sakai,et al. Discovery of rare causal knowledge from financial statement summaries , 2017, 2017 IEEE Symposium Series on Computational Intelligence (SSCI).

[8] Wei Wang,et al. Multi-Granularity Hierarchical Attention Fusion Networks for Reading Comprehension and Question Answering , 2018, ACL.

[9] Oren Etzioni,et al. Open Information Extraction from the Web , 2007, CACM.

[10] Luciano Del Corro,et al. ClausIE: clause-based open information extraction , 2013, WWW.

[11] Ming Zhou,et al. Gated Self-Matching Networks for Reading Comprehension and Question Answering , 2017, ACL.