This project demonstrated a methodology to estimating cooperate credibility with a Natural Language Processing approach. As cooperate transparency impacts both the credibility and possible future earnings of the firm, it is an important factor to be considered by banks and investors on risk assessments of listed firms. This approach of estimating cooperate credibility can bypass human bias and inconsistency in the risk assessment, the use of large quantitative data and neural network models provides more accurate estimation in a more efficient manner compare to manual assessment. At the beginning, the model will employs Latent Dirichlet Allocation and THU Open Chinese Lexicon from Tsinghua University to classify topics in articles which are potentially related to corporate credibility. Then with the keywords related to each topics, we trained a residual convolutional neural network with data labeled according to surveys of fund manager and accountant's opinion on corporate credibility. After the training, we run the model with preprocessed news reports regarding to all of the 3065 listed companies, the model is supposed to give back companies ranking based on the level of their transparency.
[1]
Bowen Zhou,et al.
SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents
,
2016,
AAAI.
[2]
Lukasz Kaiser,et al.
Attention is All you Need
,
2017,
NIPS.
[3]
Sangsung Park,et al.
Deep Learning-Based Corporate Performance Prediction Model Considering Technical Capability
,
2017
.
[4]
Jacky C.K. Chow.
Analysis of Financial Credit Risk Using Machine Learning
,
2018
.
[5]
Michael I. Jordan,et al.
Latent Dirichlet Allocation
,
2001,
J. Mach. Learn. Res..
[6]
Nan Hua,et al.
Universal Sentence Encoder
,
2018,
ArXiv.
[7]
Yen-Chun Chen,et al.
Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting
,
2018,
ACL.