Cluster Analysis for SME Risk Analysis Documents Based on Pillar K-Means

In Small Medium Enterprise’s (SME) financing risk analysis, the implementation of qualitative model by giving opinion regarding business risk is to overcome the subjectivity in quantitative model. However, there is another problem that the decision makers have difficulity to quantify the risk’s weight that delivered through those opinions. Thus, we focused on three objectives to overcome the problems that oftenly occur in qualitative model implementation. First, we modelled risk clusters using K-Means clustering, optimized by Pillar Algorithm to get the optimum number of clusters. Secondly, we performed risk measurement by calculating term-importance scores using TF-IDF combined with term-sentiment scores based on SentiWordNet 3.0 for Bahasa Indonesia. Eventually, we summarized the result by correlating the featured terms in each cluster with the 5Cs Credit Criteria. The result shows that the model is effective to group and measure the level of the risk and can be used as a basis for the decision makers in approving the loan proposal.

[1]  Fabrício Benevenuto,et al.  Comparing and combining sentiment analysis methods , 2013, COSN '13.

[2]  Mike Thelwall,et al.  Sentiment in short strength detection informal text , 2010 .

[3]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[4]  Stephen Shaoyi Liao,et al.  Mining comparative opinions from customer reviews for Competitive Intelligence , 2011, Decis. Support Syst..

[5]  Srinivas Gumparthi Risk Assessment Model for Assessing NBFCs' (Asset Financing) Customers , 2010 .

[6]  Chih-Fong Tsai,et al.  Using neural network ensembles for bankruptcy prediction and credit scoring , 2008, Expert Syst. Appl..

[7]  Dell Zhang,et al.  Semantic, Hierarchical, Online Clustering of Web Search Results , 2004, APWeb.

[8]  K. Nirmala Devi,et al.  SENTIMENT ANALYSIS FOR ONLINE FORUMS HOTSPOT DETECTION , 2012, SOCO 2012.

[9]  Yasuhiko Morimoto,et al.  Attribute Selection for Numerical Databases that Contain Correlations , 2008, Int. J. Softw. Informatics.

[10]  Fuchun Liu,et al.  Index Selection Preference and Weighting for Uncertain Network Sentiment Emergency , 2013 .

[11]  Kerstin Denecke,et al.  Using SentiWordNet for multilingual sentiment analysis , 2008, 2008 IEEE 24th International Conference on Data Engineering Workshop.

[12]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[13]  David Zimbra,et al.  Twitter brand sentiment analysis: A hybrid system using n-gram analysis and dynamic artificial neural network , 2013, Expert Syst. Appl..

[14]  ThelwallMike,et al.  Sentiment strength detection in short informal text , 2010 .

[15]  Quantitative vs. Qualitative Criteria for Credit Risk Assessment , 2011 .

[16]  Ayu Purwarianti,et al.  Indonesian social media sentiment analysis with sarcasm detection , 2013, 2013 International Conference on Advanced Computer Science and Information Systems (ICACSIS).

[17]  Yasushi Kiyoki,et al.  A pillar algorithm for K-means optimization by distance maximization for initial centroid designation , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[18]  Pei-Ying Zhang A HowNet-Based Semantic Relatedness Kernel for Text Classification , 2013 .

[19]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .

[20]  Hua Xu,et al.  Clustering product features for opinion mining , 2011, WSDM '11.

[21]  Dawid Weiss,et al.  Lingo: Search Results Clustering Algorithm Based on Singular Value Decomposition , 2004, Intelligent Information Systems.