Clustering analysis for determining the key interests in the financial literature using text mining

New economy in the era of globalization once more attracts novel investment strategies with modern analysis methods. Data mining in the last decade has been a major source of strategy implementation process in banking sectors. The volume of unstructured textual data which was generated and stored previously, has been increasing rapidly on the web. Thus, the need for processing this vast amount of text is increasing also. Text mining aims to extract novel and useful information from large amount of unstructured textual data in a semi-automatic way. In this study, we perform text-mining steps on the corpus containing the abstracts of scientific articles of five major journals publishing in the finance, economics and banking fields. For clustering of terms and detecting main topics of text on the term - document data created by text mining processes k-means clustering and singular value decomposition (SVD) methods are applied. In this study, findings demonstrate that text mining can be a very useful tool to investigate the change of trends and summarization of literature.

[1]  Andreas F. Ehmann,et al.  Lyric Text Mining in Music Mood Classification , 2009, ISMIR.

[2]  Khairullah Khan,et al.  A Review of Machine Learning Algorithms for Text-Documents Classification , 2010 .

[3]  Anand Kumar,et al.  Text mining and ontologies in biomedicine: Making sense of raw text , 2005, Briefings Bioinform..

[4]  Athena Vakali,et al.  Web Data Management Practices: Emerging Techniques and Technologies , 2007 .

[5]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[6]  Vladimir Pestov,et al.  Is the kk-NN classifier in high dimensions affected by the curse of dimensionality? , 2011, Comput. Math. Appl..

[7]  John Ignatius Griffin,et al.  Statistics; methods and applications , 1963 .

[8]  Sofus A. Macskassy Contextual linking behavior of bloggers: leveraging text mining to enable topic-based analysis , 2011, Social Network Analysis and Mining.

[9]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[10]  Paolo Giudici,et al.  Applied Data Mining for Business and Industry , 2009 .

[11]  Andrew W. Moore,et al.  X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[12]  Ying Wah Teh,et al.  Text mining for market prediction: A systematic review , 2014, Expert Syst. Appl..

[13]  Peter Harrington,et al.  Machine Learning in Action , 2012 .

[14]  Marc-André Mittermayer,et al.  Forecasting Intraday stock price trends with text mining techniques , 2004, 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the.

[15]  Jacob Goldenberg,et al.  Mine Your Own Business: Market-Structure Surveillance Through Text Mining , 2012, Mark. Sci..

[16]  Gurpreet Singh Lehal,et al.  A Survey of Text Mining Techniques and Applications , 2009 .

[17]  Shubhamoy Dey,et al.  Decision Support for e-Governance: A Text Mining Approach , 2011, ArXiv.

[18]  Andreas Hotho,et al.  A Brief Survey of Text Mining , 2005, LDV Forum.

[19]  Ronen Feldman,et al.  Book Reviews: The Text Mining Handbook: Advanced Approaches to Analyzing Unstructured Data by Ronen Feldman and James Sanger , 2008, CL.

[20]  Dursun Delen,et al.  Seeding the survey and analysis of research literature with text mining , 2008, Expert Syst. Appl..

[21]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[22]  Desheng Dash Wu,et al.  Using text mining and sentiment analysis for online forums hotspot detection and forecast , 2010, Decis. Support Syst..

[23]  Soumen Chakrabarti,et al.  Mining the web - discovering knowledge from hypertext data , 2002 .

[24]  Jan Muntermann,et al.  A Text Mining Approach to Support Intraday Financial Decision-Making , 2008, AMCIS.

[25]  Peter A. Flach,et al.  Machine Learning - The Art and Science of Algorithms that Make Sense of Data , 2012 .