A Rough-Set-Refined Text Mining Approach for Crude Oil Market Tendency Forecasting

In this study, we propose a knowledge-based forecasting system — rough-set-refined text mining (RSTM) approach — for crude oil price tendency forecasting. This system consists of two modules. In the first module, text mining techniques are used to construct a metadata repository and generate rough knowledge by extracting unstructured text documents, including gathering various related text documents, preprocessing documents, feature extraction, and metadata mining and rough knowledge generation. In the second module, rough set theory is used as a knowledge refiner for the rough knowledge, which includes information table formulation, information reduction and rough knowledge refinement. By combining these two components, some useful patterns and rules (“knowledge”) are generated, which can be used for crude oil market tendency forecasting. To evaluate the forecasting ability of RSTM, we compare its performance with that of conventional methods (e.g., statistical models and time series models) and neural network models. The empirical results reveal that RSTM outperforms other forecasting models and demonstrate that the proposed approach is suitable for simultaneous application to a wide range of practical prediction problems under uncertainty. In addition, experimental results reveal that our proposed approach is a promising alternative to the conventional methods for crude oil market tendency forecasting.

[1]  Slava M. Katz Distribution of content words and phrases in text and language modelling , 1996, Natural Language Engineering.

[2]  Dan Sullivan,et al.  Document Warehousing and Text Mining: Techniques for Improving Business Operations, Marketing, and Sales , 2001 .

[3]  Andrzej Skowron,et al.  Rough Sets: A Tutorial , 1998 .

[4]  Marti A. Hearst Untangling Text Data Mining , 1999, ACL.

[5]  Chih-Ping Wei,et al.  A mining-based category evolution approach to managing online document categories , 2001, Proceedings of the 34th Annual Hawaii International Conference on System Sciences.

[6]  Zhiqiang Zheng,et al.  On the Existence and Significance of Data Preprocessing Biases in Web-Usage Mining , 2003, INFORMS J. Comput..

[7]  Michael W. Berry,et al.  Understanding search engines: mathematical modeling and text retrieval (software , 1999 .

[8]  Chung-Hsing Yeh,et al.  A multilingual text mining approach to web cross-lingual text retrieval , 2004, Knowl. Based Syst..

[9]  Francis Eng Hock Tay,et al.  Economic and financial prediction using rough sets model , 2002, Eur. J. Oper. Res..

[10]  Roman Słowiński,et al.  The Use of Rough Sets and Fuzzy Sets in MCDM , 1999 .

[11]  Bruce Abramson,et al.  Using belief networks to forecast oil prices , 1991 .

[12]  Hillard G. Huntington,et al.  Oil Price Forecasting in the 1980s: What Went Wrong?* , 1994 .

[13]  Jr. Philip K. Verleger,et al.  Adjusting to volatile energy prices , 1994 .

[14]  Ramlan Mahmod,et al.  Rough neural expert systems , 2000 .

[15]  Ido Dagan,et al.  Mining Text Using Keyword Distributions , 1998, Journal of Intelligent Information Systems.

[16]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[17]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[18]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[19]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[20]  M. Saravanan,et al.  Summarization and categorization of text data in high-level data cleaning for information retrieval , 2003, Appl. Artif. Intell..

[21]  Constantin Zopounidis,et al.  Business failure prediction using rough sets , 1999, Eur. J. Oper. Res..

[22]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[23]  Martin Rajman,et al.  Text Mining: Natural Language techniques and Text Mining applications , 1998 .

[24]  Yonatan Aumann,et al.  Knowledge Management: A Text Mining Approach , 1998, PAKM.

[25]  R. Słowiński Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory , 1992 .

[26]  Claudio Morana,et al.  A semiparametric approach to short-term oil price forecasting , 2001 .

[27]  Bruce Abramson,et al.  Probabilistic forecasts from probabilistic models: A case study in the oil market , 1995 .