OUTLIER DETECTION AND INFLUENTIAL POINT OBSERVATION IN LINEAR REGRESSION USING CLUSTERING TECHNIQUES IN FINANCIAL TIME SERIES DATA

The modern computing technology makes data gathering and storage easier. This creates new range of problems and challenges for data analysis. Detection of outliers in time series data has gained much attention in recent years. We present a new approach based on clustering techniques for outlier. The Expectation Maximization clusters (EM-Cluster) algorithm is used to find the “optimal” parameters of the distributions that maximize the likelihood function. Regression based outlier technique is used to detect influence point. The analysis of outliers and influential points is an important step of the regression diagnostics. Several indicators are used for identifying and analyzing outliers. The proposed approach gave effective results within optimum time and space when applied to synthetic data set. This paper investigates the outliers, volatility clustering and risk-return trade-off in the Indian stock markets NSE Nifty and BSE SENSEX. Engle's ARCH Test and AR (1)-EGARCH (p, q)-in-Mean model were employed to examine the objective of the study. It revealed that volatility is persistent and there is leverage effect in the Indian stock markets.

[1]  Jingke Xi,et al.  Outlier Detection Algorithms in Data Mining , 2008, 2008 Second International Symposium on Intelligent Information Technology Application.

[2]  Komain Jiranyakul On the Risk-Return Tradeoff in the Stock Exchange of Thailand: New Evidence , 2011 .

[3]  Beat Kleiner,et al.  Graphical Methods for Data Analysis , 1983 .

[4]  René M. Stulz,et al.  Global Financial Markets and the Risk Premium on U.S. Equity , 1992 .

[5]  Francesca Bovolo,et al.  A Context-Sensitive Clustering Technique Based on Graph-Cut Initialization and Expectation-Maximization Algorithm , 2008, IEEE Geoscience and Remote Sensing Letters.

[6]  J. Hair Multivariate data analysis , 1972 .

[7]  R. C. Merton,et al.  AN INTERTEMPORAL CAPITAL ASSET PRICING MODEL , 1973 .

[8]  Anna Bartkowiak Outliers in Biometrical Data - Two Real Examples of Analysis , 2009, 2009 International Conference on Biometrics and Kansei Engineering.

[9]  Wahyu Kusuma,et al.  Journal of Theoretical and Applied Information Technology , 2012 .

[10]  T. Bollerslev,et al.  Generalized autoregressive conditional heteroskedasticity , 1986 .

[11]  Yong Shi,et al.  Detecting Clusters and Outliers for Multi-dimensional Data , 2008, 2008 International Conference on Multimedia and Ubiquitous Engineering (mue 2008).

[12]  Jaeun Shin Stock Returns and Volatility in Emerging Stock Markets , 2005 .

[13]  Iuliana F Iatan The expectation-maximization algorithm: Gaussian case , 2010, 2010 International Conference on Networking and Information Technology.

[14]  A. F. Darrat,et al.  Revisiting the risk/return relations in the Asian Pacific markets: New evidence from alternative models , 2011 .

[15]  Daniel B. Nelson CONDITIONAL HETEROSKEDASTICITY IN ASSET RETURNS: A NEW APPROACH , 1991 .

[16]  M. Karmakar Asymmetric Volatility and Risk-return Relationship in the Indian Stock Market , 2007 .

[17]  Hui Xiong,et al.  Enhancing data analysis with noise removal , 2006, IEEE Transactions on Knowledge and Data Engineering.

[18]  John T. Scruggs Resolving the Puzzling Intertemporal Relation between the Market Risk Premium and Conditional Market Variance: A Two‐Factor Approach , 1998 .

[19]  V. Chirilă,et al.  RELATION BETWEEN EXPECTED RETURN AND VOLATILITY AT BUCHAREST STOCK EXCHANGE, ON BUSINESS CYCLE STAGES , 2012 .

[20]  R. Engle Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation , 1982 .

[21]  Rufus Ayodeji Stock Return, Volatility And The Global Financial Crisis In An Emerging Market: The Nigerian Case , 2009 .

[22]  Zhi-Kai Huang,et al.  Segmentation of Color Image Using EM algorithm in HSV Color Space , 2007, 2007 International Conference on Information Acquisition.

[23]  Beat Hulliger,et al.  The BACON-EEM Algorithm for Multivariate Outlier Detection in Incomplete Survey Data. , 2008 .

[24]  Bin Li Risk and Returns in the Asian Pacific Markets : The MIDAS Approach , 2011 .

[25]  J. Campbell Stock Returns and the Term Structure , 1985 .

[26]  M. Austin Empirical Analysis of Stock Returns and Volatility : Evidence from Seven Asian Stock Markets Based on TAR-GARCH Model , 2001 .

[27]  Osama Abu Abbas,et al.  Comparisons Between Data Clustering Algorithms , 2008, Int. Arab J. Inf. Technol..

[28]  Campbell R. Harvey The Specification of Conditional Expectations , 1991 .

[29]  K. Léon Stock market returns and volatility in the BRVM , 2007 .

[30]  R. Chou Volatility persistence and stock valuations: Some empirical evidence using garch , 1988 .

[31]  Guojun Wu,et al.  The Determinants of Asymmetric Volatility , 2001 .

[32]  Bong‐Soo Lee,et al.  The Intertemporal Risk-Return Relation in the Stock Market , 2009 .

[33]  Enrique Sentana,et al.  Feedback Traders and Stock Return Autocorrelations: Evidence from a Century of Daily Data , 1992 .

[34]  Issei Fujishiro,et al.  The elements of graphing data , 2005, The Visual Computer.

[35]  Cheng Hsiao,et al.  The Relationship between Stock Returns and Volatility in International Stock Markets , 2005 .

[36]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[37]  G. Box,et al.  On a measure of lack of fit in time series models , 1978 .

[38]  K. French,et al.  Expected stock returns and volatility , 1987 .

[39]  Gonzalo Rubio Irigoyen,et al.  The Relationship between Risk and Expected Return in Europe , 2005 .

[40]  Bin Wang,et al.  Distance-Based Outlier Detection on Uncertain Data , 2009, 2009 Ninth IEEE International Conference on Computer and Information Technology.

[41]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[42]  A. Madansky Identification of Outliers , 1988 .

[43]  J. Lintner SECURITY PRICES, RISK, AND MAXIMAL GAINS FROM DIVERSIFICATION , 1965 .

[44]  Christopher J. Neely,et al.  Investigating the Intertemporal Risk-Return Relation in International Stock Markets with the Component GARCH Model , 2006 .

[45]  Guojun Wu,et al.  Asymmetric Volatility and Risk in Equity Markets , 1997 .

[46]  A. Kyle Continuous Auctions and Insider Trading , 1985 .

[47]  R. C. Merton,et al.  On Estimating the Expected Return on the Market: An Exploratory Investigation , 1980 .

[48]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[49]  J. Mossin EQUILIBRIUM IN A CAPITAL ASSET MARKET , 1966 .

[50]  E. Ghysels,et al.  There is a Risk-Return Tradeoff after All , 2004 .

[51]  B. Mandlebrot The Variation of Certain Speculative Prices , 1963 .

[52]  Mehrdad Jalali,et al.  Expectation maximization clustering algorithm for user modeling in web usage mining system , 2009 .

[53]  John M. Chambers,et al.  Graphical Methods for Data Analysis , 1983 .

[54]  W. Sharpe CAPITAL ASSET PRICES: A THEORY OF MARKET EQUILIBRIUM UNDER CONDITIONS OF RISK* , 1964 .

[55]  S. M. Guerra,et al.  Stock Returns and Volatility , 2002 .

[56]  Giorgio De Santis,et al.  Stock returns and volatility in emerging financial markets , 1997 .

[57]  Lotfi A. Zadeh,et al.  Fuzzy logic, neural networks, and soft computing , 1993, CACM.

[58]  Philip S. Yu,et al.  Fast algorithms for projected clustering , 1999, SIGMOD '99.

[59]  Frank T. Magiera,et al.  There Is a Risk–Return Trade-Off After All , 2005 .

[60]  Philip S. Yu,et al.  Finding generalized projected clusters in high dimensional spaces , 2000, SIGMOD '00.

[61]  Michael W. Brandt,et al.  On the Relationship between the Conditional Mean and Volatility of Stock Returns: A Latent VAR Approach , 2002 .

[62]  Panayiotis Theodossiou,et al.  RELATIONSHIP BETWEEN VOLATILITY AND EXPECTED RETURNS ACROSS INTERNATIONAL STOCK MARKETS , 1995 .

[63]  S. Ross,et al.  The valuation of options for alternative stochastic processes , 1976 .