Clustering High-frequency Stock Data for Trading Volatility Analysis

This paper proposes a Realized Trading Volatility (RTV) model for dynamically monitoring anomalous volatility in stock trading. Specifically, the RTV model first extracts the sequences for price volatility, volume volatility, and realized trading volatility. Then, the K-means algorithm is exploited for clustering the summary data of different stocks. The RTV model investigates the joint-volatility between share price and trading volume, and has the advantage of capturing anomalous trading volatility in a dynamic fashion. As a case study, we apply the RTV model for the analysis of real-world high-frequency stock data. For the resultant clusters, we focus on the categories with large volatility and study their statistical properties. Finally, we provide some empirical insights for the use of the RTV model.