Heterogeneous human-generated data streams are the measurands which provide opportunities to identify patterns, detect novelties and explore evolution of complex social systems. Communication technologies with their very high penetration into society can serve as particularly rich sources of information. However, for a variety of observable communication channels one has little or no access to the content of human-to-human communications, while the data streams on the intensities of such events are more common. The paper presents a framework of methods useful for exploratory analysis and visualization of such data streams. Particularly, we demonstrate how untypical activity levels can be identified by fitting a non-homogeneous Markov-modulated Poisson process and spatialising the component corresponding to unusual bursts/lulls of activity via heat maps. This approach is illustrated with a case study devoted to the analysis of geo-referenced data streams of instant messaging activity on the internet.
[1]
A S Fotheringham,et al.
The Modifiable Areal Unit Problem in Multivariate Statistical Analysis
,
1991
.
[2]
B. Silverman.
Density estimation for statistics and data analysis
,
1986
.
[3]
Alexei Pozdnoukhov,et al.
Dynamic network data exploration through semi-supervised functional embedding
,
2009,
GIS.
[4]
R. Wolpert,et al.
Poisson/gamma random field models for spatial statistics
,
1998
.
[5]
Bernard W. Silverman,et al.
Density Estimation for Statistics and Data Analysis
,
1987
.
[6]
Padhraic Smyth,et al.
Adaptive event detection with time-varying poisson processes
,
2006,
KDD '06.