Random Matrix Asymptotics of Inner Product Kernel Spectral Clustering

We study in this article the asymptotic performance of spectral clustering with inner product kernel for Gaussian mixture models of high dimension with numerous samples. As is now classical in large dimensional spectral analysis, we establish a phase transition phenomenon by which a minimum distance between the class means and covariances is required for clustering to be possible from the dominant eigenvectors. Beyond this phase transition, we evaluate the asymptotic content of the dominant eigenvectors thus allowing for a full characterization of clustering performance. However, a surprising finding is that in some particular scenarios, the phase transition does not occur and clustering can be achieved irrespective of the class means and covariances. This is evidenced here in the case of the mixture of two Gaussian datasets having the same means and arbitrary difference between covariances.