DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning