Conceptualized phrase clustering with distributed k-means