A study on document representation for clustering using similarity rough set model and semantic similarity