Finding near replicas of Web pages based on Fourier transform

Removing duplicated Web pages can improve the searching accuracy and reduce the data storage space.Current de-duplication algorithms mainly focus on