A novel linguistic phenomenon description for text similarity computing

A solution of computing text similarity was presented in this paper, which was based on a novel linguistic phenomenon description. In this study, word sense ontology of keyword is firstly constructed by context multi-information, and then, the same feature firstly was acquired from text pairs, the usage of context co-occurrence feature was gotten in using part of speech, semantic, location, average co-occurrence probability, and was expressed as the linguistic ontology knowledge; final, text similarity evaluation value is calculated for each text to judge the text similarity degree. The Chinese document set from the NTCIR-3 workshop collection was used to evaluate the method, it shows that an average 15.45%-18.49% and 11.96%-15.35% increase in precision can be achieved at top 10 and 100 ranking documents level respectively.