Authorship Identification Based on Semantic Analysis

Authorship identification techniques are popular in various research areas.The key problems of authorship identification include extracting style marks and evaluating the document similarity in terms of writing style.Traditional methods examine features revealing the author's writing habits such as the author's style of using words,constructing sentences and organizing paragraphs,among which analyzing the frequency of punctuations or function words is prevalent.Consulting theoretical stylistics,this paper proposed a new similarity evaluation method based on semantic analysis using HowNet.Experimental results show that content words can also be used as style marks to discriminate among various authors.