A Radical-Based Method for Chinese Named Entity Recognition

Chinese characters are composed of radicals, and their radicals have the distinction between "shaped parts" (representing semantics) and "sound parts" (representing speech). As a hieroglyph, many radicals of Chinese characters have certain semantic information, which can effectively improve the performance of Chinese named entity recognition. In the Chinese named entity recognition, many related studies use Bi-LSTM to extract the semantic features from radicals. However, the LSTM-based model cannot effectively extract the semantic information of radicals due to ambiguity in partitioning the granularity of radicals and weak dependency between Chinese radicals. Therefore, this paper presents a radical neural network method RCBC (Radical CNN-BiLSTM-CRF). The experimental results on SIGHAN 2006 Bakeoff MSRA dataset and Peking University's People's Daily dataset in 1998 indicate that this model can effectively extract the semantic information of Chinese radicals and improve the performanceof Chinese named entity recognition compared with the traditional model.