CHISE: Character Processing Based on Character Ontology

Currently, in the field of information processing, characters are defined and shared using coded character sets. Character processing based on coded character sets, however, has two problems: (1) Coded character sets may lack some necessary characters. (2) Characters in coded character sets have fixed semantics. They may prevent to implement classical text database for philological studies. Especially for Kanji (Chinese character), they are serious problems to digitize classical texts. To resolve the problems, we proposed "Chaon" model which is a new model of character processing based on character ontology. To realize them, a character ontology is required. Especially for Kanji, large scale ontology is required. So we realized a large scale character ontology which includes 98 thousand characters including Unicode and non-Unicode characters. This paper focuses our design or principal of a large scale character ontology based on Chaon model, and overview of its implementation named CHISE (Character Information Service Environment).