Design and Structure of The Juman++ Morphological Analyzer Toolkit

[1]  Y. Shafranovich This RFC documents the format used for Comma-Separated Values (CSV) files and registers the associated MIME type "text/csv". , 2005 .

[2]  Graham Neubig,et al.  Pointwise Prediction for Robust, Adaptable Japanese Morphological Analysis , 2011, ACL.

[3]  Yuji Matsumoto,et al.  Applying Conditional Random Fields to Japanese Morphological Analysis , 2004, EMNLP.

[4]  Qun Liu,et al.  Word Lattice Reranking for Chinese Word Segmentation and Part-of-Speech Tagging , 2008, COLING.

[5]  Daisuke Kawahara,et al.  Morphological Analysis for Unsegmented Languages using Recurrent Neural Network Language Model , 2015, EMNLP.

[6]  Hitoshi Isahara,et al.  An Error-Driven Word-Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging , 2009, ACL/IJCNLP.

[7]  Hiroya Takamura,et al.  An Efficient Algorithm for Unsupervised Word Segmentation with Branching Entropy and MDL , 2010, EMNLP.

[8]  Yasuharu Den,et al.  A Proper Approach to Japanese Morphological Analysis: Dictionary, Model, and Evaluation , 2008, LREC.

[9]  Naonori Ueda,et al.  Bayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling , 2009, ACL.

[10]  Kugatsu Sadamitsu,et al.  Morphological Analysis for Japanese Noisy Text based on Character-level and Word-level Normalization , 2014, COLING.

[11]  Daisuke Kawahara,et al.  Juman++: A Morphological Analysis Toolkit for Scriptio Continua , 2018, EMNLP.

[12]  Yuji Matsumoto,et al.  Japanese Morphological Analysis System ChaSen version 2.0 Manual , 1999 .

[13]  Masaru Kitsuregawa,et al.  Efficient Word Lattice Generation for Joint Word Segmentation and POS Tagging in Japanese , 2013, IJCNLP.

[14]  Mamoru Komachi,et al.  Long Short-Term Memory for Japanese Word Segmentation , 2017, PACLIC.

[15]  Taku Kudo,et al.  SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.

[16]  Daisuke Kawahara,et al.  Shrinking Japanese Morphological Analyzers With Neural Networks and Semi-supervised Learning , 2019, NAACL-HLT.

[17]  Daichi Mochihashi,et al.  Inducing Word and Part-of-Speech with Pitman-Yor Hidden Semi-Markov Models , 2015, ACL.

[18]  Manabu Okumura,et al.  A Simple Approach to Unknown Word Processing in Japanese Morphological Analysis , 2013, IJCNLP.

[19]  Xiaoqing Zheng,et al.  Deep Learning for Chinese Word Segmentation and POS Tagging , 2013, EMNLP.

[20]  Yuji Matsumoto,et al.  Sudachi: a Japanese Tokenizer for Business , 2018, LREC.

[21]  Daisuke Kawahara,et al.  Automatically Acquired Lexical Knowledge Improves Japanese Joint Morphological and Dependency Analysis , 2017, IWPT.