NWJC2Vec: Word embedding dataset from ‘NINJAL Web Japanese Corpus’
暂无分享,去创建一个
[1] Daisuke Kawahara,et al. Morphological Analysis for Unsegmented Languages using Recurrent Neural Network Language Model , 2015, EMNLP.
[2] Masayuki Asahara,et al. Archiving and Analysing Techniques of the Ultra-Large-Scale Web-Based Corpus Project of NINJAL, Japan , 2014 .
[3] Vít Suchomel,et al. Efficient Web Crawling for Large Text Corpora , 2012 .
[4] Adam Kilgarriff,et al. A Web Corpus and Word Sketches for Japanese , 2008 .
[5] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[6] Daisuke Kawahara,et al. Case Frame Compilation from the Web using High-Performance Computing , 2006, LREC.
[7] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[8] Yuji Matsumoto,et al. Applying Conditional Random Fields to Japanese Morphological Analysis , 2004, EMNLP.
[9] Adam Kilgarriff,et al. A Corpus Factory for Many Languages , 2010, LREC.
[10] Marco Baroni,et al. Building general- and special-purpose corpora by Web crawling , 2006 .
[11] David A. Shamma,et al. YFCC100M , 2015, Commun. ACM.
[12] Yugo Murawaki,et al. Online Acquisition of Japanese Unknown Morphemes using Morphological Constraints , 2008, EMNLP.
[13] Marco Baroni,et al. Automated construction and evaluation of Japanese Web-based reference corpora , 2005 .
[14] Yuji Matsumoto,et al. Japanese Dependency Analysis using Cascaded Chunking , 2002, CoNLL.