Bootphon/Wordseg: Wordseg-0.7
暂无分享,去创建一个
Added tools/wordseg-qsub.sh , a script to schedule a list of
segmentation jobs to a cluster running Sun Grid Engine and the
qsub scheduler.
Added example phonological rules and updated contributong guide in
documentation.
In wordseg-prep ignore empty lines in both gold and segmented
texts.
In wordseg-syll the syllabification is improved: syllabification
of words with no vowel, better error messages (see #35, #36).
In wordseg-tp add of the mutual information dependancy
measure. In the bash command, the argument --probability
{forward,backward} is replaced by --dependency {ftp,btp,mi}
(maintained for backward compatibility). See #40.
In wordseg-ag :
niteration is now 2000 by default (was 100),
improved log of iterations with -vv ,
refactored postprocessing code:
parallelized
constant memory usage (was linear wrt niterations*nutts)
tree to words conversion in C++ instead of Python
temporary parses file is now gziped (gains a factor of 20 in disk usage)
new --temdir option to specify another path for tempfile (default is /tmp)
detection of incomplete parses (if any issues a warning)
better comments in code, more unit tests