Layering and Merging Linguistic Annotations
暂无分享,去创建一个
The American National Corpus and its annotations are represented in a stand-off XML format compliant with the specifications of ISO TC37 SC4 WG1's Linguistic Annotation Framework. Because few systems that enable search and access of the corpus currently support stand-off markup, the project has developed a SAX like parser that generates ANC data with annotations in-line, in a variety of output formats.
[1] Steven J. DeRose,et al. Markup Overlap: A Review and a Horse , 2004, Extreme Markup Languages®.
[2] Laurent Romary,et al. International standard for a linguistic annotation framework , 2003, HLT-NAACL 2003.