We describe a web application called ANC2Go that enables the user to select data from the Open American National Corpus (OANC) and the Manually Annotated Sub-corpus (MASC) together with some or all of the annotations available. The user also may select from among a variety of options for output format, or may receive the selected portions of the corpus and annotations in their original GrAF XML standoff format.. The request is processed by merging the annotations selected and rendering them in the desired output format, then bundling the results and making it available for download. Thus, users can create a customized corpus with data and annotations of their choosing, delivered in the format that is most convenient for their use. ANC2Go will be released as a web service in the near future. Both the OANC and MASC are freely available for any use from the American National Corpus website and may be accessed through the ANC2Go application, or they may downloaded in their entirety.
[1]
Christiane Fellbaum,et al.
MASC: the Manually Annotated Sub-Corpus of American English
,
2008,
LREC.
[2]
Nancy Ide,et al.
GrAF: A Graph-based Format for Linguistic Annotations
,
2007,
LAW@ACL.
[3]
Nancy Ide,et al.
International Standard for a Linguistic Annotation Framework
,
2003,
Natural Language Engineering.
[4]
Christiane Fellbaum,et al.
WordNet and FrameNet as Complementary Resources for Annotation
,
2009,
Linguistic Annotation Workshop.
[5]
Nancy Ide,et al.
Bridging the Gaps: Interoperability for GrAF, GATE, and UIMA
,
2009,
Linguistic Annotation Workshop.