Brill Tagging using the Micron Automata Processor

Brill tagging is a classic rule-based algorithm for part-of-speech tagging within Natural Language Processing. However, implementation of the tagger is inherently slow on conventional Von Neumann architectures. In this paper, we accelerate the second stage of Brill tagging on the Micron Automata Processor, a new computing architecture that can perform massive pattern matching in parallel. The designed structure is tested with a subset of the Brown Corpus using 218 contextual rules. The results show a 38X speed-up for the second stage tagger implemented on a single AP chip, compared to a single thread implementation on CPU. This paper introduces the use of this new accelerator for computational linguistic tasks, particularly those that involve rule-based or pattern-matching approaches. Keywords-Part-of-speech tagging; Brill tagging; the Automata Processor; Natural Language Processing

[1]  Robert F. Simmons,et al.  A Computational Approach to Grammatical Coding of English Words , 1963, JACM.

[2]  Beatrice Santorini,et al.  Part-of-Speech Tagging Guidelines for the Penn Treebank Project (3rd Revision) , 1990 .

[3]  Julian M. Kupiec,et al.  Robust part-of-speech tagging using a hidden Markov model , 1992 .

[4]  Eric Brill,et al.  A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[5]  Eric Brill,et al.  Some Advances in Transformation-Based Part of Speech Tagging , 1994, AAAI.

[6]  Eric Brill,et al.  Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging , 1995, CL.

[7]  Eric Brill,et al.  Unsupervised Learning of Disambiguation Rules for Part of Speech Tagging , 1995, VLC@ACL.

[8]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[9]  Wolfgang Wahlster,et al.  Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics , 1997 .

[10]  Robert Krovetz,et al.  Homonymy and Polysemy in Information Retrieval , 1997, ACL.

[11]  Hans van Halteren,et al.  Improving Data Driven Wordclass Tagging by System Combination , 1998, ACL.

[12]  F. Xia,et al.  The Part-Of-Speech Tagging Guidelines for the Penn Chinese Treebank (3.0) , 2000 .

[13]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[14]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[15]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[16]  Geoffrey Sampson,et al.  The Oxford Handbook of Computational Linguistics , 2003, Lit. Linguistic Comput..

[17]  Ted Pedersen,et al.  Guaranteed Pre-tagging for the Brill Tagger , 2003, CICLing.

[18]  Naushad UzZaman,et al.  Comparison of different POS Tagging Techniques (n-gram, HMM and Brill’s tagger) for Bangla , 2007 .

[19]  Srinivas Aluru,et al.  Finding Motifs in Biological Sequences Using the Micron Automata Processor , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[20]  Dave Brown,et al.  Supplementary Material for An Efficient and Scalable Semiconductor Architecture for Parallel Automata Processing , 2013 .