An Analysis of Tree Topological Features in Classifier-Based Unlexicalized Parsing

A novel set of "tree topological features" (TTFs) is investigated for improving a classifier-based unlexicalized parser. The features capture the location and shape of subtrees in the treebank. Four main categories of TTFs are proposed and compared. Experimental results showed that each of the four categories independently improved the parsing accuracy significantly over the baseline model. When combined using the ensemble technique, the best unlexicalized parser achieves 84% accuracy without any extra language resources, and matches the performance of early lexicalized parsers. Linguistically, TTFs approximate linguistic notions such as grammatical weight, branching property and structural parallelism. This is illustrated by studying how the features capture structural parallelism in processing coordinate structures.

[1]  Thomas Wasow,et al.  Remarks on grammatical weight , 1997, Language Variation and Change.

[2]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[3]  Robert C. Berwick,et al.  Principle-Based Parsing , 1987 .

[4]  Joakim Nivre,et al.  Dependency Parsing , 2009, Lang. Linguistics Compass.

[5]  Kam-Fai Wong,et al.  Natural Language Processing - IJCNLP 2005, Second International Joint Conference, Jeju Island, Korea, October 11-13, 2005, Proceedings , 2005, IJCNLP.

[6]  Jun'ichi Tsujii,et al.  Probabilistic CFG with Latent Annotations , 2005, ACL.

[7]  Adwait Ratnaparkhi,et al.  Learning to Parse Natural Language with Maximum Entropy Models , 1999, Machine Learning.

[8]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[9]  John D. Lafferty,et al.  Towards History-based Grammars: Using Richer Models for Probabilistic Parsing , 1993, ACL.

[10]  Jun'ichi Tsujii,et al.  Chunk Parsing Revisited , 2005, IWPT.

[11]  R. Quirk A Grammar of contemporary English , 1974 .

[12]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[13]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[14]  Martin Kay,et al.  Algorithm schemata and data structures in syntactic processing , 1986 .

[15]  Eugene Galanter,et al.  Handbook of mathematical psychology: I. , 1963 .

[16]  Michael Collins,et al.  Discriminative Reranking for Natural Language Parsing , 2000, CL.

[17]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[18]  Karen Spärck Jones,et al.  Readings in natural language processing , 1986 .

[19]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[20]  George A. Miller,et al.  Introduction to the Formal Analysis of Natural Languages , 1968 .

[21]  L Frazier,et al.  Processing Coordinate Structures , 2000, Journal of psycholinguistic research.

[22]  Steven Abney,et al.  Parsing By Chunks , 1991 .

[23]  Frank Keller,et al.  A Probabilistic Corpus-based Model of Syntactic Parallelism a Probabilistic Corpus-based Model of Syntactic Parallelism 2 , 2022 .

[24]  Qun Liu,et al.  Parsing the Penn Chinese Treebank with Semantic Knowledge , 2005, IJCNLP.

[25]  Samuel W. K. Chan,et al.  Tree Topological Features for Unlexicalized Parsing , 2010, COLING.

[26]  E. Gibson Linguistic complexity: locality of syntactic dependencies , 1998, Cognition.

[27]  Alon Lavie,et al.  A Classifier-Based Parser with Linear Run-Time Complexity , 2005, IWPT.

[28]  Erik F. Tjong Kim Sang,et al.  Transforming a Chunker to a Parser , 2000, CLIN.

[29]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[30]  Tadao Kasami,et al.  An Efficient Recognition and Syntax-Analysis Algorithm for Context-Free Languages , 1965 .

[31]  Frederick Jelinek,et al.  Towards history-based grammars: using richer models for probabilistic parsing , 1992 .

[32]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[33]  Makoto Nagao,et al.  Corpus Based PP Attachment Ambiguity Resolution with a Semantic Dictionary , 1997, VLC.

[34]  Dan Klein,et al.  Learning and Inference for Hierarchically Split PCFGs , 2007, AAAI.

[35]  Timothy Baldwin,et al.  Improving Parsing and PP Attachment Performance with Sense Information , 2008, ACL.

[36]  David M. Magerman Statistical Decision-Tree Models for Parsing , 1995, ACL.

[37]  Anette Rosenbach,et al.  Animacy Versus Weight as Determinants of Grammatical Variation in English , 2005 .

[38]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[39]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[40]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.