Syntax and Parsing

Parsing uncovers the hidden structure of linguistic input. In many applications involving natural language, the underlying predicate-argument structure of sentences can be useful. The syntactic analysis of language provides a means to explicitly discover the various predicate-argument dependencies that may exist in a sentence. In natural language processing, the syntactic analysis of natural language input can vary from being very low-level, such as simply tagging each word in the sentence with a part of speech, or very high level, such as recovering a structural analysis that identifies the dependency between each predicate in the sentence and its explicit and implicit arguments. The major bottleneck in parsing natural language is the fact that ambiguity is so pervasive. In syntactic parsing, ambiguity is a particularly difficult problem since the most plausible analysis has to be chosen from an exponentially large number of alternative analyses. From tagging to full parsing, algorithms have to be carefully chosen that can handle such ambiguity. This chapter explores syntactic analysis methods from tagging to full parsing and the use of supervised machine learning to deal with ambiguity. 1 Parsing Natural Language In a text to speech application input sentences are to be converted to a spoken output that should sound like it was spoken by a native speaker of the language. Consider the following pair of sentences (imagine them spoken rather than written): 1. He wanted to go for a drive in movie . 2. He wanted to go for a drive in the country . There is a natural pause between the words ‘drive’ and ‘in’ in sentence 2 which reflects an underlying hidden structure to the sentence. Parsing can provide a structural description that identifies such a break in the intonation. A simpler case occurs in the following sentence: 3. The cat who lives dangerously had nine lives . In this case, a text to speech system needs to know that the first instance of the word ‘lives’ is a verb, and the second instance is a noun before it can begin to produce the natural intonation for this sentence. This is an instance of the part of speech tagging problem where each word in the sentence is assigned a most 1When written “drive in” would probably be hyphenated in the second utterance.

[1]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[2]  Haim Gaifman,et al.  Dependency Systems and Phrase-Structure Systems , 1965, Inf. Control..

[3]  Editors , 1986, Brain Research Bulletin.

[4]  Ann Bies,et al.  The Penn Treebank: Annotating Predicate Argument Structure , 1994, HLT.

[5]  Dependency Grammars and Context-Free Grammars , 1994 .

[6]  Michael Sipser,et al.  Introduction to the Theory of Computation , 1996, SIGA.

[7]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[8]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[9]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[10]  Mark Johnson,et al.  PCFG Models of Linguistic Tree Representations , 1998, CL.

[11]  Alexander I. Rudnicky,et al.  Task and domain specific modelling in the Carnegie Mellon communicator system , 2000, INTERSPEECH.

[12]  Scott Miller,et al.  A Novel Use of Statistical Parsing to Extract Information from Text , 2000, ANLP.

[13]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[14]  Michael Collins,et al.  Ranking Algorithms for Named Entity Extraction: Boosting and the VotedPerceptron , 2002, ACL.

[15]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[16]  Mark Johnson,et al.  A Simple Pattern-matching Algorithm for Recovering Empty Nodes and their Antecedents , 2002, ACL.

[17]  Patrick Pantel,et al.  Concept Discovery from Text , 2002, COLING.

[18]  Daniel Marcu,et al.  Summarization beyond sentence extraction: A probabilistic approach to sentence compression , 2002, Artif. Intell..

[19]  Roger Levy,et al.  Is it Harder to Parse Chinese, or the Chinese Treebank? , 2003, ACL.

[20]  Richard Sproat,et al.  The First International Chinese Word Segmentation Bakeoff , 2003, SIGHAN.

[21]  Xiaoqiang Luo A Maximum Entropy Chinese Character-Based Parser , 2003, EMNLP.

[22]  Daniel Marcu,et al.  What’s in a translation rule? , 2004, NAACL.

[23]  Regina Barzilay,et al.  Sentence Fusion for Multidocument News Summarization , 2005, CL.

[24]  Matthew Lease,et al.  Parsing and its applications for conversational speech , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[25]  Joakim Nivre Two notions of parsing , 2005 .

[26]  Thomas Emerson,et al.  The Second International Chinese Word Segmentation Bakeoff , 2005, IJCNLP.

[27]  Fernando Pereira,et al.  Discriminative learning and spanning tree algorithms for dependency parsing , 2006 .

[28]  Dan Klein,et al.  Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[29]  Chris Callison-Burch,et al.  Syntactic Constraints on Paraphrases Extracted from Parallel Corpora , 2008, EMNLP.

[30]  Regina Barzilay,et al.  Probabilistic Approaches for Modeling Text Structure and their Application to Text-to-Text Generation (Invited Talk) , 2009, ENLG.

[31]  Mark Steedman,et al.  Unbounded Dependency Recovery for Parser Evaluation , 2009, EMNLP.