Beyond the Pipeline: Discrete Optimization in NLP

We present a discrete optimization model based on a linear programming formulation as an alternative to the cascade of classifiers implemented in many language processing systems. Since NLP tasks are correlated with one another, sequential processing does not guarantee optimal solutions. We apply our model in an NLG application and show that it performs better than a pipeline-based system.

[1]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[2]  Walter Daelemans,et al.  Modularity in Inductively-Learned Word Pronunciation Systems , 1998, CoNLL.

[3]  Michael Strube,et al.  Modeling and Annotating the Semantics of Route Directions , 2005 .

[4]  A. L. Edwards,et al.  An introduction to linear regression and correlation. , 1985 .

[5]  Walter Daelemans,et al.  Rapid Development of NLP Modules with Memory-based Learning , 1998 .

[6]  L. A. Goodman,et al.  Measures of Association for Cross Classifications, IV: Simplification of Asymptotic Variances , 1972 .

[7]  Éva Tardos,et al.  Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields , 2002, JACM.

[8]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[9]  Dan Roth,et al.  Semantic Role Labeling Via Integer Linear Programming Inference , 2004, COLING.

[10]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[11]  Ehud Reiter,et al.  Has a Consensus NL Generation Architecture Appeared, and is it Psycholinguistically Plausible? , 1994, INLG.

[12]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[13]  Michael Strube,et al.  Classification-Based Generation Using TAG , 2004, INLG.

[14]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[15]  Walter Daelemans,et al.  Cascaded Grammatical Relation Assignment , 1999, EMNLP.

[16]  Joseph Naor,et al.  Approximation algorithms for the metric labeling problem via a new linear programming formulation , 2001, SODA '01.

[17]  Ernst Althaus,et al.  Computing Locally Coherent Discourses , 2004, ACL.

[18]  Dan Roth,et al.  A Linear Programming Formulation for Global Inference in Natural Language Tasks , 2004, CoNLL.

[19]  Yorick Wilks,et al.  Software Infrastructure for Natural Language Processing , 1997, ANLP.

[20]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .