Manifold Learning for the Semi-Supervised Induction of FrameNet Predicates: An Empirical Investigation

This work focuses on the empirical investigation of distributional models for the automatic acquisition of frame inspired predicate words. While several semantic spaces, both word-based and syntax-based, are employed, the impact of geometric representation based on dimensionality reduction techniques is investigated. Data statistics are accordingly studied along two orthogonal perspectives: Latent Semantic Analysis exploits global properties while Locality Preserving Projection emphasizes the role of local regularities. This latter is employed by embedding prior FrameNet-derived knowledge in the corresponding non-euclidean transformation. The empirical investigation here reported sheds some light on the role played by these spaces as complex kernels for supervised (i.e. Support Vector Machine) algorithms: their use configures, as a novel way to semi-supervised lexical learning, a highly appealing research direction for knowledge rich scenarios like FrameNet-based semantic parsing.

[1]  Katrin Erk,et al.  SemEval-2007 Task 19: Frame Semantic Structure Extraction , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[2]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[3]  Mirella Lapata,et al.  Dependency-Based Construction of Semantic Space Models , 2007, CL.

[4]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[5]  Magnus Sahlgren,et al.  The Word-Space Model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces , 2006 .

[6]  Xin Yang,et al.  Semi-supervised nonlinear dimensionality reduction , 2006, ICML.

[7]  Yoav Goldberg,et al.  On the Role of Lexical Features in Sequence Labeling , 2009, EMNLP.

[8]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[9]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[10]  Katrin Erk,et al.  To Cause Or Not To Cause: Cross-Lingual Semantic Matching for Paraphrase Modelling , 2005 .

[11]  Charles J. Fillmore,et al.  Frames and the semantics of understanding , 1985 .

[12]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[13]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[14]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[15]  Roberto Basili,et al.  Combining Word Sense and Usage for Modeling Frame Semantics , 2008, STEP.

[16]  Arindam Banerjee,et al.  Probabilistic Semi-Supervised Clustering with Constraints , 2006, Semi-Supervised Learning.

[17]  Katrin Erk,et al.  A WordNet Detour to FrameNet , 2005 .

[18]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[19]  Richard Johansson,et al.  The Effect of Syntactic Representation on Semantic Role Labeling , 2008, COLING.

[20]  Dan I. Moldovan,et al.  A Semantic Approach to Recognizing Textual Entailment , 2005, HLT.

[21]  Yoshua Bengio,et al.  The Curse of Dimensionality for Local Kernel Machines , 2005 .

[22]  张振跃,et al.  Principal Manifolds and Nonlinear Dimensionality Reduction via Tangent Space Alignment , 2004 .

[23]  Sanda M. Harabagiu,et al.  Open Domain Information Extraction via Automatic Semantic Labeling , 2003, FLAIRS Conference.

[24]  Nicolas Le Roux,et al.  The Curse of Highly Variable Functions for Local Kernel Machines , 2005, NIPS.

[25]  李幼升,et al.  Ph , 1989 .

[26]  Roberto Basili,et al.  Automatic induction of FrameNet lexical units , 2008, EMNLP.

[27]  Daniel Gildea,et al.  Automatic Labeling of Semantic Roles , 2000, ACL.

[28]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[29]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[30]  Sanda M. Harabagiu,et al.  Using Predicate-Argument Structures for Information Extraction , 2003, ACL.

[31]  Mirella Lapata,et al.  Using Semantic Roles to Improve Question Answering , 2007, EMNLP.

[32]  Richard Johansson,et al.  Using WordNet to Extend FrameNet Coverage , 2007 .