EACL 2014 14th Conference of the European Chapter of the Association for Computational Linguistics Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC)

In this paper, we introduce several vector space manipulation methods that are applied to trained vector space models in a post-hoc fashion, and present an application of these techniques in semantic role labeling for Finnish and English. Specifically, we show that the vectors can be circularly shifted to encode syntactic information and subsequently averaged to produce representations of predicate senses and arguments. Further, we show that it is possible to effectively learn a linear transformation between the vector representations of predicates and their arguments, within the same vector space.

[1]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[2]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[3]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[4]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[5]  N. C. Silver,et al.  Averaging Correlation Coefficients: Should Fishers z Transformation Be Used? , 1987 .

[6]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[7]  Gregory Grefenstette,et al.  Explorations in automatic thesaurus discovery , 1994 .

[8]  Christoph Goller,et al.  Learning task-dependent distributed representations by backpropagation through structure , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[9]  Susan T. Dumais,et al.  The latent semantic analysis theory of knowledge , 1997 .

[10]  Daniel Jurafsky,et al.  Lexical, Prosodic, and Syntactic Cues for Dialog Acts , 1998 .

[11]  L. Burnard The British National Corpus , 1998 .

[12]  Andreas Stolcke,et al.  Dialogue act modeling for automatic tagging and recognition of conversational speech , 2000, CL.

[13]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[14]  Dan Klein,et al.  Fast Exact Inference with a Factored Model for Natural Language Parsing , 2002, NIPS.

[15]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[16]  James R. Curran,et al.  Scaling Context Space , 2002, ACL.

[17]  James R. Curran,et al.  Improvements in Automatic Thesaurus Extraction , 2002, ACL 2002.

[18]  Barbara Di Eugenio,et al.  Latent Semantic Analysis for Dialogue Act Classification , 2003, NAACL.

[19]  David J. Weir,et al.  Characterising Measures of Lexical Distributional Similarity , 2004, COLING.

[20]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[21]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[22]  Mark Steedman,et al.  The syntactic process , 2004, Language, speech, and communication.

[23]  James Richard Curran,et al.  From distributional to semantic similarity , 2004 .

[24]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[25]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[26]  Douglas L. T. Rohde,et al.  An Improved Model of Semantic Similarity Based on Lexical Co-Occurrence , 2005 .

[27]  Yoshua Bengio,et al.  Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.

[28]  Yorick Wilks,et al.  Dialogue Act Classification Based on Intra-Utterance Features∗ , 2005 .

[29]  Magnus Sahlgren,et al.  The Word-Space Model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces , 2006 .

[30]  Gina-Anne Levow,et al.  Dialog act tagging with support vector machines and hidden Markov models , 2006, INTERSPEECH.

[31]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[32]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[33]  Marina Kogan,et al.  Using Web Searches on Important Words to Create Background Sets for LSI Classification , 2006, FLAIRS Conference.

[34]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[35]  Mirella Lapata,et al.  Dependency-Based Construction of Semantic Space Models , 2007, CL.

[36]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[37]  J. Bullinaria,et al.  Extracting semantic representations from word co-occurrence statistics: A computational study , 2007, Behavior research methods.

[38]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[39]  Mirella Lapata,et al.  Vector-based Models of Semantic Composition , 2008, ACL.

[40]  A Systematic Comparison of Semantic Models on Human Similarity Rating Data : The Effectiveness of Subspacing , 2008 .

[41]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[42]  Ido Dagan,et al.  Articles: Bootstrapping Distributional Feature Vector Quality , 2009, CL.

[43]  Silvia Bernardini,et al.  The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.

[44]  Jiawei Han,et al.  Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions , 2010, COLING.

[45]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[46]  Stefan Thater,et al.  Contextualizing Semantic Representations Using Syntactically Enriched Vector Models , 2010, ACL.

[47]  Mirella Lapata,et al.  Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[48]  Christopher D. Manning,et al.  Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks , 2010 .

[49]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[50]  Simon Haykin,et al.  Neural Networks and Learning Machines , 2010 .

[51]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[52]  Mehrnoosh Sadrzadeh,et al.  Experimental Support for a Categorical Compositional Distributional Model of Meaning , 2011, EMNLP.

[53]  Jeffrey Pennington,et al.  Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection , 2011, NIPS.

[54]  Hui Lin,et al.  A Class of Submodular Functions for Document Summarization , 2011, ACL.

[55]  Katrin Erk,et al.  Vector Space Models of Word Meaning and Phrase Meaning: A Survey , 2012, Lang. Linguistics Compass.

[56]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[57]  John A Bullinaria,et al.  Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD , 2012, Behavior Research Methods.

[58]  Mirella Lapata,et al.  A Comparison of Vector-based Representations for Semantic Composition , 2012, EMNLP.

[59]  Gemma Boleda,et al.  Distributional Semantics in Technicolor , 2012, ACL.

[60]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[61]  Slav Petrov,et al.  Syntactic Annotations for the Google Books NGram Corpus , 2012, ACL.

[62]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[63]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[64]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[65]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[66]  Felix Hill,et al.  Concreteness and Corpora: A Theoretical and Practical Analysis , 2013 .

[67]  Phil Blunsom,et al.  Recurrent Convolutional Neural Networks for Discourse Compositionality , 2013, CVSM@ACL.

[68]  Stefan Müller,et al.  Exploring Vector Space Models to Predict the Compositionality of German Noun-Noun Compounds , 2013, *SEMEVAL.

[69]  Yoav Goldberg,et al.  A Dataset of Syntactic-Ngrams over Time from a Very Large Corpus of English Books , 2013, *SEMEVAL.

[70]  Marco Bonzanini,et al.  Extractive summarisation via sentence removal: condensing relevant sentences into a short summary , 2013, SIGIR.

[71]  Stefan Evert,et al.  Evaluating Neighbor Rank and Distance Measures as Predictors of Semantic Priming , 2013, CMCL.

[72]  Marco Baroni,et al.  Frege in Space: A Program for Composition Distributional Semantics , 2014, LILT.

[73]  Stephen Clark,et al.  Vector Space Models of Lexical Meaning , 2015 .