Linguistic Information in Word Embeddings

We study the presence of linguistically motivated information in the word embeddings generated with statistical methods. The nominal aspects of uter/neuter, common/proper, and count/mass in Swedish are selected to represent respectively grammatical, semantic, and mixed types of nominal categories within languages. Our results indicate that typical grammatical and semantic features are easily captured by word embeddings. The classification of semantic features required significantly less neurons than grammatical features in our experiments based on a single layer feed-forward neural network. However, semantic features also generated higher entropy in the classification output despite its high accuracy. Furthermore, the count/mass distinction resulted in difficulties to the model, even though the quantity of neurons was almost tuned to its maximum.

[1]  Erik Andersson,et al.  Svenska Akademiens grammatik , 1999 .

[2]  H. Schütze,et al.  Dimensions of meaning , 1992, Supercomputing '92.

[3]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[4]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[5]  Ricardo Baeza-Yates,et al.  Modern Information Retrieval - the concepts and technology behind search, Second edition , 2011 .

[6]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[7]  Kai Ming Ting,et al.  Precision and Recall , 2017, Encyclopedia of Machine Learning and Data Mining.

[8]  Miles Osborne,et al.  Streaming First Story Detection with application to Twitter , 2010, NAACL.

[9]  Ryan P. Adams,et al.  Training Restricted Boltzmann Machines on Word Observations , 2012, ICML.

[10]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[11]  Timothy Baldwin,et al.  Restoring Punctuation and Casing in English Text , 2009, Australasian Conference on Artificial Intelligence.

[12]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[13]  Camille Roth,et al.  Natural Scales in Geographical Patterns , 2017, Scientific Reports.

[14]  Adam Lopez,et al.  Statistical machine translation , 2008, AMTA.

[15]  Brendan S. Gillon,et al.  The Lexical Semantics of English Count and Mass Nouns , 1999 .

[16]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[17]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[18]  Gunter Senft Systems of Nominal Classification , 2008 .

[19]  Gennaro Chierchia,et al.  Plurality of Mass Nouns and the Notion of “Semantic Parameter” , 1998 .

[20]  Jenny Doetjes,et al.  96. Count/mass distinctions across languages , 2012 .

[21]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[22]  David Kemmerer,et al.  Categories of object concepts across languages and brains: the relevance of nominal classification systems to cognitive neuroscience , 2017 .

[23]  Carlo Semenza,et al.  The Syntactic and Semantic Processing of Mass and Count Nouns: An ERP Study , 2011, PloS one.

[24]  Marcin Kilarski,et al.  Functions of nominal classification , 2013 .

[25]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[26]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[27]  Joakim Nivre,et al.  Real-valued Syntactic Word Vectors (RSV) for Greedy Neural Dependency Parsing , 2017, NODALIDA.

[28]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[29]  Johanna Nichols The Origin of Nominal Classification , 1989 .

[30]  Gennaro Chierchia,et al.  Mass nouns, vagueness and semantic variation , 2010, Synthese.

[31]  Mark Stevenson,et al.  Distinguishing Common and Proper Nouns , 2013, *SEMEVAL.

[32]  Ronan Collobert,et al.  Word Embeddings through Hellinger PCA , 2013, EACL.

[33]  Ali Basirat,et al.  Lexical and Morpho-syntactic Features in Word Embeddings - A Case Study of Nouns in Swedish , 2018, ICAART.

[34]  Manny Rayner,et al.  Handling compound nouns in a Swedish speech-understanding system , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[35]  Joakim Nivre,et al.  Paraphrasing Swedish Compound Nouns in Machine Translation , 2014, MWE@EACL.

[36]  Roberto Zamparelli,et al.  Quantifying Count/Mass Elasticity , 2012 .

[37]  Alexandra Y. Aikhenvald,et al.  Classifiers: A Typology of Noun Categorization Devices , 2000 .

[38]  Frank Seifart,et al.  Nominal Classification , 2010, Lang. Linguistics Compass.

[39]  Christopher D. Manning,et al.  Evaluating Word Embeddings Using a Representative Suite of Practical Tasks , 2016, RepEval@ACL.

[40]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[41]  Alexandra Y. Aikhenvald,et al.  Round Women and Long Men: Shape, Size, and the Meanings of Gender in New Guinea and Beyond , 2012 .

[42]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[43]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[44]  Marcin Kilarski,et al.  The Place of Classifiers in the History of Linguistics , 2014 .

[45]  Manaal Faruqui,et al.  Community Evaluation and Exchange of Word Vectors at wordvectors.org , 2014, ACL.

[46]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[47]  Kari Fraurud,et al.  Proper names and gender in Swedish , 1999 .

[48]  Geoffrey I. Webb,et al.  Encyclopedia of Machine Learning , 2011, Encyclopedia of Machine Learning.

[49]  Gerald P. Delahunty,et al.  The English Language: From Sound to Sense , 2010 .