Lymphocyte-style word representations

Semantic similarity between words is becoming a generic problem for many applications of computational linguistics and artificial intelligence. Word representation is the most important determination of similarity. Inspired by the analogies between words and lymphocytes, a lymphocyte-style word representation is proposed. The word representation is built on the basis of dependency syntax of sentences and represent word context as head properties and dependent properties of the word. For learning of the representations, a multi-word-agent autonomous learning model (MWAALM) based on an artificial immune system is presented. This research provides a completely new perspective on language and words. The most significant advantages of this research lie in two aspects: the first is that lymphocyte-style word representation can express both similarities and dependency relations between words, the second is that the MWAALM is implemented concisely and has the potential ability of continuous learning since the simulated targets have the ability of adaptation. Lymphocyte-style word representations are evaluated by computing the similarities between words, and experiments are conducted on the Penn Chinese Treebank 5.1. Experimental results indicate that the proposed word representations are effective.

[1]  Fernando Niño,et al.  Recent Advances in Artificial Immune Systems: Models and Applications , 2011, Appl. Soft Comput..

[2]  F. Burnet The clonal selection theory of acquired immunity , 1959 .

[3]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[4]  Jerne Nk Towards a network theory of the immune system. , 1974 .

[5]  Fernando José Von Zuben,et al.  Learning and optimization using the clonal selection principle , 2002, IEEE Trans. Evol. Comput..

[6]  M. A. R T A P A L,et al.  The Penn Chinese TreeBank: Phrase structure annotation of a large corpus , 2005, Natural Language Engineering.

[7]  Richard Hudson,et al.  An Introduction to Word Grammar , 2010 .

[8]  Leandro Nunes de Castro,et al.  aiNet: An Artificial Immune Network for Data Analysis , 2002 .

[9]  Fei Xia,et al.  The Penn Chinese TreeBank: Phrase structure annotation of a large corpus , 2005, Natural Language Engineering.

[10]  Joakim Nivre,et al.  Dependency Parsing , 2009, Lang. Linguistics Compass.

[11]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[12]  Alessandro Lenci,et al.  Distributional Memory: A General Framework for Corpus-Based Semantics , 2010, CL.

[13]  A. Perelson Immune Network Theory , 1989, Immunological reviews.

[14]  Joakim Nivre,et al.  MaltParser: A Data-Driven Parser-Generator for Dependency Parsing , 2006, LREC.

[15]  Jonathan Timmis,et al.  Artificial Immune Recognition System (AIRS): An Immune-Inspired Supervised Learning Algorithm , 2004, Genetic Programming and Evolvable Machines.

[16]  F. Azuaje Artificial Immune Systems: A New Computational Intelligence Approach , 2003 .

[17]  N K Jerne,et al.  Towards a network theory of the immune system. , 1973, Annales d'immunologie.

[18]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[19]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[20]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[21]  Pei He,et al.  A decision hyper plane heuristic based artificial immune network classification algorithm , 2013 .

[22]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[23]  W. Marsden I and J , 2012 .

[24]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[25]  Reinhard Rapp,et al.  The Computation of Word Associations: Comparing Syntagmatic and Paradigmatic Approaches , 2002, COLING.

[26]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[27]  Magnus Sahlgren,et al.  The Word-Space Model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces , 2006 .

[28]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[29]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[30]  H. Abbass,et al.  aiNet : An Artificial Immune Network for Data Analysis , 2022 .

[31]  N. K. Jerne,et al.  The generative grammar of the immune system. , 1985, Science.