Word sense disambiguation: A complex network approach

Abstract The word sense disambiguation (WSD) task aims at identifying the meaning of words in a given context for specific words conveying multiple meanings. This task plays a prominent role in a myriad of real world applications, such as machine translation, word processing and information retrieval. Recently, concepts and methods of complex networks have been employed to tackle this task by representing words as nodes, which are connected if they are semantically similar. Despite the increasingly number of studies carried out with such models, most of them use networks just to represent the data, while the pattern recognition performed on the attribute space is performed using traditional learning techniques. In other words, the structural relationships between words have not been explicitly used in the pattern recognition process. In addition, only a few investigations have probed the suitability of representations based on bipartite networks and graphs (bigraphs) for the problem, as many approaches consider all possible links between words. In this context, we assess the relevance of a bipartite network model representing both feature words (i.e. the words characterizing the context) and target (ambiguous) words to solve ambiguities in written texts. Here, we focus on semantical relationships between these two type of words, disregarding relationships between feature words. The adopted method not only serves to represent texts as graphs, but also constructs a structure on which the discrimination of senses is accomplished. Our results revealed that the adopted learning algorithm in such bipartite networks provides excellent results mostly when local features are employed to characterize the context. Surprisingly, our method even outperformed the support vector machine algorithm in particular cases, with the advantage of being robust even if a small training dataset is available. Taken together, the results obtained here show that the representation/classification used for the WSD problem might be useful to improve the semantical characterization of written texts without the use of deep linguistic information.

[1]  Janyce Wiebe,et al.  Word-Sense Disambiguation Using Decomposable Models , 1994, ACL.

[2]  Diego R. Amancio,et al.  Probing the Topological Properties of Complex Networks Modeling Short Written Texts , 2014, PloS one.

[3]  Haitao Liu The complexity of Chinese syntactic dependency networks , 2008 .

[4]  Luciano da Fontoura Costa,et al.  Probing the Statistical Properties of Unknown Texts: Application to the Voynich Manuscript , 2013, PloS one.

[5]  Luciano da Fontoura Costa,et al.  Complex networks analysis of language complexity , 2012, ArXiv.

[6]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[7]  Jinde Cao,et al.  Synchronization of nonlinear singularly perturbed complex networks with uncertain inner coupling via event triggered control , 2017, Appl. Math. Comput..

[8]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[9]  Matthias Steinbauer,et al.  DynamoGraph: extending the Pregel paradigm for large-scale temporal graph processing , 2016, Int. J. Grid Util. Comput..

[10]  Diego R. Amancio,et al.  A Complex Network Approach to Stylometry , 2015, PloS one.

[11]  Norberto Fernández García,et al.  IdentityRank: Named entity disambiguation in the news domain , 2012, Expert Syst. Appl..

[12]  Shirley Dex,et al.  JR 旅客販売総合システム(マルス)における運用及び管理について , 1991 .

[13]  John Tait,et al.  Word sense disambiguation in information retrieval revisited , 2003, SIGIR.

[14]  Luciano da Fontoura Costa,et al.  On the use of topological features and hierarchical characterization for disambiguating names in collaborative networks , 2012, ArXiv.

[15]  Santiago Segarra,et al.  Authorship Attribution Through Function Word Adjacency Networks , 2014, IEEE Transactions on Signal Processing.

[16]  M. Newman,et al.  Vertex similarity in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Julio Gonzalo,et al.  Discovering filter keywords for company name disambiguation in twitter , 2013, Expert Syst. Appl..

[18]  Hwee Tou Ng,et al.  An Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation , 2002, EMNLP.

[19]  Jinde Cao,et al.  Hybrid adaptive and impulsive synchronization of uncertain complex networks with delays and general uncertain perturbations , 2014, Appl. Math. Comput..

[20]  Kenneth Ward Church,et al.  Commercial applications of natural language processing , 1995, CACM.

[21]  Fabricio A. Breve,et al.  Particle Competition and Cooperation in Networks for Semi-Supervised Learning , 2012, IEEE Transactions on Knowledge and Data Engineering.

[22]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[23]  Fabricio A. Breve,et al.  Semi-supervised learning from imperfect data through particle cooperation and competition , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[24]  Anita Alicante,et al.  A distributed architecture to integrate ontological knowledge into information extraction , 2016, Int. J. Grid Util. Comput..

[25]  Jean Véronis,et al.  HyperLex: lexical cartography for information retrieval , 2004, Comput. Speech Lang..

[26]  Alneu de Andrade Lopes,et al.  Inductive Model Generation for Text Classification Using a Bipartite Heterogeneous Network , 2014, Journal of Computer Science and Technology.

[27]  Abdelkrim Bouramoul Contextualisation of information retrieval process and document ranking task in web search tools , 2016, Int. J. Space Based Situated Comput..

[28]  George A. Miller,et al.  Using Corpus Statistics and WordNet Relations for Sense Identification , 1998, CL.

[29]  Raymond J. Mooney,et al.  Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning , 1996, EMNLP.

[30]  R. Darnell Translation , 1873, The Indian medical gazette.

[31]  Luciano da Fontoura Costa,et al.  Structure-semantics interplay in complex networks and its effects on the predictability of similarity in texts , 2012, ArXiv.

[32]  Ellen M. Voorhees,et al.  Corpus-Based Statistical Sense Resolution , 1993, HLT.

[33]  R. Doyle The American terrorist. , 2001, Scientific American.

[34]  W. N. Locke,et al.  Machine Translation of Languages , 1956 .

[35]  Odemir Martinez Bruno,et al.  Chaotic encryption method based on life-like cellular automata , 2011, Expert Syst. Appl..

[36]  Hwee Tou Ng,et al.  It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text , 2010, ACL.

[37]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[38]  Martha Palmer,et al.  SemEval-2007 Task-17: English Lexical Sample, SRL and All Words , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[39]  Diego R. Amancio,et al.  Discriminating word senses with tourist walks in complex networks , 2013, ArXiv.

[40]  G. J. Rodgers,et al.  Network properties of written human language. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[41]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[42]  Adam Kilgarriff,et al.  The Senseval-3 English lexical sample task , 2004, SENSEVAL@ACL.

[43]  Diego R. Amancio,et al.  Comparing the topological properties of real and artificially generated scientific manuscripts , 2015, Scientometrics.

[44]  Qiang Guo,et al.  Stability of similarity measurements for bipartite networks , 2015, Scientific Reports.

[45]  John C. Mallery Thinking About Foreign Policy: Finding an Appropriate Role for Artificially Intelligent Computers , 1988 .

[46]  Diego R. Amancio,et al.  Authorship recognition via fluctuation analysis of network topology and word intermittency , 2015, ArXiv.

[47]  Hyoil Han,et al.  Survey of Word Sense Disambiguation Approaches , 2005, FLAIRS Conference.

[48]  German Rigau Claramunt,et al.  On the portability and tuning of supervised word sense disambiguation systems , 2000 .

[49]  Dragomir R. Radev,et al.  Book Review: Graph-Based Natural Language Processing and Information Retrieval by Rada Mihalcea and Dragomir Radev , 2011, CL.

[50]  Diego R. Amancio,et al.  Word sense disambiguation via high order of learning in complex networks , 2012, ArXiv.

[51]  Kim Sneppen,et al.  A simple model for self-organization of bipartite networks , 2004 .

[52]  Luciano da Fontoura Costa,et al.  Unveiling the relationship between complex networks metrics and word senses , 2012, ArXiv.

[53]  Cesar H. Comin,et al.  A Systematic Comparison of Supervised Classifiers , 2013, PloS one.

[54]  Jinde Cao,et al.  Synchronization for complex networks with Markov switching via matrix measure approach , 2015 .

[55]  Maria-Florina Balcan,et al.  A discriminative model for semi-supervised learning , 2010, J. ACM.

[56]  Cesar H. Comin,et al.  Complex systems: Features, similarity and connectivity , 2016, Physics Reports.

[57]  Zoran Levnajic,et al.  Revealing the Hidden Language of Complex Networks , 2014, Scientific Reports.

[58]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[59]  Alneu de Andrade Lopes,et al.  Inductive Model Generation for Text Categorization Using a Bipartite Heterogeneous Network , 2012, 2012 IEEE 12th International Conference on Data Mining.

[60]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[61]  Malvina Nissim,et al.  SemEval-2007 Task 08: Metonymy Resolution at SemEval-2007 , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[62]  Ruben Heradio,et al.  Understanding the role of conceptual relations in Word Sense Disambiguation , 2011, Expert Syst. Appl..

[63]  Huajun Chen,et al.  The Semantic Web , 2011, Lecture Notes in Computer Science.

[64]  Guilherme Alberto Wachs-Lopes,et al.  Analyzing natural human language from the point of view of dynamic of a complex network , 2016, Expert Syst. Appl..

[65]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[66]  David Yarowsky,et al.  The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems , 2001 .