The semantic similarity ensemble

Computational measures of semantic similarity between geographic terms pro- vide valuable support across geographic information retrieval, data mining, and informa- tion integration. To date, a wide variety of approachesto geo-semantic similarity have been devised. A judgment of similarity is not intrinsically right or wrong, but obtains a certain degree of cognitive plausibility, depending on how closely it mimics human behavior. Thus selecting the most appropriate measure for a specific task is a significant challenge. To ad- dress this issue, we make an analogy between computational similarity measures and so- liciting domain expert opinions, which incorporate a subjective set of beliefs, perceptions, hypotheses, and epistemic biases. Following this analogy, we define the semantic similarity ensemble (SSE) as a composition of different similarity measures, acting as a panel of experts having to reach a decision on the semantic similarity of a set of geographic terms. The ap- proach is evaluated in comparison to human judgments, and results indicate that an SSE performs better than the average of its parts. Although the best member tends to outper- form the ensemble ,a llensembles outperform the average performance of each ensemble's member. Hence, in contexts where the best measure is unknown, the ensemble provides a more cognitively plausible approach.

[1]  Edmund A. Mennis The Wisdom of Crowds: Why the Many Are Smarter than the Few and How Collective Wisdom Shapes Business, Economies, Societies, and Nations , 2006 .

[2]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[3]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[4]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[5]  Carsten Keßler,et al.  Similarity Measurement in Context , 2007, CONTEXT.

[6]  J. Scott Armstrong,et al.  Principles of forecasting : a handbook for researchers and practitioners , 2001 .

[7]  Adrian K. Rantilla,et al.  Confidence in aggregation of expert opinions. , 2000, Acta psychologica.

[8]  Michela Bertolotto,et al.  The Similarity Jury: Combining Expert Judgements on Geographic Concepts , 2012, ER Workshops.

[9]  Ted Pedersen,et al.  An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet , 2002, CICLing.

[10]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[11]  Jeryl L. Mumpower,et al.  Expert Judgement and Expert Disagreement , 1996 .

[12]  Michela Bertolotto,et al.  Grounding Linked Open Data in WordNet: The Case of the OSM Semantic Network , 2013, W2GIS.

[13]  Werner Kuhn,et al.  Cognitive and Linguistic Ideas in Geographic Information Semantics , 2013 .

[14]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[15]  Carlo Tasso,et al.  Evaluating the Results of Methods for Computing Semantic Relatedness , 2013, CICLing.

[16]  Max J. Egenhofer,et al.  Comparing geospatial entity classes: an asymmetric and context-dependent similarity measure , 2004, Int. J. Geogr. Inf. Sci..

[17]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[18]  Daniel F. Waggoner,et al.  Federal Reserve Bank of Atlanta E C O N O M I C R E V I E W Second Quarter 2003 Forecast Evaluation with Cross-sectional Data: the Blue Chip Surveys , 2022 .

[19]  R. Cooke,et al.  Expert judgement elicitation for risk assessments of critical infrastructures , 2004 .

[20]  熊谷 ユリヤ,et al.  James Surowiecki, 『The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations』, Random House, 5,2004, $24.95 , 2005 .

[21]  Kaplan,et al.  ‘Combining Probability Distributions from Experts in Risk Analysis’ , 2000, Risk analysis : an official publication of the Society for Risk Analysis.

[22]  Carlo Strapparava,et al.  Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[23]  Umberto Straccia,et al.  Web metasearch: rank vs. score based rank aggregation methods , 2003, SAC '03.

[24]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[25]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[26]  Christiane Fellbaum,et al.  Combining Local Context and Wordnet Similarity for Word Sense Identification , 1998 .

[27]  Krzysztof Janowicz,et al.  Algorithm, Implementation and Application of the SIM-DL Similarity Server , 2007, GeoS.

[28]  J. Armstrong,et al.  PRINCIPLES OF FORECASTING 1 Principles of Forecasting : A Handbook for Researchers and Practitioners , 2006 .

[29]  Ted Pedersen,et al.  Using WordNet-based Context Vectors to Estimate the Semantic Relatedness of Concepts , 2006 .

[30]  Michela Bertolotto,et al.  A Holistic Semantic Similarity Measure for Viewports in Interactive Maps , 2012, W2GIS.

[31]  Krzysztof Janowicz,et al.  The semantics of similarity in geographic information retrieval , 2011, J. Spatial Inf. Sci..

[32]  Michela Bertolotto,et al.  Geographic knowledge extraction and semantic similarity in OpenStreetMap , 2013, Knowledge and Information Systems.

[33]  Eneko Agirre,et al.  A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches , 2009, NAACL.

[34]  Michael J. Keiser,et al.  Relating protein pharmacology by ligand chemistry , 2007, Nature Biotechnology.

[35]  Martin Chodorow,et al.  Combining local context and wordnet similarity for word sense identification , 1998 .

[36]  Graeme Hirst,et al.  Lexical chains as representations of context for the detection and correction of malapropisms , 1995 .

[37]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[38]  Michela Bertolotto,et al.  Computing the semantic similarity of geographic terms using volunteered lexical definitions , 2013, Int. J. Geogr. Inf. Sci..

[39]  Angela Schwering,et al.  Approaches to Semantic Similarity Measurement for Geo‐Spatial Data: A Survey , 2008, Trans. GIS.

[40]  Max Henrion,et al.  Uncertainty: A Guide to Dealing with Uncertainty in Quantitative Risk and Policy Analysis , 1990 .