MaSiMe: A Customized Similarity Measure and Its Application for Tag Cloud Refactoring

Nowadays the popularity of tag clouds in websites is increased notably, but its generation is criticized because its lack of control causes it to be more likely to produce inconsistent and redundant results. It is well known that if tags are freely chosen (instead of taken from a given set of terms), synonyms (multiple tags for the same meaning), normalization of words and even, heterogeneity of users are likely to arise, lowering the efficiency of content indexing and searching contents. To solve this problem, we have designed the Maximum Similarity Measure (MaSiMe) a dynamic and flexible similarity measure that is able to take into account and optimize several considerations of the user who wishes to obtain a free-of-redundancies tag cloud. Moreover, we include an algorithm to effectively compute the measure and a parametric study to determine the best configuration for this algorithm.

[1]  Dominic Widdows,et al.  Geometry and Meaning , 2004, Computational Linguistics.

[2]  Alberto Córdoba,et al.  Pattern Matching Techniques to Identify Syntactic Variations of Tags in Folksonomies , 2008, WSKS.

[3]  Torsten Grust,et al.  Advances in database technology - EDBT 2006 : 10th International Conference on Extending Database Technology, Munich, Germany, March 2006; proceedings , 2006 .

[4]  Juan-Zi Li,et al.  Result of Ontology Alignment with RiMOM at OAEI'06 , 2006, Ontology Matching.

[5]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[6]  York Sure-Vetter,et al.  FOAM - Framework for Ontology Alignment and Mapping - Results of the Ontology Alignment Evaluation Initiative , 2005, Integrating Ontologies.

[7]  Miltiadis D. Lytras,et al.  Emerging Technologies and Information Systems for the Knowledge Society, First World Summit on the Knowledge Society, WSKS 2008, Athens, Greece, September 24-26, 2008. Proceedings , 2008, WSKS.

[8]  Klaus R. Dittrich,et al.  Detecting Similarities in Ontologies with the SOQA-SimPack Toolkit , 2006, EDBT.

[9]  Enrico Motta,et al.  The Semantic Web - ISWC 2005, 4th International Semantic Web Conference, ISWC 2005, Galway, Ireland, November 6-10, 2005, Proceedings , 2005, SEMWEB.

[10]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[11]  Abraham Bernstein,et al.  The Fundamentals of iSPARQL: A Virtual Triple Approach for Similarity-Based Semantic Web Tasks , 2007, ISWC/ASWC.

[12]  Donald E. Knuth,et al.  The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition , 1997 .

[13]  Stefanos D. Kollias,et al.  A String Metric for Ontology Alignment , 2005, SEMWEB.

[14]  Enrico Motta,et al.  Integrating Folksonomies with the Semantic Web , 2007, ESWC.

[15]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[16]  Esko Ukkonen,et al.  Approximate String Matching with q-grams and Maximal Matches , 1992, Theor. Comput. Sci..

[17]  Lora Aroyo,et al.  The Semantic Web: Research and Applications , 2009, Lecture Notes in Computer Science.