AUTOMATED WORD PUZZLE GENERATION USING TOPIC MODELS AND SEMANTIC RELATEDNESS MEASURES

We propose a knowledge-lean method to generate word puzzles from unstructured and unannotated document collections. The presented method is capable of generating three types of puzzles: odd one out, choose the related word, and separate the topics. The difficulty of the puzzles can be adjusted. The algorithm is based on topic models, semantic similarity, and network capacity. Puzzles of two difficulty levels are generated: begin- ner and intermediate. Beginner puzzles could be suitable for, e.g., beginner language learners. Intermediate puzzles require more, often specific knowl- edge to solve. Domain-specific puzzles are generated from a corpus of NIPS proceedings. The presented method is capable of helping puzzle designers compile a collection of word puzzles in a semi-automated manner. In this setting, the method is utilized to produce a great number of puzzles. Puz- zle designers can choose and maybe modify the ones they want to include in the collection.

[1]  Thomas L. Griffiths,et al.  Probabilistic Topic Models , 2007 .

[2]  Rada Mihalcea,et al.  Text-to-Text Semantic Similarity for Automatic Short Answer Grading , 2009, EACL.

[3]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .

[4]  Tom Verguts,et al.  A Rasch Model for Detecting Learning While Solving an Intelligence Test , 2000 .

[5]  Alistair Moffat,et al.  Exploring the similarity space , 1998, SIGF.

[6]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[7]  Alexandru Iosup,et al.  POGGI: generating puzzle instances for online games on grid infrastructures , 2011, Concurr. Comput. Pract. Exp..

[8]  Evgeniy Gabrilovich,et al.  Wikipedia-based Semantic Interpretation for Natural Language Processing , 2014, J. Artif. Intell. Res..

[9]  Johanna D. Moore,et al.  Latent Semantic Analysis for Text Segmentation , 2001, EMNLP.

[10]  Ian Parberry,et al.  A prototype quest generator based on a structural analysis of quests from four MMORPGs , 2011, PCGames '11.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[13]  Iryna Gurevych,et al.  What to be? - Electronic Career Guidance Based on Semantic Relatedness , 2007, ACL.

[14]  T. C. Hu Letter to the Editor---The Maximum Capacity Route Problem , 1961 .

[15]  Barnabás Póczos,et al.  Online group-structured dictionary learning , 2011, CVPR 2011.

[16]  Stephan Bloehdorn,et al.  Semantic Kernels for Text Classification Based on Topological Measures of Feature Similarity , 2006, Sixth International Conference on Data Mining (ICDM'06).

[17]  Philipp Cimiano,et al.  Cross-language Information Retrieval with Explicit Semantic Analysis , 2008, CLEF.

[18]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[19]  David Eppstein Nonrepetitive Paths and Cycles in Graphs with Application to Sudoku , 2005, ArXiv.

[20]  S. Colton Automated Puzzle Generation , 2002 .

[21]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[22]  Daniel Barbará,et al.  Topic Significance Ranking of LDA Generative Models , 2009, ECML/PKDD.

[23]  Daniel A. Ashlock,et al.  Automatic generation of game elements via evolution , 2010, Proceedings of the 2010 IEEE Conference on Computational Intelligence and Games.

[24]  Evgeniy Gabrilovich,et al.  A word at a time: computing word relatedness using temporal semantic analysis , 2011, WWW.

[25]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[26]  B.R. Myers Graphs, networks, and algorithms , 1982, Proceedings of the IEEE.

[27]  H. Simonis,et al.  Sudoku as a Constraint Problem , 2005 .

[28]  Evgeniy Gabrilovich,et al.  Concept-Based Information Retrieval Using Explicit Semantic Analysis , 2011, TOIS.