Shannon Meets Shortz: A Probabilistic Model of Crossword Puzzle Difficulty

This article is concerned with the difficulty of crossword puzzles. A model is proposed that quantifies the difficulty of a Puzzle P with respect to its clues. Given a clue–answer pair (c,a), we model the difficulty of guessing a based on c using the conditional probability P(a|c); easier mappings should enjoy a higher conditional probability. The model is tested by two experiments, each of which involves estimating the difficulty of puzzles taken from The New York Times. Additionally, we discuss how the notion of information implicit in our model relates to more easily quantifiable types of information that figure into crossword puzzles. © 2008 Wiley Periodicals, Inc.

[1]  Michael L. Littman,et al.  Solving Crosswords with PROVERB , 1999, AAAI/IAAI.

[2]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[3]  Bradley P. Carlin,et al.  BAYES AND EMPIRICAL BAYES METHODS FOR DATA ANALYSIS , 1996, Stat. Comput..

[4]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[5]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[6]  Mukkai S. Krishnamoorthy,et al.  Solving Crossword Puzzles via the Google API , 2004, ICWI.

[7]  Michael C. Frank,et al.  Search Lessons Learned from Crossword Puzzles , 1990, AAAI.

[8]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[9]  Lawrence J. Mazlack Computer Construction of Crossword Puzzles Using Precedence Relationships , 1976, Artif. Intell..

[10]  Michael L. Littman,et al.  Unsupervised Learning of Semantic Orientation from a Hundred-Billion-Word Corpus , 2002, ArXiv.

[11]  Miles Efron Using cocitation information to estimate political orientation in web documents , 2006 .

[12]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[13]  Michael L. Littman,et al.  A probabilistic approach to solving crossword puzzles , 2002, Artif. Intell..

[14]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[15]  Marco Gori,et al.  WebCrow: A Web-Based System for Crossword Solving , 2005, AAAI.