Solving Italian crosswords using the Web Technical Report

We designed and implemented a software system, called WebCrow, that represents the first solver for Italian crosswords and the first system that tackles a language game using the Web as knowledge base. Its core feature is the Web Search Module that produces a special form of web-based question answering that we call clue-answering. This paper will focus its attention on this task. The web-search approach has proved itself to be very consistent: using a limited set of documents (30 for each clue) the clue-answering process is able to retrieve over two thirds of the correct answers. In many cases the targeted word is given in output among the very first most probable candidates and in nearly 15% of clues the correct answer appears in first position. To complete the crossword solving problem the system has to fill the grid with the best set of word answers. Currently, WebCrow performances are interesting: crosswords that are “easy” for expert humans (i.e. crosswords from the cover pages of La Settimana EnigmisticaTM) are solved, in a 15 minutes time limit, with 80% of correct words and over 90% of correct letters. Crosswords that are designed for experts (i.e. examples by S. & A. Bartezzaghi both in La Settimana Enigmistica and in La Repubblica) WebCrow places correctly two thirds of the words and around 80% of the letters.