Enriching textbooks with images

Textbooks have a direct bearing on the quality of education imparted to the students. Therefore, it is of paramount importance that the educational content of textbooks should provide rich learning experience to the students. Recent studies on understanding learning behavior suggest that the incorporation of digital visual material can greatly enhance learning. However, textbooks used in many developing regions are largely text-oriented and lack good visual material. We propose techniques for finding images from the web that are most relevant for augmenting a section of the textbook, while respecting the constraint that the same image is not repeated in different sections of the same chapter. We devise a rigorous formulation of the image assignment problem and present a polynomial time algorithm for solving the problem optimally. We also present two image mining algorithms that utilize orthogonal signals and hence obtain different sets of relevant images. Finally, we provide an ensembling algorithm for combining the assignments. To empirically evaluate our techniques, we use a corpus of high school textbooks in use in India. Our user study utilizing the Amazon Mechanical Turk platform indicates that the proposed techniques are able to obtain images that can help increase the understanding of the textbook material.

[1]  Richard Sproat,et al.  WordsEye: an automatic text-to-scene conversion system , 2001, SIGGRAPH.

[2]  Nitish Srivastava,et al.  Enriching textbooks through data mining , 2010, ACM DEV '10.

[3]  Anil K. Jain,et al.  Automatic text location in images and video frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[4]  Panagiotis G. Ipeirotis Analyzing the Amazon Mechanical Turk marketplace , 2010, XRDS.

[5]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[6]  W. Bruce Croft,et al.  Improving verbose queries using subset distribution , 2010, CIKM.

[7]  David N. Milne Applying Wikipedia to Interactive Information Retrieval , 2010 .

[8]  D. Saari Decisions and elections : explaining the unexpected , 2001 .

[9]  E. Hanushek,et al.  The Role of Education Quality for Economic Growth , 2007 .

[10]  Otis Gospodnetic,et al.  Lucene in Action , 2004 .

[11]  Petros J. Katsioloudis,et al.  Identification of Quality Indicators of Visual-based Learning Material in Technology Education Programs for Grades 7-12. , 2007 .

[12]  James Ze Wang,et al.  The story picturing engine: finding elite images to illustrate a story using mutual reinforcement , 2004, MIR '04.

[13]  Hang Li,et al.  A unified and discriminative model for query refinement , 2008, SIGIR '08.

[14]  Bruce W Speck,et al.  Collaborative Writing: An Annotated Bibliography , 1999 .

[15]  Douglas Carnine,et al.  Textbook Evaluation and Adoption , 2001 .

[16]  Amanda Spink,et al.  Use of query reformulation and relevance feedback by Excite users , 2000, Internet Res..

[17]  Alfred Bork,et al.  Multimedia in Learning , 2001 .

[18]  William Grabe,et al.  Fluency in reading—Thirty-five years later , 2010 .

[19]  William Thies,et al.  Interactive DVDs as a platform for education , 2010, ICTD 2010.

[20]  Rainer Lienhart,et al.  Localizing and segmenting text in images and videos , 2002, IEEE Trans. Circuits Syst. Video Technol..

[21]  Nancy C. Mulvany,et al.  Indexing Books , 1994 .

[22]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[23]  J. Chimombo Issues in Basic Education in Developing Countries: An Exploration of Policy Options for Improved Delivery , 2005 .

[24]  Olena Medelyan,et al.  Human-competitive automatic topic indexing , 2009 .

[25]  Kentaro Toyama,et al.  Effects of integrating digital visual materials with textbook scans in the classroom , 2009 .

[26]  Sreenivas Gollapudi,et al.  Identifying enrichment candidates in textbooks , 2011, WWW.

[27]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[28]  W. Bruce Croft,et al.  Evaluating verbose query processing techniques , 2010, SIGIR.

[29]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[30]  P. Glewwe,et al.  Many Children Left Behind? Textbooks and Test Scores in Kenya , 2007 .

[31]  Laura A. Dabbish,et al.  Labeling images with a computer game , 2004, AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors.

[32]  Yin Yang,et al.  Query by document , 2009, WSDM '09.

[33]  Jane C. Ginsburg,et al.  Copyright Cases and Materials , 2002 .

[34]  M. Volman,et al.  The Web as an Information Resource in K–12 Education: Strategies for Supporting Students in Searching and Processing Information , 2005 .

[35]  Vasudeva Varma,et al.  SMEO: A Platform for Smart Classrooms with Enhanced Information Access and Operations Automation , 2010, NEW2AN.

[36]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[37]  J. Moulton How Do Teachers Use Textbooks and Other Print Materials? A Review of the Literature , 1994 .

[38]  B. Fuller What School Factors Raise Achievement in the Third World? , 1987 .

[39]  J. Gillies,et al.  Opportunity to Learn: A High Impact Strategy for Improving Educational Outcomes in Developing Countries. Working Paper. , 2008 .

[40]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[41]  Rada Mihalcea,et al.  Toward communicating simple sentences using pictorial representations , 2008, AMTA.

[42]  Xiaojin Zhu,et al.  A Text-to-Picture Synthesis System for Augmenting Communication , 2007, AAAI.

[43]  Rada Mihalcea,et al.  Linking Educational Materials to Encyclopedic Knowledge , 2007, AIED.

[44]  Yansong Feng,et al.  Topic Models for Image Annotation and Text Illustration , 2010, HLT-NAACL.

[45]  S. Heyneman,et al.  Textbooks and Achievement in Developing Countries: What we Know , 1981 .

[46]  Jana Holsanova,et al.  Reading or scanning? A study of newspaper and net paper reading. , 2003 .

[47]  Razia Fakir Mohammad,et al.  Effective Use of Textbooks: A Neglected Aspect of Education in Pakistan , 2007 .

[48]  Terry C. Lansdown,et al.  The mind's eye: cognitive and applied aspects of eye movement research , 2005 .

[49]  Larry Downes,et al.  The Laws of Disruption: Harnessing the New Forces that Govern Life and Business in the Digital Age , 2009 .

[50]  Marina Bosch,et al.  ImageCLEF, Experimental Evaluation in Visual Information Retrieval , 2010 .

[51]  Slava M. Katz,et al.  Technical terminology: some linguistic properties and an algorithm for identification in text , 1995, Natural Language Engineering.

[52]  Michael W Crossley,et al.  Textbook provision and the quality of the school curriculum in developing countries , 1994 .

[53]  Vitor R. Carvalho,et al.  Reducing long queries using query quality predictors , 2009, SIGIR.