Identifying enrichment candidates in textbooks

Many textbooks written in emerging countries lack clear and adequate coverage of important concepts. We propose a technological solution for algorithmically identifying those sections of a book that are not well written and could benefit from better exposition. We provide a decision model based on the syntactic complexity of writing and the dispersion of key concepts. The model parameters are learned using a tune set which is algorithmically generated using a versioned authoritative web resource as a proxy. We evaluate the proposed methodology over a corpus of Indian textbooks which demonstrates its effectiveness in identifying enrichment candidates.

[1]  L. Alexander Style in technical writing , 1968 .

[2]  R. Gunning The Technique of Clear Writing. , 1968 .

[3]  J. Gillies,et al.  Opportunity to Learn: A High Impact Strategy for Improving Educational Outcomes in Developing Countries. Working Paper. , 2008 .

[4]  Bertram C. Bruce,et al.  Why readability formulas fail , 1981, IEEE Transactions on Professional Communication.

[5]  E. U. Coke,et al.  Note on a simple algorithm for a computer-produced reading ease score. , 1970 .

[6]  I E Fang By computer Flesch's: reading ease score and a syllable counter. , 1968, Behavioral science.

[7]  Michael A. Covington,et al.  Idea density — A potentially informative characteristic of retrieved documents , 2009, IEEE Southeastcon 2009.

[8]  M. Coleman,et al.  A computer readability formula designed for machine scoring. , 1975 .

[9]  Walter Kintsch,et al.  Reading rate and retention as a function of the number of propositions in the base structure of sentences , 1973 .

[10]  Kevyn Collins-Thompson,et al.  A Language Modeling Approach to Predicting Reading Difficulty , 2004, NAACL.

[11]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[12]  Ee-Peng Lim,et al.  Measuring article quality in wikipedia: models and evaluation , 2007, CIKM '07.

[13]  Shin Ja J. Hwang,et al.  Language in Context: Essays for Robert E. Longacre , 1992 .

[14]  R. P. Fishburne,et al.  Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel , 1975 .

[15]  Bernice E. Leary,et al.  What makes a book readable , 1935 .

[16]  Marlaine E. Lockheed,et al.  Improving Educational Efficiency in Developing Countries: what do we know?[1] , 1988 .

[17]  Slava M. Katz,et al.  Technical terminology: some linguistic properties and an algorithm for identification in text , 1995, Natural Language Engineering.

[18]  J. Chall,et al.  A FORMULA FOR PREDICTING READABILITY , 1948 .

[19]  Jerry Greenfield Readability Formulas For EFL , 2004 .

[20]  J. Chimombo Issues in Basic Education in Developing Countries: An Exploration of Policy Options for Improved Delivery , 2005 .

[21]  R. Flesch A new readability yardstick. , 1948, The Journal of applied psychology.

[22]  Rohit J. Kate,et al.  Learning to Predict Readability using Diverse Linguistic Features , 2010, COLING.

[23]  J. Peter Kincaid,et al.  Derivation and Validation of the Automated Readability Index for Use with Technical Materials , 1970 .

[24]  B. Fuller What School Factors Raise Achievement in the Third World? , 1987 .

[25]  Razia Fakir Mohammad,et al.  Effective Use of Textbooks: A Neglected Aspect of Education in Pakistan , 2007 .

[26]  Michael W Crossley,et al.  Textbook provision and the quality of the school curriculum in developing countries , 1994 .

[27]  Nitish Srivastava,et al.  Enriching textbooks through data mining , 2010, ACM DEV '10.

[28]  William E. Rivers,et al.  Style: Ten lessons in clarity and grace , 1982, IEEE Transactions on Professional Communication.

[29]  Lucius Adelno Sherman,et al.  Analytics of Literature: A Manual for the Objective Study of English Prose and Poetry , 2009 .

[30]  Amit Saxena,et al.  Evaluating facilitated video instruction for primary schools in rural India , 2010, ICTD 2010.

[31]  Jin Zhao,et al.  Domain-specific iterative readability computation , 2010, JCDL '10.

[32]  Ronald A. Guillemette,et al.  Predicting readability of data processing written materials , 1987, DATB.

[33]  E. Hanushek,et al.  The Role of Education Quality for Economic Growth , 2007 .

[34]  D. Pennycuick School Effectiveness in Developing Countries: A Summary of the Research Evidence. Education Research Paper. Knowledge & Research. , 1993 .

[35]  Wilson L. Taylor,et al.  “Cloze Procedure”: A New Tool for Measuring Readability , 1953 .

[36]  Allan R. Williams Readability of Textual Material--A Survey of the Literature. , 1974 .

[37]  S. Heyneman,et al.  Textbooks and Achievement in Developing Countries: What we Know , 1981 .

[38]  J. Oakes,et al.  Education's Most Basic Tools: Access to Textbooks and Instructional Materials in California's Public Schools. , 2004 .

[39]  J. W. Tukey,et al.  The Measurement of Power Spectra from the Point of View of Communications Engineering , 1958 .

[40]  William H. DuBay The Principles of Readability. , 2004 .

[41]  Jeanne Sternlicht Chall,et al.  Readability: An Appraisal of Research and Application , 2012 .

[42]  W. Ross Winterowd The Grammar of Coherence. , 1970 .

[43]  Tapas Kanungo,et al.  Predicting the readability of short web summaries , 2009, WSDM '09.

[44]  L. Faigley,et al.  Coherence, Cohesion, and Writing Quality , 1981, College Composition & Communication.

[45]  William Anderson McCall,et al.  Standard Test Lessons in Reaping , 1925, Teachers College Record: The Voice of Scholarship in Education.

[46]  G. Harry McLaughlin,et al.  SMOG Grading - A New Readability Formula. , 1969 .