Understanding Instructional Support Needs of Emerging Internet Users for Web-based Information Seeking

As the wealth of information available on the Web increases, Web-based information seeking becomes a more and more important skill for supporting both formal education and lifelong learning. However, Web-based information access poses hurdles that must be overcome by certain student populations, such as low English competency users, low literacy users, or what we will refer to as emerging Internet users. The challenge springs from the fact that the bulk of information available on the Web is provided in a small number of high profile languages such as English, Korean, and Chinese. These issues continue to be problematic despite research in cross-linguistic information retrieval and machine translation, These technologies are still too brittle for extensive use by these user populations for the purpose of bridging the language gulf. In this paper, we propose a mixed-methods approach to addressing these issues specifically in connection with emerging Internet users, with data mining as a key component. Our target emerging Internet users are rural children who have recently become part of a technical university student population in the Indian state of Andhra Pradesh. As Internet penetration increases in the developing world and at the same time populations shift from rural to urban life, such populations of emerging Internet users will be an important target for design of scaffolding and educational support. In this context, in addition to using the Internet for their own personal information needs, students are expected to be able to receive assignments in English and use the Web to meet the information needs specified in their assignments. Thus, we begin our investigation with a small, qualitative study in which we investigate in detail the problems faced by these students responding to search tasks given to them in English. We first present a qualitative analysis of the result write-up in response to the given information-seeking task along with some observations about the corresponding search behavior. This analysis reveals difficulties posed by the strategies students were observed to employ to compensate for difficulties understanding the search task statement and retrieved materials. Based on these specific observations, we present an extensive controlled study in which we manipulate both characteristics of the search task as well as the manner in which it was presented (i.e., in English only, in the native language of Telugu only, or presented both in English and the native language) in order to understand how a light form of support might impact task success for these information seeking tasks. One important contribution of this work is a dataset from roughly 2,000 users including their pre-search response to the task statement, a log of their click behavior during search, and their post-search write up. A data mining methodology is presented that allows us to understand more broadly the difficulties faced by this student population as well as how the experimental manipulation affects their search behavior. Results suggest that using machine translation for the limited task of translating information seeking task statements, which is more feasible than translating queries or large scale translation of search results, may be beneficial for these users depending on the type of task. The data mining methodology itself, which can be applied as an assessment technique for evaluating search behavior in subsequent research, is a second contribution. Finally, the findings from statistical analysis of the study results and data mining are a third contribution of the work.

[1]  Beverly Park Woolf,et al.  A Roadmap for Education Technology , 2011 .

[2]  Mykola Pechenizkiy,et al.  Towards EDM Framework for Personalization of Information Services in RPM Systems , 2010, EDM.

[3]  Daniel M. Russell,et al.  Query logs alone are not enough , 2007 .

[4]  Valerie Njie,et al.  Internet Usage by Low-Literacy Adults Seeking Health Information: An Observational Analysis , 2004, Journal of medical Internet research.

[5]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[6]  Anne Aula,et al.  Multilingual search strategies , 2009, CHI Extended Abstracts.

[7]  Lorna Uden,et al.  Insight into mental models of novice Internet searchers , 2003, CACM.

[8]  Christoph Hölscher,et al.  Web search behavior of Internet experts and newbies , 2000, Comput. Networks.

[9]  Gary Marchionini,et al.  Exploratory search , 2006, Commun. ACM.

[10]  Ronald E. Rice,et al.  Accessing and Browsing Information and Communication , 2001 .

[11]  Barry Smyth,et al.  An Analysis of Query Similarity in Collaborative Web Search , 2005, ECIR.

[12]  Gary Marchionini Information-seeking strategies of novices using a full-text electronic encyclopedia , 1989 .

[13]  Gautam Biswas,et al.  Analysis of Productive Learning Behaviors in a Structured Inquiry Cycle Using Hidden Markov Models , 2010, EDM.

[14]  Doug Downey,et al.  Understanding the relationship between searchers' queries and information goals , 2008, CIKM '08.

[15]  Alistair G. Sutcliffe,et al.  Towards a cognitive theory of information retrieval , 1998, Interact. Comput..

[16]  Mirja Iivonen,et al.  From Translation to Navigation of Different Discourses: A Model of Search Term Selection during the Pre-Online Stage of the Search Process , 1998, J. Am. Soc. Inf. Sci..

[17]  Gary Marchionini,et al.  Information Seeking in Electronic Environments , 1995 .

[18]  Stephen J. Payne,et al.  Knowledge in the head and on the web: using topic expertise to aid search , 2008, CHI.

[19]  Jianfeng Gao,et al.  Resolving query translation ambiguity using a decaying co-occurrence model and syntactic dependence relations , 2002, SIGIR '02.

[20]  Maya B. Eagleton,et al.  Adolescents' Internet Search Strategies: Drawing upon Familiar Cognitive Paradigms When Accessing Electronic Information Sources , 2003 .

[21]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[22]  Evelyn Marcussen Hatch,et al.  Psycholinguistics: A Second Language Perspective , 1983 .

[23]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[24]  Laurie A. Henry Information Search Strategies on the Internet: A Critical Component of New Literacies , 2005, Webology.

[25]  Ryen W. White,et al.  Characterizing the influence of domain expertise on web search behavior , 2009, WSDM '09.

[26]  Teresa Y. Neely,et al.  Information literacy assessment : standards-based tools and assignments , 2006 .

[27]  Joseph E. Beck,et al.  Tracking Students' Inquiry Paths through Student Transition Analysis , 2010, EDM.

[28]  S Kullback,et al.  LETTER TO THE EDITOR: THE KULLBACK-LEIBLER DISTANCE , 1987 .

[29]  Kori Inkpen Quinn,et al.  Challenges of Capturing Natural Web-Based User Behaviors , 2008, Int. J. Hum. Comput. Interact..

[30]  Susan T. Dumais,et al.  Similarity Measures for Short Segments of Text , 2007, ECIR.

[31]  W. Bruce Croft,et al.  Cross-lingual relevance models , 2002, SIGIR '02.

[32]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[33]  Suresh K. Bhavnani,et al.  Strategy hubs: Domain portals to help find comprehensive information , 2006, J. Assoc. Inf. Sci. Technol..

[34]  Julie Johnson,et al.  Examining Learner Control in a Structured Inquiry Cycle Using Process Mining , 2010, EDM.

[35]  Bryce Allen,et al.  Cognitive and task influences on Web searching behavior , 2002, J. Assoc. Inf. Sci. Technol..

[36]  Xiaojun Yuan,et al.  Domain knowledge, search behaviour, and search effectiveness of engineering and science students: an exploratory study , 2005, Inf. Res..

[37]  Esther Grassian,et al.  Information Literacy Instruction: Theory and Practice , 2001 .

[38]  Gary Marchionini,et al.  Information-seeking strategies of novices using a full-text electronic encyclopedia , 1989, JASIS.

[39]  Wessel Kraaij,et al.  Embedding Web-Based Statistical Translation Models in Cross-Language Information Retrieval , 2003, CL.

[40]  Mirja Iivonen,et al.  From translation to navigation of different discourses: a model of search term selection during the pre-online stage of the search process , 1998 .

[41]  Anne Aula,et al.  Query Formulation in Web Information Search , 2003, ICWI.

[42]  Stephanie C. Kerns Information Literacy Instruction: Theory and Practice. , 2002 .

[43]  Eric Brill,et al.  Improving web search ranking by incorporating user behavior information , 2006, SIGIR.

[44]  Susan Wiedenbeck,et al.  PATTERNS OF INFORMATION SEEKING ON THE WEB: A QUALITATIVE STUDY OF DOMAIN EXPERTISE AND WEB EXPERTISE , 2003 .

[45]  Susan T. Dumais,et al.  Evaluation Challenges and Directions for Information-Seeking Support Systems , 2009, Computer.

[46]  Peter Ingwersen,et al.  The Turn - Integration of Information Seeking and Retrieval in Context , 2005, The Kluwer International Series on Information Retrieval.

[47]  Diane Kelly,et al.  The effects of topic familiarity on information search behavior , 2002, JCDL '02.

[48]  Liwen Qiu,et al.  Analytical Searching vs. Browsing in Hypertext Information Retrieval Systems. , 1993 .

[49]  Louise Limberg,et al.  Experiencing information seeking and learning: a study of the interaction between two phenomena , 1999, Inf. Res..

[50]  Mark S. Ackerman,et al.  The perfect search engine is not enough: a study of orienteering behavior in directed search , 2004, CHI.

[51]  Ryan Shaun Joazeiro de Baker,et al.  Identifying Students' Inquiry Planning Using Machine Learning , 2010, EDM.

[52]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.