Open Data Kit: Technologies for Mobile Data Collection and Deployment Experiences in Developing Regions

Gathering information accurately and quickly is essential for enabling organizations working in low-resource settings to have timely and sustainable impact. Due to insufficient infrastructure, many organizations currently use paper to collect data in the field, only to have data entry clerks digitize the data later. This often introduces latency and potential sources of error. However, the growing development of cellular infrastructure combined with the rapid decline in the cost of smart phones presents an opportunity to shift the primary collection medium from paper to mobile devices. This dissertation presents our contribution to data collection in developing regions, Open Data Kit (ODK), an extensible, open-source suite of tools designed to facilitate tasks at every level of data collection campaigns. ODK currently provides three tools to this end: Collect, Aggregate, and Build. Collect is a mobile client providing simple interfaces for collecting data. Aggregate is an easy to deploy data storage system hosted in the "cloud" or on local servers. Build is a web-based drag-and-drop form designer created to simplify the process of creating complex digital forms. By providing the ability to both capture and present richer data (e.g. images, video, and location), ODK tools have provided organizations new ways to collect and analyze information. We present the system architecture and through example real-world deployments, highlight specific design decisions that have enabled new directions in data collection and workforce management. Finally, we discuss lessons learned in building the system and present promising future directions in the space.

[1]  William M. Tierney,et al.  A computer-based medical record system and personal digital assistants to assess and follow patients with respiratory tract infections visiting a rural Kenyan health centre , 2006, BMC Medical Informatics Decis. Mak..

[2]  David Amadi,et al.  Evaluation of an Android-based mHealth system for population surveillance in developing countries , 2012, J. Am. Medical Informatics Assoc..

[3]  Jakob Nielsen,et al.  Estimating the number of subjects needed for a thinking aloud test , 1994, Int. J. Hum. Comput. Stud..

[4]  D. King,et al.  A quantifiable alternative to double data entry. , 2000, Controlled clinical trials.

[5]  Alex Pentland,et al.  Open source handheld-based EMR for paramedics working in rural areas , 2002, AMIA.

[6]  H Fraser,et al.  Cost and implementation analysis of a personal digital assistant system for laboratory data collection. , 2008, The international journal of tuberculosis and lung disease : the official journal of the International Union against Tuberculosis and Lung Disease.

[7]  Richard Anderson,et al.  Facilitated video instruction in low resource schools , 2012, ICTD '12.

[8]  Kentaro Toyama,et al.  Mobile phones and paper documents: evaluating a new approach for capturing microfinance data in rural India , 2006, CHI.

[9]  Benjamin E. Birnbaum,et al.  Automated quality control for mobile data collection , 2012, ACM DEV '12.

[10]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[11]  W. Kaplan,et al.  Globalization and Health BioMed Central Debate , 2006 .

[12]  Gaetano Borriello,et al.  ODK tables: data organization and information services on a smartphone , 2011, NSDR '11.

[13]  Gaetano Borriello,et al.  Open Source Data Collection in the Developing World , 2009, Computer.

[14]  J. Blaya,et al.  E-health technologies show promise in developing countries. , 2010, Health affairs.

[15]  Al Borchers,et al.  CodeDoc for Real-Time Point-of-Care Emergencies , 2003, AMIA.

[16]  J. Aker,et al.  Mobile Phones and Economic Development in Africa , 2010 .

[17]  Paul G. Biondich,et al.  Concept Dictionary Creation and Maintenance Under Resource Constraints: Lessons from the AMPATH Medical Record System , 2007, AMIA.

[18]  John F. Canny,et al.  Mobile-izing health workers in rural India , 2010, CHI.

[19]  Hamish S F Fraser,et al.  Development, implementation and preliminary study of a PDA-based tuberculosis result collection system. , 2006, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[20]  Tapan S. Parikh,et al.  Establishing relationships for designing rural information systems , 2007, CHI Extended Abstracts.

[21]  Dorian G. W. Smith,et al.  Palm computer demonstrates a fast and accurate means of burn data collection. , 2000, The Journal of burn care & rehabilitation.

[22]  Mark Tomlinson,et al.  The use of mobile phones as a data collection tool: A report from a household survey in South Africa , 2009, BMC Medical Informatics Decis. Mak..

[23]  Gaetano Borriello,et al.  E-imci: improving pediatric health care in low-income countries , 2008, CHI.

[24]  Rajesh Veeraraghavan,et al.  Digital Green: Participatory video for agricultural extension , 2007, 2007 International Conference on Information and Communication Technologies and Development.

[25]  Gaetano Borriello,et al.  Validated caloric expenditure estimation using a single body-worn sensor , 2009, UbiComp.

[26]  Thomas N. Smyth,et al.  Notes from the Field Stories from the Field : Reflections on HCI 4 D Experiences , 2009 .

[27]  Gaetano Borriello,et al.  Open data kit sensors: mobile data collection with wired and wireless sensors , 2012, ACM DEV '12.

[28]  Tapan S. Parikh Designing an Architecture for Delivering Mobile Information Services to the Rural Developing World , 2006, WMCSA.

[29]  Neal Lesh,et al.  Using Mobile Applications for Community-based Social Support for Chronic Patients , 2009 .

[30]  Martin C. Were,et al.  Leapfrogging Paper-Based Records Using Handheld Technology: Experience from Western Kenya , 2010, MedInfo.

[31]  R. Malkin Design of health care technologies for the developing world. , 2007, Annual review of biomedical engineering.

[32]  Donna L Berry,et al.  Personal digital assistants for HIV treatment adherence, safer sex behavior support, and provider training in resource-constrained settings. , 2007, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[33]  T. Groves,et al.  SatelLife: getting relevant information to the developing world , 1996, BMJ.

[34]  Gaetano Borriello,et al.  Portable antenatal ultrasound platform for village midwives , 2010, ACM DEV '10.

[35]  Roger Eeckels,et al.  Data Cleaning: Detecting, Diagnosing, and Editing Data Abnormalities , 2005, PLoS medicine.

[36]  Edward E. Jones,et al.  How do people perceive the causes of behavior , 1976 .

[37]  R. Reynolds-Haertle,et al.  Single vs. double data entry in CAST. , 1992, Controlled clinical trials.

[38]  Emma Brunskill,et al.  Evaluating the accuracy of data collection on mobile phones: A study of forms, SMS, and voice , 2009, 2009 International Conference on Information and Communication Technologies and Development (ICTD).

[39]  Ronald Rosenfeld,et al.  Speech vs. touch-tone: Telephony interfaces for information access by low literate users , 2009, 2009 International Conference on Information and Communication Technologies and Development (ICTD).

[40]  Paul G. Biondich,et al.  The AMPATH Medical Record System: Creating, Implementing, and Sustaining an Electronic Medical Record System to Support Hiv/AIDS Care in Western Kenya , 2007, MedInfo.

[41]  Khai N. Truong,et al.  Participant and interviewer attitudes toward handheld computers in the context of HIV/AIDS programs in sub-Saharan Africa , 2008, CHI.

[42]  Gaetano Borriello,et al.  Digitizing paper forms with mobile imaging technologies , 2012, ACM DEV '12.

[43]  M. Bamberger,et al.  Monitoring and evaluation : some tools, methods, and approaches , 2004 .

[44]  Yan Xiao,et al.  A review and a framework of handheld computer adoption in healthcare , 2005, Int. J. Medical Informatics.

[45]  Richard Han,et al.  FireWxNet: a multi-tiered portable wireless system for monitoring weather conditions in wildland fire environments , 2006, MobiSys '06.

[46]  Hamish S. F. Fraser,et al.  Development, implementation and preliminary study of a PDA-based bacteriology collection system , 2006, AMIA.

[47]  F. Heider The psychology of interpersonal relations , 1958 .

[48]  Jingke Xi,et al.  Outlier Detection Algorithms in Data Mining , 2008, 2008 Second International Symposium on Intelligent Information Technology Application.

[49]  Tapan S. Parikh Engineering rural development , 2009, CACM.

[50]  Marcel Tanner,et al.  The use of personal digital assistants for data entry at the point of collection in a large household survey in southern Tanzania , 2007, Emerging themes in epidemiology.

[51]  Bryant Thomas Karras,et al.  Design and Implementation of Cell-PREVEN: A Real-Time Surveillance System for Adverse Events Using Cell Phones in Peru , 2005, AMIA.

[52]  Dianne J Terlouw,et al.  Use of handheld computers with global positioning systems for probability sampling and data entry in household surveys. , 2007, The American journal of tropical medicine and hygiene.

[53]  Tapan S. Parikh,et al.  Using CAM-equipped Mobile Phones for Procurement and Quality Control at a Rural Coffee Cooperative , 2007 .

[54]  P. Byass,et al.  Evaluation of a computerized field data collection system for health surveys. , 1991, Bulletin of the World Health Organization.

[55]  Gaetano Borriello,et al.  Open data kit: tools to build information services for developing regions , 2010, ICTD.

[56]  Tapan S. Parikh,et al.  Mobile phone tools for field-based health care workers in low-income countries. , 2011, The Mount Sinai journal of medicine, New York.

[57]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[58]  V. Gqamane The Millennium Development Goals Report, 2009 , 2009 .

[59]  Gaetano Borriello,et al.  Design of a phone-based clinical decision support system for resource-limited settings , 2012, ICTD.

[60]  Paul G. Biondich,et al.  Cooking Up An Open Source EMR For Developing Countries: OpenMRS - A Recipe For Successful Collaboration , 2006, AMIA.

[61]  S Day,et al.  Double data entry: what value, what price? , 1998, Controlled clinical trials.

[62]  Joseph M. Hellerstein,et al.  Data in the First Mile , 2011, CIDR.

[63]  Stefan Peterson,et al.  Delayed care seeking for fatal pneumonia in children aged under five years in Uganda: a case-series study. , 2008, Bulletin of the World Health Organization.

[64]  Charles Tumwebaze,et al.  Epihandy Mobile - A Mobile Data Collection Tool , 2008 .

[65]  Krzysztof Z. Gajos,et al.  Opportunities for Intelligent Interfaces Aiding Healthcare in Low-Income Countries , 2008 .

[66]  Rajesh Veeraraghavan,et al.  Warana Unwired: Replacing PCs with mobile phones in a rural sugarcane cooperative , 2007, 2007 International Conference on Information and Communication Technologies and Development.

[67]  Hwa Sun Kim,et al.  Adoption of a PDA-Based Home Hospice Care System for Cancer Patients , 2009, Computers, informatics, nursing : CIN.

[68]  James A. Landay,et al.  MyExperience: a system for in situ tracing and capturing of user feedback on mobile phones , 2007, MobiSys '07.

[69]  Shaohua Chen,et al.  The Developing World is Poorer than We Thought, But No Less Successful in the Fight Against Poverty , 2008 .

[70]  Joyojeet Pal,et al.  The challenges of technology research for developing regions , 2006, IEEE Pervasive Computing.

[71]  Joseph M. Hellerstein,et al.  Improving data quality with dynamic forms , 2009, 2009 International Conference on Information and Communication Technologies and Development (ICTD).

[72]  Jung-Min Park,et al.  An overview of anomaly detection techniques: Existing solutions and latest technological trends , 2007, Comput. Networks.

[73]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[74]  Joseph M. Hellerstein,et al.  Shreddr: pipelined paper digitization for low-resource organizations , 2012, ACM DEV '12.