Digital Crop Health Monitoring by Analyzing Social Media Streams

This paper introduces the idea of using social media streams like Twitter to identify occurrences of crop diseases. Climate change and changes in agriculture practices have contributed to a change in crop disease dynamics leading to an increase in crop damages. Monitoring crop disease occurrences across regions is helpful for farmers to prepare for such adverse situations and make effective use of crop protection products thus ensuring enough produce for the growing population and protection of the environment. We investigate Machine Learning and Natural Language Processing techniques in order to spot agricultural discussions on Twitter; then analyze, categorize, and group them; so they can be used by a stakeholder to identify crop disease incidences, patterns, and trends at the regional scale. Current systems using keyword based search of agricultural diseases do not always yield agriculturally relevant tweets and those that do could talk on a range of sub-topics. Therefore, text classification forms the core component of this work. A two fold classification process is employed, classifying agriculturally relevant tweets from the rest and then performing fine-grained categorization on them. The resulting model for agricultural tweets classification performs with 93% accuracy and the fine grained categorization model that categorizes tweets into 6 categories gives 75% accuracy. A prototype of an interactive web based disease monitoring application is also presented. The location estimation is not always accurate but nonetheless, this work acts as a proof of concept for the introduction of social media as a novel data source in precision farming.

[1]  D. Bebber,et al.  Crop pests and pathogens move polewards in a warming world , 2013 .

[2]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[4]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[5]  Leysia Palen,et al.  Natural Language Processing to the Rescue? Extracting "Situational Awareness" Tweets During Mass Emergency , 2011, ICWSM.

[6]  K. Garrett,et al.  A global surveillance system for crop diseases , 2019, Science.

[7]  Leysia Palen,et al.  Identifying and Categorizing Disaster-Related Tweets , 2016, SocialNLP@EMNLP.

[8]  James K. M. Brown,et al.  Aerial Dispersal of Pathogens on the Global and Continental Scales and Its Impact on Plant Disease , 2002, Science.

[9]  Mizuki Morita,et al.  Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter , 2011, EMNLP.

[10]  M. Shigematsu,et al.  Using Social Media for Actionable Disease Surveillance and Outbreak Management: A Systematic Literature Review , 2015, PloS one.

[11]  D. Bebber,et al.  Crop-destroying fungal and oomycete pathogens challenge food security. , 2015, Fungal genetics and biology : FG & B.

[12]  S. Zipper Agricultural Research Using Social Media Data , 2018 .

[13]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[14]  John B. Vogler,et al.  Citizen science helps predict risk of emerging infectious disease , 2015 .

[15]  Scott A. Isard,et al.  Integrated Pest Information Platform for Extension and Education (iPiPE): Progress Through Sharing , 2015 .

[16]  Peter Daszak,et al.  Emerging infectious diseases of plants: pathogen pollution, climate change and agrotechnology drivers. , 2004, Trends in ecology & evolution.

[17]  Fernando Diaz,et al.  Extracting information nuggets from disaster- Related messages in social media , 2013, ISCRAM.

[18]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[19]  T. Holmes,et al.  Economic and physical determinants of the global distributions of crop pests and pathogens , 2014, The New phytologist.

[20]  Mark Dredze,et al.  Separating Fact from Fear: Tracking Flu Infections on Twitter , 2013, NAACL.

[21]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[22]  Yiming Yang,et al.  Text categorization , 2008, Scholarpedia.

[23]  Virginia Gewin,et al.  How to feed a hungry world , 2010, Nature.

[24]  M. Garbelotto,et al.  Environmental Factors Driving the Recovery of Bay Laurels from Phytophthora ramorum Infections: An Application of Numerical Ecology to Citizen Science , 2017 .

[25]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[26]  Graham Kendall,et al.  Search Methodologies: Introductory Tutorials in Optimization and Decision Support Techniques , 2013 .

[27]  T. Burgess,et al.  Urban environments provide opportunities for early detections of Phytophthora invasions , 2017, Biological Invasions.