Machine learning for automatic rule classification of agricultural regulations: A case study in Spain

Abstract Currently, pest management practices require modern equipment and the use of complex information, such as regulations and guidelines. The complexity of regulations is the root cause of the emergence of automated solutions for compliance assessment by translating regulations into sets of machine-processable rules that can be run by specialized modules of farm management information systems (FMIS). However, the manual translation of rules is prohibitively costly, and therefore, this translation should be carried out with the support of artificial intelligence techniques. In this paper, we use the official Spanish phytosanitary products registry to empirically evaluate the performance of four popular machine learning algorithms in the task of correctly classifying pesticide regulations as prohibitions or obligations. Moreover, we also evaluate how to improve the performance of the algorithms in the preprocessing of the texts with natural language processing techniques. Finally, due to the specific characteristics of the texts found in pesticide regulations, resampling techniques are also evaluated. Experiments show that the combination of the machine learning algorithm Logic regression, the natural language technique part-of-speech tagging and the resampling technique Tomek links is the best performing approach, with an F1 score of 68.8%, a precision of 84.46% and a recall of 60%. The experimental results are promising, and they show that this approach can be applied to develop a computer-aided tool for transforming textual pesticide regulations into machine-processable rules. To the best of our knowledge, this is the first study that evaluates the use of artificial intelligence methods for the automatic translation of agricultural regulations into machine-processable representations.

[1]  Terry W. Griffin,et al.  Farmers’ Adoption Path of Precision Agriculture Technology , 2017 .

[2]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[3]  Wim Peters,et al.  On Rule Extraction from Regulations , 2011, JURIX.

[4]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[5]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[6]  Roberto Pinto,et al.  Managing supplier delivery reliability risk under limited information: Foundations for a human-in-the-loop DSS , 2013, Decis. Support Syst..

[7]  Alessandro Montaghi,et al.  A performance comparison of machine learning methods to estimate the fast-growing forest plantation yield based on laser scanning metrics , 2015, Comput. Electron. Agric..

[8]  Ralf Bill,et al.  Towards automated compliance checking based on a formal representation of agricultural production standards , 2011 .

[9]  Kari Koskinen,et al.  A service infrastructure for the representation, discovery, distribution and evaluation of agricultural production standards for automated compliance control , 2012 .

[10]  Shun'ichi Kaneko,et al.  Image-based field monitoring of Cercospora leaf spot in sugar beet by robust template matching and pattern recognition , 2015, Comput. Electron. Agric..

[11]  Dionysis Bochtis,et al.  Conceptual model of a future farm management information system , 2010 .

[12]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[13]  Spyros Fountas,et al.  Farm management information systems: Current situation and future perspectives , 2015, Comput. Electron. Agric..

[14]  Eric Brill,et al.  A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[15]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..