Application of Machine Learning to support production planning of a food industry in the context of waste generation under uncertainty

Abstract Food production is a complex process where uncertainty is very relevant (e.g. stochastic yield and demand, variability in raw materials and ingredients…), resulting in differences between planned production and actual output. These discrepancies have an economic cost for the company (e.g. waste disposal), as well as an environmental impact (food waste and increased carbon footprint). This research aims to develop tools based on data analytics to predict the magnitude of these discrepancies, improving enterprise profitability while, at the same time, reducing environmental impact aiding food waste management. A food company that produces liquid products based on fruits and vegetables was analyzed. Data was gathered on 1,795 batches, including the characteristics of the product (recipe, components used…) and the difference between the input and the output weight. Machine Learning (ML) algorithms were used to predict deviations in production, reducing uncertainties related to the amount of waste produced. The ML models had greater predictive capacity than a linear model with stepwise parameter selection. Then, uncertainty is included in the predictions using a normal distribution based on the residuals of the model. Furthermore, we also demonstrate that ML models can be used as a tool to identify possible production anomalies. This research shows innovative ways to deal with uncertainty in production planning using modern methods in the field of operation research. These tools improve classical methods and provide production managers with valuable information to assess the economic benefits of improved machinery or process controls. As a consequence, accurate predictive models can potentially improve the profitability of food companies, also reducing their environmental impact.

[1]  M. Jaber,et al.  Environmentally responsible inventory models: Non-classical models for a non-classical era , 2011 .

[2]  E. Silver A Simple Method of Determining Order Quantities in Joint Replenishments Under Deterministic Demand , 1976 .

[3]  J. S. Hunter,et al.  Statistics for Experimenters: Design, Innovation, and Discovery , 2006 .

[4]  A. Rashid,et al.  Towards circular economy implementation: a comprehensive review in context of manufacturing industry , 2016 .

[5]  Hong Guo,et al.  Prediction of effluent concentration in a wastewater treatment plant using machine learning models. , 2015, Journal of environmental sciences.

[6]  Colin New,et al.  MRP with high uncertain yield losses , 1984 .

[7]  Thea King,et al.  Food safety for food security: Relationship between global megatrends and developments in food safety , 2017 .

[8]  Andreas Ziegler,et al.  ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R , 2015, 1508.04409.

[9]  S. Wolfert,et al.  Big Data in Smart Farming – A review , 2017 .

[10]  Eric W.T. Ngai,et al.  Implementing an RFID-based manufacturing process management system: Lessons learned and success factors , 2012 .

[11]  Kelly Bronson,et al.  Big Data in food and agriculture , 2016 .

[12]  Lipan Feng,et al.  The influence of big data system for used product management on manufacturing–remanufacturing operations , 2019, Journal of Cleaner Production.

[13]  Akemi Takeoka Chatfield,et al.  A contingency model for creating value from RFID supply chain network projects in logistics and manufacturing environments , 2009, Eur. J. Inf. Syst..

[14]  E. Abt Understanding statistics 3 , 2010, Evidence-Based Dentistry.

[15]  Simeon Kaitibie,et al.  Understanding Challenges to Food Security in Dry Arab Micro-States: Evidence from Qatari Micro-Data , 2013 .

[16]  Kimberly M Thompson,et al.  Variability and Uncertainty Meet Risk Management and Risk Communication , 2002, Risk analysis : an official publication of the Society for Risk Analysis.

[17]  Raul Poler,et al.  Models for production planning under uncertainty: A review ☆ , 2006 .

[18]  Serenella Sala,et al.  Current options for the valorization of food manufacturing waste: a review , 2014 .

[19]  Zahir Irani,et al.  Sustainable food security futures: Perspectives on food waste and information across the food supply chain , 2016, J. Enterp. Inf. Manag..

[20]  MaryAnne M. Gobble,et al.  Big Data: The Next Big Thing in Innovation , 2013 .

[21]  Stephen C. Graves,et al.  Uncertainty and Production Planning , 2011 .

[22]  P. Fernández,et al.  The use of trade data to predict the source and spread of food safety outbreaks: An innovative mathematical modelling approach. , 2019, Food research international.

[23]  Hong Tan,et al.  Big Data Based Design of Food Safety Cloud Platform , 2014, ICRA 2014.

[24]  H. Theil Introduction to econometrics , 1978 .

[25]  Joshua D. Angrist,et al.  Identification of Causal Effects Using Instrumental Variables , 1993 .

[26]  Kim F. Nimon,et al.  Interpreting Multiple Linear Regression: A Guidebook of Variable Importance , 2012 .

[27]  M. Wahab,et al.  EOQ models for a coordinated two-level international supply chain considering imperfect items and environmental impact , 2011 .

[28]  Kwangtae Park,et al.  A newsvendor analysis of a binomial yield production process , 2019, Eur. J. Oper. Res..

[29]  Cengiz Kahraman,et al.  A decision support system for demand forecasting with artificial neural networks and neuro-fuzzy models: A comparative analysis , 2009, Expert Syst. Appl..

[30]  Miryam Barad,et al.  Control Limits for Multi-stage Manufacturing Processes with Binomial Yield (Single and Multiple Production Runs) , 1996 .

[31]  Surya Prakash Singh,et al.  Sustainable robust layout using Big Data approach: A key towards industry 4.0 , 2018, Journal of Cleaner Production.

[32]  Marco Setti,et al.  Model selection and averaging in the assessment of the drivers of household food waste to reduce the probability of false positives , 2018, PloS one.

[33]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[34]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[35]  Kim Hua Tan,et al.  Harvesting big data to enhance supply chain innovation capabilities: An analytic infrastructure based on deduction graph , 2015 .

[36]  Shahriar Akter,et al.  How ‘Big Data’ Can Make Big Impact: Findings from a Systematic Review and a Longitudinal Case Study , 2015 .

[37]  Daniela M. Witten,et al.  An Introduction to Statistical Learning: with Applications in R , 2013 .

[38]  J. Gustavsson Global food losses and food waste , 2011 .

[39]  Harpreet Kaur,et al.  Heuristic modeling for sustainable procurement and logistics in a supply chain using big data , 2017, Comput. Oper. Res..

[40]  Hau L. Lee,et al.  Lot Sizing with Random Yields: A Review , 1995, Oper. Res..

[41]  J. Friedman Multivariate adaptive regression splines , 1990 .

[42]  J. Friedman Stochastic gradient boosting , 2002 .

[43]  Amiya Kumar Tripathy,et al.  Rice crop yield prediction in India using support vector machines , 2016, 2016 13th International Joint Conference on Computer Science and Software Engineering (JCSSE).

[44]  Michael F. Goodchild,et al.  Development and test of an error model for categorical data , 1992, Int. J. Geogr. Inf. Sci..

[45]  Jay Lee,et al.  Service Innovation and Smart Analytics for Industry 4.0 and Big Data Environment , 2014 .

[46]  Samuel Fosso Wamba,et al.  Achieving supply chain integration using RFID technology: The case of emerging intelligent B-to-B e-commerce processes in a living laboratory , 2012, Bus. Process. Manag. J..

[47]  Yingfeng Zhang,et al.  A comprehensive review of big data analytics throughout product lifecycle to support sustainable smart manufacturing: A framework, challenges and future research directions , 2019, Journal of Cleaner Production.