Distributed dynamic elastic nets: A scalable approach for regularization in dynamic manufacturing environments

In this paper, we focus on the task of learning influential parameters under unsteady and dynamic environments. Such unsteady and dynamic environments often occur in the ramp-up phase of manufacturing. We propose a novel regularization-based framework, called Distributed Dynamic Elastic Nets (DDEN), for this problem and formulate it as a convex optimization objective. Our approach solves the optimization problem using a distributed framework. Consequently it is highly scalable and can easily be applied to very large datasets. We implement a L-BFGS based solver in the Apache Spark framework. For validating our algorithm, we consider the issue of scrap reduction at an assembly line during the ramp-up phase of a manufacturing plant. By considering the logistic regression as a sample model, we evaluate the performance of our approach although extensions of the proposed regularizer to other classification and regression techniques is straightforward. Through experiments on data collected at a functioning manufacturing plant, we show that the proposed method not only reduces model variance but also helps preserve the relative importance of features in dynamic conditions compared to standard approaches. The experiments further show that the classification performance of DDEN if often better than logistic regression with standard elastic nets for datasets from dynamic and unsteady environments. We are collaborating with manufacturing units to use this process for improving production yields during the ramp-up phase. This work serves as a demonstration of how data mining can be used to solve problem in manufacturing.

[1]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[2]  Michael I. Jordan,et al.  A Robust Minimax Approach to Classification , 2003, J. Mach. Learn. Res..

[3]  Shie Mannor,et al.  Robustness and Regularization of Support Vector Machines , 2008, J. Mach. Learn. Res..

[4]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[5]  Sabine Van Huffel,et al.  Total Least Squares and Errors-in-variables Modeling , 2007, Comput. Stat. Data Anal..

[6]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[7]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[8]  Ami Wiesel,et al.  Linear Regression With Gaussian Model Uncertainty: Algorithms and Bounds , 2008, IEEE Transactions on Signal Processing.

[9]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[10]  Jianfeng Gao,et al.  Scalable training of L1-regularized log-linear models , 2007, ICML '07.

[11]  L. Ghaoui,et al.  Robust Classification with Interval Data , 2003 .

[12]  Ami Wiesel,et al.  Maximum likelihood estimation in linear models with a Gaussian model matrix , 2006, IEEE Signal Processing Letters.

[13]  Laurent El Ghaoui,et al.  Robust Solutions to Least-Squares Problems with Uncertain Data , 1997, SIAM J. Matrix Anal. Appl..

[14]  Patrick L. Harrington,et al.  Robust Logistic Regression with Bounded Data Uncertainties , 2010 .

[15]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[16]  Sabine Van Huffel,et al.  Total least squares and errors-in-variables modeling , 2007, Signal Process..

[17]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[18]  Stephen P. Boyd,et al.  Robust Fisher Discriminant Analysis , 2005, NIPS.

[19]  Raymond J. Carroll,et al.  On errors-in-variables for binary regression models , 1984 .

[20]  Laurent El Ghaoui,et al.  Robust Optimization , 2021, ICORES.

[21]  Jure Leskovec,et al.  Patterns of temporal variation in online media , 2011, WSDM '11.

[22]  G. Golub,et al.  Parameter Estimation in the Presence of Bounded Data Uncertainties , 1998, SIAM J. Matrix Anal. Appl..

[23]  R. Tibshirani,et al.  Generalized Additive Models: Some Applications , 1987 .