Big data analytics for intelligent manufacturing systems: A review

Abstract With the development of Internet of Things (IoT), 5 G, and cloud computing technologies, the amount of data from manufacturing systems has been increasing rapidly. With massive industrial data, achievements beyond expectations have been made in the product design, manufacturing, and maintain process. Big data analytics (BDA) have been a core technology to empower intelligent manufacturing systems. In order to fully report BDA for intelligent manufacturing systems, this paper provides a comprehensive review of associated topics such as the concept of big data, model driven and data driven methodologies. The framework, development, key technologies, and applications of BDA for intelligent manufacturing systems are discussed. The challenges and opportunities for future research are highlighted. Through this work, it is hoped to spark new ideas in the effort to realize the BDA for intelligent manufacturing systems.

[1]  Xinyu Li,et al.  A Generative Adversarial Network Based Deep Learning Method for Low-Quality Defect Image Reconstruction and Recognition , 2021, IEEE Transactions on Industrial Informatics.

[2]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[3]  Robert X. Gao,et al.  Big data analytics for smart factories of the future. , 2020, CIRP annals ... manufacturing technology.

[4]  Bing Wang,et al.  Bad-scenario-set robust scheduling for a job shop to hedge against processing time uncertainty , 2019, Int. J. Prod. Res..

[5]  Xinguo Ming,et al.  Module-based similarity measurement for commercial aircraft tooling design , 2015 .

[6]  William H. Dutton,et al.  Clouds, big data, and smart assets: Ten tech-enabled business trends to watch , 2010 .

[7]  Junliang Wang,et al.  AdaBalGAN: An Improved Generative Adversarial Network With Imbalanced Learning for Wafer Defective Pattern Recognition , 2019, IEEE Transactions on Semiconductor Manufacturing.

[8]  Paul P. Tallon Corporate Governance of Big Data: Perspectives on Value, Risk, and Cost , 2013, Computer.

[9]  Y. Ni,et al.  Integrated Counts of Carbohydrate-Active Protein Domains as Metabolic Readouts to Distinguish Probiotic Biology and Human Fecal Metagenomes , 2019, Scientific Reports.

[10]  Gary L. Frankwick,et al.  Effects of big data analytics and traditional marketing analytics on new product success: A knowledge fusion perspective , 2016 .

[11]  Ray Y. Zhong,et al.  Big Data Analytics for Physical Internet-based intelligent manufacturing shop floors , 2017, Int. J. Prod. Res..

[12]  Toly Chen,et al.  A Systematic Cycle Time Reduction Procedure for Enhancing the Competitiveness and Sustainability of a Semiconductor Manufacturer , 2013 .

[13]  Fei Tao,et al.  Research on the Knowledge-Based Multi-Dimensional Information Model of Manufacturing Capability in CMfg , 2012 .

[14]  MengChu Zhou,et al.  A Distance-Based Weighted Undersampling Scheme for Support Vector Machines and its Application to Imbalanced Classification , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Pingyu Jiang,et al.  Manifold learning based rescheduling decision mechanism for recessive disturbances in RFID-driven job shops , 2016, Journal of Intelligent Manufacturing.

[16]  Lihui Wang,et al.  Imbalanced data fault diagnosis of rotating machinery using synthetic oversampling and feature learning , 2018, Journal of Manufacturing Systems.

[17]  Qi Wang,et al.  Integrating Model-Driven and Data-Driven Methods for Power System Frequency Stability Assessment and Control , 2019, IEEE Transactions on Power Systems.

[18]  Takeshi Nakazawa,et al.  Wafer Map Defect Pattern Classification and Image Retrieval Using Convolutional Neural Network , 2018, IEEE Transactions on Semiconductor Manufacturing.

[19]  Quanshi Zhang,et al.  Visual interpretability for deep learning: a survey , 2018, Frontiers of Information Technology & Electronic Engineering.

[20]  Bo Zhang,et al.  Toward the third generation artificial intelligence , 2020, Science China Information Sciences.

[21]  F BabiceanuRadu,et al.  Big Data and virtualization for manufacturing cyber-physical systems , 2016 .

[22]  A. Gunasekaran,et al.  Big data analytics in logistics and supply chain management: Certain investigations for research and applications , 2016 .

[23]  Andrew Kusiak,et al.  Computational Intelligence in Product Design Engineering: Review and Trends , 2007, IEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews).

[24]  Sabeur Aridhi,et al.  An experimental survey on big data frameworks , 2016, Future Gener. Comput. Syst..

[25]  Carlo Curino,et al.  Apache Tez: A Unifying Framework for Modeling and Building Data Processing Applications , 2015, SIGMOD Conference.

[26]  GhemawatSanjay,et al.  The Google file system , 2003 .

[27]  Seetha Hari,et al.  Learning From Imbalanced Data , 2019, Advances in Computer and Electrical Engineering.

[28]  Jie Zhang,et al.  Anomaly detection of power consumption in yarn spinning using transfer learning , 2021, Comput. Ind. Eng..

[29]  Yaguo Lei,et al.  Health condition identification of multi-stage planetary gearboxes using a mRVM-based method , 2015 .

[30]  Yanquan Zhou,et al.  Mining customer requirements from online reviews: A product improvement perspective , 2016, Inf. Manag..

[31]  Lihui Wang,et al.  From Intelligence Science to Intelligent Manufacturing , 2019, Engineering.

[32]  Gary Marcus,et al.  Deep Learning: A Critical Appraisal , 2018, ArXiv.

[33]  Myeongsu Kang,et al.  Deep Residual Networks With Dynamically Weighted Wavelet Coefficients for Fault Diagnosis of Planetary Gearboxes , 2018, IEEE Transactions on Industrial Electronics.

[34]  Xiaoou Li,et al.  Deformable Convolutional Networks for Efficient Mixed-Type Wafer Defect Pattern Recognition , 2020, IEEE Transactions on Semiconductor Manufacturing.

[35]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[36]  Witold Pedrycz,et al.  Solving Fuzzy Job-Shop Scheduling Problem Using DE Algorithm Improved by a Selection Mechanism , 2020, IEEE Transactions on Fuzzy Systems.

[37]  Junliang Wang,et al.  Fog-IBDIS: Industrial Big Data Integration and Sharing with Fog Computing for Manufacturing Systems , 2019, Engineering.

[38]  Nirwan Ansari,et al.  Convergence of Networking and Cloud/Edge Computing: Status, Challenges, and Opportunities , 2020, IEEE Network.

[39]  Sebastian Stiller,et al.  Feasibility Tests for Recurrent Real-Time Tasks in the Sporadic DAG Model , 2012, ArXiv.

[40]  Junliang Wang,et al.  A collaborative architecture of the industrial internet platform for manufacturing systems , 2020, Robotics Comput. Integr. Manuf..

[41]  Viviana Fernandez,et al.  Wavelet- and SVM-based forecasts: An analysis of the U.S. metal and materials manufacturing industry , 2007 .

[42]  Azuraliza Abu Bakar,et al.  Data mining in production planning and scheduling: A review , 2009, 2009 2nd Conference on Data Mining and Optimization.

[43]  Lihui Wang,et al.  Big data analytics based fault prediction for shop floor scheduling , 2017 .

[44]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[45]  Jie Zhang,et al.  An effective approach for causal variables analysis in diesel engine production by using mutual information and network deconvolution , 2020, J. Intell. Manuf..

[46]  Mouhacine Benosman,et al.  Model‐based vs data‐driven adaptive control: An overview , 2018 .

[47]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.

[48]  Wei Cao,et al.  Blockchain-Secured Smart Manufacturing in Industry 4.0: A Survey , 2021, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[49]  Kai Ding,et al.  A loosely-coupled deep reinforcement learning approach for order acceptance decision of mass-individualized printed circuit board manufacturing in industry 4.0 , 2021 .

[50]  David M. Nicol,et al.  Denial-of-Service Threat to Hadoop/YARN Clusters with Multi-tenancy , 2014, 2014 IEEE International Congress on Big Data.

[51]  Runliang Dou,et al.  Modeling collinear WATs for parametric yield enhancement in semiconductor manufacturing , 2017, 2017 13th IEEE Conference on Automation Science and Engineering (CASE).

[52]  Hai Su,et al.  Pathologist-level interpretable whole-slide cancer diagnosis with deep learning , 2019, Nat. Mach. Intell..

[53]  Buyung Kosasih,et al.  Acoustic emission-based condition monitoring methods: Review and application for low speed slew bearing , 2016 .

[54]  Dimitrios Tzovaras,et al.  Robust malfunction diagnosis in process industry time series , 2016, 2016 IEEE 14th International Conference on Industrial Informatics (INDIN).

[55]  Jie Zhang,et al.  Big data analytics for forecasting cycle time in semiconductor wafer fabrication system , 2016 .

[56]  Heeyoung Kim,et al.  Classification of Mixed-Type Defect Patterns in Wafer Bin Maps Using Convolutional Neural Networks , 2018, IEEE Transactions on Semiconductor Manufacturing.

[57]  Lu Wang,et al.  STORM: Spatio-Temporal Online Reasoning and Management of Large Spatio-Temporal Data , 2015, SIGMOD Conference.

[58]  Conrad S. Tucker,et al.  Data-Driven Decision Tree Classification for Product Portfolio Design Optimization , 2009, J. Comput. Inf. Sci. Eng..

[59]  Fei Tao,et al.  Digital twin-driven product design, manufacturing and service with big data , 2017, The International Journal of Advanced Manufacturing Technology.

[60]  C. Bi,et al.  Real-time nearfield acoustic holography for reconstructing the instantaneous surface normal velocity , 2015 .

[61]  Jie Zhang,et al.  A data-driven robust optimization method for the assembly job-shop scheduling problem under uncertainty , 2020, Int. J. Comput. Integr. Manuf..

[62]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[63]  Ang Liu,et al.  Application of data analytics for product design: Sentiment analysis of online product reviews , 2018, CIRP Journal of Manufacturing Science and Technology.

[64]  Andrew Kusiak,et al.  Data-driven smart manufacturing , 2018, Journal of Manufacturing Systems.

[65]  Abraham Silberschatz,et al.  HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads , 2009, Proc. VLDB Endow..

[66]  Bin Yang,et al.  An intelligent fault diagnosis approach based on transfer learning from laboratory bearings to locomotive bearings , 2019, Mechanical Systems and Signal Processing.

[67]  Seokcheon Lee,et al.  Learning dispatching rules using random forest in flexible job shop scheduling problems , 2019, Int. J. Prod. Res..

[68]  Yingfeng Zhang,et al.  A comprehensive review of big data analytics throughout product lifecycle to support sustainable smart manufacturing: A framework, challenges and future research directions , 2019, Journal of Cleaner Production.

[69]  Gang Rong,et al.  Data-driven robust optimization under correlated uncertainty: A case study of production scheduling in ethylene plant , 2018, Comput. Chem. Eng..

[70]  Mariano Frutos,et al.  Production planning and scheduling in Cyber-Physical Production Systems: a review , 2019, Int. J. Comput. Integr. Manuf..

[71]  Jixian Zhang Multi-source remote sensing data fusion: status and trends , 2010 .

[72]  Liang Gao,et al.  Convolutional Neural Network With Automatic Learning Rate Scheduler for Fault Classification , 2021, IEEE Transactions on Instrumentation and Measurement.

[73]  Junliang Wang,et al.  An Unequal Deep Learning Approach for 3-D Point Cloud Segmentation , 2020, IEEE Transactions on Industrial Informatics.

[74]  John D. Owens,et al.  Multi-GPU MapReduce on GPU Clusters , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[75]  Feng Jia,et al.  An Intelligent Fault Diagnosis Method Using Unsupervised Feature Learning Towards Mechanical Big Data , 2016, IEEE Transactions on Industrial Electronics.

[76]  Dong-Hee Lee,et al.  A data-driven approach to selection of critical process steps in the semiconductor manufacturing process considering missing and imbalanced data , 2019, Journal of Manufacturing Systems.

[77]  Liang Gao,et al.  A Deep Lifelong Learning Method for Digital-Twin Driven Defect Recognition With Novel Classes , 2021, J. Comput. Inf. Sci. Eng..

[78]  Christoforos E. Kozyrakis,et al.  Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[79]  Chen-Fu Chien,et al.  Manufacturing intelligence to forecast and reduce semiconductor cycle time , 2012, J. Intell. Manuf..

[80]  Andrew Kusiak,et al.  Innovation: A data-driven approach , 2009 .

[81]  Wenjun Chris Zhang,et al.  Big data driven cycle time parallel prediction for production planning in wafer manufacturing , 2018, Enterp. Inf. Syst..

[82]  D. Hassabis,et al.  Neuroscience-Inspired Artificial Intelligence , 2017, Neuron.

[83]  Ray Y. Zhong,et al.  A two-level advanced production planning and scheduling model for RFID-enabled ubiquitous manufacturing , 2015, Adv. Eng. Informatics.

[84]  Andrew Kusiak,et al.  Smart manufacturing must embrace big data , 2017, Nature.

[85]  Ian J. Goodfellow,et al.  The Relationship Between High-Dimensional Geometry and Adversarial Examples , 2018 .

[86]  Remzi Seker,et al.  Big Data and virtualization for manufacturing cyber-physical systems: A survey of the current status and future outlook , 2016, Comput. Ind..

[87]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[88]  Manuel Graña,et al.  Reinforcement learning of ball screw feed drive controllers , 2014, Eng. Appl. Artif. Intell..

[89]  Craig Chambers,et al.  FlumeJava: easy, efficient data-parallel pipelines , 2010, PLDI '10.

[90]  Jie Zhang,et al.  Big data analytics for cycle time related feature selection in the semiconductor wafer fabrication system , 2020, Comput. Ind. Eng..

[91]  P. O'Donovan,et al.  Big data in manufacturing: a systematic mapping study , 2015, Journal of Big Data.

[92]  Cynthia Breazeal,et al.  Machine behaviour , 2019, Nature.

[93]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[94]  Liang Gao,et al.  A New Reinforcement Learning Based Learning Rate Scheduler for Convolutional Neural Network in Fault Classification , 2021, IEEE Transactions on Industrial Electronics.

[95]  Arantza Illarramendi,et al.  I4TSRS: A System to Assist a Data Engineer in Time-Series Dimensionality Reduction in Industry 4.0 Scenarios , 2018, CIKM.

[96]  Scott Shenker,et al.  Discretized streams: fault-tolerant streaming computation at scale , 2013, SOSP.

[97]  Jinsong Bao,et al.  A Hybrid CNN–LSTM Algorithm for Online Defect Recognition of CO2 Welding , 2018, Sensors.

[98]  Sebastian Stiller,et al.  Feasibility Analysis in the Sporadic DAG Task Model , 2013, 2013 25th Euromicro Conference on Real-Time Systems.

[99]  Carlo Curino,et al.  Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.

[100]  Weiming Shen,et al.  Integrated manufacturing process planning and control based on intelligent agents and multi-dimension features , 2014 .

[101]  Jie Zhang,et al.  Forecasting the power consumption of a rotor spinning machine by using an adaptive squeeze and excitation convolutional neural network with imbalanced data , 2020 .

[102]  Ray Y. Zhong,et al.  Big Data for supply chain management in the service and manufacturing sectors: Challenges, opportunities, and future perspectives , 2016, Comput. Ind. Eng..

[103]  Ran Jin,et al.  Reconfigured piecewise linear regression tree for multistage manufacturing process control , 2012 .

[104]  Alaa El. Sagheer,et al.  Time series forecasting of petroleum production using deep LSTM recurrent networks , 2019, Neurocomputing.

[105]  Yingfeng Zhang,et al.  A big data analytics architecture for cleaner manufacturing and maintenance processes of complex products , 2017 .

[106]  Kunpeng Zhu,et al.  Big Data Oriented Smart Tool Condition Monitoring System , 2020, IEEE Transactions on Industrial Informatics.

[107]  Saeed Shahrivari,et al.  Beyond Batch Processing: Towards Real-Time and Streaming Big Data , 2014, Comput..

[108]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[109]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[110]  Georgios Sarakakis,et al.  Data driven design for reliability , 2016, 2016 Annual Reliability and Maintainability Symposium (RAMS).

[111]  Jyoti K. Sinha,et al.  A future possibility of vibration based condition monitoring of rotating machines , 2013 .

[112]  Rixin Wang,et al.  A new intelligent fault identification method based on transfer locality preserving projection for actual diagnosis scenario of rotating machinery , 2020 .

[113]  Dazhong Wu,et al.  Democratizing digital design and manufacturing using high performance cloud computing: Performance evaluation and benchmarking , 2017 .

[114]  Xiaofeng Hu,et al.  Remaining useful life prediction based on health index similarity , 2019, Reliab. Eng. Syst. Saf..

[115]  Xiuzhen Li,et al.  A Rough VIKOR-Based QFD for Prioritizing Design Attributes of Product-Related Service , 2016 .

[116]  Jie Zhang,et al.  Online defect recognition of narrow overlap weld based on two-stage recognition model combining continuous wavelet transform and convolutional neural network , 2019, Comput. Ind..

[117]  Feng Luo,et al.  Community Detection on Networks with Ricci Flow , 2019, Scientific Reports.

[118]  Kouichi Sakurai,et al.  One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.

[119]  Muhammad Usman,et al.  RaSEC: An Intelligent Framework for Reliable and Secure Multilevel Edge Computing in Industrial Environments , 2020, IEEE Transactions on Industry Applications.

[120]  Aysun Coşkun,et al.  Production fault simulation and forecasting from time series data with machine learning in glove textile industry , 2019, Journal of Engineered Fibers and Fabrics.

[121]  Shimon Ullman,et al.  Using neuroscience to develop artificial intelligence , 2019, Science.

[122]  Ronald R. Yager,et al.  A framework for multi-source data fusion , 2004, Inf. Sci..