Correct and Control Complex IoT Systems: Evaluation of a Classification for System Anomalies

In practice there are deficiencies in precise interteam communications about system anomalies to perform troubleshooting and postmortem analysis along different teams operating complex IoT systems. We evaluate the quality in use of an adaptation of IEEE Std. 1044-2009 with the objective to differentiate the handling of fault detection and fault reaction from handling of defect and its options for defect correction. We extended the scope of IEEE Std. 1044-2009 from anomalies related to software only to anomalies related to complex IoT systems. To evaluate the quality in use of our classification a study was conducted at Robert Bosch GmbH. We applied our adaptation to a postmortem analysis of an IoT solution and evaluated the quality in use by conducting interviews with three stakeholders. Our adaptation was effectively applied and interteam communications as well as iterative and inductive learning for product improvement were enhanced.

[1]  大野 耐一,et al.  Toyota production system : beyond large-scale production , 1988 .

[2]  Gustavo Stubrich The Fifth Discipline: The Art and Practice of the Learning Organization , 1993 .

[3]  John Allspaw,et al.  How Complex Systems Fail , 2010, Web Operations.

[4]  Stefan Wagner,et al.  On Observability and Monitoring of Distributed Systems: An Industry Interview Study , 2019, ICSOC.

[5]  Stefan Wagner,et al.  Defect classification and defect types revisited , 2008, DEFECTS '08.

[6]  Miroslaw Staron,et al.  A Light-Weight Defect Classification Scheme for Embedded Automotive Software and Its Initial Evaluation , 2012, 2012 IEEE 23rd International Symposium on Software Reliability Engineering.

[7]  Jay W. Forrester,et al.  System dynamics, systems thinking, and soft OR , 1994 .

[8]  Niall Murphy,et al.  Site Reliability Engineering: How Google Runs Production Systems , 2016 .

[9]  Per Runeson,et al.  Guidelines for conducting and reporting case study research in software engineering , 2009, Empirical Software Engineering.

[10]  N. McGlynn Thinking fast and slow. , 2014, Australian veterinary journal.

[11]  P. Mayring Qualitative content analysis: theoretical foundation, basic procedures and software solution , 2014 .

[12]  R. Stake The art of case study research , 1995 .

[13]  Staðlaráð Íslands,et al.  Gæðastjórnunarkerfi : grunnatriði og íðorðasafn = Quality Management Systems : fundamentals and vocabulary. , 2006 .

[14]  Erik Hollnagel,et al.  Resilience Engineering : New directions for measuring and maintaining safety in complex systems Third Progress Report , June 2007 , 2007 .

[15]  John R. Boyd,et al.  The Essence of Winning and Losing , 2012 .

[16]  Ergonomic requirements for office work with visual display terminals ( VDTs ) — Part 11 : Guidance on usability , 1998 .

[17]  Carl E. Landwehr,et al.  Basic concepts and taxonomy of dependable and secure computing , 2004, IEEE Transactions on Dependable and Secure Computing.

[18]  W. Duncan A GUIDE TO THE PROJECT MANAGEMENT BODY OF KNOWLEDGE , 1996 .

[19]  S. M. Kinsella Activity-Based Costing: Does it Warrant Inclusion in a Guide to the Project Management Body of Knowledge (PMBOK® Guide)? , 2002 .