A large-scale evaluation of automated metadata inference approaches on sensors from air handling units

Abstract Building automation systems provide abundant sensor data to enable the potential of using data analytics to, among other things, improve the energy efficiency of the building. However, deployment of these applications for buildings, such as, fault detection and diagnosis (FDD) on multiple buildings remains a challenge due to the non-trivial efforts of organizing, managing and extracting metadata associated with sensors (e.g., information about their location, function, etc.), which is required by applications. One of the reasons leading to the problem is that varying conventions, acronyms, and standards are used to define this metadata. To better understand the nature of the problem, as well as the performance and scalability of existing solutions, we implement and test 6 different time-series based metadata inference approaches on sensors from 614 air handling units (AHU) instrumented in 35 building sites accounting for more than 400 buildings distributed across United States of America. We infer 12 types of sensors and actuators in AHUs required by a rule-based FDD application: AHU performance and assessment rules (APAR). Our results show that: (1) the average performance of these approaches in terms of accuracy is similar across building sites, though there is significant variance; (2) the expected accuracy of classifying the type of points required by APAR for a new unseen building is, on average, 75%; (3) the performance of the model does not decrease as long as training data and testing data are extracted from adjacent months.

[1]  Burcu Akinci,et al.  Requirements and Evaluation of Standards for Integration of Sensor Data with Building Information Models , 2009 .

[2]  David E. Culler,et al.  Automated Metadata Construction to Support Portable Building Applications , 2015, BuildSys@SenSys.

[3]  Fu Xiao,et al.  Data mining in building automation system for improving building operational performance , 2014 .

[4]  Michael R. Brambley,et al.  Review Article: Methods for Fault Detection, Diagnostics, and Prognostics for Building Systems—A Review, Part II , 2005 .

[5]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[6]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[7]  Kamin Whitehouse,et al.  Clustering-based Active Learning on Sensor Type Classification in Buildings , 2015, CIKM.

[8]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[9]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[10]  W. Livingood,et al.  Review of Current Data Exchange Practices: Providing Descriptive Data to Assist with Building Operations Decisions , 2011 .

[11]  Mikkel Baun Kjærgaard,et al.  Towards a metadata discovery, maintenance and validation process to support applications that improve the energy performance of buildings , 2016, 2016 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops).

[12]  Karsten Menzel,et al.  Multi-dimensional building performance data management for continuous commissioning , 2010, Adv. Eng. Informatics.

[13]  George Forman,et al.  Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement , 2010, SKDD.

[14]  Kamin Whitehouse,et al.  High-dimensional Time Series Clustering via Cross-Predictability , 2017, AISTATS.

[15]  Mikkel Baun Kjærgaard,et al.  Mining building metadata by data stream comparison , 2016, 2016 IEEE Conference on Technologies for Sustainability (SusTech).

[16]  Kamin Whitehouse,et al.  The Building Adapter: Towards Quickly Applying Building Analytics at Scale , 2015, BuildSys@SenSys.

[17]  Yuebin Yu,et al.  A review of fault detection and diagnosis methodologies on air-handling units , 2014 .

[18]  Richard W. Bukowski,et al.  CRITICAL INFORMATION FOR FIRST RESPONDERS, WHENEVER AND WHEREVER IT IS NEEDED , 2001 .

[19]  Alastair Robinson,et al.  Achieving a Net Zero Energy Retrofit - In a humid, temperate climate: Lessons from the University of Hawai'i at Manoa , 2015 .

[20]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[21]  David E. Culler,et al.  Towards Automatic Spatial Verification of Sensor Placement in Buildings , 2013, BuildSys@SenSys.

[22]  Kamin Whitehouse,et al.  Smart Blueprints: Automatically Generated Maps of Homes and the Devices Within Them , 2012, Pervasive.

[23]  D. Culler,et al.  Empirical Mode Decomposition for Intrinsic-Relationship Extraction in Large Sensor Deployments , 2012 .

[24]  Harald Kosch,et al.  An ontology design pattern for IoT device tagging systems , 2015, 2015 5th International Conference on the Internet of Things (IOT).

[25]  Haibo He,et al.  Assessment Metrics for Imbalanced Learning , 2013 .

[26]  Karl Aberer,et al.  Deriving Semantic Sensor Metadata from Raw Measurements , 2012, SSN.

[27]  Ziyou Xiong,et al.  Assisted Point Mapping to Enable Cost-effective Deployment of Intelligent Building Applications , 2016 .

[28]  Ionut Constandache,et al.  Creating a room connectivity graph of a building from per-room sensor units , 2012, BuildSys '12.

[29]  Joachim Hammer,et al.  Challenges, approaches and architecture for distributed process integration in heterogeneous environments , 2008, Adv. Eng. Informatics.

[30]  Jingkun Gao,et al.  A Data-driven Meta-data Inference Framework for Building Automation Systems , 2015, BuildSys@SenSys.

[31]  Balakrishnan Narayanaswamy,et al.  Zodiac: Organizing Large Deployment of Sensors to Create Reusable Applications for Buildings , 2015, BuildSys@SenSys.

[32]  Burcu Akinci,et al.  Comparison of linear correlation and a statistical dependency measure for inferring spatial relation of temperature sensors in buildings , 2014, BuildSys@SenSys.

[33]  Mark Modera,et al.  A method for discovering functional relationships between building components from sensor data , 2015 .

[34]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[35]  Burcu Akinci,et al.  Exploratory Study Towards Streamlining the Identification of Sensor Locations Within a Facility , 2014 .

[36]  Jouko Pakanen,et al.  Demonstrating automated fault detection and diagnosis methods in real buldings , 2001 .

[37]  Karl Aberer,et al.  Sensor Metadata Management and Its Application in Collaborative Environmental Research , 2008, 2008 IEEE Fourth International Conference on eScience.

[38]  Steven T. Bushby,et al.  A rule-based fault detection method for air handling units , 2006 .

[39]  Chris Clifton,et al.  Semantic Integration in Heterogeneous Databases Using Neural Networks , 1994, VLDB.

[40]  Bernard Gorman,et al.  Towards automating the deployment of energy saving approaches in buildings , 2014, BuildSys@SenSys.