Treatment of Missing Data in Intelligent Lighting Applications

The concept of intelligent lighting facilitates the use of machine learning models to adapt the lighting application behavior based on changing context. Ideally, a complete dataset without missing values is used to train the learning algorithm. Nevertheless, it is common to have missing data values in the dataset, e.g. due to lack of rich enough user interfaces such as smart phones. In this paper, we study various probabilistic approaches to treat missing feature values in a dataset collected from an office breakout area. This dataset is used to train the learning model to provide intelligent lighting solutions. We evaluate the performance of five different approaches by simulation, using four rule-based classification algorithms and various proportions of missing data. We find that none of these approaches gives best performance over the necessary range of conditions, and that an adaptive strategy is more suited.

[1]  Steven W. Lockley,et al.  Circadian Rhythms: Influence of Light in Humans , 2009 .

[2]  Edgar Acuña,et al.  The Treatment of Missing Values and its Effect on Classifier Accuracy , 2004 .

[3]  Foster J. Provost,et al.  Handling Missing Values when Applying Classification Models , 2007, J. Mach. Learn. Res..

[4]  David C. Howell,et al.  The Treatment of Missing Data , 2007 .

[5]  Abdul Rauf Baig,et al.  Using Association Rules for Better Treatment of Missing Values , 2006, ArXiv.

[6]  Martine Knoop,et al.  Dynamic lighting for well-being in work places: Addressing the visual, emotional and biological aspects of lighting design. , 2006 .

[7]  R. Gifford,et al.  Assessing Beliefs about Lighting Effects on Health, Performance, Mood, and Social Behavior , 1996 .

[8]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[9]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[10]  Lukasz A. Kurgan,et al.  Impact of imputation of missing values on classification error for discrete data , 2008, Pattern Recognit..

[11]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[12]  Ron Kohavi,et al.  The Power of Decision Tables , 1995, ECML.

[13]  P. Torcellini,et al.  A Literature Review of the Effects of Natural Light on Building Occupants , 2002 .

[14]  S. Salzberg,et al.  INSTANCE-BASED LEARNING : Nearest Neighbour with Generalisation , 1995 .

[15]  Zili Zhang,et al.  Missing Value Estimation for Mixed-Attribute Data Sets , 2011, IEEE Transactions on Knowledge and Data Engineering.

[16]  Tariq Samad,et al.  Imputation of Missing Data in Industrial Databases , 1999, Applied Intelligence.