Mining the Quantified Self: Personal Knowledge Discovery as a Challenge for Data Science

The last several years have seen an explosion of interest in wearable computing, personal tracking devices, and the so-called quantified self (QS) movement. Quantified self involves ordinary people recording and analyzing numerous aspects of their lives to understand and improve themselves. This is now a mainstream phenomenon, attracting a great deal of attention, participation, and funding. As more people are attracted to the movement, companies are offering various new platforms (hardware and software) that allow ever more aspects of daily life to be tracked. Nearly every aspect of the QS ecosystem is advancing rapidly, except for analytic capabilities, which remain surprisingly primitive. With increasing numbers of qualified self participants collecting ever greater amounts and types of data, many people literally have more data than they know what to do with. This article reviews the opportunities and challenges posed by the QS movement. Data science provides well-tested techniques for knowledge discovery. But making these useful for the QS domain poses unique challenges that derive from the characteristics of the data collected as well as the specific types of actionable insights that people want from the data. Using a small sample of QS time series data containing information about personal health we provide a formulation of the QS problem that connects data to the decisions of interest to the user.

[1]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[2]  Galit Shmueli,et al.  To Explain or To Predict? , 2010 .

[3]  Xindong Wu,et al.  Sequential pattern mining in multiple streams , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[4]  Elena Lutsiv,et al.  Association Rules Discovery in Multivariate Time Series , 2007, SYRCoDIS.

[5]  Galit Shmueli,et al.  To Explain or To Predict? , 2010, 1101.0891.

[6]  Alex Alves Freitas,et al.  Comprehensible classification models: a position paper , 2014, SKDD.

[7]  Geoffrey I. Webb,et al.  Supervised Descriptive Rule Discovery: A Unifying Survey of Contrast Set, Emerging Pattern and Subgroup Mining , 2009, J. Mach. Learn. Res..

[8]  L. Umansky The Data-Driven Life , 2012 .

[9]  J. I The Design of Experiments , 1936, Nature.

[10]  Stephen D. Bay,et al.  Detecting Group Differences: Mining Contrast Sets , 2001, Data Mining and Knowledge Discovery.

[11]  Amon Rapp,et al.  Visualization of Human Behavior Data: The Quantified Self , 2014 .

[12]  Jiju Antony,et al.  Design of experiments for engineers and scientists , 2003 .

[13]  Heikki Mannila,et al.  Discovering Frequent Episodes in Sequences , 1995, KDD.

[14]  Deborah Estrin,et al.  Small data, where n = me , 2014, Commun. ACM.

[15]  D. Wettschereck,et al.  Subgroup Visualization: A Method and Application in Population Screening , 2002 .

[16]  Shelley L. Leininger,et al.  Single Subject Designs in Biomedicine , 2009 .

[17]  Melanie Swan,et al.  Sensor Mania! The Internet of Things, Wearable Computing, Objective Metrics, and the Quantified Self 2.0 , 2012, J. Sens. Actuator Networks.

[18]  Melanie Swan,et al.  The Quantified Self: Fundamental Disruption in Big Data Science and Biological Discovery , 2013, Big Data.