Towards multimodal deep learning for activity recognition on mobile devices

Current smartphones and smartwatches come equipped with a variety of sensors, from light sensor and inertial sensors to radio interfaces, enabling applications running on these devices to make sense of their surrounding environment. Rather than using sensors independently, combining their sensing capabilities facilitates more interesting and complex applications to emerge (e.g., user activity recognition). But differences between sensors ranging from sampling rate to data generation model (event triggered or continuous sampling) make integration of sensor streams challenging. Here we investigate the opportunity to use deep learning to perform this integration of sensor data from multiple sensors. The intuition is that neural networks can identify nonintuitive features largely from cross-sensor correlations which can result in a more accurate estimation. Initial results with a variant of a Restricted Boltzmann Machine (RBM), show better performance with this new approach compared to classic solutions.