Finding dependencies between actions using the crowd

Activity recognition can provide computers with the context underlying user inputs, enabling more relevant responses and more fluid interaction. However, training these systems is difficult because it requires observing every possible sequence of actions that comprise a given activity. Prior work has enabled the crowd to provide labels in real-time to train automated systems on-the-fly, but numerous examples are still needed before the system can recognize an activity on its own. To reduce the need to collect this data by observing users, we introduce ARchitect, a system that uses the crowd to capture the dependency structure of the actions that make up activities. Our tests show that over seven times as many examples can be collected using our approach versus relying on direct observation alone, demonstrating that by leveraging the understanding of the crowd, it is possible to more easily train automated systems.