Scrupulous, scrutable, and sumptuous

n creating a multimedia travelogue of my recent trip to Europe, I checked in to the most interesting places I visited using the location-based service Foursquare [1]. Launched in March 2009 and serving more than 50 million users, Foursquare is a social networking site that lets you bookmark (i.e., " check in " at) venues based on your geographic location. As with all such services, the more information about you available to the system, the better the interactions. However, as much as I love it, Foursquare suffers from data sparsity and the closed world problem for me—I'm an irregular user, so the service knows very little about me. For example, in Cambridge I checked in to the River Cam, a river I rowed on frequently once upon a time, and Foursquare excitedly exclaimed, " Your first river! " Not so, Foursquare. Not my first river. I concede it is perhaps the first river we have shared together. While this error is charming and amusing, information-poor user models can be dangerous. More trivially, they are a waste of our attentional resources, distracting us with irrelevant content. In the world of product recommendation, this manifests most irritatingly in recommendations for things we already own or would never purchase. Enough experiences like these with a service and one is likely to feel bemused at best, frustrated at worst. Users have a low threshold for how many poor experiences they are willing to endure before a service loses its allure. Such reduced engagement negatively impacts service viability from a business perspective. Thus, users and services have the same goal—to improve inference and recommendation quality. And that requires data, not just what I did data (behavioral and transaction logs) but also why I did it (motivation) data and other things I'd like to do/explore (aspiration) data. Internet services that offer recommendations usually rely on " big data " and machine-learning algorithms to crunch the data to find patterns and make inferences and predictions. These are in some sense user models. But many such " user models " behind information targeting are not really focused on us as individual users, as people or persons. This is their power (these generic user models scale well) and also their weakness (none of us is entirely generic). These techniques fail in the face of data sparsity—without enough data to crunch on, there are no conclusions (or poor ones), no …