Repair from a Chair: Computer Repair as an Untrusted Cloud Service

Today, when people need their computers repaired, their process is not very different from hiring someone to fix a television: they either bring the computer to a repair service [1, 3], or they call a technician (or family member) and ask for a house call. This process is inconvenient. It also risks the privacy of customers’ data and the integrity of their systems: repair services have gained notoriety for stealing personal data from customers [22] or otherwise intruding on their privacy [7]. Remote desktops [2, 4, 5] avoid physical movement but still require that the customer make a choice. The customer can spend time and monitor the repairer (though many customers of repair services probably do not have the technical savvy to detect spurious actions in the first place). Or, the customer can save time and ignore the repairer, giving him carte blanche, just as if the computer were in the shop. Neither choice seems great. The purpose of this paper is to articulate a new vision for repair. This vision is motivated by three trends: • Software repairs. We surveyed retail repair services and learned that a large majority of repairs today involve software changes only (§2). Such repairs do not inherently require travel. • Virtual machines. Many computers today, even desktops, include virtualization technology, which could allow customers to ship their computers electronically [9, 19] (by sending the virtual machine image) and enforce guarantees against the repairer (by implementing protections in the virtual machine monitor). • Outsourcing. Service providers have long offered value-added services, such as desktop management and IT consulting. Lately, commodity computing has followed an analogous path, migrating to wellprovisioned, off-site, partially anonymous service providers (often known as the cloud). We call our vision repair from a chair: let a customer, at the press of a button, electronically ship a computer to a third party repairer whom the customer never meets; let the repairer be untrusted by the customer (meaning that the customer is protected against repairer error, whether accidental or intentional); and let the repair happen asynchronously. By asynchronously, we mean that the customer does not need to monitor the repair in real time. Note that the context for this vision is retail repair, in contrast with much academic work on troubleshooting [8, 11, 14, 15, 17, 18, 20, 21, 23, 25–28]. There, an experienced system administrator is faced with a complex configuration issue (say a subtly wrong token in httpd.conf), and this person trusts the troubleshooter (or they are the same person). In our setting, however, users are inexperienced (so much so that they would be unable to use the above-cited tools), the problems are relatively easy for the troubleshooter (as indicated by our survey), and the customer does not fully trust the repairer. Thus, the technical challenges in our scenario are different (though the tools above would be useful to the repairer so are complementary to our work). Our challenges are, first, to protect the privacy of the customer’s data. For example, if the repairer needs to correct a misconfiguration in a virus checker, he should not be able to see private vacation photos. We also have to protect the integrity of the customer’s system: if the repairer executes an invalid repair, a customer-side module should reject it or roll it back if the customer later discovers a problem. Last, we want to protect availability: the customer should be able to keep working during the repair. This requires a way to merge the repairer’s changes with those of the customer. A key building block for solving these problems comes from the rich literature on dependency tracking [8, 15, 17, 18, 20, 21, 23, 28] and, in particular, selective redo [11, 14], which we (ironically) use to protect against the repairer. We will also borrow other work, including virtual machine migration [6, 10, 19]. There are also new problems to solve: protecting customer privacy while allowing the repairer to work, statically validating repairs, merging the repairer’s changes with the user’s, dependency tracking across OS upgrades, coherently composing the aforementioned, and more. However, we do not yet have complete solutions. Rather, this paper’s primary contributions are articulating both the vision and the research agenda that must be addressed to realize it. Secondary contributions are targeting retail repair (which implies a new model: easy repairs but untrusted repairers) and conducting an inquiry of current retail repair services, which we report next.

[1]  Mona Attariyan,et al.  Automating Configuration Troubleshooting with Dynamic Information Flow Analysis , 2010, OSDI.

[2]  Steven D. Gribble,et al.  Configuration Debugging as Search: Finding the Needle in the Haystack , 2004, OSDI.

[3]  Xi Wang,et al.  Intrusion Recovery Using Selective Re-execution , 2010, OSDI.

[4]  Shan Lu,et al.  Flight data recorder: monitoring persistent-state interactions to improve systems management , 2006, OSDI '06.

[5]  Helen J. Wang,et al.  Privacy-Preserving Friends Troubleshooting Network , 2005, NDSS.

[6]  Michael Vrable,et al.  Scalability, fidelity, and containment in the potemkin virtual honeyfarm , 2005, SOSP '05.

[7]  Helen J. Wang,et al.  Automatic Misconfiguration Troubleshooting with PeerPressure , 2004, OSDI.

[8]  Wei-Ying Ma,et al.  Automated known problem diagnosis with event traces , 2006, EuroSys.

[9]  Michael Austin,et al.  eCryptfs : An Enterprise-class Cryptographic Filesystem for Linux , 2005 .

[10]  Eyal de Lara,et al.  The taser intrusion recovery system , 2005, SOSP '05.

[11]  Dina Katabi,et al.  Enabling Configuration-Independent Automation by Non-Expert Users , 2010, OSDI.

[12]  Yi-Min Wang,et al.  Flight Data Recorder: Always-on Tracing and Scalable Analysis of Persistent State Interactions to Improve Systems and Security Management , 2006 .

[13]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[14]  Monica S. Lam,et al.  Optimizing the migration of virtual computers , 2002, OPSR.

[15]  Monica S. Lam,et al.  The collective: a cache-based system management architecture , 2005, NSDI.

[16]  Margo I. Seltzer,et al.  Provenance-Aware Storage Systems , 2006, USENIX ATC, General Track.

[17]  Kiran-Kumar Muniswamy-Reddy,et al.  Causality-based versioning , 2009, TOS.

[18]  Brian D. Noble,et al.  Using Provenance to Aid in Personal File Search , 2007, USENIX Annual Technical Conference.

[19]  Mona Attariyan,et al.  AutoBash: improving configuration management with operating system causality analysis , 2007, SOSP.

[20]  Samuel T. King,et al.  Backtracking intrusions , 2003, SOSP '03.

[21]  Tzi-cker Chiueh,et al.  Design, implementation, and evaluation of repairable file service , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..