Tutorial T6: Variability-resistant Software and Hardware for Nano-Scale Computing

As semiconductor manufacturers build ever smaller components, circuits and chips at the nano scale become less reliable and more expensive to produce no longer behaving like precisely chiseled machines with tight tolerances. Modern computing tends to ignore the variability in behavior of underlying system components from device to device, their wear-out over time, or the environment in which the computing system is placed. This makes them expensive, fragile and vulnerable to even the smallest changes in the environment or component failures. This tutorial presents an approach to tame and exploit variability through a strategy where system components -- led by proactive software -- routinely monitor, predict and adapt to the variability of manufactured systems. Unlike conventional system design where variability is hidden behind the conservative specifications of an "over-designed" hardware, we describe strategies that expose spatiotemporal variations in hardware to the highest layers of software. After presenting the background and positioning the new approach, the tutorial will proceed in a bottom- up fashion. Causes of variability at the circuit and hardware levels are first presented, and classical approaches to hide such variability are presented. The tutorial then presents a number of strategies at successively higher levels of abstraction covering the circuit, microarchitecture, compiler, operating systems and software applications to monitor, detect, adapt to, and exploit the exposed variability. Adaptable software will use online statistical modeling to learn and predict actual hardware characteristics, opportunistically adjust to variability, and proactively conform to a deliberately underdesigned hardware with relaxed design and manufacturing constraints. The resulting class of UnO (Underdesigned and Opportunistic) computing machines are adaptive but highly energy efficient. They will continue working while using components that vary in performance or grow less reliable over time and across technology generations. A fluid software-hardware interface will mitigate the variability of manufactured systems and make machines robust, reliable and responsive to changing operating conditions offering the best hope for perpetuating the fundamental gains in computing performance at lower cost of the past 40 years.