EMPROF: Memory Profiling Via EM-Emanation in IoT and Hand-Held Devices

This paper presents EMPROF, a new method for profiling the performance impact of the memory subsystem without any support on, or interference with, the profiled system. Rather than rely on hardware support and/or software instrumentation on the profiled system, EMPROF analyzes the system's EM emanations to identify processor stalls that are associated with last-level cache (LLC) misses. This enables EMPROF to accurately pinpoint LLC misses in the execution timeline and to measure the cost (stall time) of each miss. Since EMPROF has zero "observer effect", so it can be used to profile applications that adjust their activity to their performance. It has no overhead on target machine, so it can be used for profiling embedded, hand-held, and IoT devices which usually have limited support for collecting, and limited resources for storing, the profiling data. Finally, since EMPROF can profile the system as-is, its profiling of boot code and other hard-to-profile software components is as accurate as its profiling of application code. To illustrate the effectiveness of EMPROF, we first validate its results using microbenchmarks with known memory behavior, and also on SPEC benchmarks running a cycle-accurate simulator that can provide detailed ground-truth data about LLC misses and processor stalls. We then demonstrate the effectiveness of EMPROF on real systems, including profiling of boot activity, show how its results can be attributed to the specific parts of the application code when that code is available, and provide additional insight on the statistics reported by EMPROF and how they are affected by the EM signal bandwidth provided to EMPROF.

[1]  Jan-Patrick Lehr,et al.  Counting performance: hardware performance counter and compiler instrumentation , 2016, GI-Jahrestagung.

[2]  Aaron Goldberg,et al.  Interrupt-based hardware support for profiling memory system performance , 1995, Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors.

[3]  Yale N. Patt,et al.  Predicting Performance Impact of DVFS for Realistic Memory Systems , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[4]  Jack J. Dongarra,et al.  A Scalable Cross-Platform Infrastructure for Application Performance Tuning Using Hardware Counters , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[5]  Allen D. Malony,et al.  Overhead Compensation in Performance Profiling , 2004, Parallel Process. Lett..

[6]  Larry L. Biro,et al.  Power considerations in the design of the Alpha 21264 microprocessor , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[7]  Vincent M. Weaver Self-monitoring overhead of the Linux perf_ event performance counter interface , 2015, 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[8]  Stijn Eyerman,et al.  A Counter Architecture for Online DVFS Profitability Estimation , 2010, IEEE Transactions on Computers.

[9]  Margaret Martonosi,et al.  MemSpy: analyzing memory system bottlenecks in programs , 1992, SIGMETRICS '92/PERFORMANCE '92.

[10]  Babak Falsafi,et al.  Clearing the clouds: a study of emerging scale-out workloads on modern hardware , 2012, ASPLOS XVII.

[11]  Daniel Genkin,et al.  Stealing Keys from PCs Using a Radio: Cheap Electromagnetic Attacks on Windowed Exponentiation , 2015, CHES.

[12]  Jeffrey Dean,et al.  ProfileMe: hardware support for instruction-level profiling on out-of-order processors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[13]  Trevor N. Mudge,et al.  Power: A First-Class Architectural Design Constraint , 2001, Computer.

[14]  Markus G. Kuhn,et al.  Compromising Emanations , 2002, Encyclopedia of Cryptography and Security.

[15]  Milos Prvulovic,et al.  Spectral profiling: Observer-effect-free profiling by monitoring EM emanations , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[16]  S. Eranian Perfmon2: a flexible performance monitoring interface for Linux , 2010 .

[17]  Pankaj Rohatgi,et al.  Electromagnetic Attacks and Countermeasures , 2009, Cryptographic Engineering.

[18]  Wenyuan Xu,et al.  WattsUpDoc: Power Side Channels to Nonintrusively Discover Untargeted Malware on Embedded Medical Devices , 2013, HealthTech.

[19]  Josef Weidendorfer,et al.  A Tool Suite for Simulation Based Analysis of Memory Access Behavior , 2004, International Conference on Computational Science.

[20]  David J. DeWitt,et al.  DBMSs on a Modern Processor: Where Does Time Go? , 1999, VLDB.

[21]  Qiang Wu,et al.  Exposing memory access regularities using object-relative memory profiling , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[22]  Alessandro Orso,et al.  Zero-overhead profiling via EM emanations , 2016, ISSTA.

[23]  Jeffrey H. Reed,et al.  Power fingerprinting in SDR integrity assessment for security and regulatory compliance , 2011 .

[24]  S. Turner,et al.  Performance Analysis Using the MIPS R10000 Performance Counters , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[25]  Lance M. Berc,et al.  Continuous profiling: where have all the cycles gone? , 1997, ACM Trans. Comput. Syst..

[26]  Wim van Eck,et al.  Electromagnetic radiation from video display units: An eavesdropping risk? , 1985, Comput. Secur..

[27]  James E. Smith,et al.  A first-order superscalar processor model , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[28]  Matthias Hauswirth,et al.  Understanding Measurement Perturbation in Trace-based Data , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[29]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[30]  Allen D. Malony,et al.  Performance Measurement Intrusion and Perturbation Analysis , 1992, IEEE Trans. Parallel Distributed Syst..

[31]  John L. Henning SPEC CPU2000: Measuring CPU Performance in the New Millennium , 2000, Computer.

[32]  Chandra Krintz,et al.  Efficient remote profiling for resource-constrained devices , 2006, TACO.

[33]  Gurindar S. Sohi,et al.  A programmable co-processor for profiling , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[34]  Daisuke Suzuki,et al.  On measurable side-channel leaks inside ASIC design primitives , 2014, Journal of Cryptographic Engineering.

[35]  Wenyuan Xu,et al.  Current Events: Identifying Webpages by Tapping the Electrical Outlet , 2013, ESORICS.

[36]  Milos Prvulovic,et al.  EDDIE: EM-based detection of deviations in program execution , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).