Optimizing thin client caches for mobile cloud computing:

The emergence and rapid spread of interest and use of cloud computing as an accessible and expandable, as needed, computing facility on the go, has a very deep affinity to the proliferation of intelligent mobile devices including smartphones and tablets. Together, these technologies have the potential of not leaving anybody behind when it comes to computing applications whether small and personal or large and organizational, and regardless of geographic boundaries and economical conditions. However, many technical challenges still exist that are still delaying the realization of this dream with the responsiveness and quality needed from the user perspective. In this paper, we examine user requirements for access to the cloud through thin clients, handheld and mobile devices. In light of these requirements we characterize some of the needed research developments particularly in the area of device architecture. We present our work in exploring the cache design space for embedded processors using evolutionary techniques for mobile and thin client processors. We present a heuristic, evolutionary approach (genetic algorithm) to exploration that significantly cuts down on the time and resources, obtaining a near optimal design. We demonstrate the real‐world utility of our tool‐chain—“CERE” (pronounced SIRI) short for (CachE Recommendation Engine)—by rapidly and efficiently designing a cache hierarchy, which maximizes the performance of a web browser navigating to a set of popular websites running on a single ARM core. The goal is to improve the users' experience using web browsers. “CERE” made the right choices, and we were able to observe a 17.1% speedup going from the “best” hierarchy relative to the “worst” hierarchy. We will detail potential future directions as well.

[1]  Ronald G. Dreslinski,et al.  The M5 Simulator: Modeling Networked Systems , 2006, IEEE Micro.

[2]  Reza Sedaghat,et al.  Multi-objective efficient design space exploration and architectural synthesis of an application specific processor (ASP) , 2011, Microprocess. Microsystems.

[3]  M. Horowitz,et al.  Low-power digital design , 1994, Proceedings of 1994 IEEE Symposium on Low Power Electronics.

[4]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[5]  Frode Eika Sandnes,et al.  Toward a realistic task scheduling model , 2006, IEEE Transactions on Parallel and Distributed Systems.

[6]  Apan Qasem,et al.  Evaluating a Model for Cache Conflict Miss Prediction , 2005 .

[7]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[8]  Michael M. Swift,et al.  Reducing memory reference energy with opportunistic virtual caching , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[9]  Patrick Schaumont,et al.  A Hardware-Software Partitioning and Scheduling Algorithm for Dynamically Reconfigurable Embedded Systems , 2000 .

[10]  Vincenzo Catania,et al.  An Evolutionary Approach for Pareto-optimal Configurations in SOC Platforms , 2001, VLSI-SOC.

[11]  YeungDonald,et al.  Efficient Reuse Distance Analysis of Multicore Scaling for Loop-Based Parallel Programs , 2013 .

[12]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[13]  Tarek A. El-Ghazawi,et al.  Application-specific processors for web-browsing: An exploration and evaluation of the design space , 2013, 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors.

[14]  S. N. Sivanandam,et al.  Introduction to genetic algorithms , 2007 .

[15]  Vikram K. Narayana,et al.  "CERE": A CachE Recommendation Engine: Efficient Evolutionary Cache Hierarchy Design Space Exploration , 2014, 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS).

[16]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[17]  José Ignacio Hidalgo,et al.  Optimization methodology of dynamic data structures based on genetic algorithms for multimedia embedded systems , 2009, J. Syst. Softw..

[18]  Todd M. Austin,et al.  SimpleScalar: An Infrastructure for Computer System Modeling , 2002, Computer.

[19]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[20]  Vincenzo Catania,et al.  Performance evaluation of efficient multi-objective evolutionary algorithms for design space exploration of embedded computer systems , 2011, Appl. Soft Comput..

[21]  Nathan L. Binkert,et al.  Network-Oriented Full-System Simulation using M5 , 2003 .

[22]  Andy D. Pimentel,et al.  NASA: A generic infrastructure for system-level MP-SoC design space exploration , 2010, 2010 8th IEEE Workshop on Embedded Systems for Real-Time Multimedia.

[23]  Donald Yeung,et al.  Identifying optimal multicore cache hierarchies for loop-based parallel programs via reuse distance analysis , 2012, MSPC '12.

[24]  Albert Y. Zomaya,et al.  Genetic Scheduling for Parallel Processor Systems: Comparative Studies and Performance Issues , 1999, IEEE Trans. Parallel Distributed Syst..

[25]  Gilles Sassatelli,et al.  Accuracy evaluation of GEM5 simulator system , 2012, 7th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC).

[26]  Rajkumar Buyya,et al.  Cloud-Based Augmentation for Mobile Devices: Motivation, Taxonomies, and Open Challenges , 2013, IEEE Communications Surveys & Tutorials.

[27]  William Fornaciari,et al.  HANDS: heterogeneous architectures and networks-on-chip design and simulation , 2012, ISLPED '12.

[28]  Vincenzo Catania,et al.  A GA-based design space exploration framework for parameterized system-on-a-chip platforms , 2004, IEEE Transactions on Evolutionary Computation.

[29]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[30]  Maurizio Palesi,et al.  Multi-objective design space exploration using genetic algorithms , 2002, Proceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627).

[31]  Srihari Makineni,et al.  Exploring the cache design space for large scale CMPs , 2005, CARN.

[32]  Albert Y. Zomaya,et al.  Observations on Using Genetic Algorithms for Dynamic Load-Balancing , 2001, IEEE Trans. Parallel Distributed Syst..

[33]  Norman P. Jouppi,et al.  CACTI: an enhanced cache access and cycle time model , 1996, IEEE J. Solid State Circuits.

[34]  Christian S. Perone,et al.  Pyevolve: a Python open-source framework for genetic algorithms , 2009, SEVO.

[35]  Donald Yeung,et al.  Studying multicore processor scaling via reuse distance analysis , 2013, ISCA.

[36]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[37]  Norman P. Jouppi,et al.  Cacti 3. 0: an integrated cache timing, power, and area model , 2001 .

[38]  Paramvir Bahl,et al.  The Case for VM-Based Cloudlets in Mobile Computing , 2009, IEEE Pervasive Computing.

[39]  Mahadev Satyanarayanan,et al.  Quantifying interactive user experience on thin clients , 2006, Computer.

[40]  Michael O'Boyle,et al.  Weak heterogeneity as a way of adapting multicores to real workloads , 2013, ADAPT '13.

[41]  George Kurian,et al.  Graphite: A distributed parallel simulator for multicores , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[42]  Vikram K. Narayana,et al.  Efficient Mapping of Task Graphs onto Reconfigurable Hardware Using Architectural Variants , 2012, IEEE Transactions on Computers.

[43]  Ronald G. Dreslinski,et al.  Full-system analysis and characterization of interactive smartphone applications , 2011, 2011 IEEE International Symposium on Workload Characterization (IISWC).

[44]  Abdel-Hameed A. Badawy Locality Transformations and Prediction Techniques for Optimizing Multicore Memory Performance , 2013 .

[45]  Nikil D. Dutt,et al.  Automatic tuning of two-level caches to embedded applications , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[46]  Bo Li,et al.  Gearing resource-poor mobile devices with powerful clouds: architectures, challenges, and applications , 2013, IEEE Wireless Communications.

[47]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.