Performance‐steered design of software architectures for embedded multicore systems

Many software applications demanding a considerable computing power are moving towards the field of embedded systems (and, in particular, hand‐held devices). A possible way to increase the computing power of this kind of platform, so that both cost and power consumption are kept low, is the employment of multiple CPU cores on the same chipset. Consequently, it is essential to design applications that meet performance requirements leveraging the underlying parallel platform. As embedded applications are usually built using different components (whose source code is often not available) from different companies, the designer can mostly only operate at the architectural level. So far, methodologies for designing software architectures have mainly addressed general‐purpose systems, often relying on hardware platforms with a high degree of parallelism. In this paper, we present our experience in architectural design of parallel embedded applications; as a result, we propose a possible methodology for the application design at the architectural level, targeted to embedded systems built upon multicore chipsets with a low degree of parallelism. It makes use of performance predictions, obtained by simulations. Such a methodology can be employed both for retargeting existing sequential applications to parallel processing platforms and for designing complete applications from scratch. We show the application of the proposed methodology to an embedded digital cartographic system. Starting with a software description using UML diagrams, candidate software architectures (utilizing different parallel solutions) are first defined and then evaluated, to end with the selection of the one yielding the highest performance gain. Copyright © 2002 John Wiley & Sons, Ltd.

[1]  Ed F. Deprettere,et al.  Exploring Embedded-Systems Architectures with Artemis , 2001, Computer.

[2]  Raj Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[3]  Yossi Matias,et al.  Can shared-memory model serve as a bridging model for parallel computation? , 1997, SPAA '97.

[4]  Richard N. Taylor,et al.  A Classification and Comparison Framework for Software Architecture Description Languages , 2000, IEEE Trans. Software Eng..

[5]  David J. McConnell,et al.  Reengineering real-time embedded software onto a parallel processing platform , 1996, Proceedings of WCRE '96: 4rd Working Conference on Reverse Engineering.

[6]  Mark Klein,et al.  Experience with performing architecture tradeoff analysis , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[7]  Jim Conallen,et al.  Modeling Web application architectures with UML , 1999, CACM.

[8]  Sartaj Sahni,et al.  Performance metrics: keeping the focus on runtime , 1996, IEEE Parallel Distributed Technol. Syst. Appl..

[9]  Leonard J. Bass,et al.  Scenario-Based Analysis of Software Architecture , 1996, IEEE Softw..

[10]  David B. Skillicorn,et al.  Models and languages for parallel computation , 1998, CSUR.

[11]  Elliott D. Kaplan Understanding GPS : principles and applications , 1996 .

[12]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[13]  J. R. Horgan,et al.  Simulation-trace-based component performance prediction , 2000, Proceedings 33rd Annual Simulation Symposium (SS 2000).

[14]  Alessio Bechini,et al.  Evaluation of On-Chip Multiprocessor Architectures for an Embedded Cartographic System , 2001 .

[15]  Cosimo Antonio Prete,et al.  The ChARM tool for tuning embedded systems , 1997, IEEE Micro.

[16]  Günter Haring,et al.  Performance Prediction of Parallel Systems with Scalable Specifications - Methodology and Case Study , 1994, SIGMETRICS.

[17]  M. J. Quinn,et al.  Analytical performance prediction on multicomputers , 1993, Supercomputing '93.

[18]  Luigi M. Ricciardi,et al.  A Trace-Driven Simulator for Performance Evaluation of Cache-Based Multiprocessor Systems , 1995, IEEE Trans. Parallel Distributed Syst..

[19]  Milos D. Ercegovac,et al.  A methodology for performance analysis of parallel computations with looping constructs , 1992 .

[20]  K. Olukotun,et al.  Evaluation of Design Alternatives for a Multiprocessor Microprocessor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[21]  Lei Hu,et al.  A performance prototyping approach to designing concurrent software architectures , 1997, Proceedings of PDSE '97: 2nd International Workshop on Software Engineering for Parallel and Distributed Systems.

[22]  Edward A. Lee,et al.  What's Ahead for Embedded Software? , 2000, Computer.

[23]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[24]  Connie U. Smith,et al.  Performance evaluation of software architectures , 1998, WOSP '98.

[25]  Neil A. Speirs,et al.  A UML tool for an automatic generation of simulation programs , 2000, WOSP '00.

[26]  Daniel A. Menascé,et al.  A Method for Design and Performance Modeling of Client/Server Systems , 2000, IEEE Trans. Software Eng..

[27]  Anoop Gupta,et al.  Parallel computer architecture - a hardware / software approach , 1998 .

[28]  Wolfgang Obelöer,et al.  Trapper: eliminating performance bottlenecks in a parallel embedded application , 1997, IEEE Concurrency.

[29]  Ivar Jacobson,et al.  The Unified Modeling Language User Guide , 1998, J. Database Manag..

[30]  Michael N. DeMers,et al.  Fundamentals of Geographic Information Systems , 1996 .

[31]  Arjan J. C. van Gemund,et al.  Performance prediction of parallel processing systems: the PAMELA methodology , 1993, ICS '93.

[32]  Manfred Schlett Trends in Embedded-Microprocessor Design , 1998, Computer.

[33]  Marco Bernardo,et al.  Let's evaluate performance algebraically , 1999, CSUR.

[34]  Rick Kazman,et al.  Assessing architectural complexity , 1998, Proceedings of the Second Euromicro Conference on Software Maintenance and Reengineering.

[35]  Luigi M. Ricciardi,et al.  Trace Factory: generating workloads for trace-driven simulation of shared-bus multiprocessors , 1997, IEEE Concurrency.