Methods to explore design space for MPEG RMC codec specifications

The recent MPEG Reconfigurable Media Coding (RMC) standard aims at defining media processing specifications (e.g. video codecs) in a form that abstracts from the implementation platform, but at the same time is an appropriate starting point for implementation on specific targets. To this end, the RMC framework has standardized both an asynchronous dataflow model of computation and an associated specification language. Either are providing the formalism and the theoretical foundation for multimedia specifications. Even though these specifications are abstract and platform-independent the new approach of developing implementations from such initial specifications presents obvious advantages over the approaches based on classical sequential specifications. The advantages appear particularly appealing when targeting the current and emerging homogeneous and heterogeneous manycore or multicore processing platforms. These highly parallel computing machines are gradually replacing single-core processors, particularly when the system design aims at reducing power dissipation or at increasing throughput. However, a straightforward mapping of an abstract dataflow specification onto a concurrent and heterogeneous platform does often not produce an efficient result. Before an abstract specification can be translated into an efficient implementation in software and hardware, the dataflow networks need to be partitioned and then mapped to individual processing elements. Moreover, system performance requirements need to be accounted for in the design optimization process. This paper discusses the state of the art of the combinatorial problems that need to be faced at this design space exploration step. Some recent developments and experimental results for image and video coding applications are illustrated. Both well-known and novel heuristics for problems such as mapping, scheduling and buffer minimization are investigated in the specific context of exploring the design space of dataflow program implementations.

[1]  Ghislain Roquier,et al.  High level design space exploration of RVC codec specifications for multi-core heterogeneous platforms , 2010, 2010 Conference on Design and Architectures for Signal and Image Processing (DASIP).

[2]  Rudy Lauwereins,et al.  Cyclo-static data flow , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[3]  Jan Karel Lenstra,et al.  Approximation algorithms for scheduling unrelated parallel machines , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[4]  Mickaël Raulet,et al.  The Reconfigurable Video Coding Standard [Standards in a Nutshell] , 2010, IEEE Signal Processing Magazine.

[5]  Marco Mattavelli MPEG Reconfigurable Video Representation , 2012 .

[6]  Edward A. Lee,et al.  Dataflow process networks , 2001 .

[7]  Ghislain Roquier,et al.  Scheduling of dynamic dataflow programs based on state space analysis , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Jörn W. Janneck,et al.  Profiling dataflow programs , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[9]  Christophe Lucarz,et al.  Dataflow Programming for Systems Design Space Exploration for Multicore Platforms , 2011 .

[10]  Donna S. Reese,et al.  Near-critical path analysis of program activity graphs , 1994, Proceedings of International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[11]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[12]  Praveen K. Murthy,et al.  Memory Management for Synthesis of DSP Software , 2006 .

[13]  Ghislain Roquier,et al.  Synthesizing Hardware from Dataflow Programs , 2008, 2008 IEEE Workshop on Signal Processing Systems.

[14]  Christophe Lucarz,et al.  Optimization of portable parallel signal processing applications by design space exploration of dataflow programs , 2011, 2011 IEEE Workshop on Signal Processing Systems (SiPS).

[15]  Marco Mattavelli,et al.  High-abstraction level complexity analysis and memory architecture simulations of multimedia algorithms , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Y.-K. Kwok,et al.  Static scheduling algorithms for allocating directed task graphs to multiprocessors , 1999, CSUR.

[17]  Mauricio G. C. Resende,et al.  Greedy Randomized Adaptive Search Procedures , 1995, J. Glob. Optim..

[18]  P. Pardalos,et al.  The Maximum Clique Problem , 1999, Handbook of Combinatorial Optimization.

[19]  Ghislain Roquier,et al.  Software Code Generation for the RVC-CAL Language , 2011, J. Signal Process. Syst..

[20]  J.-F. Nezan,et al.  Reconfigurable video coding on multicore , 2009, IEEE Signal Processing Magazine.

[21]  Ge Yu,et al.  Static Scheduling and Software Synthesis for Dataflow Graphs with Symbolic Model-Checking , 2007, 28th IEEE International Real-Time Systems Symposium (RTSS 2007).

[22]  Mickaël Raulet,et al.  Classification and transformation of dynamic dataflow programs , 2010, 2010 Conference on Design and Architectures for Signal and Image Processing (DASIP).

[23]  Barton P. Miller,et al.  Critical path analysis for the execution of parallel and distributed programs , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[24]  Edward A. Lee,et al.  Taming heterogeneity - the Ptolemy approach , 2003, Proc. IEEE.

[25]  Edward A. Lee,et al.  Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing , 1989, IEEE Transactions on Computers.

[26]  Johan Eker,et al.  A STRUCTURED DESCRIPTION OF DATAFLOW ACTORS AND ITS APPLICATION , 2003 .

[27]  Barton P. Miller,et al.  Parallel program performance metrics: a comparison and validation , 1992, Proceedings Supercomputing '92.

[28]  Gilles Kahn,et al.  The Semantics of a Simple Language for Parallel Programming , 1974, IFIP Congress.

[29]  Thomas Martyn Parks,et al.  Bounded scheduling of process networks , 1996 .

[30]  P. Pardalos,et al.  Handbook of Combinatorial Optimization , 1998 .

[31]  Ronald L. Graham,et al.  Bounds for Multiprocessor Scheduling with Resource Constraints , 1975, SIAM J. Comput..

[32]  Ghislain Roquier,et al.  Synthesizing hardware from dataflow programs: An MPEG-4 simple profile decoder case study , 2008, SiPS.

[33]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[34]  Göktürk Üçoluk Genetic Algorithm Solution of the TSP Avoiding Special Crossover and Mutation , 2002, Intell. Autom. Soft Comput..

[35]  Yu Wang,et al.  An efficient technique for analysis of minimal buffer requirements of synchronous dataflow graphs with model checking , 2009, CODES+ISSS '09.

[36]  Mickaël Raulet,et al.  A unified hardware/software co-synthesis solution for signal processing systems , 2011, Proceedings of the 2011 Conference on Design & Architectures for Signal & Image Processing (DASIP).

[37]  Reinhard Diestel,et al.  Graph Theory , 1997 .

[38]  Ghislain Roquier,et al.  Scheduling of dynamic dataflow programs with model checking , 2011, 2011 IEEE Workshop on Signal Processing Systems (SiPS).

[39]  Jörn W. Janneck A machine model for dataflow actors and its applications , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[40]  Jan Karel Lenstra,et al.  Complexity of machine scheduling problems , 1975 .

[41]  Jack B. Dennis,et al.  First version of a data flow procedure language , 1974, Symposium on Programming.