DS-DSE: Domain-specific design space exploration for streaming applications

Domain-specific computing is promising for highperformance low-power execution of applications with similar functionality. In particular, streaming applications with significant functional and structural similarities can tremendously benefit. However, current Design Space Exploration (DSE) focuses on individual applications in isolation. Hence, much of the domain optimization opportunities are missed. DSE methodologies need to broaden the scope from individual applications in isolation to optimizing across applications within a domain. This paper introduces a novel Domain-Specific DSE (DS-DSE) approach for domain-specific computing with a focus on streaming applications. Key contributions are: (1) a formalized method to extract the functional and structural similarities of domain applications, (2) a novel algorithm for hardware/software partitioning of a domain-specific platform to maximize the throughput across domain applications (under certain constraints) and (3) a methodology to evaluate a domain platform. This paper demonstrates the benefits using 4 domains: OpenVX (vision processing), and 3 synthetic domains (with greater complexity). Our experiments demonstrate a performance improvement (average throughput) of 36.8% for OpenVX and 46.2% for synthetic domains of the DS-DSE generated platform compared to an application-specific platform.

[1]  Li Tao Hardware/software partitioning based on greedy algorithm and simulated annealing algorithm , 2013 .

[2]  Ákos Horváth,et al.  Multi-objective optimization in rule-based design space exploration , 2014, ASE.

[3]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[4]  Luca Benini,et al.  Platform 2012, a many-core computing accelerator for embedded SoCs: Performance evaluation of visual analytics applications , 2012, DAC Design Automation Conference 2012.

[5]  Michael Glaß,et al.  Multi-variant-based design space exploration for automotive embedded systems , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[6]  Kyo Chul Kang,et al.  Feature-Oriented Domain Analysis (FODA) Feasibility Study , 1990 .

[7]  Jürgen Teich,et al.  Operational mode exploration for reconfigurable systems with multiple applications , 2011, 2011 International Conference on Field-Programmable Technology.

[8]  Wayne Kelly,et al.  Performance estimation of streaming applications for hierarchical MPSoCs , 2016, RAPIDO '16.

[9]  Gunar Schirner,et al.  Function-Level Processor (FLP): A Novel Processor Class for Efficient Processing of Streaming Applications , 2015, Journal of Signal Processing Systems.

[10]  Andreas Gerstlauer,et al.  System-on-Chip Environment: A SpecC-Based Framework for Heterogeneous MPSoC Design , 2008, EURASIP J. Embed. Syst..

[11]  Wu Jigang,et al.  Efficient heuristic and tabu search for hardware/software partitioning , 2013, The Journal of Supercomputing.

[12]  Gunar Schirner,et al.  Improving scalability of CMPs with dense ACCs coverage , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[13]  Sander Stuijk,et al.  SDF^3: SDF For Free , 2006, Sixth International Conference on Application of Concurrency to System Design (ACSD'06).

[14]  Yuan Wen Hau,et al.  Hardware/software partitioning of embedded System-on-Chip applications , 2015, 2015 IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC).