Efficient algorithms for task mapping on heterogeneous CPU/GPU platforms for fast completion time

[1]  Sarfraz Khurshid,et al.  An Empirical Study of Boosting Spectrum-Based Fault Localization via PageRank , 2021, IEEE Transactions on Software Engineering.

[2]  Yuqun Zhang,et al.  Simulee: Detecting CUDA Synchronization Bugs via Memory-Access Modeling , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[3]  R. Buyya,et al.  Data Allocation Mechanism for Internet-of-Things Systems With Blockchain , 2020, IEEE Internet of Things Journal.

[4]  Xi Zheng,et al.  A survey on security issues in services communication of Microservices‐enabled fog applications , 2019, Concurr. Comput. Pract. Exp..

[5]  Marcelo Cogo Miletto,et al.  OpenMP and StarPU Abreast: the Impact of Runtime in Task-Based Block QR Factorization Performance , 2019, Anais do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD).

[6]  Yuqun Zhang,et al.  Automating CUDA Synchronization via Program Transformation , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[7]  Nikil Dutt,et al.  HESSLE-FREE , 2019, ACM Trans. Embed. Comput. Syst..

[8]  Prasun Ghosal,et al.  Dynamic Task Mapping and Scheduling with Temperature-Awareness on Network-on-Chip based Multicore Systems , 2019, J. Syst. Archit..

[9]  Daniel Cordeiro,et al.  PLB-HAC: Dynamic Load-Balancing for Heterogeneous Accelerator Clusters , 2019, Euro-Par.

[10]  Marco Di Natale,et al.  Pessimism in multicore global schedulability analysis , 2019, J. Syst. Archit..

[11]  Ulrich Margull,et al.  GPUart - An application-based limited preemptive GPU real-time scheduler for embedded systems , 2019, J. Syst. Archit..

[12]  Shuai Che,et al.  Northup: Divide-and-Conquer Programming in Systems with Heterogeneous Memories and Processors , 2019, 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[13]  Farokh B. Bastani,et al.  Service-Oriented IoT Modeling and Its Deviation from Software Services , 2018, 2018 IEEE Symposium on Service-Oriented System Engineering (SOSE).

[14]  Hoi-Jun Yoo,et al.  UNPU: A 50.6TOPS/W unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[15]  Sarfraz Khurshid,et al.  EdSketch: execution-driven sketching for Java , 2019, International Journal on Software Tools for Technology Transfer.

[16]  Farokh B. Bastani,et al.  A Framework for IoT-Based Monitoring and Diagnosis of Manufacturing Systems , 2017, 2017 IEEE Symposium on Service-Oriented System Engineering (SOSE).

[17]  Shaoli Liu,et al.  Cambricon-X: An accelerator for sparse neural networks , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[18]  Miao Hu,et al.  ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[19]  Yu Wang,et al.  Going Deeper with Embedded FPGA Platform for Convolutional Neural Network , 2016, FPGA.

[20]  Song Han,et al.  EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[21]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[22]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Mehmet Deveci,et al.  Fast and High Quality Topology-Aware Task Mapping , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[24]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[25]  Karin Strauss,et al.  Accelerating Deep Convolutional Neural Networks Using Specialized Hardware , 2015 .

[26]  Jason Cong,et al.  Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.

[27]  Wei Quan,et al.  A Hybrid Task Mapping Algorithm for Heterogeneous MPSoCs , 2015, ACM Trans. Embed. Comput. Syst..

[28]  James H. Anderson,et al.  Exploring the Multitude of Real-Time Multi-GPU Configurations , 2014, 2014 IEEE Real-Time Systems Symposium.

[29]  Cong Liu,et al.  Task mapping in heterogeneous embedded systems for fast completion time , 2014, 2014 International Conference on Embedded Software (EMSOFT).

[30]  Hamid Arabnejad,et al.  List Scheduling Algorithm for Heterogeneous Systems by an Optimistic Cost Table , 2014, IEEE Transactions on Parallel and Distributed Systems.

[31]  Scott A. Mahlke,et al.  Transparent CPU-GPU collaboration for data-parallel kernels on heterogeneous systems , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.

[32]  R. Namyst,et al.  Composing multiple StarPU applications over heterogeneous machines: A supervised approach , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.

[33]  Cédric Augonnet,et al.  StarPU-MPI: Task Programming over Clusters of Machines Enhanced with Accelerators , 2012, EuroMPI.

[34]  Kyoung-Don Kang,et al.  Supporting Preemptive Task Executions and Memory Copies in GPGPUs , 2012, 2012 24th Euromicro Conference on Real-Time Systems.

[35]  Mark Silberstein,et al.  PTask: operating system abstractions to manage GPUs as compute devices , 2011, SOSP.

[36]  Rosa M. Badia,et al.  Productive Cluster Programming with OmpSs , 2011, Euro-Par.

[37]  Shinpei Kato,et al.  TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments , 2011, USENIX Annual Technical Conference.

[38]  Assaf Schuster,et al.  Processing data streams with hard real-time constraints on heterogeneous systems , 2011, ICS '11.

[39]  Michael F. P. O'Boyle,et al.  A Static Task Partitioning Approach for Heterogeneous Systems Using OpenCL , 2011, CC.

[40]  Cédric Augonnet,et al.  StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..

[41]  Joshua S. Auerbach,et al.  Lime: a Java-compatible and synthesizable language for heterogeneous architectures , 2010, OOPSLA.

[42]  Christoph W. Kessler,et al.  SkePU: a multi-backend skeleton programming library for multi-GPU systems , 2010, HLPP '10.

[43]  Rizos Sakellariou,et al.  DAG Scheduling Using a Lookahead Variant of the Heterogeneous Earliest Finish Time Algorithm , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.

[44]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Shan Shan Huang,et al.  Liquid Metal: Object-Oriented Programming Across the Hardware/Software Boundary , 2008, ECOOP.

[46]  Gregory Diamos,et al.  Harmony: an execution model and runtime for heterogeneous many core systems , 2008, HPDC '08.

[47]  L. Kalé,et al.  Application-specific topology-aware mapping for three dimensional topologies , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[48]  Emmanuel Jeannot,et al.  Comparative Evaluation Of The Robustness Of DAG Scheduling Heuristics , 2008, CoreGRID Integration Workshop.

[49]  Muli Ben-Yehuda,et al.  Tapping into the fountain of CPUs: on operating system support for programmable devices , 2008, ASPLOS.

[50]  Rizos Sakellariou,et al.  Scheduling multiple DAGs onto heterogeneous systems , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[51]  Rizos Sakellariou,et al.  A hybrid heuristic for DAG scheduling on heterogeneous systems , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[52]  Rizos Sakellariou,et al.  An Experimental Investigation into the Rank Function of the Heterogeneous Earliest Finish Time Scheduling Algorithm , 2003, Euro-Par.

[53]  William Thies,et al.  StreamIt: A Language for Streaming Applications , 2002, CC.

[54]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[55]  Philip S. Yu,et al.  Reinforcement-Learning-Guided Source Code Summarization Using Hierarchical Attention , 2022, IEEE Transactions on Software Engineering.

[56]  Wolfgang Blochinger,et al.  TASKWORK: A Cloud-aware Runtime System for Elastic Task-parallel HPC Applications , 2019, CLOSER.

[57]  Lei Sun,et al.  Building real-time parallel task systems on multi-cores: A hierarchical scheduling approach , 2019, J. Syst. Archit..

[58]  Tongquan Wei,et al.  Thermal-aware correlated two-level scheduling of real-time tasks with reduced processor energy on heterogeneous MPSoCs , 2018, J. Syst. Archit..

[59]  Santanu Chattopadhyay,et al.  Task mapping and scheduling for network-on-chip based multi-core platform with transient faults , 2018, J. Syst. Archit..

[60]  Rui Zhang,et al.  SmartVM: a SLA-aware microservice deployment framework , 2018, World Wide Web.

[61]  Xiao Liu,et al.  BigVM: A Multi-Layer-Microservice-Based Platform for Deploying SaaS , 2017, 2017 Fifth International Conference on Advanced Cloud and Big Data (CBD).

[62]  Rashad Al-Jawfi,et al.  Handwriting Arabic character recognition LeNet using neural network , 2009, Int. Arab J. Inf. Technol..