Efficient algorithms for task mapping on heterogeneous CPU/GPU platforms for fast completion time
暂无分享,去创建一个
Cong Liu | Husheng Zhou | Yuqun Zhang | Zexin Li | Ao Ding | Cong Liu | Yuqun Zhang | Husheng Zhou | Zexin Li | Ao Ding
[1] Sarfraz Khurshid,et al. An Empirical Study of Boosting Spectrum-Based Fault Localization via PageRank , 2021, IEEE Transactions on Software Engineering.
[2] Yuqun Zhang,et al. Simulee: Detecting CUDA Synchronization Bugs via Memory-Access Modeling , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).
[3] R. Buyya,et al. Data Allocation Mechanism for Internet-of-Things Systems With Blockchain , 2020, IEEE Internet of Things Journal.
[4] Xi Zheng,et al. A survey on security issues in services communication of Microservices‐enabled fog applications , 2019, Concurr. Comput. Pract. Exp..
[5] Marcelo Cogo Miletto,et al. OpenMP and StarPU Abreast: the Impact of Runtime in Task-Based Block QR Factorization Performance , 2019, Anais do Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD).
[6] Yuqun Zhang,et al. Automating CUDA Synchronization via Program Transformation , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).
[7] Nikil Dutt,et al. HESSLE-FREE , 2019, ACM Trans. Embed. Comput. Syst..
[8] Prasun Ghosal,et al. Dynamic Task Mapping and Scheduling with Temperature-Awareness on Network-on-Chip based Multicore Systems , 2019, J. Syst. Archit..
[9] Daniel Cordeiro,et al. PLB-HAC: Dynamic Load-Balancing for Heterogeneous Accelerator Clusters , 2019, Euro-Par.
[10] Marco Di Natale,et al. Pessimism in multicore global schedulability analysis , 2019, J. Syst. Archit..
[11] Ulrich Margull,et al. GPUart - An application-based limited preemptive GPU real-time scheduler for embedded systems , 2019, J. Syst. Archit..
[12] Shuai Che,et al. Northup: Divide-and-Conquer Programming in Systems with Heterogeneous Memories and Processors , 2019, 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[13] Farokh B. Bastani,et al. Service-Oriented IoT Modeling and Its Deviation from Software Services , 2018, 2018 IEEE Symposium on Service-Oriented System Engineering (SOSE).
[14] Hoi-Jun Yoo,et al. UNPU: A 50.6TOPS/W unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).
[15] Sarfraz Khurshid,et al. EdSketch: execution-driven sketching for Java , 2019, International Journal on Software Tools for Technology Transfer.
[16] Farokh B. Bastani,et al. A Framework for IoT-Based Monitoring and Diagnosis of Manufacturing Systems , 2017, 2017 IEEE Symposium on Service-Oriented System Engineering (SOSE).
[17] Shaoli Liu,et al. Cambricon-X: An accelerator for sparse neural networks , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[18] Miao Hu,et al. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[19] Yu Wang,et al. Going Deeper with Embedded FPGA Platform for Convolutional Neural Network , 2016, FPGA.
[20] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[21] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.
[22] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[23] Mehmet Deveci,et al. Fast and High Quality Topology-Aware Task Mapping , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.
[24] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[25] Karin Strauss,et al. Accelerating Deep Convolutional Neural Networks Using Specialized Hardware , 2015 .
[26] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.
[27] Wei Quan,et al. A Hybrid Task Mapping Algorithm for Heterogeneous MPSoCs , 2015, ACM Trans. Embed. Comput. Syst..
[28] James H. Anderson,et al. Exploring the Multitude of Real-Time Multi-GPU Configurations , 2014, 2014 IEEE Real-Time Systems Symposium.
[29] Cong Liu,et al. Task mapping in heterogeneous embedded systems for fast completion time , 2014, 2014 International Conference on Embedded Software (EMSOFT).
[30] Hamid Arabnejad,et al. List Scheduling Algorithm for Heterogeneous Systems by an Optimistic Cost Table , 2014, IEEE Transactions on Parallel and Distributed Systems.
[31] Scott A. Mahlke,et al. Transparent CPU-GPU collaboration for data-parallel kernels on heterogeneous systems , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.
[32] R. Namyst,et al. Composing multiple StarPU applications over heterogeneous machines: A supervised approach , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.
[33] Cédric Augonnet,et al. StarPU-MPI: Task Programming over Clusters of Machines Enhanced with Accelerators , 2012, EuroMPI.
[34] Kyoung-Don Kang,et al. Supporting Preemptive Task Executions and Memory Copies in GPGPUs , 2012, 2012 24th Euromicro Conference on Real-Time Systems.
[35] Mark Silberstein,et al. PTask: operating system abstractions to manage GPUs as compute devices , 2011, SOSP.
[36] Rosa M. Badia,et al. Productive Cluster Programming with OmpSs , 2011, Euro-Par.
[37] Shinpei Kato,et al. TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments , 2011, USENIX Annual Technical Conference.
[38] Assaf Schuster,et al. Processing data streams with hard real-time constraints on heterogeneous systems , 2011, ICS '11.
[39] Michael F. P. O'Boyle,et al. A Static Task Partitioning Approach for Heterogeneous Systems Using OpenCL , 2011, CC.
[40] Cédric Augonnet,et al. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..
[41] Joshua S. Auerbach,et al. Lime: a Java-compatible and synthesizable language for heterogeneous architectures , 2010, OOPSLA.
[42] Christoph W. Kessler,et al. SkePU: a multi-backend skeleton programming library for multi-GPU systems , 2010, HLPP '10.
[43] Rizos Sakellariou,et al. DAG Scheduling Using a Lookahead Variant of the Heterogeneous Earliest Finish Time Algorithm , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.
[44] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[45] Shan Shan Huang,et al. Liquid Metal: Object-Oriented Programming Across the Hardware/Software Boundary , 2008, ECOOP.
[46] Gregory Diamos,et al. Harmony: an execution model and runtime for heterogeneous many core systems , 2008, HPDC '08.
[47] L. Kalé,et al. Application-specific topology-aware mapping for three dimensional topologies , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[48] Emmanuel Jeannot,et al. Comparative Evaluation Of The Robustness Of DAG Scheduling Heuristics , 2008, CoreGRID Integration Workshop.
[49] Muli Ben-Yehuda,et al. Tapping into the fountain of CPUs: on operating system support for programmable devices , 2008, ASPLOS.
[50] Rizos Sakellariou,et al. Scheduling multiple DAGs onto heterogeneous systems , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[51] Rizos Sakellariou,et al. A hybrid heuristic for DAG scheduling on heterogeneous systems , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[52] Rizos Sakellariou,et al. An Experimental Investigation into the Rank Function of the Heterogeneous Earliest Finish Time Scheduling Algorithm , 2003, Euro-Par.
[53] William Thies,et al. StreamIt: A Language for Streaming Applications , 2002, CC.
[54] Salim Hariri,et al. Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..
[55] Philip S. Yu,et al. Reinforcement-Learning-Guided Source Code Summarization Using Hierarchical Attention , 2022, IEEE Transactions on Software Engineering.
[56] Wolfgang Blochinger,et al. TASKWORK: A Cloud-aware Runtime System for Elastic Task-parallel HPC Applications , 2019, CLOSER.
[57] Lei Sun,et al. Building real-time parallel task systems on multi-cores: A hierarchical scheduling approach , 2019, J. Syst. Archit..
[58] Tongquan Wei,et al. Thermal-aware correlated two-level scheduling of real-time tasks with reduced processor energy on heterogeneous MPSoCs , 2018, J. Syst. Archit..
[59] Santanu Chattopadhyay,et al. Task mapping and scheduling for network-on-chip based multi-core platform with transient faults , 2018, J. Syst. Archit..
[60] Rui Zhang,et al. SmartVM: a SLA-aware microservice deployment framework , 2018, World Wide Web.
[61] Xiao Liu,et al. BigVM: A Multi-Layer-Microservice-Based Platform for Deploying SaaS , 2017, 2017 Fifth International Conference on Advanced Cloud and Big Data (CBD).
[62] Rashad Al-Jawfi,et al. Handwriting Arabic character recognition LeNet using neural network , 2009, Int. Arab J. Inf. Technol..