Deployment and verification of machine learning tool-chain based on kubernetes distributed clusters
暂无分享,去创建一个
Xuehai Zhou | Haoyu Cai | Chao Wang | Xuehai Zhou | Chao Wang | Haoyu Cai
[1] Ekaba Bisong,et al. Building Machine Learning and Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for Beginners , 2019 .
[2] Silvio Savarese,et al. Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Gang Yin,et al. An Insight Into the Impact of Dockerfile Evolutionary Trajectories on Quality and Latency , 2018, 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC).
[4] Joseph Redmon,et al. YOLOv3: An Incremental Improvement , 2018, ArXiv.
[5] Marko Lukša,et al. Kubernetes in Action , 2018, Kubernetes in Action.
[6] Dhabaleswar K. Panda,et al. Efficient Large Message Broadcast using NCCL and CUDA-Aware MPI for Deep Learning , 2016, EuroMPI.
[7] Mike Amundsen,et al. Microservice Architecture: Aligning Principles, Practices, and Culture , 2016 .
[8] Andy Davis,et al. This Paper Is Included in the Proceedings of the 12th Usenix Symposium on Operating Systems Design and Implementation (osdi '16). Tensorflow: a System for Large-scale Machine Learning Tensorflow: a System for Large-scale Machine Learning , 2022 .
[9] Oliver Kramer,et al. Machine Learning for Evolution Strategies , 2016 .
[10] Pooyan Jamshidi,et al. Microservices Architecture Enables DevOps: Migration to a Cloud-Native Architecture , 2016, IEEE Software.
[11] Zheng Zhang,et al. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.
[12] Claus Pahl,et al. Containerization and the PaaS Cloud , 2015, IEEE Cloud Computing.
[13] David Bernstein,et al. Containers and Cloud: From LXC to Docker to Kubernetes , 2014, IEEE Cloud Computing.
[14] Dirk Merkel,et al. Docker: lightweight Linux containers for consistent development and deployment , 2014 .
[15] M. Zaharia,et al. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.
[16] George Bosilca,et al. Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation , 2004, PVM/MPI.
[17] RICHARD KOO,et al. Checkpointing and Rollback-Recovery for Distributed Systems , 1986, IEEE Transactions on Software Engineering.
[18] Nikhil Ketkar,et al. Introduction to PyTorch , 2021, Deep Learning with Python.
[19] Ekaba Bisong,et al. Kubeflow and Kubeflow Pipelines , 2019, Building Machine Learning and Deep Learning Models on Google Cloud Platform.
[20] et al.,et al. Jupyter Notebooks - a publishing format for reproducible computational workflows , 2016, ELPUB.
[21] IEEE conference on computer vision and pattern recognition , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).
[22] Dan Walsh,et al. Design and implementation of the Sun network filesystem , 1985, USENIX Conference Proceedings.